You are here


Large-scale grid computing for content-based image retrieval

Content-based image retrieval (CBIR) technologies offer many advantages over purely text-based image search. However, one of the drawbacks associated with CBIR is the increased computational cost arising from tasks such as image processing, feature extraction, image classification, and object detection and recognition. Consequently CBIR systems have suffered from a lack of scalability, which has greatly hampered their adoption for real-world public and commercial image search. At the same time, paradigms for large-scale heterogeneous distributed computing such as Grid computing, cloud computing, and utility based computing are gaining traction as a way of providing more scalable and efficient solutions to large-scale computing tasks. In this paper, we present an approach in which a large distributed processing Grid has been used to apply a range of CBIR methods to a substantial number of images. By massively distributing the required computational task across thousands of Grid nodes, we have achieved very high throughput at relatively low overheads. This has allowed us to analyse and index about 25 million high resolution images thus far while using just two servers for storage and job submission. The CBIR system was developed by Imense Ltd. and is based on automated analysis and recognition of image content using a semantic ontology. It features a range of image processing and analysis modules, including image segmentation, region classification, scene analysis, object detection, and face recognition methods.

Presentation Type: 
Presentation Audio: 
Presentation Paper: 
Subscribe to virtualisation