AutoFocus: An Open-source Facet-Driven Enterprise Search Solution

In the final presentation of the afternoon, Jeroen Wester of Aduna described the main features of their open-source, facet-driven enterprise search solution, AutoFocus. AutoFocus is based upon and exploits the advantages of Semantic Web technologies, in particular RDF (Resource Description Framework), although a bewildering variety of related technologies - XML, SOAP, SKOS, OWL - are also employed. In addition to providing components for metadata-based data integration and cross-silo search and navigation in a single enterprise search solution, AutoFocus offers the advantage of being open-source, meaning that its source code is freely available for customization.

AutoFocus consists of two components. AutoFocus itself is a compiled desktop application which will scan and index documents on specified 'sources' and then provide facilities for searching them. Sources can include local file system folders, websites, IMAP email folders and remote Aduna AutoFocus Servers. The second component, AutoFocus Metadata Server, provides a means for an AutoFocus client to gain access to network resources. Significantly, AutoFocus Metadata Server not only offers standard content searching, but it will also index and make searchable key metadata fields embedded in source documents, such as those in the MS Office document Properties Sheet, and Acrobat PDF file XMP metadata.

After outlining the common problems associated with full-text search, Jeroen went on to characterize AutoFocus' approach as 'metadata-centric for metadata exploration & query formulation' and as providing 'information visualisation for search result exploration'. He continued with a simulation designed to demonstrate these distinctive features of the product, concentrating on AutoFocus' 'Cluster Map' visualization of search results.

Cluster Maps display search results as a sort of Venn diagram, where documents containing search terms X or Y are represented by two discrete clusters, while documents including both X AND Y are represented by a third cluster linked to the other two. The AutoFocus interface is user-customizable so that the clusters can display simply as a sphere showing the number of documents in each cluster, or can display each document as a sphere. Hovering the mouse over a document sphere pops-up a description of its location and title. Clicking a cluster displays a result list in the lower half of the window where any item can be clicked to open it in its native application. In search mode, AutoFocus can also display a list of suggested refinement terms derived from the other keywords indexed for that source.

