Automatic subject classification for improving retrieval in Swedish repositories

The recent adoption of the Dewey Decimal Classification (DDC) in Sweden has ignited discussions about automated subject classification especially for digital collections, which generally seem to lack subject indexing from controlled vocabularies. This is particularly problematic in the context of academic resource retrieval tasks, which require an understanding of discipline-specific terminologies and the narratives behind their internal ontologies. The currently available experimental classification software have not been adequately tested and their usefulness is unproven especially for Swedish language resources. We address these issues by investigating a unifying framework of automatic subject indexing for the DDC, including an analysis of suitable interactive visualisation features for supporting these aims. We will address the disciplinary narratives behind the DDC in selected subject areas and the preliminary results will include an analysis of the data collection and a breakdown of the methodology. Major visualisation possibilities in support of the classification process are also outlined. The project will contribute significantly to Swedish information infrastructure by improving the findability of Swedish research resources by subject searching, one of the most common yet the most challenging types of searching.

