David Crystal (Crystal Semantics)
Semantic targeting: past, present and future
This keynote address will look at the evolution of the linguistic approach to content analysis which Crystal has been developing over the past 20 years. It begins with the knowledge management taxonomy used for the Cambridge family of general encyclopedias, and follows its transformation into an Internet taxonomy, with applications in automatic document classification, search engine assistance, e-commerce, online advertising, and Internet security. Recent developments have brought a focus on advertising, a field which has seen ideas develop from simple keyword analysis to contextual advertising and now to semantic targeting. Crystal explores the difference between these notions, and describes current issues in the way semantic targeting is evolving, including ways of handling site sensitivity, sentiment, intention, and cultural localization.

Clifford Lynch (Coalition for Networked Information)
e-Research and new challenges in knowledge structuring
Our second keynote speaker gives a high-level overview of some of the developments in e-research and cyberinfrastructure, with emphasis on some of the opportunities for data curation and data reuse, with considerable emphasis on humanities and social sciences as well as science and engineering. Lynch will also look at developments in "citizen science" and what might be thought of as "citizen humanities" in this context. The talk will conclude with consideration of the changing nature of publishing/authoring, particularly in the scholarly sphere, and the implications of the production of structured, re-useable, and interchangeable knowledge as part of the processes of scholarship and scholarly communication.

Tom Scott and Michael Smethurst (BBC)
Building coherence at
Think of the BBC as a storytelling organisation; then think of the transition needed from storytelling in the world of linear broadcasting to that of the non-linear, hypertext world of the web. The value in a website lies not in its implicit (meta)data of the domain model but rather in the way the domain model overlaps and intersects with other domains. As ever the links are more important than the nodes because that's were the context lives: programmes:segment music:track, programmes:segment food:recipe etc. In this way we weave new 'user journeys' into and out of a domain, into and out of From archive episodes no longer available online, to a recipe page, to a chef, to another recipe and back to a recent episode. Using well targeted content specific links we could not only escape the dead end content silos that characterised but point users back to programmes that would hopefully inform, educate and all that stuff. In building and in this way we have kept everything in its right place we've built a sane, maintainable, scalable, accessible site that search engines love and can be easily evolved to add new features and functionality. So to anyone considering how best to build websites we'd recommend you throw out the Photoshop and embrace Domain Driven Design and the Linked Data approach every time. Even if you never intend to publish RDF it just works.

Image retrieval
This session will begin with an overview of the state of the art for the image retrieval market in Still digital images - the hardest things to classify and find, given by Ian Davis of Dow Jones Client Solutions. For those of us who mostly handle retrieval from text, this will bring into focus the added difficulties and rather different needs experienced by the users of images. Davis will probe the strengths and weaknesses of the different approaches through which the challenges can be met.

While traditionally image retrieval has relied on indexing with controlled vocabularies, Chris Town in his talk Giving meaning to content through ontology based image retrieval will argue that such keyword based multimedia retrieval effectively treats images as "black boxes" since all indexing and search is based on the labels associated with a given image rather than the image itself. Furthermore, manual image annotation is an expensive process which is prone to problems such as errors, inconsistencies, ambiguity, lack of context, and both over- and under-keywording. But the alternative approach of content-based image retrieval (CBIR) has mostly failed to gain wide adoption. Town will outline why this may be the case, with a particular emphasis on the aspect that CBIR solutions have not done enough to bridge the "semantic gap" between their system's retrieval model and that of the user. He will demonstrate how ontological query languages have been utilised by Imense Ltd. to provide more effective image retrieval and image analysis solutions.

A third dimension will be added by Elaine Ménard from McGill University in Canada, speaking about Ordinary image retrieval in a multilingual context: a comparison of two indexing vocabularies. She compares traditional image indexing with the use of a controlled vocabulary and the free image indexing using uncontrolled vocabulary. Her study also compares image retrieval within two retrieval contexts: a monolingual context where the language of the query is the same as the indexing language and, a multilingual context where the language of the query is different from the indexing language.

Paul Miller (The Cloud of Data)
Exploiting data in the cloud
Much of the recent attention devoted to Cloud Computing has been concerned with outsourcing of hardware or hosting of applications. Important as these trends are, Miller will argue that the Cloud is capable of far more than simple replication of existing enterprise processes. Amazon's recently announced Public Data Sets programme and the World Wide Web Consortium's (W3C) Linked Open Data community project illustrate the opportunity for re-use of public data, with licensing frameworks evolving to reflect shifting presumptions. Specifications from the Semantic Web are being put to work as enterprises such as Thomson Reuters seek to unlock value in expensively curated internal data. What happens as increasing quantities of data become accessible, as attitudes to control and ownership morph, and as technologies evolve to enhance 'enterprise' applications with insight from beyond the firewall? Where might the balance lie between comprehensiveness and insight on one hand, and security and control on the other? Miller will point us to the future.

