Venue: University College London
|
Legal Know-How
review by Fran Alexander
Six excellent presentations began with the legal practitioner's viewpoint, then took us through practical knowledge management work, to exciting research in using ontologies, and challenges for the future.
Legal knowledge - the practitioner's viewpoint
Melanie Farquhason, 3Kites Consulting
Melanie was a practising solicitor for 20 years, then worked in legal knowledge organization, and now works for a small consultancy firm. he described herself as a "recovering lawyer", highlighting the very specific training and mindset of lawyers, who speak "legalese" rather than English. She emphasised that despite this, a law firm is essentially a business selling knowledge, even if the lawyers themselves sometimes do not seem to think of it in those terms.
She introduced the afternoon with humour, telling an anecdote about a sketch at a legal revue in which a lawyer had fallen asleep for a hundred years to wake to find that "the computer" managed all the knowledge and all the lawyer had to do was put on a good show to entertain the clients. The story illustrates the point that more and more of the knowledge that is contained within IT systems is crucial to the business, even though the "entertaining the clients" aspect is often what gains the most prominence.
Life and the law are getting more complicated all the time, while at the same time there is more pressure to deliver value and not waste time. Being paid by the hour meant that in the past, lawyers often didn't worry if a particular task took them a long time. As law firms have had to become more businesslike, the concept of knowledge is expanding. Knowledge used to be thought of as research notes and standard template documents. Now knowledge relating to case work includes what is known about the clients, the industry, the situation, and even how to estimate and charge appropriately, so there is a much wider context.
Melanie outlined eight key use cases showing when a lawyer needs knowledge. These included "hard law" questions (such as whether a section is still in force or case law still good), questions about precedent, questions about background or general awareness of issues, background on particular type of matter (cases) - especially important for junior and trainee lawyers, and more general questions about procedures and formalities, who relevant experts are, how to estimate for charges, as well as questions about particular clients or industries.
She described the frustration of the "Sunday afternoon syndrome" when answers are needed but there is no-one else around to ask, especially when the relevant document or piece of information is known to exist but cannot be found. Traditionally there were a range of places a lawyer would go to look for answers to the different types of question. Hard law questions were resolved by expensive online databases, but these are often not very easy to use. Standards would be kept on intranets or in document banks and databases. Broader knowhow could be found in databases or in bulletins, such as newsletters. Firms would also have practice manuals, perhaps in electronic form on an Intranet. A lawyer would need to know where to look before they could even begin their research.
A particular danger for legal research is misleading classifications. In most search scenarios, if somebody misses something it doesn't matter too much, but for lawyers, missing a single document could be devastating. So, any area where people are uncertain of or argued over classifications is particularly worthy of attention. Legal product developers often used difficult and complex classifications that were not user friendly or helpful. On the other hand, over clever systems that think they know what you want can be just as dangerous. Clever software analysing meaning is often sold as a panacea by vendors who gloss over the risks inherent in making algorithmic assumptions. "You might also be interested in" suggestions are great fun on sites like Amazon, but assuming a lawyer's interests are based on previous cases is risky, as they may be working on pharmaceuticals one day and the defence industry the next.
Systems that can incorporate different classifications and allow them to be linked are better than those that impose a single viewpoint, as different taxonomies are used for different purposes. Faceted systems allow different views on content, with mapped taxonomies and faceted search offering promising ways forward, but there is still much to be done.
Why lawyers need taxonomies - Adventures in organizing legal knowledge
Kathy Jacob and Lynley Barker, Pinsent Masons LLP
Taxonomies are often the "invisible force" that lawyers don't know they need and don't realise they are using. At Pinsent Masons a very pragmatic view of knowledge management is taken, with decisions made on a committee and consensus basis and embedded into everyday practice.
The firm has a great history of seeking external specialist help. Kate Simpson and Adrian Dale were advisers who helped the firm realise that imposing a bought in enterprise search solution would not work unless they had god knowledge processes in place to underpin it. Kate led a series of workshops that served both to assess requirements but also to gain buy-in and explain the purpose of metadata. Everybody understands findability - why can't I find my stuff? - is a question everyone has asked at some point.
Following Kate's recommendations, a large information architecture review was undertaken and a matter (case)-centric repository put in place, supported by Autonomy iManage. In addition, an Intranet based on Microsoft Sharepoint now supports dynamic publishing. A firm-wide vocabulary to underpin the corporate taxonomy was developed and implemented. Kate undertook lots of card sorting - 38 workshops with over 200 participants from all levels - to understand how lawyers conceptualise areas of law as well as basic business terms, such as names of departments. However, this was not a one-off process, as consultation is constant and the vocabulary continues to evolve under governance. The taxonomy is very business-focused and works under the surface, so that researchers don't have to see it, with lots of behind the scenes mapping. It is held in SchemaLogic.
Terminology was carefully chosen so that labels that would "sing" to lawyers in all jurisdictions - Scottish terminology being included, for example. Much work was done to encourage lawyers to add useful metadata. At first lawyers were just labelling all their documents as "other", so that option was removed from the taxonomy, forcing the lawyers to describe the content of their documents. However, the amount of manually added metadata was kept to a minimum, but supplemented by automatically created metadata where possible.
For the Intranet and Enterprise Search Upgrade and Refresh projects, changing working practices (training, explaining, consulting, and communicating) were just as important as the technology. For example, explaining that naming conventions were the key to findability was a good way to gain buy in. The team were delighted when at a workshop a senior partner declared: "if you don't name your kid's trousers when they go to school don't expect to ever see those trousers again".
Quality control of the Intranet is maintained by having a publishing team to oversee tagging of content, with a focus on high-value content, so personal workspaces could contain less well-tagged documents. Quality was also enhanced by rolling out the system first to specially chosen teams, picked because they were likely to be good adopters and so could provide good training examples.
Taxonomy Management in Practice at Clifford Chance
Mats Bergman, Clifford Chance
When Mats started at Clifford Chance (CC) about 10 years ago, there was no taxonomy management at all. Practice areas did whatever they wanted to do, so there were lots of different databases and different vocabularies. There was, however, senior support to rationalise knowledge management and to improve findability. Knowledge partners from all practice areas were brought in to help develop a suite of "gold standard" taxonomies.
CC decided not to try to impose a single corporate taxonomy, but allowed different taxonomies to remain. These diverse taxonomies were mapped so that users could use their own terms while the overall system was integrated. However, as interdependency of systems increased, updates to the taxonomies needed to be governed and managed. Wider buy-in was helped by the addition of "knowledge contributions" to annual staff appraisals and good knowledge management practices were embedded in meetings and other routine processes. Allowing people to continue using their familiar terminology helped maintain a balance between the time needed to add tags with the benefits of improved and findability of content.
This network of mapped taxonomies, held in a central taxonomy database that supplies values to enterprise search systems and holds mappings between taxonomies, underpins search and is driving organisation of the public facing website. The difference between the internal and external sites is beginning to blur as more content is published on the external facing website.
During the project to bring together some 50 siloed databases, Mats learned that you do not need to be a subject matter expert to co-ordinate such work, and in fact it can be politically helpful to be neutral. Negotiation and mediation skills are key, and communication using wireframes and mockups is also vital. The support of an experienced and qualified taxonomist was invaluable. Christine Miskin was employed to provide taxonomy-specific expertise. Finally, senior management support for the project was also necessary with high-visibility and high-profile projects involving senior staff key to avoiding or mitigating resistance.
The next area Mats is investigating is how to use social tagging for certain types of content to improve findability on wikis and to discover common and emerging themes of interest quickly. He noted the huge instinctive trust people have in Google, even though when CC tried to use it as an internal search system within the firewall, its results were demonstrably not very good.
Textual Information Extraction and Ontologies for Legal Case-Base Reasoning
Dr Adam Zachary Wyner, University of Liverpool
Dr Wyner gained a PhD in linguistics - focusing on the syntax and semantics of adverbs - and then a second PhD in computer science. He uses this combined skillset to investigate ontology-building for legal case-based reasoning and seeks to build interesting bridges between academics and legal professionals. His research is in Artificial Intelligence and law, particularly ways of producing formal representations (knowledge models) of deontic concepts and contracts (to do with duties, obligations and permissions).
A huge amount of understanding of how language works has built up over the past 30 years. This has led to the development of concept extraction techniques, and such concepts can by linked by ontologies, which define the relationships between the concepts.
Currently flat indexes are inflexible as they do not allow further automatic processing based on relationships between terms. If a term is under a particular subheading, the index does not tell you why the indexer has grouped the concepts in that way. This means that computers cannot be programmed to understand those relationships.
String searches are powerful, but strings do not represent concepts (e.g. synonyms are different strings, but mean the same thing). Therefore strings need to be associated with underlying identifiers or thesauruses in order to use them for concept searching.
Adam's ontologies are being designed to answer specific problems - how to find relevant precedents and how to determine the applicability of legislation to a particular case. They are not search tools in the conventional sense.
There is lots of intellectual leg-work to do in the form of legal research (time, money, expertise) in order to structure knowledge representations that will be useful, with law students being brought in to help with the work. An open-source approach is being taken, in order to make the tools widely available.
Adam stressed that he is not attempting to build "robot lawyers" (although it did cross my mind that - in reference to Melanie's humorous anecdote - clients might find these rather entertaining!). He also pointed out that he is not attempting to cover all notions or aspects of law or any kind of social factors.
His general architecture for text engineering includes lexicons, annotation rules for basic concepts, and complex protocols for searching those annotations. Using the Protégé ontology editor, classes of relevant entities, their properties, relations, and constraints were defined. These were then tested for consistency to make sure that inferences drawn from them were in fact valid.
Taxonomies and ontologies can be blended, so the two approaches can be used to support each other. The main difference between taxonomies and ontologies is that ontologies support further reasoning, whereas taxonomies leave relationships undefined. However, faceted taxonomies do contain horizontal relationships and so are a mid-point between monohierarchical taxonomies and fully defined ontologies.
A bottom-up approach to knowledge modelling was taken, starting with lists of words of similar meaning gleaned from Word Net and other thesauruses. These were then grouped and described with a cover term (a process essentially the same as traditional taxonomy building). The corpus of work to be analysed was then annotated with the cover concepts, expressed in a semantic web format. The cover concepts were expressed in the ontology, and the corpus was re-searched for the cover concepts or combinations of them.
So, example cover concepts would be the cause of action (e.g. counterfeited goods, piracy), abbreviations for districts or jurisdictions, terms relating to judgements (e.g. grant, deny, overturn, remand, affirm). When such concepts were found they were highlighted and colour-coded according to the cover concept. Of course, this method is not completely accurate (why the robot lawyer isn't quite ready to take over), but can be used as a rapid-highlight supplement to speed reading, helping lawyers with huge volumes of documents to pick out sections that might be relevant.
The cover concepts are an attempt to solve the problem of how you undertake a conceptual search without having to process lots of synonyms. Adam referred to the cover concepts as "factoroids".
An ontological approach is preferable to an algorithmic machine-learning approach because the ontology is explicit in its rules and reasoning; whereas machine-learning algorithms are often obscure (results come out of a "black box"). So if absurd or unhelpful results are obtained, an ontology can be analysed step by step and re-edited to produce better results, whereas it is very difficult to unpick and improve machine-learning methods.
When building an ontology, you can start with some concept extraction to identify frequent terms, and use those as a basis for identifying key relationships, but the process should be iterative, with your ontology feeding back into your concept extraction to refine it, and the ontology adapted when interesting or relevant concepts or relationships are not being accounted for or the defined relationships are not producing helpful results. In practice, we usually have a "mental ontology" of the subject area, with a good idea of what the key relationships are, so you can start simply by getting that down onto paper (e.g. we know in law that the relationships between allegations, evidence and judgements are what is of interest, not relationships between judgements and what judges had for breakfast - although I find myself wondering if that would make a difference...).
The development of legal ontologies is in its infancy and there is much work to be done over the next decade. Two related conferences are JURIX, in Liverpool, UK, December 16-17, 2010, and ICAIL, Pittsburgh, USA, June 6-10, 2011.
Collaboration across boundaries
Gwenda Sippings, Head of Information, and Gerard Bredenoord, Head of Knowledge, at Linklaters LLP
Linklaters is a large, complex, international, successful, and ambitious law firm. It considers knowledge management to be a key part of the business, placing the Knowledge and Learning Department (K&L) within business services rather than business support.
Regional K&L managers and K&L advisers represent practice groups and these include marketing, finance, and human resources as well as law. Many members of staff have multi-faceted roles and very different competencies in information research. In addition, the firm is international, so as well as difference between civil and criminal law, there are multiple languages to consider, with staff who have very different ways of communicating and very diverse educational backgrounds. Nevertheless, there is a move towards increased self-research rather than mediated research.
There is a centralised physical repository for all Linklaters content, and this helps information management enormously. Information service provision includes the management of internal and external published information. Findability remains the most important aspect of knowledge management, but there is a recognition that the world changes so quickly that you cannot build the perfect system. By the time you have finished, the world has moved on. For example, Gerard was involved in a project to build a big taxonomy/thesaurus to support a search system, but they didn’t deliver it on time, so it was never implemented.
However, as Linklaters is unusual in having only one physical repository, work can focus on providing different lenses onto the content to suit different communities. These include full text search over curated legal knowledge, a specialist search solution for deal-related knowledge, and a single place for the management of information. The team also collect measurement and metrics around usage and utilisation.
For the Intranet, a very lightweight approach was taken, with no single taxonomy for navigation built up front, but a very shallow layer of top-level navigation was introduced and then anyone was allowed to choose their own categories for the pages they published on the Intranet. This helped to foster interest in the use of categories to fuel navigation, and the benefit for the K&L team is that they do not have to predict what content will be developed or how people will want to navigate to it. However, the cost is that all the category headings are manually monitored and managed. The Intranet contains about 1,000 pages so this is possible, but it is likely a more controlled approach will be needed in future as the number of pages increases.
The measure of success for the Intranet is not "stickiness", but how quickly a user can arrive, find what they are looking for, and leave. The shorter time they spend on the Intranet and the fewer pages they visit the better. This is the opposite of what is considered good by websites that depend on advertising revenue, where the longer users stay and the more pages they visit, the better. There is a very open approach to managing content on the Intranet, with every page being editable by anybody within the firm, but in practice people do not tend to edit other people’s content.
The specialist search function for deal-related search is taxonomy based, with content comprising factual comparisons in tabular format. The use of structured content with good metadata provided a much richer solution to the particular business retrieval need than the free search that is available across the whole of the firm's content. One particular problem for standard enterprise search systems for Linklaters was that much of their content is structurally extremely similar and the language used very formulaic. This meant that statistically-based language processing failed to identify the handful of words that might differ between one document and another based on the same template, but those differences were key to serving up the results required. The text search system could not place the documents in context or tell what problem the searcher was trying to resolve.
However, from a business process point of view, the less tagging that staff need to do the better, so the minimum amount of intervention to improve search was identified. There were five fairly obvious metadata attributes that improved results dramatically:
(i) practice,
(ii) jurisdiction,
(iii) document type,
(iv) type of knowledge (e.g. precedent)
(v) one or two sentences by the author explaining why they thought the document was important (which seemed to me very much like a traditional cataloguing description).
These key pieces of metadata were sufficient to refine and filter search results for acceptable levels of findability. Significantly for the business, this limited metadata set was easy for staff to understand and quick to add, so cost little staff time at the content production stage, but delivered huge benefits in time saved in searching.
Reconciling the taxonomy needs of different users
Derek Sturdy, Tikit
Derek described legal knowledge management as comprising three key areas:
(i) research
(ii) document drafting aids
(iii) process maps.
The importance of using multiple linked taxonomies was stressed, as each can deal with a different aspect of the problem - for example, document types are very different to topic lists, what is needed when drafting a document is not the same as what is needed when trying to understand a subject area.
Publishers tend to like to sell their taxonomies as proprietary products and do not co-operate on standards. Some portmanteau publishers have attempted mapping work to produce unified searches of aggregated products, but many have failed to respond to lawyers' needs or convince lawyers' of how and why their products are helpful.
This led to lawyers hand-building their own precedent repositories, which seems to be a huge failure of the KM profession and Derek raised the challenge to think about why this is. Are we not providing them with the right services to help them with this part of the work? Is it a communication failure - lawyers don't understand how we do it and what skills we have available?
Most lawyers suffer from the "garage problem" they only want to type a single word into a Google-type search, but then they suffer from the problem that what they want is buried under piles of junk. Staff who specialise in research are often much happier using sophisticated tools to help them find what they want more easily.
It is very important to make sure that the metadata that is added is the metadata that is actually useful. This is especially important when considering automatically generated metadata, which is often used in search solutions because it is easy to collect. For example, the date a document was last updated is more useful than when it was first written or first uploaded. The name of the author may be less important than their level of authority - the question is not who is the author but do I trust them, so if I use their document as a source I won't be fired?
Interfaces that show respect to the searcher are also important - so including an "I'll tell you" folksonomic option to allow the content producer to add a tag that is not in the taxonomy, or a "go away" option in case the searcher wants a raw, unrefined search without any taxonomic filtering. These options allow the user to modify the way a search is being carried out and stop them seeing the system as something paternalistic and inflexible. The usage of such options can be identified in metrics and measurements to help spot weaknesses or new trends so that the core taxonomies can be modified.
Derek asked if there is a case for a crating a common collaborative legal taxonomy. The problem is getting worse in an information overloaded world, not better, with more taxonomies being developed all the time. The solution is mapping and the use of core concept IDs, but firms tend to be protective of the intellectual property contained in their taxonomies, and brokering agreement on standards, even as part of government-sponsored initiatives is notoriously difficult.
Networking
The event was followed by a chance to network over a glass of wine, and with so many questions provoked by the presentations, the opportunity to talk individually to the speakers - a very welcome feature of ISKO UK events - was particularly valuable and fascinating.
The next event will be in spring 2011.


