As the hype of generative AI dies down and focus turns to practical applications, legal tech is finding new ways to limit potential risks
The limits of my language are the limits of my world,’ wrote Ludwig Wittgenstein. The fact that law has specific terminology may go some way to explaining the amount of attention dedicated to generative AI, particularly OpenAI’s ChatGPT and its latest iteration, GPT-4. But as generative AI moves from hot topic to practical applications, legal tech is finding new ways to limit its inherent risks.
A report by the Thomson Reuters Institute, ChatGPT and Generative AI within Law Firms, which surveyed 443 firms in the UK, US and Canada in March, found that 82% of respondents believe that ChatGPT and generative AI can be used for legal work. UK firms were most cautious – 42% thought ChatGPT should be used in legal, compared with 52% of US and 62% of Canadian law firms. When it came to risk awareness, the division was between roles rather than locations, with 80% of partners and managing partners flagging up potential risks. This is unsurprising as they are accountable for how the firm manages its own and its clients’ information.
The report suggests that the caution among those in senior roles might be generational or because of a lack of knowledge about generative AI. But law firm decision-makers may also be more cynical about ChatGPT hype versus adoption. Just 9% of firms in the UK and Canada (3% in the US) surveyed are currently using it or planning to use it, 34% are considering using it, and 60% have no plans to use it. Other contributory factors might be the serious data breaches that have hit the news in recent weeks, and the accuracy and privacy challenges raised by large language models.
Data challenges
On 24 March OpenAI announced a security incident that allowed a nine-hour window of ‘unintentional visibility’ of user information – in other words an irreversible public data leak. And ChatGPT’s dangerous habit of falsifying information (for example, by creating non-existent case law to support a legal argument) and asserting factual inaccuracies is routinely described as ‘hallucination’. While it is unlikely that anybody would provide GPT-generated legal documentation or advice without checking it, a combination of AI and human error could potentially lead to serious real-world consequences.
ChatGPT is not just about language and conversation – it can also write code. This gives it great potential to transform processes (by identifying and addressing bottlenecks, for example), but also brings additional risk. Earlier this month Samsung Semiconductor reportedly discovered that when employees used ChatGPT to fix problems with source code, optimise test sequences for identifying faults in chips, and convert internal meeting notes into a presentation, they were sharing the company’s trade secrets with OpenAI. This is because all input data is used to train the system and there is no way to opt out (when using the API – application programming interface – the public version does have an opt out) or retrieve your data.
However, not all use cases for generative AI in legal involve confidential/client information. There have been recent announcements from law firms and legal tech vendors integrating generative AI into products and processes, as well as new roles and responsibilities as more firms are appointing heads of AI and machine learning.
Danielle Benecke, founder and director of Baker McKenzie’s machine-learning practice, spoke at LegalWeek in New York. She discussed how generative AI might transform the law firm business model by ‘finding better ways to deliver the most valuable thing [lawyers] sell, which is judgment’. The very existence of Baker McKenzie’s machine-learning practice suggests that the business model is already evolving.
Most discussions about generative AI’s application to legal have focused on collaborative efforts between law firms and legal tech vendors. In March, DLA Piper announced its implementation of Casetext’s CoCounsel, an AI legal assistant built on OpenAI’s GPT-4. According to Casetext’s website, it ‘uses dedicated servers to access GPT-4, meaning your data isn’t sent to “train” the model’, thereby avoiding Samsung-style data leaks.
Legal library
This month saw a significant shift in the legal tech vendor landscape. The merger of legal research companies vLex and Fastcase will create a legal research library of more than 1 billion documents and a global subscriber base. Fastcase CEO Ed Walters described the merger as ‘the beginning of the end of the duopoly in legal research’, referring to LexisNexis and Westlaw. As US legal commentator Bob Ambrogi observed: ‘The merger could also help accelerate global adoption of a common set of data standards for classifying legal work that has been developed by the SALI [Standards Advancement for the Legal Industry] Alliance’.
Thomson Reuters also announced that it was selling a majority stake in its Elite financial and practice management products for law firms to global alternative asset manager TPG, which will leave Elite as a standalone legal tech vendor for the first time since 2003.
Caution from the buy-side
There is less noise from the buy-side – corporate legal departments. However, many general counsel are already managing the legal implications associated with generative AI so are familiar with its opportunities and risks.
Alessandro Galtieri is deputy general counsel at multinational telecoms company Colt Technology Services, which like other telcos is looking into developing GPT-powered customer-facing chatbots. He explains that the data risk is on the customer side. While Colt teams are trained not to input company data into training models, there is always a risk that a customer will inadvertently share sensitive or potentially valuable information.
'If I put a query to five law firms all using [similar] AI models and I get five answers all the same, where is the value-add?'
Alessandro Galtieri, Colt Technology Services
While Galtieri has not worked with law firms using generative AI applications, he has concerns. The first is to ensure that the firm/vendor has an agreement with OpenAI (or another provider) to segregate their data (as per Casetext’s CoCounsel). His second point reflects Benecke’s comment about judgment: ‘If I put a query to five law firms all using [similar] AI models and I get five answers all the same, where is the value-add? Because generative AI models work on probability, they generate content but not insight. But if I ask five law firms a question about customer liability, for example, a firm that knows my business might also offer me advice about insurance, or vendor liability, or something else relevant that was not part of my original question. And that is the value-add.’
Galtieri is also concerned about firms using generative AI for text generation. ‘As a GC, I value succinct advice. The worst law firms are the ones who send 20 pages when I want a yes/no answer, or a steer to help me make a decision. I’m looking for one paragraph, with a clear bottom line. If generative AI provides a cheap and frictionless way to help associates produce more text, which it can then summarise, it doesn’t save time or money because the output needs to be checked, and you can’t check the summary if you haven’t read the original material,’ he explains.
Privacy by design
YCNBot is an open source chatbot project developed by London law firm Travers Smith and US consultancy 273 Ventures, led by Daniel Katz and Michael Bommarito (authors of the study where GPT-4 passed the bar exam), to apply generative AI models to legal services. Travers Smith announced on 14 April: ‘YCN stands for Your Company Name, and enables firms to replace the ChatGPT interface with a chatbot that plugs into Enterprise APIs of Microsoft and OpenAI. Use of the Enterprise APIs results in enhanced controls around compliance, security and data privacy.’
Travers Smith AI manager Sam Lansley explains that he and Shawn Curran, head of legal technology, had started talking to legal tech vendors about generative AI. However, they had the same data privacy concerns as Galtieri: ‘We wanted to make sure that we had an agreement that they wouldn’t store our data or use it to develop their own models.’ When they could not find the user interface they were looking for, they decided to create their own and integrate it with external services via an API, built in collaboration with 273 Ventures. ‘We wanted it to be plug and play, so that it could integrate with ChatGPT or any other generative AI provider. We are now looking at other ways of hiding personal data to protect our users.’ YCNBot is Travers Smith’s third free-to-access open source product. The others are MatMail (for email filing) and Etatonna (a contract labelling tool).
Bommarito explains that the initial release of YCNBot is about providing governance around the use of generative AI in terms of permissioning, audit and control. YCNBot is the equivalent of an email management system for generative AI. And like an email management tool, it also provides an opportunity to catch and reject sensitive information before it is uploaded. ‘If someone copies and pastes something directly into ChatGPT, it’s too late, but YCNBot sits there like a sentry and prevents sensitive information from leaving the firm,’ he says.
YCNBot can be accessed directly on GitHub; access the license with no restrictions
1 Reader's comment