AI in production

The Evolution of Legal Tech and AI Agents

Discover the origins of Flank and the shift to legal automation that changed the game. Jake Jones and Richard Mabey explore co-pilots, autopilots, and the human quality of legal services in this deep dive.

Lilian Breidenbach

04 Dec 2024 • 6 min read

Photo by Kiwihug / Unsplash

This is an in-depth conversation on the automation of legal work between Jake Jones, Co-Founder at Flank on Brief Encounters with Juro's Richard Mabey. They talk about how it’s done and what it means. Over the course of three parts, Jake and Richard share the product mindsets that enable them to build AI products on the application layer, creating a comprehensive map of the legal tech landscape in the process. They discuss emergent technologies, the challenges of building with rapidly changing foundation models, and how to protect the human quality of legal services.

In this first part of the series, Jake and Richard talk about the origins and early years of Flank, how they pivoted into in-house legal teams, and how legal tech autopilots differ from co-pilots.

0:00

/1:37

Part One: Origins and Early Years – Co-pilots vs. Autopilots

The Origins And Early Years Of Flank

Richard Mabey: With me on Brief Encounters today is Jake Jones, co-founder and CPO at Flank, and I’m excited to talk about what’s happening in legal automation. Tell us a bit about the company you founded.

Jake Jones: You know us from a previous lifetime as LegalOS, which we founded in 2018. Our initial premise was to solve a very particular problem: Legal services and consumers are a long way away from each other, and there’s a bottleneck between them, largely because legal services take a lot of human effort to deliver.

Initially, we sold to law firms, with the intention of empowering them to deliver better, cheaper, and faster legal services to clients. But we learned that law firms don’t necessarily want that, maybe because their incentives don’t align due to the billable hour. This led us to pivot in-house, where the service providers are the in-house legal teams, and the consumers are the commercial teams.

The problem space we then set out so solve was commercial teams having to wait on legal teams. Examples would be the days they have to wait for an NDA, or for negotiation support, or for the completion of an infosec questionnaire. We found this problem space almost entirely blank.

For about a year, we were unable to solve this problem, because the technology wasn’t there. Then ChatGPT-3.5 came along, and with it a whole new way of interfacing with technology. It allowed us to begin to build AI agents, which is what we do today.

Our products are AI agents for legal, compliance, privacy, and infosec teams, who deploy them directly to commercial teams. The agents autonomously resolve requests at the point of origin.

Richard Mabey: It’s an awesome story, and one we’ve been following for a while, because we were founded around the same time and are tackling related challenges. There was so much low-hanging fruit in those early days, because the level of automation was so low. Integrating something with Salesforce or digitizing paper documents was considered mind-blowing. But for you there was a specific pivot moment. When did you realize that what you wanted to do, which hadn’t been possible before, was now going to be possible?

Jake Jones: We spent a lot of 2022 trying to build a product that legal teams could deploy to commercial teams, for them to self-serve some of the real legal thinking. Rather than just self-serving getting an NDA, it would self-serve the answer to a complex negotiation question. Someone’s pushing back on clause 5.4. That someone is a blue chip customer and the deal size is 150k. What should I do in this specific scenario? But the interfaces we were able to build were never adopted by commercial teams, because they don’t like to adopt new products.

Richard Mabey: Was this through your own user interface, or through an integrated system?

Jake Jones: At this time, it was through our own UI that also integrated into Salesforce. But the adoption rate from the end user was so low that eventually the legal teams dropped out. As soon as ChatGPT was released, what we saw was not primarily an amazing technical innovation in what AI can do behind the scenes, but a new way of interacting with software that fits into how service people already interact with legal teams; they’re emailing, slacking, or MS teamsing them.

Now, legal teams can put something between them and this crazy traffic. That was the big shift. As soon as we saw that, we thought: It might not be today, or in the next 12 months, but at some point in the next few years all of this work will be done by AI. Why don’t we just start building and see how far we get?

Richard Mabey: And how did that feel? You had spent multiple years building the first iteration, and this felt like an evolution, but really a pivot in what you were doing. Take us back to that mindset, how it felt to take that risk of building a second company within a company.

Jake Jones: It definitely was an evolution, but in a sense it was two steps back. It was exciting, but it was tough to acknowledge that everything we’d built had to go. There was a new paradigm, and we had to grab it as quickly as possible. But then the excitement took over. In November 2022, we were in a position where the API still wasn’t released until around March 2023, so we had three or four months where we were building a product without an API. We didn’t even know if an API was coming, and we had to navigate that.

Parsing Legal Tech Categories: Co-Pilots vs. Autopilots

Richard Mabey: It’s interesting to think about categories as well. There was a time when you could neatly put legal tech in categories: e-signature, CLM, matter management. We always struggled with that, because there was an element of self-serve in Juro and we’ve always had native e-signature, which was unusual for a CLM. But it seems there’s a kind of convergence happening. Where, if at all, do you fit into the traditional categories?

Jake Jones: When I think about categories, I’m looking at who we’re solving a problem for and what’s the exact problem we’re solving; that’s the category. It then doesn’t matter what the solution looks like. A category might end up being composed of multiple point solutions. I basically see two major new categories emerging, one of which is already pretty full of incumbents: co-pilots and autopilots.

Co-pilots are tools primarily used by lawyers to do the work they’re already doing, but faster and more efficiently. Maybe they also manage that work, operate as systems of record. For now, I broadly include CLMS in that, although it remains to be seen where they go.

The other category is frontline agents. The problem being solved here is not the lawyer needing support to do their work, but them not wanting to do a proportion of that work and needing to outsource it. Previously, it was maybe outsourced to a deal desk or paralegal, or they allowed themselves to just not review NDAs, for example. Now they can outsource the work to an agent.

Richard Mabey: The difference between co-pilots and autopilots seems to basically come down to the degree to which the task is automated and the level of trust to which the legal function will give autonomy to the business. That’s where the DNAs of Juro and Flank are similar; we’ve always tried to believe that legal will enable folks in the business to self-serve on tasks. Of course those tasks have so far been quite limited, but where does that go? In Flank today you can get markups of documents, you can get answers to really complex questions. What are the limiting factors on what the business can actually self-serve?

Jake Jones: I think it’s a walking question; we don’t know yet. To a degree, it will depend on the appetite of legal teams to automate away work. What is the work they actually want to automate away? From the adoption of CLMs, with the automation of terms based on templates, the automation of NDAs, and everything else that goes through a CLM, and how successful that is, we can clearly see that this kind of work is certainly going to be automated away fully.

The other limiting factor is the intelligence of the large language models. How intelligent do they get? And the big question is: How deterministic can they be? Because their architecture is probabilistic, and lawyers aren’t going to be fans of rolling a dice with every deal they do. So, for the builders, the question is: Can we tame the LLM to be much more deterministic than is in its nature?

In Part Two, Jake and Richard continue their deep dive into the technologies and challenges of automating legal work: they discuss legal-specific vs. open-source LLMs, whether RAG works, the critical importance of early adopters, product orthodoxies vs. realities, and the distinguishing features of AI agents.

Find out more about Flank's AI Agents here.

🎧 Listen to the full conversation on Episode 8 of Juro's Brief Encounters here.