Over this past weekend, Twitter discovered the problem that I have dedicated the past four years of my life to solving. Why don't lawyers and other non-coders use git?
This is an important question. The legal system is the essential bedrock upon which all else in society is built. The protection of civil rights, peaceful conflict resolution, access to due process, enforceable contracts — all these depend on a well functioning legal system.
The professionals who operate this system are equipped with inadequate tooling. Absent git-like version control, they rely on inefficient and error-prone manual “version control” processing.
In many cases, manual version control processes (e.g. git diff, merge, and rebase but by hand) take more than an hour of lawyer time per contract version while resulting in errors!
Note: this problem is not specific to the law. Finance professionals, academic researchers, legislators, among many other professionals encounter similar version control issues. We are currently focused on law because of our domain expertise and because the problem is particularly pronounced in the law.
How do lawyers currently work?
As many have observed, lawyers face the same fundamental set of problems that coders do when collaborating. They have multiple collaborators contributing changes to a project concurrently. They need to receive approval from their colleagues prior to committing their changes. When a colleague commits new changes, they must update their current work to account for them.
Absent a concurrent version control system like git, lawyers rely on a process called "redlining" to solve these problems. A "redline" is the equivalent of a diff in git. It is a document showing the insertions and deletions between two versions of a document.
Redlining in practice
Unlike git, redlining leaves many of these problems unsolved. The following is an example of the basic steps of redlining a stock purchase agreement in the big-law context. Different practices of law might have different variations of the following process, but most follow similar general principles.
In this example, our law firm represents the buyer in an M&A deal. The seller’s counsel just sent the first draft of an agreement, V1. To turn the second draft of the agreement, V2, the firm needs input from specialists on the tax team, the IP team, and the employment team.
Kicking off this process, an M&A partner instructs a junior associate lawyer to circulate V1 to the three specialist teams to receive their redlines. Upon receiving these redlines, the M&A associate should reconcile the changes and submit those changes as the new version on their document management system.
M&A lawyer emails specialist teams asking for their contributions to the agreement.
The specialist teams each modify V1 of the agreement with their respective changes. They send the modified documents to the M&A lawyer.
Note: I’m simplifying the process a bit. The specialist’s team undergoes its own redlining process between the partner and associates on the team before returning a draft to M&A.M&A lawyer redlines the specialist version to the original via a redlining product. Associate saves that redline to their local file system as well as the "clean" version that was sent by the specialist.
M&A lawyer manually merges the three specialists’ drafts. They do this by opening a copy of V1 of the contract in Microsoft Word on the left side of their screen. On the right side of their screen, they open one of the specialist’s redlines. Change-by-change, they modify the new document to reflect the changes in the redline. Upon adding all of the changes, the associate opens the next specialist redline and repeats the process. They continue this until all changes are integrated into the new document. This document is V2 of the agreement.
Below is a screen recording of an actual lawyer merging specialist changes into an agreement. It took them over an hour to do what our system did in under a minute. This process resulted in three mistakes.
Associate redlines the consolidated V2 against V1, and sends V2 along with the redline to the M&A partner. They also save V2 to their document management system, which serves as the source of truth for committed versions of the contract.
The above process has several issues — the most obvious of which is that it's tremendously inefficient. It can take an associate lawyer several hours to integrate changes of a 100+ page contract such as a stock purchase agreement. Equally important, however, is that it’s error prone. If two specialists introduce conflicting changes, there is no way for the specialist to know without manually reading the redlines side-by-side. It's a huge problem if an associate incorrectly reconciles a merge conflict, especially if they overwrite the changes made by a more senior colleague since those changes could become permanent legal obligations worth millions of dollars.
Why don't lawyers use git?
As any coder knows, git is designed to solve exactly the sorts of problems that the lawyer is tasked with above. When I tell coders about Version Story, they often ask me, "why don't lawyers just use git?"
While git is an excellent tool for coding, there are two reasons why it will never see adoption amongst legal professionals.
Docx. Every lawyer uses Microsoft Word. They're trained with it early in their career and regularly take advantage of idiosyncratic formatting features. Every legal contract and precedent created in the past 25 years currently is in docx (or its predecessor doc). The information underpinning the entire US legal system (and every other legal system) is in docx. Docx is as much the standard for legal drafting as JavaScript is for client-side web programming.
Git works with plain text files, not docx files. While it is technically possible for lawyers to draft in a git-compatible markdown language, the legal industry departing from the docx standard is about as likely as web browsers discontinuing support for JavaScript. It's not going to happen. (Note: you should be deeply skeptical of any startup that claims they intend to replace Microsoft Word or docx in a lawyer's workflow).
UI/UX. I started coding when I was 14, have a four year degree in computer science, and have worked professionally as a software engineer for over 7 years. It wasn't until I was several years into my career that I achieved mastery with git. It has a notoriously steep learning curve and is a complete non-starter for non-technical professionals.
Lawyers have high standards. A simple and intuitive user interface is essential for gaining widespread adoption within the industry.
Why hasn't anyone done this yet?
Solving this problem requires both an understanding of git and its capabilities as well as the intricacies of the legal drafting and redlining process. The set of people capable of solving it is limited to those who have both worked professionally as lawyers and also are proficient coders.
My co-founder Kevin is such a person. Prior to founding Version Story, he worked both as an M&A lawyer at Simpson Thacher and as a software engineer at LinkedIn. I can confidently say I would not be able to build a product fit to the needs of lawyers without his first-hand experience guiding our design.
Over the years, several startups have attempted to solve this problem. None of them have had domain experience in the legal field. Because of that, they are unaware of the two constraints necessary to break into legal workflows — docx and UX. To sidestep the immense technical challenges of handling docx, they build in-browser text editors thinking they can get lawyers to draft in their application instead of in Microsoft Word. In their UX, they use terminology like “pull”, “push”, “rebase”, and “HEAD” forgetting how inscrutable this terminology is to the uninitiated.
Any product that seeks to bring the functionality of git to a mass audience must handle docx and mustn’t be afraid to rethink the user experience from the ground up.
Our solution
Building a solution to this problem was hard. The technical problems of comparing and rendering docx files and the UX problem of making git-like functionality intuitive required a substantial amount of creativity, determination, and patience on both parts. Beyond that, establishing credibility in the legal industry such that top law firms could entrust us with their confidential documents presented its own set of challenges. Thankfully we had the benefit of ignorance prior to embarking on these schleps.
Technical problems
When we started Version Story, we intended to leverage third party APIs for our document processing while focusing our energy on solving the UX challenges of reimagining git. Docx has been the standard file format for word processing since 2007 — surely there should be reliable APIs for comparing them and converting them to html and pdf.
We were wrong and, believe me, we tried not to be. While APIs exist which purport to do what we need them to, they are nowhere near adequate for legal use cases. They would work fine in demos with simple test documents, but as soon as an actual lawyer attempted to compare a 500+ page prospectus with complex tables, custom formatting, and embedded images, they would fail. Not even Microsoft Word's internal compare tool is adequate for such cases. We had to build our own custom comparison engine.
Comparing plain text is a solved problem. Since 1986, Myers Diff has been the standard algorithm for text comparison. There are plenty of helpful online resources for learning the algorithm.
Comparing documents is not a solved problem. Docx is an open file format which implements the OpenXml specification of XML. Under the hood, a docx file is a zipped set of XML documents. OpenXML is similar enough to html to be legible to those who have written html but with far fewer SDKs for interfacing with it. If you imagine how you would represent things like formatting, tables, links, etc. with HTML, that's essentially how it's represented in OpenXML.
A document comparison system requires,
Decomposing the document into well-designed data structures.
Comparing text that can be arbitrarily nested in any set of different containers such as tables, text boxes, and shapes.
Defining an edit script format that can express both textual changes and structural changes (modifying formatting, changing table dimensions, etc).
Using that edit script to rebuild the document while pulling in the correct formatting from both the original and modified documents.
Additionally, several elements of this system required novel algorithm designs. Unlike comparing plain text, there is no standard algorithm for comparing tables. How do we know which rows from the original document match with which rows of the modified? The same goes for columns. Furthermore, how can we determine if a row or column has been modified or if it's been wholly inserted or deleted? How do we know if we're even comparing the correct tables against each other in the first place?
Beyond the elegant world of theory and algorithms, we had to work through the long tail of edge cases. Docx is an enormous file standard with countless features you’ve never heard of. After building the initial version of our system, it took debugging hundreds of edge cases to get it to a point where it’s consistently reliable. Since the files we compare are generated outside of our system, it was impossible to anticipate all of the edge cases in advance. The only way to do so is to work through them when they arise in production.
Finally, supporting such a system presents some thorny system-level problems. Comparing a 1000+ page document (not uncommon in the legal industry) can require 30+ minutes of execution time and 20+ gb of memory. Building robust support for these documents while not causing timeouts in the various parts of the system awaiting the completion of these operations was challenging. In hindsight, however, this has been an advantage for us because documents like these crash incumbent redline products which can only run on Windows laptops.
The product
Replacing git’s technical language with a visual one was our critical unlock to making git-like version control accessible to non-coders. Intuitive visual actions allow users to perform the core operations of git without requiring them to memorize any confusing commands or vocabulary.
The following illustrates the stock purchase agreement process described above within the context of Version Story:
M&A lawyer adds V1 of the agreement to Version Story. The version is represented by a tile on the canvas.
M&A lawyer invites specialist colleagues to Version Story.
Specialist lawyers upload their changes. They generate a redline by drawing a line from V1 to their changes.
M&A lawyer merges these changes into a consolidated redline. They download a track changes version of this redline and resolve conflicts in Microsoft Word.
Our product does not yet comprehensively demonstrate the full functionality of git. We still have work to do to accomplish our full vision! What we’ve achieved, however, is a critical step in the direction of moving non-coders to a git-like version control system. We have validated our core hypothesis of representing git functionality with a visual canvas and now have lawyers performing git-like functions they never had access to before.
Does it work?
The following is a set of unsolicited comments from our users:
Why does this all matter?
Legal documents aren't just text on a page – they're the operating system of society. When lawyers draft and revise contracts, regulations, and legislation, they're writing the rules that govern how our world works. The stakes couldn't be higher:
A missed change in a merger agreement could cost clients millions of dollars
An overlooked revision in a regulatory filing could expose companies to legal liability
Conflicting changes in contract terms could lead to years of litigation
The inability to trace how legal language evolved can hamper future interpretation
In a 2022 article of Money Stuff titled “Brightline redline”, Matt Levine illustrates an example of these stakes, describing how Certares and Knighthead found themselves in litigation with Morgan Stanley and Brightline over a $25,000,000 credit agreement where the two sides couldn't even agree on which version was signed. "If there is a dispute about what the contract says," Levine explains, "you have to go back through the email chain to pick out which attachment everyone thought they were signing off on." The dispute centered on critical language about loan prepayments worth millions of dollars – language that one side claims was never agreed to.
The next generation of legal collaboration will not be plagued by issues like these. It will be faster, better, safer, and easier with a concurrent version control system like Version Story.
What comes next
We've kept our heads down the past four years relentlessly building a solution to this problem and credibility in the legal industry. We now have a differentiated product that users love and sell to top legal teams like those of Dentons, Addleshaw Goddard, Mishcon de Reya, Barings, and Pantera Capital.
Over the coming months, we will launch to the public. In the meantime if you’d like to try Version Story for yourself, feel free to email me and I’ll add you to our whitelist! jordan@versionstory.com
Very interesting!
With zero modification this would also be indispensable for (conventional) engineers writing compliance documents