By June Hsiao Liebert, Firmwide Director of Library and Research Services at Sidley Austin LLP
New artificial intelligence (AI) and data analytics products are flooding the legal information space, and they claim to do everything from predicting the outcome of a case to writing briefs. What do you really know about these products and how they work? How do you separate the valuable products from the junk? As the global director of library and research services at a large law firm and a former law school CIO, my team and I are constantly finding and evaluating legal information tools that can improve the work that our firm does.
Many of the newest information tools employ a mysterious algorithm that magically spits out results. Users are expected to trust that the vendor is providing results that are reliable, accurate and unbiased. Blindly trusting a third party, however, is a risky move for any law firm.
The vendors may not be willing to release details about the algorithms they are using, but you can evaluate the data that goes into the algorithms to begin with. I am also a former database programmer, so the concept of “garbage in, garbage out” is ingrained in me. We need to understand what is going into these magical algorithms in order to evaluate what is coming out.
What should you be asking your providers about the information or data they are using to create their products? Here are my top 6 questions:
1. What is the source of your data/information?
If a vendor is not willing to reveal the source of the data/information they are using, then you should be suspicious. It is important to know whether the source is reliable and likely to be around for a while. When it comes to legal information, there are a limited number of reliable sources available, which also means it may be expensive. If a small company has to purchase such information, the question is whether their business model will be sustainable over time. If the company is “creating” or gathering its own information, is it being done in a scalable and cost-effective manner?
2. How frequently is the data/information updated and what is the lag time?
Even if the data/information source is of high quality, it may be unreliable if it is not updated frequently. You need to know the frequency to make sure the product will meet your needs. You should also know what the lag time is, because the data/information available to users may be weeks behind, even if it is updated frequently.
3. Is the data being normalized or cleaned?
Normalizing data ensures that you are actually comparing apples to apples. For example, if data from one source uses a different unit of measure than another, then the results will be flawed, no matter how advanced the algorithm is. Competent vendors can tell you exactly how they clean their data and what the limitations are.
4. How, and how often, is the quality and accuracy of the data/information being checked?
What procedures does the vendor follow to ensure that the data/information they are using is correct on an ongoing basis? It is not enough to use a reliable source. Something as simple as a small glitch in the connection during download can cause errors. Even well-established vendors are not immune to errors. In March 2016, for example, Thomson Reuters (TR) announced an error in their PDF conversion process for scanning cases, which inadvertently left out text. Although the problem affected only a small percentage of cases, it took TR more than a year to discover the error and resolve the problem.
5. Is your usage data being stored and how is it being used?
Statistics about how your organization uses or consumes external information is now a valuable commodity. Almost all vendors are tracking and storing your usage data, but how is that data being used? You should ask the vendor how they are using this data and whether they are sharing it with anyone else. Ideally, these types of privacy and security questions should be addressed in contract negotiations.
6. How does the vendor decide what data or information to include and what expertise do they have in the subject area?
Some of the newer legal information tools were created by technology specialists, who may or may not understand how a lawyer would actually use their products. In order to avoid the “garbage in” problem, it is important that the vendor has (or is working with) people who are experts in the field. They are more likely to make good decisions about what data or information to include.
These six questions are already in use by legal research analysts (also known as law librarians) to evaluate resources on a daily basis. It is not surprising that the same questions apply when evaluating new information tools that utilize AI or provide data analytics. Legal research analysts work with the underlying data and information sources on a regular basis and have a strong sense of the organization’s information needs. Familiarity with the data underlying the charts, graphs, bells, and whistles is key to critically assessing a new information tool.
If you ask your information vendors the six questions above and involve your law firm’s research analysts in the process, you can avoid wasting valuable time on tools that are unlikely to meet your needs.