The problem
It’s clear today that businesses of all shapes are turning to LLMs to try to increase productivity. The use cases vary, and today the variety of problems solved by LLMs represent just a fraction of the future opportunities for the technology.
Yet, most solutions today remain confined to “chatbots” on unstructured data. While these can provide value, we’ve found in various conversations with business users, IT professionals and others in an enterprise that enabling natural-language questions against structured data would be the massive unlock toward meaningful productivity gains.
Structured data is foundational to any enterprise, from operations to sales and marketing, however, extracting insights and value from this data in today’s paradigm requires multiple departments to contribute to the retrieval, processing, and interpretation of structured data.
For example, take the process of dashboard creation. With an existing solution, a company or department must first engage with a vendor. They then onboard the client, define the project, configure the solution, buildout the infrastructure and software, construct a data pipeline, replicate the dashboard’s data into a data lake, build data models, construct the dashboards based on the starting solutions and then turn them over for use. This is a multi-step process that can take nine months to complete. And then, at the end of this three-quarter-long building, the dashboards require significant effort to change.
LLMs offer a promise to vastly shorten this process and make it dynamic by translating the question into code and returning the answer, which would be even better if charted. Yet without additional technology components and tools built into a readymade application, such as our own Conversational Finance solution powered by our hila platform, getting an LLM to operate properly comes with its own difficulties, such as:
Overcoming the challenges
Let’s focus specifically on one of the predominant use cases inside of an enterprise — text to SQL.
The most performant models and papers in Yale’s Spider Leaderboard have a credible 91.2 percent accuracy, but that remains too low for an enterprise. Indeed, using a combination of models and chain-of-thought reasoning, the next closest techniques from the Alibaba Group achieved only 86.6 percent accuracy.
For an enterprise to trust a system, it needs to be at 99 percent accuracy or greater. We have achieved this through a combination of techniques. To start, we process the question to see that it can be answered by the available dataset. We then cross-reference the query against a vast database of knowledge that is both domain-specific (such as finance-specific for an ERP system) and company-specific (such as which months constitute a fiscal year for a particular company). The response then gets post-processing after the LLM returns the SQL to ensure that the SQL answers the initial query. These various guardrails around any model, public, private or fine-tuned, ensure we raise its SQL generation accuracy above 99 percent.
The key to our process is a robust set of knowledge and key SQL statements that guide the LLM in making the correct calls. This knowledge is based on both domain-specific understanding and customer-specific data. Both of these contribute to the synthetic data generation that constitutes the backbone of our system.
In this way, Conversational Finance, our primary application for text to SQL, addresses the concerns head-on:
In addition, hila can monitor both LLMs and traditional models at scale, including models with characteristics in the billions per day.
As we are already working with several enterprises in these areas, we can apply the knowledge set and common questions we’ve gained previously to future engagements. This enables us to go from onboarding to in use in 90 days.
Interested in learning more about how we are helping enterprises tap into structured data in real-time?
Get in touch here.