Text2SQL in the Enterprise Context.

3 min readOct 21, 2024

Why Text2SQL is not an easy AI task and what we provide.

Introduction.

In the last year, the use of Natural Language to interact with business data has become a central theme in Artificial Intelligence. One of the emerging technologies in this area is Text2SQL, which allows translating Natural Language requests into SQL queries, ready to be executed on enterprise databases. These features present a tremendous opportunity to simplify data access, enabling even less experienced users to quickly obtain information without having to write SQL code manually.

In a modern enterprise, where data is distributed across multiple systems and often organized in complex structures, tools like Text2SQL can significantly improve operational efficiency.

However, Text2SQL alone is not sufficient to address all Data Management challenges. It often needs to be integrated with other solutions, such as automated report generation, document retrieval, and the use of LLMs (Large Language Models) to produce synthesized responses based on multiple sources.

In this article, we will explore how a modern but flexible approach, combining advanced AI techniques and more traditional methods, can offer a comprehensive system for managing data-driven requests in a business context.

What Tools Can We Provide?

To understand and manage complexity, we often rely on simplifications. One of the most common is to think that any natural language request can be resolved by a good LLM (Large Language Model) and an optimized prompt. However, the reality of Generative AI-based applications is more complex and requires various components, some powered by AI and others more “traditional.”

Let’s take a concrete example: many requests cannot be handled by a Text2SQL model alone. This tool generates an SQL query to extract data from a database, but much more is needed to provide a complete answer.

A typical workflow may include the following steps:

The user makes a request in Natural Language.
The request is converted into an SQL query and executed.
The resulting data is displayed and stored in the conversation history.
The user may make additional requests, such as: “Create a report based on the extracted data.”
Relevant documents are retrieved from the knowledge base using RAG (Retrieval-Augmented Generation) techniques.
Finally, all relevant data (relational and document-based) is passed to an LLM, which generates a synthesized response based on an appropriate prompt.

The Necessary Functionality

To build a complete solution, we need several tools. Some examples:

Intelligent Routing: We need to distinguish between requests that require only SQL to retrieve data and those that need a more complex response based on all available information in the chat history.
Flexible interaction history management: It is necessary to store requests, retrieved data, and generated responses in a structured way so they can be used in the future for synthesis by an LLM.
Dynamic schema selection: When using Text2SQL, we need to efficiently manage the database schema. In an enterprise context, the schema can include hundreds of tables and thousands of fields. We cannot send the entire schema to the model but must intelligently select only the relevant portions.
SQL generation: Accurate SQL generation requires powerful models specialized in the Text2SQL task and adequate management of prompts, with examples that enhance and increase the model’s capabilities.
SQL cache: To avoid repeatedly executing the same queries, an SQL cache can reduce the time and resources required.
Monitoring capabilities for the entire solution: For example, through integration with an APM (Application Performance Management) tool.

A Reference Architecture

The following figure represents a Reference Architecture, showing the main components needed to create a complete system.

Reference Architecture (Image by the author.)

Our Contribution

The EMEA AI Specialist team, of which I am a part, has the mission to support clients and partners in building their solutions using the AI Services offered by the OCI platform.

Each project may have different requirements, but our goal is to provide a toolbox of tools that, when properly configured, cover all the functionalities described. All of this, of course, while considering the specific needs of the target market.

Text2SQL, as said, is not a completely solved AI task. To help our customers we’re actively working on it and you’ll see updates soon, stay tuned!