Aura SQL RAG pipeline

Description of the SQL RAG pipeline

Introduction

ATRIA currently integrates one RAG pipeline for the conversion of a request from natural language to an SQL query.

Steps in the SQL RAG chain

The use of the SQL RAG chain encompasses different stages, which are explained and schematically represented below.


Figure 15. SQL RAG chain

1. Injection checking

  • Detects the presence of anomalies in the user’s query that may affect the resolution process.
  • Currently, a set of checks, based on heuristics, are made:
    • Detects overly long questions.
    • Detects suspicious substrings in the query.

2. Question translation (currently deactivated)

  • Optional step for the translation of the user’s query into English.
  • Currently, it is not activated.

3. Candidate table retrieval

  • The system searches the candidate tables for relevant documents. This is currently done using a hybrid search, through the combination of lexical and semantic search (embeddings).
  • The table retrieval is currently based on the similarity between the user’s query and the tables high level description.

4. SQL query generation

  • The top-2 results (tables) are selected.
  • In them, the user’s request is converted from natural language to an SQL query.