This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

RAG

1: General RAG
2: Aura SQL RAG pipeline

RAG capability

Overview of the RAG capability, the benefits derived from its use and the current predefined RAG chain in ATRIA

Introduction to RAG technology

RAG (Retrieval Augmented Generation) is a technique for augmenting LLM knowledge with additional data. It provides a way to optimize the output of an LLM with targeted and updated information without retraining it; thus, providing more appropriate answers based on specific and latest data.

The process includes three differentiated parts:

Retrieval: it searches and extracts relevant information from a KB database using information retrieval techniques, such vector representations (embeddings) to find text blocks that contain the appropriate information to resolve the input request.
Augmented: the RAG model augments the user input (or prompts) by adding the relevant retrieved data. This step uses prompt engineering techniques to communicate effectively with the LLM.
Generation: the enriched prompt is sent to an LLM, that generates the most accurate response for the user.

Figure 12. RAG technology

Application of RAG in ATRIA

As explained before, the LLM/LMM Experiences Builder enables the generation of LLM chains that integrate different AI technologies.

Within this capability, complex flows based on the RAG technology can be integrated.

Example case

Imagine that our platform, ATRIA, operates like a restaurant with different chefs, each specialized in a unique approach to meeting customers' needs.

A RAG model can be compared to Chef Sara, a chef who combines her traditional culinary experience with the real-time consultation of resources to enhance her recipes with the latest culinary trends worldwide, as she likes to be continuously up-to-date.

When a customer requests a nutritious and hearty meal, Sara goes beyond her own knowledge, based on already learnt techniques and recipes. Instead, she consults innovative cuisine resources: Indian cookbooks and her recent notes on advanced molecular cooking techniques. These external sources allow her to innovate and propose a unique dish: a curry foam, light and airy, with an intense spice flavor and a touch of coconut milk.

In technical terms, the RAG approach combines:
a. Generation based on prior knowledge (the internal model): equivalent to Sara's knowledge of cooking.
b. Real-time retrieval of external information: consulting cookbooks and notes represents how a RAG system looks up information in databases or dynamic sources during the response process.

This integration allows the model to provide more contextualized responses, tailored to specific needs, especially when the stored knowledge is limited or insufficient.

Currently, ATRIA incorporates the following RAG chains:

General RAG: Complex AI-driven flow for resolving generic questions experiences based on FAQs
SQL RAG: RAG-based pipeline for resolving SQL queries

In upcoming versions, constructors will be able to design their own LLMs chains based on RAG.

Benefits from the use of RAG technologies

Updated and targeted information: RAG allows developers to provide the latest data to the generative models, targeted to the specific use case.
Cost-effective implementation: Data in the knowledge repository can be continually updated without incurring significant costs.
Enhanced user trust: The data sources contributing to the RAG’s vector database are identifiable. This transparency allows for the correction or removal of any inaccuracies present in RAG and clearly improves users’ confidence.
Improved developers control: With RAG, developers can test and improve their applications more efficiently, control and change the LLM’s information sources to adapt to changing requirements, restrict sensitive information retrieval to different authorization levels and ensure the LLM generates appropriate responses.

1 - General RAG

General RAG capability

Overview of the General RAG capability, encompassing the underlying technology, its application in ATRIA and the benefits derived from its use

Application in ATRIA: General RAG

ATRIA enables the generation of generic questions experiences (use cases) to resolve users' requests expressed in natural language and based on FAQs by supporting complex calls to AI models.
This is done through the integration of a predefined RAG (Retrieval Augmented Generation) chain while guaranteeing security and privacy in interactions.

Figure 13. General RAG in ATRIA

The predefined RAG chain defined in ATRIA is called General RAG. It includes additional steps that overcome the potential of Retrieval Augmented Generation technologies by optimizing the input prompt and generating more accurate responses. See details in section Functional overview.

In upcoming versions, constructors will be able to design their own LLMs chains based on RAG.

Interaction with ATRIA General RAG capability

This service is accessible via API, enabling its consumption both from Aura Platform and any external application.

Current available models

The AI-driven models currently integrated into ATRIA are included here.

Functional overview of General RAG

The use of the General RAG capability encompasses three different stages:

Data ingestion, that includes uploading the knowledge bases used for lexical (keywords) and semantic search (embeddings) search.
Discover the underlying processes for that in the document Import documents into *ATRIA, as well as tips for data curation, a process recommended before the documents uploading.
RAG chain: If a request enters ATRIA, the General RAG capability executes the predefined steps in its chain, which are described in the following figure.
Aura answer: The generated response is sent to the user.

Figure 14. General RAG stages

Making a zoom in the stages of the General RAG pipeline, the following steps are included:

Figure 18. General RAG chain

Security: the request is analyzed to improve security and prevent prompt injection.
Multi-language: The multi-language feature allows users to receive responses in their own language. The system automatically detects the language in the user’s request in the multi-language step of the RAG pipeline, and this language is afterwards used in the response generation stage to provide the response back to the user.
Conversation history: If there is information from previous interactions, they are now analyzed to check if they are relevant for the current query. In this case, the query is rewritten using this context information.
Retrieval: Lexical and semantic retrieval from databases that return text blocks with key information to compose the response.
Post-filtering: The retrieved text blocks are compared with the user query to determine if they are relevant or not to answer the question.
Response generation: If so, the fragments are reordered and used to compose an augmented prompt which is resolved through LLMs technology.

Benefits from the use of ATRIA General RAG

The General RAG predefined chain enables all the advantages of RAG technologies to the resolution of use cases. Specifically for generic questions use cases based on FAQs.
Moreover, General RAG capability integrates other extra features that lead to more accurate responses:
- Features to avoid prompt injection
- Conversation history
- Filtering steps
The use of Retrieval Augmented Generation techniques enables the use of continually updated information, every time an up-to-date knowledge base is uploaded into the system.

Generative feedback functionality

When testing how Generative AI/RAG capabilities work with the ATRIA web interface aura-manager, it is possible to use the feedback functionality to estimate the user’s satisfaction regarding the quality and appropriateness of the generated answer to her request. This can be done easily by clicking the thumbs-up or thumbs-down icons.

Do you need a more detailed explanation on how Generative feedback capability works?

Access the document Generative feedback functional description
Access the document Use ATRIA web interface (aura-manager) to discover how to utilize this functionality.

2 - Aura SQL RAG pipeline

Aura SQL RAG pipeline

Description of the SQL RAG pipeline

Introduction

ATRIA currently integrates one RAG pipeline for the conversion of a request from natural language to an SQL query.

Steps in the SQL RAG chain

The use of the SQL RAG chain encompasses different stages, which are explained and schematically represented below.

Figure 15. SQL RAG chain

1. Injection checking

Detects the presence of anomalies in the user’s query that may affect the resolution process.
Currently, a set of checks, based on heuristics, are made:
- Detects overly long questions.
- Detects suspicious substrings in the query.

2. Question translation (currently deactivated)

Optional step for the translation of the user’s query into English.
Currently, it is not activated.

3. Candidate table retrieval

The system searches the candidate tables for relevant documents. This is currently done using a hybrid search, through the combination of lexical and semantic search (embeddings).
The table retrieval is currently based on the similarity between the user’s query and the tables high level description.

4. SQL query generation

The top-2 results (tables) are selected.
In them, the user’s request is converted from natural language to an SQL query.