ATRIA General RAG operational workflow

ATRIA technical operational flow corresponding to the operation of the RAG capability, specifically to the so-named General RAG predefined chain.

Flow diagram

Calls to the atria-rag-server component (AtriaRAG in the sequence diagram) executes the predefined RAG chain General RAG.


@startuml
title RAG API diagram
participant Application
participant Kernel #1add4d
participant AuraGatewayApi #76bbe7
participant AtriaModelGateway #f58e11
participant AtriaRAG #f5de11
participant AzureOpenAI #9476e7

Application -> Kernel: Create two-legged token with scope aura-aiservices:messaging:write
Note right of Kernel: this token needs refreshing
Kernel -> Application: Response two-legged token
Application -> Kernel: Request to aura-aiservices/generative/prompts with token and correlatorId (x-correlator)
Kernel -> AuraGatewayApi: Request to aiservices/generative/prompts with token-info header and correlatorId
AuraGatewayApi -> AuraGatewayApi: Validate request
AuraGatewayApi -> AuraGatewayApi: Generate prompt
AuraGatewayApi -> AtriaModelGateway: Send prompt to atria-model-gateway
activate AtriaModelGateway
AtriaModelGateway -> AtriaRAG: 1.0: Enrich request 
activate AtriaRAG
AtriaRAG -> AtriaRAG: securityStg

opt translateStg.enabled == true
    AtriaRAG -> AtriaRAG: 1.1: Translate user query 
    AtriaRAG -> AtriaModelGateway: Send request to LLM 
    AtriaModelGateway -> AzureOpenAI: Send Request to ChatCompletation endpoint
    AzureOpenAI --> AtriaModelGateway: Response from AzureOpenAI
    AtriaModelGateway --> AtriaRAG: LLM response with translated query
end
opt cleanStg.enabled == true
    AtriaRAG -> AtriaRAG: 1.2: Clean the user query 
    AtriaRAG -> AtriaModelGateway: Send request to LLM 
    AtriaModelGateway -> AzureOpenAI: Send Request to ChatCompletation endpoint
    AzureOpenAI --> AtriaModelGateway: Response from AzureOpenAI
    AtriaModelGateway --> AtriaRAG: LLM response with new cleaned query

end
opt contextStg.enable == true
    alt Ask LLM
        AtriaRAG -> AtriaModelGateway: 1.3: Request LLM to validate the conversational context
        AtriaModelGateway -> AzureOpenAI: Send Request to ChatCompletation endpoint
        AzureOpenAI --> AtriaModelGateway: Response from AzureOpenAI
        AtriaModelGateway --> AtriaRAG: LLM response [SAME CONTEXT] or [DIFFERENT CONTEXT]
        AtriaRAG -> AtriaRAG: Recreate Query
    end
    alt Recreate Query 
        AtriaRAG -> AtriaModelGateway: 1.4: Call LLM to generate new question 
        AtriaModelGateway -> AzureOpenAI: Send Request to ChatCompletation endpoint
        AzureOpenAI --> AtriaModelGateway: Response from AzureOpenAI
        AtriaModelGateway --> AtriaRAG: Response with new question
    end
end

AtriaRAG -> AtriaRAG: retrievalStg

opt postFilteringStg.enable == true
    AtriaRAG -> AtriaRAG: Post Filtering 
    note right: Batch request
    AtriaRAG -> AtriaModelGateway: 1.5: Request LLM for each chunk 
    AtriaModelGateway -> AzureOpenAI: Send Request to ChatCompletation endpoint
    AzureOpenAI --> AtriaModelGateway: Response from AzureOpenAI
    AtriaModelGateway --> AtriaRAG: LLM response [RELEVANT] or [IGNORABLE]
end

AtriaRAG -> AtriaModelGateway: 1.6: Request LLM generativeStg 
AtriaModelGateway -> AzureOpenAI: Send Request to ChatCompletation endpoint
AzureOpenAI --> AtriaModelGateway: Response from AzureOpenAI 
AtriaModelGateway --> AtriaRAG: LLM response
AtriaRAG --> AtriaModelGateway: 2: Final response 
deactivate AtriaRAG
deactivate AtriaModelGateway

AtriaModelGateway -> AuraGatewayApi: Response Model Gateway
AuraGatewayApi -> AuraGatewayApi: process atria-model-gateway response
AuraGatewayApi -> AuraGatewayApi: generate response
AuraGatewayApi -> Kernel: response 200 and message with session_id
Kernel -> Application: response 200 and message with session_id

@enduml