This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Technical guidelines

1: Build e2e experiences

1.1: Build experiences that call Generative AI
1.2: Build experiences that use General RAG
1.3: Build experiences that call NLP apps
1.4: Build experiences that call Semantic Search

2: Publish an API in Kernel
3: Configure an application
4: Hot swapping of applications
5: Agents server

5.1: Guidelines for technical developers

5.1.1: Define and configure agent
5.1.2: Create a Docker image
5.1.3: Deployment of an agent
5.1.4: Errors management

5.2: Guidelines for use cases constructors

5.2.1: Create and configure an agent
5.2.2: Create and configure an agent base

6: ATRIA configuration

6.1: ATRIA components default configuration
6.2: Create and configure a preset
6.3: Import documents into ATRIA
6.4: Create and configure an agent

7: Get Kernel access token
8: Request to Aura NLP Resolution API
9: Best practices for prompts generation
10: Request to Aura Generative API
11: Use ATRIA web interface
12: O365 Authentication
13: ATRIA error management

13.1: atria-model-gateway error management

14: Tutorial: Create new Copilot preset
15: Adjust timeouts in ATRIA
16: Create new Copilot preset (previous to Metallica)
17: Update ATRIA configuration using ConfigMap (previous to Metallica)
18: Modify prompts (previous to Metallica)

ATRIA technical guidelines

Guidelines detailing specific technical processes for different technical profiles (DevOps, use case builders, NLP experts, linguists, etc.) who want to use ATRIA capabilities or operate our AI-driven platform

Index of ATRIA technical guidelines

Other technical guidelines

These guidelines are common to ATRIA and Aura Virtual Assistant

Deploy Aura: Deployment Task Center
Develop and operate ATRIA: Developers Workspace

1 - Build e2e experiences

Build end-to-end experiences in ATRIA

“How to” workflows that schematically shows the orderly steps required to build end-to-end experiences using ATRIA capabilities

Introduction

In order to leverage the available ATRIA capabilities within a use case development, specific technical tasks must be performed by different Aura teams.

The current document provides a schematic overview of the end-to-end workflow and links to the corresponding technical guidelines tailored to each team responsible for specific aspects of the project.

Build experiences that call Generative AI

Build experiences that use General RAG

Build experiences that call NLP Apps

Build experiences that call Semantic Search

1.1 - Build experiences that call Generative AI

Build experiences that call Generative AI

Workflow with the main stages to build an end-to-end experience that calls an OpenAI GPT model

Introduction

Generative AI in Aura benefits from the auto-generative capabilities of Azure OpenAI GPT models for an accurate understanding of requests and the generation of highly reliable answers.

Steps in the process

a. Prerequisites: Install and enable

Enable ATRIA components in Aura installer

GES team

Check that the required components are enabled. If not:
Enable Generative components
Enable atria-model-gateway
Enable aura-manager (ATRIA web interface)

Publish aura-gateway-api in Kernel

GES team / Kernel DevOps Team

Is aura-gateway-api published in Kernel? If not:
Publish the aura-gateway-api API in Kernel as a prerequisite to call this API

Get a Kernel token

GES team

Check if your Kernel token has already expired. If so:
Get a valid Kernel two-legged token

}

b. Build experience

Configure, build and test your experience with Generative/RAG

Use case constructor

Guidelines for ATRIA uses cases constructors

1.2 - Build experiences that use General RAG

Build experiences that use General RAG

Workflow with the main stages to build an end-to-end experience that calls the General RAG model

Introduction

General RAG capability enables the implementation of RAG (Retrieval Augmented Generation) techniques to surpass the capabilities of LLMs in the development of generic questions use cases (based on FAQs).

Steps in the process

a. Prerequisites: Install and enable

Enable ATRIA components in Aura installer

GES team

Check that the required components are enabled. If not:
Enable RAG components
Enable atria-model-gateway
Enable atria-rag server
Enable aura-manager (ATRIA web interface)

Publish aura-gateway-api in Kernel

GES team / Kernel DevOps Team

Is aura-gateway-api published in Kernel? If not:
Publish the aura-gateway-api API in Kernel as a prerequisite to call this API

Get a Kernel token

GES team

Check if your Kernel token has already expired. If so:
Get a valid Kernel two-legged token

b. Build experience

Configure, build and test your experience with Generative/RAG

Use case constructor

Guidelines for ATRIA uses cases constructors

1.3 - Build experiences that call NLP apps

Build experiences that call NLP apps

Workflow with the main stages to build an end-to-end experience that calls NLP as a Service to use an NLP app

Introduction

Within NLP as a Service, the NLP Apps capability enables channels, services or skills to connect with Aura cognitive capabilities for sending a request expressed in natural language and receiving back an accurate response via API, without the need for a conversational bot.

Steps in the process

a. Prerequisites: Install and enable

1. Enable ATRIA components in Aura installer

GES team

Is Aura NLP deployed in your Aura system? If not:
Deploy Aura NLP
Enable NLP as a Service components

2. Publish aura-gateway-api in Kernel

GES team / Kernel DevOps Team

Is aura-gateway-api published in Kernel? If not:
Publish the aura-gateway-api API in Kernel as a prerequisite to call this API

3. Get a Kernel token

GES team

Check if your Kernel token has already expired. If so:
Get a valid Kernel two-legged token

b. Configure

4. Configure an application

Use case constructor

Configure an application to connect with aura-gateway-api

c. Build & test

5. Build the understanding model

Use case constructor

Generate and deploy the NLP recognition package for your use case

6. Make request to API

Use case constructor

Make a request to Aura NLP resolution API

1.4 - Build experiences that call Semantic Search

Build experiences that call Semantic Search

How to build an end-to-end experience that uses the Semantic Search stage (OpenAI embeddings recognizer), within NLP as a service

Introduction

Within [NLP as a Service], the Semantic Search capability enables the use of Azure OpenAI embeddings for the development of generic questions experiences (grounded in FAQs).

Steps in the process

a. Prequisites: Install and enable

1. Enable ATRIA components in Aura installer

GES team

Deploy Aura NLP
Enable NLP as a Service components

2. Publish aura-gateway-api in Kernel

GES team / Kernel DevOps Team

Is aura-gateway-api published in Kernel? If not:
Publish the aura-gateway-api API in Kernel as a prerequisite to call this API

3. Get a Kernel token

GES team

Check if your Kernel token has already expired. If so:
Get a valid Kernel two-legged token

b. Configure

4. Configure an application

Use case constructor

Configure an application to connect with aura-gateway-api

c. Build & test

5. Prepare the FAQ knowledge base

Content manager

Prepare the FAQ contents and answers used by the Semantic Search stage

6. Build the understanding model

Use case constructor

Generate and deploy the NLP recognition package for your use case
For the Semantic Search capability, the stage OpenAI embeddings is used

7. Make request to API

Use case constructor

Make a request to Aura NLP resolution API

2 - Publish an API in Kernel

Publish an API in Kernel

Guidelines for the publication of an API in Kernel

Guidelines

As a prerequisite for building an experience in ATRIA, the aura-gateway-api API must be published in Kernel.

For this purpose, follow the instructions below:

Request Kernel team to configure in the corresponding Kernel environment the scopes needed to call this API:
- aura-ai-services:messaging:write: Permission to send Generative / RAG and feedback messages to Aura.
- aura-ai-services:nlp-messaging:write: Permission to send NLP as a Service messages to Aura.
Access from Kernel to aura-gateway-api will be done by APIKey. It is necessary to create this APIKey in each environment following these instructions Generate an APIKey.
Request Kernel team to configure in the corresponding Kernel environment this API Aura AI Services.
- For that, the following settings are needed:
  - URL: https://{{aura-services-environment}}.auracognitive.com/aura-services/v2/aiservices/
  - Authorization header: APIKEY {{api-key}}
    
    Where:
    - {{aura-services-environment}} should look like svc-[country]-[environment], for instance svc-es-pre
    - {{api-key}} is a specific APIKey created for Kernel to access this endpoint. This APIKey must be requested to the team in charge of operating the corresponding Aura environment:
      - Aura Global team for development and staging environments
      - GES for certifications, pre-production and production environments
This process must be executed in each Kernel deployment.

3 - Configure an application

Guidelines for the configuration of an application

Comprehensive instructions for the configuration of an application to communicate with aura-gateway-api

Introduction

Prior to the development of a use case that needs aura-gateway-api to connect with an external service, the configuration of an application is required to set the specific parameters of the ATRIA AI-driven capability to be used.

Additionally, if certain changes must be made on an existing application through a hot swapping process, follow the guidelines Hot swapping of Aura applications configuration

Guidelines to create and configure an application

The creation of an application requires the following steps:

Create a task for the configuration of an application in JIRA.
Copy the tables below and paste them into the JIRA task.
Fill in all the fields corresponding to your specific ATRIA capability in the table’s column “Value in app”.
Do not modify the content of the remaining columns aside from “Value in app”.
The application must be available in the applications list of the environment, which is available through the aura-configuration-api server.
The edited sheet will serve as the basis for the subsequent validation of the application by Aura Global Team and its uploading to the system.

Mandatory parameters: parameter_name
Optional parameters: parameter_name

Application basic data

Parameter	Definition	Type	Value in app
disabled	Boolean value to enable or disable the application. By default: `false`	Boolean	complete
id	Unique application identifier	UUID	complete
name	Unique application name	String	complete
brand	Identifier of the Telefónica Brand associated to the application. Available values in the document Telefónica brands management	String	complete
nlp	Parameters to use the NLP as a Service capability Mandatory for using this capability	N/A	Go to NLP as a Service section
models	Parameters to use the Generative AI capability or RAG capability Mandatory for using these capabilities	N/A	Go to Using Generative AI / RAG section
agents	Identifiers of the agents associated with the application. Mandatory if the application requires integration with agents.	String[]	complete

Using NLP as a Service: nlp parameter

The use of the NLP as a Service capability by a channel requires the previous registration of the channel in Aura using the channels registration template.

Parameter	Definition	Type	Value in app
channelId	Identifier of the channel willing to use NLP as a Service	String	complete

Using Generative AI / RAG: models parameter

The use of Generative AI or RAG capabilities by applications requires the definition of the following parameters:

Parameter	Definition	Type
level	. It indicates if level application in atria-model-gateway, meaning that it has specific access and control privileges. . Default value `user`	String
presets	. Include here all the presets (configurable entities that define the instructions to work with the AI model for the resolution of a use case) that will be assigned to this application. . For this purpose, follow the guidelines to include a preset in the application . Take into account that a preset must be previously created in ATRIA	String[]

4 - Hot swapping of applications

Hot swapping of Aura applications configuration

Guidelines to execute modifications in Aura applications configuration through a hot swapping process

Prerequisites

The URL of aura-configuration-api must have the following format: https://{{aura-services-domain}}.auracognitive.com/aura-services/v2/configuration where:
- {{aura-services-domain}} should be svc-[country]-[environment], for instance svc-es-pre
Recommended:
- kubectl installed in your local host.
- curl installed in your local host.
- jq installed in your local host.

Access Aura Configuration API

Get the APIKey

First, we must get the APIKey, AURA_AUTHORIZATION_HEADER, of aura-configuration-api. For this purpose, follow these steps:

Execute the following command:

# substitute {{aura-environment}} with the environment you're configuring
export AURA_ENVIRONMENT={{aura-environment}}

$ kubectl -n $AURA_ENVIRONMENT get secret aura-configuration-api -o json | jq -r ".data.AURA_AUTHORIZATION_HEADER|@base64d

Copy the value of APIKey.

Update the application configuration

To update the configuration of an application, we must make a patch to the aura-configuration-api indicating the application that we want to modify and the new value:

Execute the next curl to update configuration:

# generate a valid UUID as correlator
# substitute {{correlator}} with the generated UUID
# substitute aura-services-domain with the specific information for environment, svc-[country]-[environment].
# substitute {{applicationId}} with the value of application to change
# substitute {{apikey}} with the value of APIKey get in the previous step
$ curl --location --request PATCH 'https://{{aura-services-domain}}.auracognitive.com/aura-services/v2/configuration/applications/{{applicationId}}' \
--header 'correlator: {{correlator}}' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'Authorization: {{apikey}}' \
--data-raw '{
    "id": "{{applicationId}}",
     // Send the object to update
    }
}'

Check the change through the following request:

# generate a valid UUID as correlator
# substitute {{correlator}} with the generated UUID
# substitute aura-services-domain with the specific information for environment svc-[country]-[environment].
# substitute {{applicationId}} with the value of application to change
# substitute {{apikey}} with the value of APIKey get in the previous step
# The response will be the application configuration.
$ curl --location --request GET 'https://{{aura-services-domain}}.auracognitive.com/aura-services/v2/configuration/applications/{{applicationId}}' \
--header 'correlator: {{correlator}}' \
--header 'Accept: application/json' \
--header 'Authorization: {{apikey}}'

ℹ️ NOTE: The config-watcher runs periodically (every 5 minutes) and when it detects that the application configuration has been modified, it will restart the pods.

5 - Agents server

Agents server

These documents include comprehensive guidelines for the management of agents in ATRIA, intended for two profiles:

Technical developers in charge of creating agents

Use cases constructors responsible for the generation and configuration of an agent for their experience

Index of contents

5.1 - Guidelines for technical developers

Guidelines for technical developers

Scope: Comprehensive guidelines for the development of an agent and its integration into ATRIA

Technical developers

Introduction

ATRIA offers a framework for the development of agents to be used for the creation of experiences.

Agents can be developed both using their own tools or taking advantage of the libraries and tools that ATRIA provides. The current document will focus on the second option.

Agents development workflow

Prerequisites

Two main prerequisites are required prior to the development of an agent:

Python version: 3.13. or higher
ATRIA agents-server dependencies must be installed.

Workflow

The orderly steps for the creation and subsequent deployment of an agent are included below:

5.1.1 - Define and configure agent

Define and configure agent

Description of the process for defining a new agent in ATRIA

Introduction

The first step for the creation of a new agent in ATRIA includes the definition of the agent. This is done by extending the BaseAgent class, in charge of creating the basic structure for all the agents to be developed and the management of its configuration.

Base class

The BaseAgent class provides the necessary functionality to initialize agent, manage configurations and interact with the agent package.

Access here the BaseAgent class in the Github repository.

The ATRIA agent-server uses this class to identify all the agents of the agent package coming from the BaseAgent.
This class promotes functionalities to initialize and build the agent.

Build agent

To build an agent it is necessary to extend this base class and, at least, extend these functionalities, according to the specific goals assigned to the agent.

get_class_ref: This method is used to get the class reference of the agent. It is used to identify the agent in the system.
build: This method is used to build the agent. It is used to set up the necessary configurations and to get everything required to initialize the agent.
__call__: This method is used to call the agent. It is used to execute the agent’s functionality.
initialize: This method is optional and used to initialize the agent. For example, it can be used to set up the agent’s state or to load any necessary data.

Configuration and information

The BaseAgent class provides a way to manage the agent’s configuration and information.

Information

The agent information is generated during the agent creation stage and can be accessed through the info field. It contains the following fields:

identifier: Unique identifier of the agent. This value corresponds to the class_ref of the agent.
name: Name of the agent. This value corresponds to the name of the agent coming from API.
deployment_name: Name of the agent deployment used to identify the agent environment that corresponds to the agent.
base_name : Base name of the agent used to identify the type agent. This value corresponds to the name of the agent class.
version : Version of the agent used to identify the version agent. This value corresponds to the version of the agent package installed.

Configuration

The agent configuration is generated during the agent creation stage from the aura-configuration-api, within the API agentBase.configuration field.

Here is an example.

 {
      "id": "XXXX-XXXX-XXXX-XXXXXXXXXXXX",
      "name": "test-agent",
      "deploymentName": "test-deployment-agent",
      "description": "A test agent",
      "communication": {
          "communicationType": "http"
      },
      "agentBase": {
          "name": "test-agent",
          "configuration": {
            "test": "test_value",
            "url": "http://localhost:8000"
          }
      }
  }

The entire content of this field is inserted into the agent under creation as a dictionary in the config field of the agent.

To obtain the configuration, access this config field as shown in the example below, where we want to obtain the test field within the agent’s configuration.
The use of the config will always be to fetch the information and not to store it between requests, as this config can be updated on the fly without the need to restart the component, so the config is read on each request.

test_value = self.config.get("test", None)

All this configuration is at agent level, but there are also configuration values at environment level. To obtain this environment configuration, use the command below:

test_value = os.getenv("TEST_ENV_VAR")

5.1.2 - Create a Docker image

Docker image

Description of how to create agent images, which is necessary for future deployment.

Introduction

The Docker image is the necessary component to deploy the agent in the ATRIA environment.

The image is built from the agent package and agent-server, which contains all the necessary to run the agent-server.

Build agent image

To build the agent image, it is necessary to use the docker command with the build option.

docker build -t <image_name> .

Where:

<image_name>: Name of the image to be created.
.: It indicates that the Dockerfile is in the current directory.

Dockerfile

The Dockerfile is the file that contains the instructions to build the image.

The Dockerfile for the agent image is located in the root directory of the agent package.

It is structured in two stages:

The first stage builds the agent package
The second stage creates the final image with the necessary dependencies and configurations and run the agent-server.

FROM python:3.13-slim AS base

ADD packages/atria-agent-dummy /opt/atria-agent
WORKDIR /opt/atria-agent

RUN pip install -r dev-requirements.txt && \
    python -m build

FROM python:3.13-slim

WORKDIR /opt/atria-agent

RUN apt-get update && apt-get install gcc python3-dev -y

COPY --from=base /opt/atria-agent/dist/atria_agent_dummy-*.tar.gz .
COPY --from=base /opt/atria-agent/entrypoint.sh entrypoint.sh
COPY --from=base /opt/atria-agent/version.txt version.txt

RUN pip install atria_agent_dummy-*.tar.gz && rm atria_agent_dummy-*.tar.gz

ENV AGENT_PACKAGE_NAME=atria_agent_dummy

ENTRYPOINT ["./entrypoint.sh"]

The directory atria-agent-dummy is the agent package that contains the agents code.

The entrypoint.sh script is used to run the agent server with the necessary configurations.

#!/bin/bash

set -e

export AURA_LOGGING_MODULE_VERSION=$(cat /opt/atria-agent/version.txt)

python -m atria_agents_server

5.1.3 - Deployment of an agent

Deployment of an agent in ATRIA

Guidelines for the deployment of newly created agents in ATRIA

Introduction

The current document includes comprehensive guidelines that serves as the foundational framework for the deployment of customized agents within ATRIA.

To deploy an agent in the aura config provisioning it is necessary to generate the following files in the corresponding folders.

Agents Base

In this folder, it is necessary to include the agents that are available to build. To do this, a json file must be generated with the data of this new agent.

Here is an example.

{
  "id": "XXXX-XXXX-XXXX-YYYYYYYYYYYYYY",
  "name": "test-agent",
  "description": "An agent test",
  "language": "python",
  "version": "1.0.0"
}

Where:

id: Unique identifier for the agent.
name: Name of the agent.
description: Description of the agent.
language: Programming language used for the agent.
version: Version of the agent.

These fields can also be added, removed or edited from the api. Also changes made by the api directly will not be persisted between releases.

Agents Deployment

In this folder, developers must define the agent to be deployed, associated with the [Docker image version]((/docs/atria/technical-guidelines/agents-management/agents-technical-development/docker-image/) previously created.

For this purpose, generate a json file with the data of this new agent.

Here is an example.

{
   "id": "XXXX-XXXX-XXXX-XXXXXXXXXXXX",
   "name": "test-agent",
   "config": {},
   "secrets": {},
   "image": "XXXX/agent-test",
   "tag": "X.X.X"
 }

Where:

id: Unique identifier for the agent.
name: Name of the agent.
config: Assign configuration assigned to in the agent’s environment (can be empty).
secrets: Assign secrets assigned to in the agent’s environment (can be empty).
image: Docker image of the agent.
tag: Tag of the Docker image.

These fields cannot be updated from the api.

Agents

In this folder, developers must include the information regarding te agent to be deployed, together with its configuration and information.

You can display the same agent image but with different information or configuration.

Here is an example.

{
     "id": "XXXX-XXXX-XXXX-XXXXXXXXXXXX",
     "name": "test-agent",
     "deploymentName": "test-deployment-agent",
     "description": "A test agent",
     "communication": {
         "communicationType": "http"
     },
     "agentBase": {
         "name": "test-agent",
         "configuration": {}
     }
 }

Where:

id: Unique identifier for the agent.
name: Name of the agent. This value is the name we want to give to the agent and is not related to the name we have at code level. Therefore, we can deploy several different agents with the same code base.
deploymentName: Name of the deployment of the agent, this value allows grouping several agents to the same deployment name.
description: Description of the agent.
communication: Communication type of the agent, in this case it is HTTP.
agentBase: Information about the base agent, including its name and configuration.
agentBase.name: Name of the base agent, this value allows you to associate the agent with the image of the agent you want to deploy. This value is associated with the name that comes by reference with the value of the get_class_ref of the developed agent.
agentBase.configuration: Configuration parameters for the agent (can be empty).

These fields can also be added, removed or edited from the api. Also changes made by the api directly will not be persisted between releases.

Applications

For the agent to be used in ATRIA, it must be associated to an existing application.

For this purpose, within the general process for the configuration of an application, edit the field agents with the list of agents’ identifiers to be associated to the application.

Here is an example.

{
     "brand": "ZZZZ",
     "id": "YYYY-YYYY-YYYY-YYYYYYYYYYYY",
     "name": "test-agent-app",
     "agents": [
         "XXXX-XXXX-XXXX-XXXXXXXXXXXX"
     ]
}

These fields can also be added, removed or edited from the api. Also changes made by the api directly will not be persisted between releases.

5.1.4 - Errors management

Errors management

Description of the error handling available on the server for internal use of new agents in ATRIA

Introduction

The agents-server provides a set of error managers mechanisms to ensure that agents can handle errors gracefully and provide meaningful feedback to users. This is essential for maintaining the reliability and usability of the agents.

Error Managers

The agents-server provides a set of error managers that can be used to handle errors in a consistent way. These error managers are designed to be used by agents to handle errors that occur during their execution.

The error managers are:

AgentErrorManager: This error manager is used to handle errors that occur during the execution of the agent. This results in the corresponding response and error code, depending on the exception thrown at agent level.
FastApiErrorManager: This error manager is used to handle errors that occur during the execution of the FastAPI application. This results in the corresponding response and error code, depending on the exception thrown at server level.

AgentErrorManager

All these exceptions receive a message and an error code.

This manager controls the following exceptions:

AgentBaseException

This is the base exception for all agent-related exceptions. It is used to catch any other exceptions that are not explicitly handled by the other error managers. It results in a 500 Internal Server Error response. It is formed as follows:

message: String. Default value: An agent error occurred.
error_code: String. Default value: AGENT_ERROR.

AgentNotFoundException

This exception is raised when the agent is not found in the system. It results in a 404 Not Found response. It is formed as follows:

message: String. Default value: Agent not found.
error_code: String. Default value: AGENT_NOT_FOUND.

AgentConfigException

This exception is raised when there is an error in the agent configuration. It results in a 400 Bad Request response. It is formed as follows:

message: String. Default value: Agent configuration error.
error_code: String. Default value: AGENT_CONFIG_ERROR.

AgentValidationException

This exception is raised when there is a validation error in the agent’s input. It results in a 400 Bad Request response. It is formed as follows:

message: String. Default value: Agent validation failed.
error_code: String. Default value: AGENT_VALIDATION_ERROR.

AgentExecutionException

This exception is raised when there is an error during the execution of the agent. It results in a 500 Internal Server Error response. It also receives the field detail. It is formed as follows:

message: String. Default value: Agent execution failed.
error_code: String. Default value: AGENT_EXECUTION_ERROR.
detail: String. Used to provide additional information to message. Default value: empty string.

AgentExternalServiceException

This exception is raised when there is an error in the external service that the agent is trying to access. It results in a 502 Bad Gateway response. It also receives the fields service_error_code and service_name. It is formed as follows:

message: String. Default value: External service error.
error_code: String. Default value: AGENT_EXTERNAL_SERVICE_ERROR.
service_error_code: String. Used to provide additional information, adding to the message {service_error_code}`. Default value: empty string.
service_name: String. Used to provide additional information, adding to the message `Service: {service_name}. Default value: empty string.

AgentModelError

message: String. Default value: Model error.
error_code: String. Default value: AGENT_MODEL_ERROR.
service_error_code: String. Used to provide additional information, adding to the message {service_error_code}`. Default value: empty string.
service_name: String. Used to provide additional information, adding to the message `Service: {service_name}. Default value: empty string.

AgentRateLimitError

This exception is raised when the agent exceeds the rate limit for the external service it is trying to access. It results in a 429 Too Many Requests response. It also receives the fields service_error_code and service_name. It is formed as follows:

message: String. Default value: Rate limit error.
error_code: String. Default value: AGENT_RATE_LIMIT_ERROR.
service_error_code: String. Used to provide additional information, adding to the message {service_error_code}`. Default value: empty string.
service_name: String. Used to provide additional information, adding to the message `Service: {service_name}. Default value: empty string.
retry_after: String. Mandatory field that adds the value Retry-After in the response header to indicate how long the client should wait before making a new request.

Usage

This manager allows launching these exceptions internally in your new agent.

To launch one of these exceptions, use the following command:

raise AgentExecutionException(message='message error', service_error_code='ERROR CODE', detail='problem with the agent execution')

FastApiErrorManager

This manager controls the following exceptions server:

ValidationException: This exception is raised when there is a validation error in the request. It results in a 400 Bad Request response.
RequestValidationError: This exception is raised when there is a validation error in the request body. It results in a 400 Bad Request response.
ResponseValidationError: This exception is raised when there is a validation error in the response body. It results in a 400 Bad Request response.

Response Error

The error managers return a response with the following structure:

{
    "code": "NOT_FOUND",
    "message": "Agent with identifier XXXX not found.",
    "errors": [
        {
            "type": "AGENT_NOT_FOUND",
            "message": "Agent with identifier XXXX not found."
        }
    ]
}

The response contains the following fields:

code: The error code that identifies the type of error.
message: A human-readable message that describes the error.
errors: A list of errors that occurred during the execution of the agent. Each error contains the following fields:
- type: The type of error that occurred.
- message: A human-readable message that describes the error.

5.2 - Guidelines for use cases constructors

Guidelines for use cases constructors

Scope: Comprehensive guidelines for the use of agents in an ATRIA experience

Use cases constructors

Index of contents

5.2.1 - Create and configure an agent

Create and configure an agent

Guidelines for the configuration of ATRIA by use cases constructors when developing an experience by means of an agent

Introduction

An agent is a configuration entity in ATRIA that represents an integration point for external channels, services, or platforms.

Agents are referenced by applications to enable channel or service connectivity within the platform.

Guidelines to configure an agent

1. Create a new agent

Build the agent for your use case (json file), using the available agent fields.

When the agent json file is generated, execute this command to include it:

curl --location --request POST 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/agents/' \
  --header 'Content-Type: application/json' \
  --header 'Accept: application/json' \
  --header 'Authorization: APIKEY XXX' \
  --data-raw '<NEW AGENT JSON>'

1.1. Modify/update an agent

If once created, certain modifications are required, follow these instructions:

Make the required changes in the agent json file using the available agent fields.

When the agent is modified, execute this command to update it:

curl --location --request PUT 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/agents/<agentID>' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: APIKEY XXX' \
  --data '<AGENT JSON WITH MODIFICATIONS>'

1.2. Delete an agent

Execute the following command:

curl --location --request DELETE 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/agents/<agentId>' \
  --header 'Accept: application/json' \
  --header 'Authorization: APIKEY XXX'

2. Include the agent in the application

If the application for your use case does not exist, first create it following the guidelines for the configuration of an application.

Once the application is created, assign the created agent in the field agents.

If you update or delete an agent, ensure that any application referencing it is also updated accordingly.
Remember that agents must exist to be inserted in an application.

Example to update the list of agents in an application:

curl --location --request PATCH 'https://svc-<env>.auracognitive.com/aura-services/v1/applications/<applicationId>' \
  --header 'Accept: application/json' \
  --header 'Authorization: APIKEY XXX' \
  --data '{
    "id": "<applicationId>",
    "agents": [
      "<agentId1>",
      "<agentId2>"
    ]
  }'

Agent fields

The fields for the characterization of an agent are summarized below, as defined in the API swagger Aura Configuration ATRIA Agents:

Field	Type	Mandatory	Description
`id`	string	Yes	Unique identifier (UUID) for the agent.
`name`	string	Yes	Name that uniquely identifies the agent in Aura.
`description`	string	No	Description of the agent.
`communication`	object	Yes	Parameters for the configuration of the communication flow. See communication configuration.
`agentBase`	object	No	Configuration of the agent base
`deploymentName`	string	No	Name of the deployment where the agent is running. If the `endpoint` field is not present in `communication`, this field will be used to compose the `endpoint` field to the agent. Both fields are incompatible.
`metadata`	object	No	Document metadata (version, createdAt, updatedAt, etc). See metadata.

Communication configuration (`communication`)

Field	Type	Mandatory	Description
`communicationType`	string	Yes	Type of communication. Only `http` is currently supported.
`endpoint`	string	No	HTTP endpoint where the agent listens.
`headers`	object	No	HTTP headers associated with the agent.
`timeout`	number	No	Timeout for agent communication.
`retries`	number	No	Number of retries for communication.

Agent base (`agentBase`)

Field	Type	Mandatory	Description
`name`	string	Yes	The name that identifies the agent base univocally in Aura.
`configuration`	object	No	The configuration of the agent flow.

Metadata (`metadata`)

Field	Type	Mandatory	Description
`version`	string	No	Configuration version when the document was created.
`createdAt`	string	No	Creation date (ISO 8601).
`updatedAt`	string	No	Last update date (ISO 8601).

Example: Minimal agent configuration

{
  "id": "b1e2c3d4-5678-1234-9abc-def012345678",
  "name": "example-agent",
  "communication": {
    "communicationType": "http",
    "endpoint": "https://agent.example.com/webhook"
  }
}

Example: Full agent configuration

  {
    "id": "1870fa4a-bcc4-4a7c-88fc-c0194555a076",
    "name": "device-recommender-agent",
    "communication": {
      "communicationType": "http",        
    },
    "deploymentName": "mongo-device-recommender-agent",
    "description": "An AI agent built with langgraph that provides personalized recommendations about devices by querying and analyzing data stored in a MongoDB database.",
    "agentBase": {
      "name": "device-recommender-agent",
      "configuration": {
        "conversational_agent": {
          "conversational_prompt": "Adopt the role of Aura",
          "model_params": {
            "temperature": 0.1
          },
          "model_str": "model_gw/gpt-4o-mini"
        },
        "mongo_agent": {
          "database_name": "mongo-recommender",
          "limit_query_result": 10,
          "model_params": {
            "temperature": 0.1
          },
          "model_str": "model_gw/gpt-4o-mini",                    
        }
      },          
    },
    "metadata": {
      "createdAt": "2025-07-11T09:54:33.973Z",
      "updatedAt": "2025-07-11T09:54:33.973Z",
      "version": "10.3.0"
    }
  }

Note:

The id, name, and communication fields are mandatory.

The communicationType must be http.

If an agent is deleted, applications referencing it will be updated.

5.2.2 - Create and configure an agent base

Create and configure an agent base

Guidelines for the configuration of ATRIA by use cases constructors when developing an experience by means of an agent

Introduction

An agent-base is a configuration entity in ATRIA that represents an implementation code for an agent.

Agents base are referenced by agents to deploy the configuration of the agent.

Guidelines to configure an agent-base

1. Create a new agent-base

Build the agent-base for your use case (json file), using the available agent base fields.

When the agent-base json file is generated, execute this command to include it:

curl --location --request POST 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/agents-base/' \
  --header 'Content-Type: application/json' \
  --header 'Accept: application/json' \
  --header 'Authorization: APIKEY XXX' \
  --data-raw '<NEW AGENT BASE JSON>'

1.1. Modify/update an agent-base

If once created, certain modifications are required, follow these instructions:

Make the required changes in the agent-base json file using the available agent base fields.

When the agent-base is modified, execute this command to update it:

curl --location --request PUT 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/agents-base/<agentBaseId>' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: APIKEY XXX' \
  --data '<AGENT BASE JSON WITH MODIFICATIONS>'

1.2. Delete an agent-base

Execute the following command:

curl --location --request DELETE 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/agents-base/<agentBaseId>' \
  --header 'Accept: application/json' \
  --header 'Authorization: APIKEY XXX'

2. Include the agent-base in the agent

Create the agents configuration, indicating de agentBase.name, as explained in the document guidelines for the configuration of an agent

Agent base fields

The fields for the characterization of an agent-base are summarized below, as defined in the API swagger Aura Configuration ATRIA Agents Base:

Field	Type	Mandatory	Description
`id`	string	Yes	Unique identifier (UUID) for the agent-base.
`name`	string	Yes	Name that uniquely identifies the agent-base.
`description`	string	No	Description of the agent-base.
`language`	string	Yes	Language type that the agent base is associated with. Currently, python.
`tags`	array	No	Tags that the agent base is associated with.
`version`	string	No	Version of the agent base.
`metadata`	object	No	Document metadata (version, createdAt, updatedAt, etc). See metadata.

Metadata (`metadata`)

Field	Type	Mandatory	Description
`version`	string	No	Configuration version when the document was created.
`createdAt`	string	No	Creation date (ISO 8601).
`updatedAt`	string	No	Last update date (ISO 8601).

Example: agent-base configuration

{
  "id": "cd2b534c-16c3-4d89-a87e-ec45d3939232",
  "name": "agent-base-test",
  "description": "An AI agent built with langgraph that provides personalized recommendations about devices by querying and analyzing data stored in a MongoDB database.",
  "language": "python",
  "version": "1.0.0"
}

Note:

The id, name, and language fields are mandatory.

The language must be python.

6 - ATRIA configuration

ATRIA configuration

Comprehensive description of ATRIA default configuration
Guidelines for the modification of ATRIA components configuration
Guidelines for importing documents into ATRIA

Introduction

ATRIA main components, atria-model-gateway and atria-rag-server, are configured through different parameters, both internal ones and required when developing an experience in ATRIA.

The following documents describe these parameters and their associated fields and fully define the processes for their modification by experiences constructors.

The configuration parameters can be divided into two main categories:

CONFIGURATION PARAMETERS	DESCRIPTION	TARGET USERS	RELATED DOCUMENTS
Server configuration parameters	Internal configuration for ATRIA components	ATRIA developers and installation teams	ATRIA components default configuration
preset	- Instructions to work with the AI model for the resolution of a use case - It includes a process for documents and data import into the environment	ATRIA use cases constructors	- Modify ATRIA configuration: Configure a preset - Import documents into ATRIA

6.1 - ATRIA components default configuration

ATRIA components default configuration

Description of the default configuration (internal configuration) for ATRIA components

Introduction

The default configuration of ATRIA corresponds to the server configuration, that is, the internal configuration for ATRIA components.

Within a specific configuration type, parameters are organized by component:

Fields for atria-model-gateway configuration
Fields for atria-rag-server
Common fields for both components

1. Server configuration

Fields related to the internal configuration of ATRIA components

Target users: ATRIA development and installation teams

The default server configuration fields are non-modifiable by ATRIA constructors (excepting prompts)

1.1. Logging configuration

Configuration field shared between atria-model-gateway and atria-rag-server that enables the configuration of logs in a customizable and independent way

The logging configuration is done through a json configuration file that is set by default, as shown below.

{
  "version": 1,
  "disable_existing_loggers": false,
  "logging": {
    "handlers": {
      "hdl2": {
        "class": "logging.StreamHandler",
        "formatter": "json",
        "level": <AUTOCOMPLETED>
      }
    },
    "loggers": {
      "atria_model_gw": {
         "level": <AUTOCOMPLETED>,
         "handlers":[
            "hdl2"
         ],
         "filters":[],
         "propagate": false
      }
    },
    "root": {
      "level": <AUTOCOMPLETED>,
      "handlers": []
    }
  }
}

Fields

The main fields are explained below. However, for more details, developers are kindly requested to read the General Python logging documentation

Parameter	Subparameters	Definition	Type/Default values
`version`		Version of the logging configuration	number
`disable_existing_loggers`		Boolean value to indicate whether or not the already existing loggers when this call is made are disabled or not	boolean
`handlers`		Dictionary with different logging handlers. Each key is the name of a handler
	`class`	It is configured with Python logging handlers (See Python documentation)
	`formatter`	It configures the format of logs.	`json`, `string`, `console`, `simple`
	`level`	Level of the logging event. It must be filled with the labels	`INFO`, `ERROR`, `WARN` or `DEBUG`
`loggers`		Python dictionary in which each key is a logger name and each value is a dictionary describing how to configure the corresponding logger instance
	`level`	(Optional) Level of the logger.
	`handlers`	(Optional) List with the IDs of the handlers for this logger
	`filters`	(Optional) List with the IDs of the filters for this logger
`root`		Configuration for the root logger.
	`level`	(Optional) Level of the logger.
	`handlers`	(Optional) List with the IDs of the handlers for this logger

1.2. atria-model-gateway default configuration

This section includes the parameters configured by default in atria-model-gateway:

Defaults

General-purpose field with parameters to define the behavior of atria-model-gateway

Defaults fields

Parameter	Subparameters	Definition	Type/Default values
`session_params`		(Optional) Default values for a session	object
	`window`	(Optional) Session window	number
	`timeout`	(Optional) Session expiration time	number
`service_params`		(Optional) Default values for the server	object
	`preflight_max_age`	(Optional) Preflight max age	number
`messages`		(Optional) Message options	object
	`types`	(Optional) Types of messages.	list[string]
`openai_proxy`		Activate OpenAI proxy	boolean
`trimmer`		(Optional) Expression to trim the response	string

If the timeout is 0, the last conversation in the session will not be saved, but the session history will be used.

Defaults by default

The default configuration is described as follows:

defaults:
  # Default values for a session
  session_params:
    window: 2
    timeout: 3600

  # Default values for the server
  service_params:
    preflight_max_age: 86400

  # Message options
  messages:
    types:
      - feedback

  # Activate openai proxy
  openai_proxy: false

Redis

This section includes the Redis connection configuration for atria-model-gateway.

Redis fields

Parameter	Definition	Type/Default values
`connection_mode`	(Mandatory) Connection mode	`single`, `sentinel`, `cluster`
`pool_size`	(Mandatory) Pool size	number
`database`	(Mandatory) Database	number
`password`	(Mandatory) Password	string
`uri`	(Mandatory) URI name	string
`prefix`	(Mandatory) Prefix	string
`sleep_time`	(Optional) Sleep time	number
`max_retries`	(Optional) Maximum number of retries	number

Redis by default

The default configuration for Redis is described as follows:

redis:
  connection_mode: <AUTOCOMPLETED>
  pool_size: 100
  database: <AUTOCOMPLETED>
  password: <AUTOCOMPLETED>
  uri: <AUTOCOMPLETED>
  prefix: <AUTOCOMPLETED>

Redis Subscriber

This section includes the Redis event subscriber connection configuration for atria-model-gateway.

Redis subscriber fields

Parameter	Definition	Type/Default values
`connection_mode`	(Mandatory) Connection mode	`single`, `sentinel`, `cluster`
`pool_size`	(Mandatory) Pool size	number
`database`	(Mandatory) Database	number
`password`	(Mandatory) Password	string
`uri`	(Mandatory) URI name	string
`prefix`	(Mandatory) Prefix	string
`sleep_time`	(Optional) Sleep time	number
`max_retries`	(Optional) Maximum number of retries	number
`channels`	List of channels to subscribe to	list[string]

Redis subscriber by default

The default configuration for Redis is described as follows:

redis_subscriber:
  connection_mode: <AUTOCOMPLETED>
  pool_size: 100
  database: <AUTOCOMPLETED>
  password: <AUTOCOMPLETED>
  uri: <AUTOCOMPLETED>
  prefix: <AUTOCOMPLETED>
  channels:
    - "ApplicationConfiguration"
    - "PresetConfiguration"

Config API

Field with parameters for the API configuration for atria-model-gateway

Config API fields

Parameter	Definition	Type/Default values
`base_url`	(Mandatory) API config URL	string
`api_key`	(Mandatory) APIKey	string

Config API by default

The default configuration is described as follows:

aura_config_api:
  base_url: <AUTOCOMPLETED>
  api_key:  <AUTOCOMPLETED>

Allow logging prompts with INFO level

Field to allow logging prompt with INFO level for atria-model-gateway. It should only be used for debugging errors in environments where there are no debug logs. Due to the size of the prompts, this variable should be set to false once it is not needed.

Allow logging prompts

Parameter	Definition	Type/Default values
`allow_log_prompts`	Allow logging prompts	boolean

Allow logging prompts by default

The default configuration is described as follows:

allow_log_prompts: false

Models

Predefined AI models included in atria-model-gateway by default.

The model(s) to be used must be selected when configuring an application.

Model fields

Parameter	Subparameters	Definition	Type/Default values
`type`		(Mandatory) Identifier type of model	`rag`, `openai`, `mock`, `perplexity`
`name`		(Optional) Model name. If this value does not exist, `id` is used	string
`class_params`		(Mandatory) Preset description	object
	`endpoint`	(Mandatory) Endpoint of the model	string
	`type`	(Mandatory for RAG) Type of the model	`langchain`
	`path`	(Mandatory for RAG) Path of endpoint model	string
	`azure_name`	(Mandatory for OpenAI) Azure name of the model	string
	`model_name`	(Mandatory for OpenAI) Model name	string
	`api_key`	(Mandatory for OpenAI) APIkey to be used in the model call	string
	`api_version`	(Mandatory for OpenAI) API version to be used in the model call	string
	`output`	(Mandatory for mocks) Response to be used in the model call	string
`description_params`		(Optional) Description of the model params	object
	`context_window`	(Optional) Context window of model	number
`tokenizer`		(Optional) Tokenizer of model	string

Models by default

atria-rag model

Model for using the atria-rag-server.

The default configuration is described as follows:

  atria-rag:
    type: rag
    name: Rag server model
    class_params:
      type: langchain
      endpoint: <AUTOCOMPLETED>
      path: <AUTOCOMPLETED>

gpt-4

Model for using Azure OpenAI GPT-4 model.

The default configuration is described as follows:

      gpt-4:
        type: openai
        local: false
        class_params:
          azure_name: deployment_gpt-4
          model_name: gpt-4
          api_key: <AUTOCOMPLETED>
          endpoint: <AUTOCOMPLETED>
          api_version: <AUTOCOMPLETED>
          timeout:
             timeout: 60
             read: 60
        description_params:
          context_window: 300

gpt-4o

Model for using Azure OpenAI GPT-4o model.

The default configuration is described as follows:

      gpt-4o:
        type: openai
        local: false
        class_params:
          azure_name: deployment_gpt-4o
          model_name: gpt-4o
          api_key: <AUTOCOMPLETED>
          endpoint: <AUTOCOMPLETED>
          api_version: <AUTOCOMPLETED>
          timeout:
            timeout: 60
            read: 60
          description_params:
            context_window: 128000

gpt-4o-mini

Model for using Azure OpenAI GPT-4o-mini model.

The default configuration is described as follows:

      gpt-4o-mini:
        type: openai
        local: false
        class_params:
          azure_name: deployment_gpt-4o-mini
          model_name: gpt-4o-mini
          api_key: <AUTOCOMPLETED>
          endpoint: <AUTOCOMPLETED>
          api_version: <AUTOCOMPLETED>
          timeout:
            timeout: 60
            read: 60
          description_params:
            context_window: 128000

o3-mini

Model for using Azure OpenAI o3-mini model.

The default configuration is described as follows:

      o3-mini:
        type: openai
        local: false
        class_params:
          azure_name: deployment_o3-mini
          model_name: o3-mini
          api_key: <AUTOCOMPLETED>
          endpoint: <AUTOCOMPLETED>
          api_version: <AUTOCOMPLETED>
          timeout:
            timeout: 60
            read: 60
          description_params:
            context_window: 128000

gpt-4.1-nano

Model for using Azure OpenAI gpt-4.1-nano model.

gpt-4.1-nano:
  type: openai
  local: false
  class_params:
    azure_name: deployment_gpt-4.1-nano
    model_name: gpt-4.1-nano
    api_key: <AUTOCOMPLETED>
    endpoint: <AUTOCOMPLETED>
    api_version: <AUTOCOMPLETED>
    timeout:
      timeout: 60
      read: 60
    description_params:
      context_window: 128000

perplexity-sonar

This model will be available in ATRIA in upcoming releases. Model for using Perplexity sonar model.

The default configuration is described as follows:

perplexity-sonar:
 type: perplexity
 local: false
 class_params:
   model_name: sonar
   api_key: <AUTOCOMPLETED>
   endpoint: <AUTOCOMPLETED>
   timeout:
     timeout: 20
     read: 45
   http_raise_when_retry_limit_exceeded_recognizer: false
 description_params:
   context_window: 300

Important: This model does not support the same parameters as the previous ones. Check Microsoft document API & feature support.
The following parameters are not supported by the model: temperature, top_p, presence_penalty, frequency_penalty, logprobs, top_logprobs, logit_bias, max_tokens.

1.3. atria-rag-server default configuration

This section includes the parameters configured by default in atria-rag-server:

LLMs

Predefined parameter to define the Large Language Models (LLMs) that call from atria-model-gateway to atria-rag-server.

Currently, only one LLM with the necessary configuration to connect atria-model-gateway to atria-rag-server is defined. It cannot be modified.

LLMs fields

Parameter	Definition	Type/Default values
`name`	(Optional) LLM name. If this value does not exist, `id` is used	string
`model_type`	(Mandatory) Model type	string
`endpoint`	(Mandatory) Endpoint of the model	string

LLm by default

atria-model-gateway:

  atria_model_gateway:
    name: Local Model Gateway
    model_type: llm_manager
    endpoint: http://atria-model-gw:6391/aura-services/v1/atria-model-gw

Embeddings

Parameters to define the embeddings, vector representations to find text blocks that contain the information to resolve the input request.

Two types of Embeddings are available for use:

Local Embeddings: Generated by the atria-rag-server in local mode.
Embeddings OpenAI: Generated by OpenAI.

Embeddings fields

Parameter	Definition	Type/Default values
`name`	(Mandatory) Embedding name	string
`type`	(Mandatory) LLM name. Type of the model	`sentence_transformer`, `azure_openai`
`model`	(Mandatory) Used model	string
`openai_api_version`	(Mandatory to call Azure OpenAI) OpenAI API version	string
`openai_api_type`	(Mandatory to call Azure OpenAI) OpenAI API type	string
`openai_api_key`	(Mandatory to call Azure OpenAI) OpenAI APIKey	string
`azure_endpoint`	(Mandatory to call Azure OpenAI) Azure endpoint	string

Embeddings by default

The predefined embeddings in atria-rag-server are shown below:

Local Sentence Transformer from HuggingFace:

This is an open-source model that appears in sentence-transformers library.

It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for several tasks like:

Clustering
Multilingual similarity searches
Retrieval-based tasks
Classification

A brief characterization of this embedding regarding different parameters is included below:

Cost: Free to use once downloaded (local execution). No API call costs.
Latency: Low, since it runs locally without external API calls.
Performance: Satisfactory for general-purpose sentence embeddings, supporting multiple languages.
Vector Length: 384 dimensions (smaller than OpenAI’s ADA model).
Hardware Requirements: Needs a GPU for faster inference; otherwise, it can be slow on a CPU.
Model Size: Requires local storage (~120MB).
Quality: Slightly lower accuracy than larger models, especially for complex NLP tasks.

This embedding can be configured with a yaml file:

local_st:
    name: Local Sentence Transformer from HuggingFace
    type: sentence_transformer
    model: paraphrase-multilingual-MiniLM-L12-v2

Distilbert-based Local Sentence Transformer from HuggingFace

This is an open-source model that appears in sentence-transformers library.

It has been trained on 215M (question, answer) pairs from diverse sources.

It maps sentences & paragraphs to a 768 dimensional dense vector space and was designed for several tasks like:

Semantic search
Question answering
Passage retrieval

A brief characterization of this embedding regarding different parameters is included below:

Cost: Free (local execution). No API call costs.
Latency: Fast, optimized for question-answer retrieval tasks.
Performance: Outperforms MiniLM in retrieval-based tasks due to DistilBERT’s training on QA data.
Vector Length: 768 dimensions (higher than MiniLM, better at capturing semantics).
Hardware Requirements: Similar to MiniLM, requires a GPU for optimal performance.
Model Size: Larger than MiniLM (~250MB).
Quality: Primarily trained for English, not as strong for multilingual applications.

This embedding can be configured with a yaml file:

test_distilbert:
    name: Distilbert-based Local Sentence Transformer from HF
    type: sentence_transformer
    model: multi-qa-distilbert-cos-v1

OpenAI Embeddings ADA

This is one of OpenAI’s latest models for generating embeddings and has quickly become a top choice for tasks:

Recommendation systems
Chatbots
Semantic search
Large-scale applications

A brief characterization of this embedding regarding different parameters is included below:

Cost: Paid API model (depends on token usage, $0.0001/1k Tokens). It can be expensive for high-volume applications.
Latency: API calls introduce certain delay, specially in large-scale real-time applications.
Performance: State-of-the-art embeddings with high accuracy for a wide range of NLP tasks.
Hardware Requirements: No local hardware requirements, it works via API.
Vector Length: 1536 dimensions (rich semantic representation).
Quality: Strong performance across multiple languages.

This embedding can be configured with a yaml file:

text-embedding-ada-002:
  name: text-embedding-ada-002 model from Azure OpenAI API
  type: azure_openai
  model: deployment_text-embedding-ada-002
  openai_api_version: <AUTOCOMPLETED>
  openai_api_type: azure
  openai_api_key: <AUTOCOMPLETED>
  azure_endpoint: <AUTOCOMPLETED>

Redis Subscriber

This section includes the Redis event subscriber connection configuration for the atria-rag-server.

Redis subscriber fields

Parameter	Definition	Type/Default values
`connection_mode`	(Mandatory) Connection mode	`single`, `sentinel`, `cluster`
`pool_size`	(Mandatory) Pool size	number
`database`	(Mandatory) Database	number
`password`	(Mandatory) Password	string
`uri`	(Mandatory) URI name	string
`prefix`	(Mandatory) Prefix	string
`sleep_time`	(Optional) Sleep time	number
`max_retries`	(Optional) Maximum number of retries	number
`channels`	List of channels to subscribe to	list[string]

Redis subscriber by default

The default configuration for Redis is described as follows:

redis_subscriber:
  connection_mode: <AUTOCOMPLETED>
  pool_size: 100
  database: <AUTOCOMPLETED>
  password: <AUTOCOMPLETED>
  uri: <AUTOCOMPLETED>
  prefix: <AUTOCOMPLETED>
  channels:
    - "PresetConfiguration"

Prompts

A prompt is defined as an input instruction given to an AI model to generate a response. It guides the AI in the required kind of output.

A prompt by default is defined in ATRIA for different RAG stages. This can be used when a specific prompt is not defined in the preset.

Prompts structure for RAG

The hierarchy of default prompts in RAG stages is shown below:

prompts  
 |___ <stage>
        |___ default
        |       |___ text
        |       |___ args
        |___ <language>
                |___ text
                |___ args

The first level in the prompts configuration are the stages of the RAG process. Each stage has its own configuration and purpose.
Prompts configuration works at language level, so it is possible to have different prompts for different languages, indicated by the language code:
- <language>: Any language prompt configuration (ISO 639-1 Code)
- default: Default prompt configuration (in a specific language)
For each language, the prompts structure must include the fields text and args:
- text: This field contains the text of the prompt that will be sent to the language model. It includes placeholders (e.g., {query}, {target_language}) that are mandatory for the prompt to work. These placeholders will be dynamically replaced with the specific values when the prompt is executed.
- args: Optional field that contains a dictionary of arguments that will be used to replace the placeholders in the text field.

Default prompts in RAG stages

The following stages are currently defined in RAG:

cleanStg

This stage is responsible for cleaning the user query. It ensures that the query is in a proper format before further processing.

See how to include this stage in the default prompt code here

translationStg

This stage handles the translation of the user query into the target language, if necessary.

See how to include this stage in the default prompt code here

contextStg

This stage determines the context of the user query, ensuring it is aligned with the previous conversation or context.

Default prompts in this stage:

sameContext: Configuration to check if the query is in the same context.
recreatedQuestion: Configuration to rewrite the original question. It is composed of following prompts:
- default: Configuration for rewriting the original question.
- system: System prompt configuration.
- human: Human prompt configuration.
system: System prompt configuration.
human: Human prompt configuration.
order: Array of strings with prompts names sorted.

See how to include this stage in the default prompt code here

postFilteringStg

This stage filters the retrieved documents or data to ensure relevance to the user query.

Default prompts in this stage:

relevantDocument: Configuration to check if the document is relevant.
relevantSql: Configuration to check if the SQL data is relevant.

See how to include this stage in the default prompt code here

generativeStg

This stage generates the final response using the retrieved and filtered data.

Default prompts in this stage:

stuff: Configuration for the “stuff” strategy. It is composed of the following sub-stages:
- default: Configuration for the “stuff” strategy.
- system: System prompt configuration.
- human: Human prompt configuration.
notAnswerResponse: Configuration for responses when the question cannot be answered.
informationExtraction: Configuration for extracting information. It is composed of following prompts:
- human1: Human prompt configuration.
- ia: IA prompt configuration.
- human: Human prompt configuration.
responseConsolidation: Configuration for consolidating the response.
sqlPrompt: Configuration for generating SQL query statements.

See how to include this stage in the default prompt code here

RAG default prompt

The current section includes the prompt defined by default for ATRIA RAG capability.

You can also access the yaml file in the Github repository.

In case of any discrepancy between the content of this document and that on GitHub, the GitHub version shall always be considered the most up-to-date

RAG default prompt

prompts:
  cleanStg:
    es:
      text: |
        A continuación hay una consulta del usuario.
        Por favor, limpie la consulta y responda solo con la pregunta del usuario o alguna charla informal.
        -------
        {query}        
    default:
      text:
        A user query follows.
        Please clean the query and respond with just the user question or small talk. The query must be written in English.
        -------
        {query}
  translationStg:
    default:
      text: |
        Translate the following question to {target_language}: {question}

        Instructions:
        1. Maintain the formal tone of the original text.
        2. Do not translate proper names and specific terms (e.g., company names, product names, countries).
        3. Provide the translation in the same format and structure as the original text.

        Translated Text:
        Finally, return the result as a unique JSON object, with the following structure:

        ```
        {{
            "source_languge": The original question language,
            "target_language": The target language,
            "translation": The translation of the question to the target_language. ,
            "possible": true|false,
            "reason": The reason why it is possible or not possible to translate the question.
        }}
        ```        
  contextStg:
    sameContext:
      default:
        text: |
          Below is a conversation followed by a question. You must determine if the question corresponds to the same context as the conversation or if it is from a different context.
          Respond only with: [SAME CONTEXT] o [DIFFERENT CONTEXT]

          Conversation:
          {memory}

          Question:
          {query}          
      es:
        text: |
          A continuación hay una conversación y seguidamente una pregunta. Debes responder si la pregunta corresponde al mismo contexto de la conversación o es una pregunta de un contexto diferente.
          Responde únicamente con: [MISMO CONTEXTO] o [DIFERENTE CONTEXTO]

          Conversación:
          {memory}

          Pregunta:
          {query}          
    recreatedQuestion:
      default:
        default:
          text: |
            Answer with just a new question or the original question.
            Rewrite the original question only if it follows the conversation. Always rewritten question in the same language as the user's question.

            Conversation:
            {memory}

            Original question:
            {query}

            Rewritten question:            
        es:
          text: |
            Responde sólamente con una nueva pregunta.
            Reescribe la pregunta original si es una continuación de la conversación. Utiliza el idioma de la peticion del usuario para rescribir la pregunta.

            Conversación:
            {memory}

            Pregunta original:
            {query}

            Pregunta reescrita:            
      system:
        default:
          text: |
            The user text contains a query, plus the previous conversation turn.
            - If the previous conversation is relevant for the current query, incorporate it into the query and produce a rewritten query
            - else just repeat the current query.

            Always rewrite the question in the same language as the user's question.            
        es:
          text: |
            El texto del usuario contiene una consulta, además del turno anterior de la conversación.

            - Si la conversación anterior es relevante para la consulta actual, incorpórala en la consulta y produce una consulta reescrita.
            - Si no es relevante, simplemente repite la consulta actual.

            Reescribe siempre la consulta en el mismo idioma en que está formulada la consulta del  usuario.            
      human:
        default:
          text: |
            Previous conversation:
            {memory}

            Current query:
            {query}

            Rewritten query:            
        es:
          text: |
            Conversación anterior:
            {memory}

            Consulta actual:
            {query}

            Consulta reescrita:            
      order: ["system", "human"]
  postFilteringStg:
    relevantDocument:
      default:
        text: |
          Below is an excerpt of text followed by a question. You must determine if the excerpt is relevant or irrelevant for answering the question.
          Respond only with: [RELEVANT] o [IGNORABLE]

          Excerpt:
          {extract}

          Question:
          {query}          
      es:
        text: |
          A continuación hay un extracto de texto y seguidamente una pregunta. Debes responder si el extracto es relevante o ignorable para responder la pregunta.
          Responde únicamente con: [RELEVANTE] o [IGNORABLE]

          Extracto:
          {extract}

          Pregunta:
          {query}          
    relevantSql:
      default:
        text: |
          Given the following question:
          `{question}`

          Is it possible to answer, using the data contain in the following table?:
          ```sql
          {sql_table_definition}
          ```


          **Explain briefly, all your decisions**.
          First, identify which tables are necessary to answer the question. Justify why you selected each of these tables.
          Use the following format:
          ```
          I need the following tables to answer the question:
            - <table_name>: <reasoning>
            - <table_name>: <reasoning>
            ...
          ```

          Then, identify which columns are necessary to answer the question. Justify why you selected each of these columns.
          Write the list of columns you identified, and the reasoning after each column, using the following format:
          ```
          I need the following columns to answer the question:
            - <table name>:
              - <column_name>: <reasoning>
              - <column_name>: <reasoning>
              ...
            - <table_name>:
              - <column_name>: <reasoning>
              - <column_name>: <reasoning>
              ...
            ...
          ```

          Then, tell if the tables and columns you identified are enough to answer the question.
          Write the answer using the following format:
          ```
          Possible to answer the question using the former columns:
            - <reasoning>
            - Result: <Yes|No>
          ```

          Then, explain, step by step, how you would write the SQL query to answer the question, using the columns you identified.
           **Use the full qualified names of the columns**. **DO NOT USE THE `JSON_OBJECT` FUNCTION IN THE QUERY**.

          Finally, tell if the question can be answered using this format:

          ```
          {{
              "possible": true|false,
              "reason": The reason why it is possible or not possible to answer the question.
          }}
          ```          
  generativeStg:
    stuff:
      default:
        default:
          text: |
            Use the following context extractions to answer the question at the end.

            Contexto:
            {context}

            If the extracted context do not contain the answer avoid coming up with an answer, and response you do not have information for answering and kindly invite the user to make a new question.

            Question:
            {question}

            Never include information by your own using your own knowledge.
            {extra_prompt}            
        es:
          text: |
            Utilice el siguiente contexto que ha sido extraido  para responder la pregunta del final.

            Contexto:
            {context}

            Usando esta información, responde a la pregunta del usuario.
            Si la información no contiene la respuesta evita firmemente responder, di que desconoces la respuesta e invita educadamente al usuario a que formule una nueva pregunta.

            Pregunta:
            {question}

            Nunca incluyas información utilizando tus propios conocimientos.
            {extra_prompt}            
      system:
        default:
          text: |
            Respond in language {user_query_language}.

            Question:
            {question}            
          args:
            user_query_language: "#.auto.language.user_query"
        es:
          text: |
            Responde en el idioma {user_query_language}.

            Pregunta:
            {question}            
          args:
            user_query_language: "#.auto.language.user_query"
      human:
        default:
          text: |
            You are going to generate an answer for a user question or query.
            To generate the answer, take always into account all the information available in the context provided.

            Context:
            {context}

            Question:
            {question}

            Never include information by your own using your own knowledge.
            {extra_prompt}            
        es:
          text: |
            Vas a generar una respuesta para una pregunta o consulta del usuario.
            Para generar la respuesta, ten siempre en cuenta toda la información disponible en el contexto proporcionado.

            Pregunta:
            {question}

            Contexto:
            {context}

            Nunca incluyas información utilizando tus propios conocimientos.
            {extra_prompt}            
      order: ["system", "human"]
    notAnswerResponse:
      default:
        text: |
          You are a question answering agent. You have tried to answer this question: {query}
          However you do not have information to answer this.
          Please, tell the user that you are not able to answer, apologize and invite the user to make other question.
          Avoid any harmful answer, such as sexual, rude, sexist or racist.
          Respond in language {user_query_language}.

          User question:
          {query}          
        args:
          user_query_language: "#.auto.language.user_query"
      es:
        text: |
          Eres un agente de respuesta a preguntas. Has intentado responder a esta pregunta: {query}
          Sin embargo, no tienes información para responder a esto.
          Por favor, dile al usuario que no puedes responder, discúlpate e invita al usuario a hacer otra pregunta.
          Evita cualquier respuesta dañina, como sexual, grosera, sexista o racista.
          Responde en el idioma {user_query_language}.

          Pregunta del usuario:
          {query}          
        args:
          user_query_language: "#.auto.language.user_query"
    informationExtraction:
      default:
        default:
          text: |
            The original question is this: {question}
            We have provided a previous answer: {existing_answer}
            Only if necessary, refine the answer exclusively with the context below.
            ------------
            {context_str}
            ------------
            Given the new context, refine the original answer to improve the quality of the response.
            If the context is useless, respond with the exact words of the original answer.
            {extra_prompt}            
        es:
          text: |
            La pregunta original es esta: {question}
            Hemos proporcionado una respuesta previa: {existing_answer}
            Sólo si es necesario refina la respuesta exclusivamente con el contexto a continuación.
            ------------
            {context_str}
            ------------
            Dado el nuevo contexto, refina la respuesta original para mejorar la calidad de la respuesta.
            Si el contexto es inútil responde con las mismas palabras de la respuesta original.
            {extra_prompt}            
      human1:
        default:
          text: "{question}"
        es:
          text: "{question}"
      ia:
        default:
          text: "{existing_answer}"
        es:
          text: "{existing_answer}"
      human:
        default:
          text: |
            Refine the existing answer only if necessary, exclusively with the context below.
            ------------
            {context_str}
            ------------
            Given the new context, refine the original answer to improve the quality of the response.
            If the context is useless, respond with the exact words of the original answer.
            {extra_prompt}            
        es:
          text: |
            Refina la respuesta existente, sólo si es necesario, exclusivamente con el contexto a continuación.
            ------------
            {context_str}
            ------------
            Dado el nuevo contexto, refina la respuesta original para mejorar la calidad de la respuesta.
            Si el contexto es inútil responde con las mismas palabras de la respuesta original.
            {extra_prompt}            
      order: ["human1", "ia", "human"]
    responseConsolidation:
      default:
        default:
          text: |
            Below I provide you a context.
            ---------------------
            {context_str}
            ---------------------

            Given exclusively the context, and without using any prior knowledge, respond with a single sentence to the question:
            {question}

            {extra_prompt}            
        es:
          text: |
            A continuación te doy un contexto.
            ---------------------
            {context_str}
            ---------------------

            Dado exclusivamente el contexto, y sin usar ningún conocimiento previo responde con una única frase a la pregunta:
            {question}

            {extra_prompt}            
      system:
        default:
          text: |
            Below I provide you a context.
            ---------------------
            {context_str}
            ---------------------

            Given exclusively the context, and without using any prior knowledge, respond with a single sentence to the question:
            {question}

            {extra_prompt}            
        es:
          text: |
            A continuación te doy un contexto.
            ---------------------
            { context_str }
            ---------------------

            Dado exclusivamente el contexto y sin usar ningún conocimiento previo responde con una única frase a cualquier pregunta.

            { extra_prompt }            
      human:
        default:
          text: "{question}"
        es:
          text: "{question}"
      order: ["system", "human"]
    sqlPrompt:
      default:
        text: |
          Generate a SQL query statement to answer the following question:
          `{question}`

          Use the data contained in the following table, as defined in SQL:
          ```sql
          {sql_table_definition}
          ```

          The following tables, containing auxiliary information, are also available:
          ```sql
          CREATE TABLE D_CBD_Static_Geo_Area_v6 (GEO_AREA_ID VARCHAR, CBD_GEO_AREA_LEVEL1_ID VARCHAR, CBD_GEO_AREA_LEVEL2_ID VARCHAR, CBD_GEO_AREA_LEVEL3_ID VARCHAR, CBD_GEO_AREA_LEVEL4_ID VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);
              COMMENT ON TABLE D_CBD_Static_Geo_Area IS 'Geographical areas. This table contains foreign keys to the different levels of geographical areas. In particular, it contains the foreign keys to these tables: CBD_Static_Geo_Area_Level1, CBD_Static_Geo_Area_Level2, CBD_Static_Geo_Area_Level3, CBD_Static_Geo_Area_Level4. Therefore, this tables is used, via JOIN, to query the geographical information contained in the different levels of geographical areas. For instance, if you have a table T with a field GEO_AREA_ID and you need to check whether this location corresponds to the region of Asturias you will need to look for GEO_AREA_ID in this table, then extract the CBD_GEO_AREA_LEVEL4_ID and query the table CBD_Static_Geo_Area_Level4 to get the name of the region.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area.GEO_AREA_ID IS 'Identifier of the geographical area considered. FORMAT: string containing a numerical code. This field does not contain location names.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL1_ID IS 'Identifier of the geographical area Level 1 (max level of detail: CP or similar). FORMAT: string containing a numerical code. This field does not contain location names.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL2_ID IS 'Identifier of the geographical area Level 2 (City/Town). FORMAT: string containing a numerical code. This field does not contain location names.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province). FORMAT: string containing a numerical code. This field does not contain location names.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code. This field does not contain location names.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area.EXTRACTION_TM IS 'Date-time of the record';

          CREATE TABLE D_CBD_Static_Geo_Area_Level2_v6 (CBD_GEO_AREA_LEVEL2_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, CBD_GEO_AREA_LEVEL3_ID VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);
              COMMENT ON TABLE D_CBD_Static_Geo_Area_Level2 IS 'Geographical area level 2 (State)';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.CBD_GEO_AREA_LEVEL2_ID IS 'Identifier of the geographical area Level 2 (City/Town). FORMAT: string containing a numerical code.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 2. FORMAT: alphanumeric string containing the name of the city/town.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province)';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 2';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 2';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_AREA_ID IS 'Identifier of the geographical area considered. FORMAT: string containing a numerical code.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_STD_AREA_CD IS 'Standard code of the geo area';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.EXTRACTION_TM IS 'Date-time of the record';

          CREATE TABLE D_CBD_Static_Geo_Area_Level3_v6 (CBD_GEO_AREA_LEVEL3_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, CBD_GEO_AREA_LEVEL4_ID VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, ISO_3166_2_CD VARCHAR, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);
              COMMENT ON TABLE D_CBD_Static_Geo_Area_Level3 IS 'Geographical area level 3 (Region)';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province). FORMAT: string containing a numerical code.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 3. FORMAT: alphanumeric string containing the name of the province.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 3';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 3';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.ISO_3166_2_CD IS 'ISO 3166-2 associated';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_AREA_ID IS 'Identifier of the geographical area considered. FORMAT: string containing a numerical code.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_STD_AREA_CD IS 'Standard code of the geo area';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.EXTRACTION_TM IS 'Date-time of the record';

          CREATE TABLE D_CBD_Static_Geo_Area_Level4_v6 (CBD_GEO_AREA_LEVEL4_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, HASC_1_CD VARCHAR, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);
              COMMENT ON TABLE D_CBD_Static_Geo_Area_Level4 IS 'Geographical area level 4 (min. Detail)';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 4. FORMAT: alphanumerical string containing the name of the state/region. EXAMPLE VALUES: ''Asturias'', ''Andaluc\u00eda'', etc.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 4';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 4';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.HASC_1_CD IS 'Hierarchical administrative subdivision codes ';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_AREA_ID IS 'Identifier of the geographical area considered. FORMAT: string containing a numerical code.';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_STD_AREA_CD IS 'Standard code of the geo area';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';
              COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.EXTRACTION_TM IS 'Date-time of the record';

          CREATE TABLE D_CBD_Static_Station_Type_v6 (STATION_TYPE_CD VARCHAR, TECH_LEVEL_WEIGHT_QT FLOAT, STATION_TYPE_L2_DES VARCHAR, STATION_TYPE_L1_DES VARCHAR, STATION_TYPE_L2_ORDER_NUM INT, STATION_TYPE_L1_ORDER_NUM INT, STATION_TYPE_ORDER_NUM INT, CONSCIOUS_IND BOOLEAN, EXTRACTION_TM VARCHAR);
              COMMENT ON TABLE D_CBD_Static_Station_Type IS 'Station types';
              COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_CD IS 'Device type';
              COMMENT ON COLUMN D_CBD_Static_Station_Type.TECH_LEVEL_WEIGHT_QT IS 'Associated weight for the technologic level of the home';
              COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L2_DES IS 'Station type level 2';
              COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L1_DES IS 'Station type level 1';
              COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L2_ORDER_NUM IS 'Station type order level 2';
              COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L1_ORDER_NUM IS 'Station type order level 1';
              COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_ORDER_NUM IS 'Station type order';
              COMMENT ON COLUMN D_CBD_Static_Station_Type.CONSCIOUS_IND IS 'Indicates if the related device type has energy efficiency';
              COMMENT ON COLUMN D_CBD_Static_Station_Type.EXTRACTION_TM IS 'Date-time of the record';

          CREATE TABLE D_Segment_v8 (OPERATOR_ID VARCHAR, SEGMENT_ID VARCHAR, SEGMENT_DES VARCHAR, GBL_SEGMENT_ID VARCHAR, SEGMENT_GROUP_ID VARCHAR, SEGMENT_GROUP_DES VARCHAR, EXTRACTION_TM VARCHAR);
              COMMENT ON TABLE D_Segment IS 'Classifications of the customers, attending to different segmentation criteria, for marketing and management issues, according to OB criteria and its correspondence with the global segment classification';
              COMMENT ON COLUMN D_Segment.OPERATOR_ID IS 'Global Operator Identifier (Operator acting as owner of the information present in the current entity)';
              COMMENT ON COLUMN D_Segment.SEGMENT_ID IS 'Organisational segment of the client, in the OB. FORMAT: Numerical code.';
              COMMENT ON COLUMN D_Segment.SEGMENT_DES IS 'Segment description. This is the actual name of the segment. POSSIBLE VALUES: ''NTT'', ''Residencial'', ''Pymes'', ''Residencial/SC'', ''Autonomos'', ''Operadores'', ''Grandes Clientes'', ''Residencial Prepago'', ''Telefonica'', ''Sin Clasificar'', ''Empresas''';
              COMMENT ON COLUMN D_Segment.GBL_SEGMENT_ID IS 'ID of the global segment classification';
              COMMENT ON COLUMN D_Segment.SEGMENT_GROUP_ID IS 'ID code of the segmentation group';
              COMMENT ON COLUMN D_Segment.SEGMENT_GROUP_DES IS 'Description of the segmentation group. POSSIBLE VALUES: ''0.- OPERADORES'', ''1.- U.N. Empresas'', ''2.-U.N. Gran Público'', ''3.- TELEFONICA'', ''4.- SIN CLASIFICAR''';
              COMMENT ON COLUMN D_Segment.EXTRACTION_TM IS 'Date-time of the record';
          ```

          Some of the former tables contains columns in full-qualified format. For instance, these are some examples of full-qualified columns:
          ```
          record_name.field_name
          TEC_PLAT_REC.DEVICE_ID
          record_name.subrecord_name.field_name
          TEC_PLAT_REC.TEC_PLAT_SUBCOMP_REC.DEVICE_ID
          ...
          ```
          Always use the full-qualified format when referring to columns in the tables. For instance, if you need to use the column 'TEC_PLAT_REC.DEVICE_ID', you should not refer to it as 'DEVICE_ID', but as 'TEC_PLAT_REC.DEVICE_ID'.
          **Explain in detail, step by step, all your decisions**.
          If you need to filter by a higher level geographical such as a region (Comunidad Autónoma) you will need to:
          - join the `GEO_AREA_ID` field of the data table (such as `CBD_HGU_Detail_Daily`) with the `GEO_AREA_ID` field in `D_CBD_Static_Geo_Area` table
          - then join the `CBD_GEO_AREA_LEVEL4_ID` field in the `D_CBD_Static_Geo_Area` with the `CBD_GEO_AREA_LEVEL4_ID` field in the `D_CBD_Static_Geo_Area_Level4` table
          - then compare the `GEO_AREA_LEVEL_DES` field in the `D_CBD_Static_Geo_Area_Level4` table with the name of the region (e.g., 'Cantabria'), since the DESCRIPTION field does contain the actual name of the geographical area.
          **Only perform these joins if explicit filtering or grouping by geographical location is necessary**.

          First, identify which tables are necessary to answer the question. Justify why you selected each of these tables.
          Use the following format:
          ```
          I need the following tables to answer the question:
            - <table_name>: <reasoning>
            - <table_name>: <reasoning>
            ...
          ```
          Then, identify which columns are necessary to answer the question. Justify why you selected each of these columns.
          Write the list of columns you identified, and the reasoning after each column, using the following format:
          ```
          I need the following columns to answer the question:
            - <table name>:
              - <column_name>: <reasoning>
              - <column_name>: <reasoning>
              ...
            - <table_name>:
              - <column_name>: <reasoning>
              - <column_name>: <reasoning>
              ...
            ...
          ```
          Then, tell if the tables and columns you identified are enough to answer the question.
          Write the answer using the following format:
          ```
          Possible to answer the question using the former columns:
            - <reasoning>
            - Result: <Yes|No>
          ```
          Then, explain, step by step, how you would write the SQL query to answer the question, using the columns you identified. **Use the full qualified names of the columns**. **DO NOT USE THE `JSON_OBJECT` FUNCTION IN THE QUERY**.
          Finally, write the SQL query to answer the question, using the columns you identified. **DO NOT USE THE `JSON_OBJECT` FUNCTION IN THE QUERY**.
          Return the result as a unique JSON object, with the following structure:
          {{
              "result": <Write the SQL query here. **MAKE SURE THAT THE STATEMENT `SELECT JSON_OBJECT` is not used in the query and Use the full qualified names of the columns. Generate a valid SQL sentence in a single line without new line characters.**>,
              "status": "OK",
              "reason": <a reasoning explaining the query>
          }}
          If the former table does not contain the necessary data to answer the question, return the following JSON object:
          {{
              "result": null,
              "status": "ERROR",
              "reason": <a reasoning explaining the query>
          }}
          Make sure that the JSON object is correctly formatted, and can be parsed by a JSON parser.

Injection

Default injection configuration for atria-rag-server. It is used to avoid prompt injection.

Injection fields

Parameter	Definition	Type/Default values
`heuristics`	Heuristic sentences. Object, where the key is the language and the value is a list of phrases. Now, by default, the heuristics sentences are defined in the config, the file path is no indicated. It is important to note that the phrases added here will be also added to those defined in the security stage securityStg of the preset configuration.	object

| max_length | (Mandatory) Maximum length |number |

Injection by default

The default configuration is described as follows:

injection:
  heuristics:
    es: 
      - responde como
      - responda como
      - respondeme como
      - respondame como
    en: 
      - answer like
      - forget everything
      - forget your
  max_length: 200

Service

Defaults service configuration for atria-rag-server.

Service fields

Parameter	Definition	Type/Default values
`host`	(Mandatory) Host name	string
`port`	(Mandatory) Port id	number

Service by default

The default configuration is described as follows:

service:
  host: 0.0.0.0
  port: <AUTOCOMPLETED>
  log_level: <AUTOCOMPLETED>

Local Storage

Defaults fields related to the configuration of the local storage for documents

Local Storage fields

Parameter	Definition	Type/Default values
`atria_resources_data_folder`	(Mandatory) Folder name for data resources	string
`atria_shared_data_folder`	(Mandatory) Shared data folder name	string

Local Storage by default

The default configuration is described as follows:

local_storage_manager:
  atria_resources_data_folder: "/opt/atria-rag/data"
  atria_shared_data_folder: "/var/atria-rag-data"

Config API

Field with parameters for atria-rag-server API configuration

Config API fields

Parameter	Definition	Type/Default values
`base_url`	(Mandatory) API Config URL	string
`api_key`	(Mandatory) APIKey	string

Config API by default

The default configuration is described as follows:

aura_config_api:
  base_url: <AUTOCOMPLETED>
  api_key:  <AUTOCOMPLETED>

Retrievers

Retriever are responsible for storing the information that have been generated in the documents. Each retriever is associated with a database in order to feed or retrieve information from it.

Currently, there are three different retrievers defined in ATRIA:
-qdrant
-tfidf
-elasticsearch

Retriever fields

Each retriever type has defined specific fields, as shown below:

Parameter	Subparameters	Definition	Type/Default values
`qdrant`	`host`	(Mandatory) Host service Qdrant	string
	`port`	(Mandatory) Port service Qdrant	number
	`prefix`	(Mandatory) Prefix to collection	string
`tfidf`	`dump_name`	(Mandatory) Dump name of service Tfidf	string
`elasticsearch`	`host`	(Mandatory) Host service Elasticsearch	string
	`ca_crt`	(Mandatory) Path certificate Elasticsearch	string
	`username`	(Mandatory) Username service Elasticsearch	string
	`password`	(Mandatory) Password service Elasticsearch	string
	`index_name`	(Mandatory) Index service Elasticsearch	string

Retrievers by default

The default configuration is described as follows:

retrievers:
  qdrant:
    host: <AUTOCOMPLETED>
    port: 6333
    prefix: <AUTOCOMPLETED>
  tfidf:
    dump_name: /var/atria-rag-data/tfidf/dump/

Metadata

Parameter related to the configuration of metadata in atria-rag-server

It is used to setup how metadata is used when providing responses. The retrieving operation produces a list of candidates, each of which may provide a dictionary of metadata. The metadata is used to filter the candidates and provide additional information in the response.

Metadata fields

Parameter	Subparameters	Definition	Type/Default values
`map`	`filetype`	(Optional) Type of file, typically used to specify the format	string
	`page_number`	(Optional) Page number. It could be used to identify particular pages	string
`group-by`		(Optional) Group by field names.	string
`aggregate`		(Optional) Determines how the values of duplicated fields are consolidated during grouping	string
`output_filter`		(Optional) List of fields to be displayed in the metadata	List of string
`root`		(Optional) Primary fields that will structure the final output of the metadata processing	List of string

Metadata by default

The default configuration for metadata is described as follows:

metadata:
  map:
    filetype: content-type
    page_number: page-number
  group-by: url
  aggregate: page-number
  output_filter:
    - title
    - url
    - content-type
    - page-number
    - _zxcv
  root:
    - title
    - url
    - content-type

Language identification

Parameter related to the configuration of Language Identification in atria-rag-server

It is used to identify the language of the user’s question. The result is a dictionary containing the detected language in ISO 639-3 format and its corresponding conversion.
In addition to language identification, the user’s question is preprocessed at this stage, and special characters that may cause recognition errors are removed. For example, line breaks. In case of error, the default language is returned.

This language identification is calculated through fasttext library.

Language identification fields

Parameter	Definition	Type/Default values
`language_default`	(Optional) Language in ISO 639-3 format (two letters). For example: `es`	string
`score_threshold`	(Optional) Score threshold used to respond in the identified language or in the default language. For example: `0.85`	float
`model_path`	(Mandatory) Model path. For example: `/opt/atria-fasttext/fasttext_model.bin`	string
`chars_to_clean`	(Optional) Characters to be cleaned. By default is `['/n']`	list of string

Language Identification by default

The default configuration for language identification is described as follows:

language_identification:
  score_threshold: <AUTOCOMPLETED>
  language_default: <AUTOCOMPLETED>
  model_path: "/opt/atria-fasttext/fasttext_model.bin"

6.2 - Create and configure a preset

Create and configure a preset

Guidelines for the configuration of ATRIA by use cases constructors when developing an experience by means of a preset

This guidelines correspond to a specific stage in the processes for building experiences using Generative AI or RAG, which are fully explained in:

Introduction

A preset is a configurable entity that defines the instructions to work with the AI model for the resolution of a use case.

These instructions include, apart from other parameters, the prompt with text to guide the AI model with the generation of the response. For example:

“Maintain the formal tone of the original text”
“If the previous conversation is relevant for the current query, incorporate it into the query and produce a rewritten query”

When developing an experience in ATRIA, use cases constructors must configure a preset for the specific ATRIA application to be used.

ATRIA use cases constructors can use the currently available default presets or they can modify them or create new ones via API.

In both scenarios, a further step is required to include the preset in the application.

Guidelines to configure a preset

1. Create a new preset

Build the preset for your use case (json file), using the available preset fields.
Do you get lost with all the preset configuration parameters? In best practices for ATRIA configuration, you can find the most commonly used parameters by experiences constructors grouped by their purpose (“I want to increase security”, “I want to activate the multi-language feature”, etc.)

When the preset json file is generated, execute this command to include it:

  curl --location --request POST 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/presets/' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header 'Authorization: APIKEY XXX' \
    --data-raw '<NEW PRESET JSON>'

1.1. Modify/update a preset

If once created, certain modifications are required, follow these instructions:

Make the required changes in the preset json file using the available preset fields.

When the preset is modified, execute this command to include it:

  curl --location --request PUT 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/presets/<presetID>' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: APIKEY XXX' \
  --data '<PRESET JSON WITH MODIFICATIONS>'

1.2. Delete a preset

Execute the following command:

  curl --location --request DELETE 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/presets/<presetId>' \
  --header 'Accept: application/json' \
  --header 'Authorization: APIKEY XXX'

2. Include the preset in the application

An application is defined as an entity that allows the connection of a channel, service or skill with with ATRIA.

If the application for your use case does not exist, firstly it is required to create it following the guidelines for the configuration of an application.

Once the application is created, assign the created preset. Two scenarios arise here:

2.1. If an existing preset is modified

Get the list of presets assigned to the application to be used from aura-configuration-api:

  curl --location 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/applications/<applicationID>' \
  --header 'Authorization: APIKEY '

Check if your preset is already included in the list and, consequently, associated to your application.
If not, declare the created preset in the application following the guidelines for the configuration of an application: Use Generative AI/RAG, within the field presets.

2.2. If a new preset is created

Update aura-configuration-api to indicate to the application the complete list of presets to be used.

It is necessary to include the entire list of presets associated to the application (the existing presets and the created/modified ones)

    curl --location --request PATCH 'https://svc-<env>.auracognitive.com/aura-services/v1/applications/:applicationId' \
    --header 'Accept: application/json' \
    --header 'Authorization: APIKEY XXX' \
    --data '{
        "id": "<applicationId>",
        "models": {
            "level": <levelType>, 
            "presets": [
                <complete-new-list-of-presets>
            ]
        }
    }'

The level field, that indicates the different levels of access to the application, can only be changed By the Global Team. This command is a specific scenario in the process of modifying API configuration, described in the document Hot swapping of Aura applications configuration.

Declare the created preset in the application following the guidelines for the configuration of an application: Use Generative AI/RAG, within the field presets.

Preset fields

The fields for the characterization of a preset are summarized below, which are defined in the API swagger Aura Configuration API Preset.

If there is any discrepancy between the parameters definitions included in this document and those in the API swagger, definitions established in the API shall prevail.

id: Mandatory. Preset identifier. The type is string.
name: Mandatory. Preset name. If this value does not exist, id is used. The type is string.
description: Optional. Preset description. If this value does not exist, id is used. The type is string.
group: Mandatory. This parameter is used to group requests regarding the AI technologies used to generate KPIs. The type is string. Feasible values: simple_ai (Generative AI preset) and enriched_ai (RAG preset).
session: Optional. Parameters for session configuration.
- window: Optional. The size of the session window, in queries. The type is number.
- timeout: Optional. The time in seconds after which the session will be closed if no queries are received. If it is 0, the session history will be used, but the current interaction will not be saved. The type is number.
generative: Mandatory if Generative AI is used. It indicates the use of Generative AI in the use case. If this field exists, the rag field must not exist.
- model: Mandatory. Model configuration.
  - id: Mandatory. Unique identifier of the model
  - parameters. Optional. Dictionary with all possible parameters for the model. For generative, check them here.
- injectionMaxLength. Optional. Maximum length of the input user. The type is number.
- prompts: Optional. Parameters to define the prompts with instructions used as input by the AI model to automatically generate responses.
  . The object may include properties such as text, additional parameters, and specific configurations to control the behavior of the generative model.
  . If no prompt is defined, the resolution of the use case is entirely delegated to the LLM model.
  - template: Optional. Template that includes the user’s input. It must include {MSG} for the user’s utterance. This will override (or add, if not defined) the template for the user message, as defined in the preset (Note: templates allow framing the user message to mitigate prompt injection attacks). The type is string.
  - preamble: Optional. List of phrases to be included in the model prompt.
  - examples: Optional. Examples to enrich the prompt. The type is string.
  - promptMaxLength: Optional. Maximum length of the completed prompt. Used to avoid calling LLMS with wrong prompts.
  - promptRegexClean: Optional. Regex pattern to clean the query before sending it to the model. This is useful to remove unwanted characters or patterns from the query. The type is number.
rag: Mandatory if RAG technology is used. It indicates the RAG configuration. If this field exists, the generative field must not exist.
- ragType: Optional. RAG type. Values: questions-answers (by default) or sql
- model: Mandatory. Parameters for the configuration of the RAG model.
  - id: Mandatory. Unique identifier of the model to be used. The type is string.
  - parameters: Optional. Dictionary with all the possible parameters for the model.
- references: Optional. Configuration for managing references in the system. It control de number of references the system relies on to generate a response.
  - maximum: Optional. Maximum number of returned references. The type is number.
  - baseUrl: Optional. Base URL of references that will be shown to the user as part of the response. For the types of data unstructured, csvand text (defined in the field loaderType), it is required to add here the path to the public URL to be shown in the response as a clickable reference.
- stages: Mandatory. Stages of the RAG model.
  - promptSystemLanguage: Optional. Parameter to select a specific language from the ones defined in the prompt. Type: string in ISO 639-3 format. For example: es.
  - defaultUserLanguage: Optional. Parameter used in multi-language feature. It indicates the default response language to be used if the system is not able to automatically recognize the language. Type: string in ISO 639-3 format. For example: es.
  - securityStg: Stage with parameters related to security used to avoid prompt injection.
    - injectionMaxLength: Mandatory. Maximum length of the input user. If length is greater, an error is sent. The type is number.
    - heuristics: Optional. Heuristics configuration.
      - es: List of heuristic sentences in Spanish. The type is list.
      - en: List of heuristic sentences in English. The type is list.
  - translationStg: Stage used to translate the prompt.
    - enabled: Mandatory. Boolean value to activate or not the translation stage. The type is boolean.
    - language: Mandatory. Two-letter ISO 639-1 language code into which user input is translated to match the language of the data. The type is string. If this field exists, the prompts field must not exist.
    - prompt: Mandatory. List of prompts to be used in the LLM call.
      . The type is PromptLanguage.
      . If this field exists, the language field must not exist.
      . If this field is empty, the default prompt for this stage will be used.
  - contextStg: Stage used to know if the user’s phrase has the same context of the conversation.
    - enabled: Mandatory. Boolean value to activate or not the context stage. The type is boolean.
    - stickyContext: Mandatory. Strategy to include the context into the new query. If not specified, the optional context in the request is ignored. The type is string. Values:
      - ask_llm: An LLM-call is made to discern whether the context applies to the current query. If so, a recreate_question is performed. If not, the context is ignored and a clear_context field is added into the response.
      - include_context: The context will be inserted as is into the query. prompts should not by empty for this option.
      - recreate_question: An LLM-call will try to recreate the question by using the context.
    - prompts: Optional. List of prompts to be used in the LLM call.
      . The type is StickyContextPrompts.
      . If this field is empty, the default prompt for this stage will be used.
  - cleanStg: Stage used to remove prompt injection attempts using an LLM call.
    - enabled: Mandatory. Boolean value to activate or not the clean stage. The type is boolean.
    - prompt: Optional. Prompt to be used in the LLM call.
      . The type is PromptLanguage. For example: “Please clean up the query and reply only with the user’s question”.
      . If this field is empty, the default prompt for this stage will be used.
  - retrievalStg: Mandatory. Stage related to the retrieval phase, which is the process of obtaining relevant documents by comparing the query against indexed data or vectors.
    The stage is crucial for identifying and retrieving the documents or data that best match the input query, ensuring that only the most relevant results are returned.
    - sources: Mandatory. Sources data.
      - name: Mandatory. Name of the source data. The type is string.
      - embeddings: Mandatory. Embeddings model identifier that the ATRIA source data is associated with.
      - docs: Mandatory. Field with parameters related to the configuration of documents. The type is object.
        
        extension: Mandatory. Extensions of documents. The type is string. The extensions must be separated by a comma.
        
        loader: Mandatory. Project loader configuration.
        
        loaderType: Mandatory. Type of loader. Values: unstructured, csv, text, jsond, jsonl or url_list
        
        options: Optional. Object that configures how the document loader operates. It allows specifying the mode of loading and any post-processing actions to be applied to the loaded data.
        
        loaderMode: Optional. Modes for loader running. The type is string. The possible values are:
        
        single: Document will be returned as a single document representing the whole
        
        elements: The loader splits the document into different elements such as: Title, NarrativeText, etc. This allows a more granular processing and analysis
        
        postProcessors: Optional. Post processor loader. It allows to perform operations in the loaded document such as cleaning, transforming, enriching, etc. The type is string.
      - splitter: Optional. Project splitter for dividing large text inputs into smaller, manageable chunks, that can be more easily processed by language models, ensuring efficient and accurate processing.
        
        splitterType: Mandatory. Method used to split the text. Value: recursivechar (Recursively divides the text based on a character, typically looking for specific breakpoints such as punctuation or whitespace)
        
        options: Optional. Project splitter options.
        
        chunkSize: Optional. Maximum size of chunks to be returned. The type is number.
        
        chunkOverlap: Optional. Overlap in characters between chunks. The type is number.
      - retrievers: Mandatory. List of retrievers used to query and retrieve relevant data or documents from a collection based on a given query.
        
        retrieverType: Mandatory. Type of the retriever. Possible values: qdrant, tfidf, or elasticsearch.
        
        config: Optional. Configuration parameters for retrievers. The type is dictionary.
        
        numDocs: Optional. Number of documents to retrieve. The type is number.
        
        loadChunkSize: Optional. Chuck size used to load the documents in qdrant. The type is number. By default, 1000.
  - postFilteringStg: Stage in charge of processing candidates before they enter the RAG chain.
    . It prompts the project LLM for each candidate, using the query and the candidate text. The LLM determines whether the candidate text is related to the query, and if not, the candidate will be filtered out.
    . If this option is not enabled, no post-processing or filtering will take place.
    - enabled: Mandatory. Boolean value to activate or not the post-filtering stage. The type is boolean.
    - candidatesPostFiltering: Mandatory. Post-retrieval filtering applied to the candidates. It must be llm_filter (for each candidate, a very short request is made to the LLM to identify whether the candidate is relevant to answer the query. If ’no’ is decided, the candidate is filtered out)
    - prompt: Optional. Prompt to be used in the LLM call.
      . The type is PromptLanguage.
      . If this field is empty, the default prompt for this stage will be used.
  - generativeStg: Stage for handling the question and answer process.
    . It defines the strategy to solve the question, the prompts used in different stages of the process and the templates for generating responses
    - ragStrategy: Optional. Strategy to combine documents to generate a response. By default, stuff:
      - stuff: Mandatory. If stuff prompt is used, ragStrategy must be set to stuff.
      - refine: Mandatory. If informationExtraction or responseConsolidation prompts are used, ragStrategy must be set to refine.
    - prompts Optional. List of prompts to be used in the LLM call.
      . The type is GenerationPrompts.
      . If this field is empty, the default prompt for this stage will be used.
      - #.auto.language.user_query: Parameter that activates the automatic detection of language in the user’s query (multi-language feature).
        . This parameter is included in the args field of the prompt.
        . If you use the prompt by default, the multi-language feature is already activated.
        . Example:
        ... default: text: | Respond in language {user_query_language}. Question: {question} args: user_query_language: "#.auto.language.user_query" ...
- outputRefine: Optional. It is used to set up how to provide responses. The retrieving operation produces a list of candidates, each of which may provide a dictionary of metadata.
  - candidates: Optional. It indicates whether to return the candidates in raw (useful for evaluation purposes) or not. The type is boolean, by default, false.
  - filterOutputMetadata: Optional. It is used to set up how metadata is used when providing responses. The retrieving operation produces a list of candidates, each of which may provide a dictionary of metadata.
    - map: Optional. Maps attribute names in the original data to standard or more user-friendly names for later use.
      - fileType: Optional. String representing the type of file, typically used to specify the format or content type of the file being referenced. By default, content-type
      - pageNumber: Optional. String representing a page number. It could be used to identify particular pages within a document or resource. By default, page-number
    - groupBy: Optional. groupBy and aggregate are expressed in post-map field names. By default, url
    - aggregate: Optional. It determines how the values of duplicated fields are consolidated during grouping, specifying the handling of aggregated field information. By default, page-number
    - outputFilter: Optional. List of fields to be displayed in the metadata. Type is list.
    - root: Optional. Defines the primary fields that will structure the final output of the metadata processing. Fields listed under root will remain at the top level of the response entries, while all other metadata fields will be nested under a metadata. Type is list.

Example of preset for Generative AI capability

      {
        "id": "e27ca464-488a-435d-a508-da8a262d905f",
        "name": "openai",
        "description": "openai model",
        "brand": "",
        "contact": "",
        "group": "simple_ai",
        "session": {
          "window": 0
        },
        "generative": {
          "model": {
            "id": "openai",
            "parameters": {
              "top_p": 0.9
            }
          },
          "prompts": {
              "preamble": {
                "text": "Habla como si fueras {name}",
                "args": {
                  "name": "Napoleon"
                }
              },
              "examples":[
                "Naciste en galicia",
                "Di que tu padre era gallego"
              ],
              "promptRegexClean": "[#\\n\"]+"
          }
        }
      }

Example of preset for RAG capability

    {
        "id": "1cafcb5c-7951-4645-86d4-055d3b46fe79",
        "name": "atria-rag-gpt-35-turbo",
        "group": "enriched_ai",
        "description": "Atria rag GPT 3.5",
        "session": {
            "window": 3
        },
        "rag": {
            "ragType": "questions-answers",
            "model": {
                "id": "gpt-35-turbo",
                "parameters": {
                    "max_tokens": 4000,
                    "temperature": 1,
                    "top_p": 1
                }
            },
            "references": {
                "maximum": 3,
                "baseUrl": "project-gpt-35-turbo/pdfs"
            },
            "stages": {
                "language": "en",
                "translationStg": {
                    "enabled": true,
                    "language": "en"
                },
                "contextStg": {
                    "enabled": true,
                    "stickyContext": "ask_llm"
                },
                "cleanStg": {
                    "enabled": true
                },
                "retrievalStg": {
                    "sources": {
                        "name": "project-gpt-35-turbo",
                        "embeddings": "text-embedding-ada-002",
                        "docs": [
                            {
                                "extension": "pdf",
                                "loader": {
                                    "loaderType": "unstructured",
                                    "options": {
                                        "loaderMode": "single"
                                    }
                                }
                            },
                            {
                                "extension": "txt",
                                "loader": {
                                    "loaderType": "url_list"
                                }
                            }
                        ],
                        "splitter": {
                            "splitterType": "recursivechar",
                            "options": {
                                "chunkSize": 60,
                                "chunkOverlap": 20
                            }
                        },
                        "retrievers": [
                            {
                                "retrieverType": "qdrant",
                                "config": {
                                    "loadChunkSize": 10000
                                }
                            },
                            {
                                "retrieverType": "tfidf"
                            }
                        ]
                    }
                },
                "postFilteringStg": {
                    "enabled": true
                },
                "generativeStg": {
                    "ragStrategy": "stuff"
                }
            },
            "outputRefine": {
                "candidates": false
            }
        }
    }

6.3 - Import documents into ATRIA

Import documents into ATRIA

Guidelines for importing documents and new data into ATRIA environment

Introduction

As described in General RAG: functional overview, when using RAG capability, different databases are used for lexical and semantic search.

The documents that feed these knowledge bases must be uploaded into the environment to be used in the RAG chain and updated when required. In this framework, two processes must be considered:

a. Curate data (recommended): Firstly, it is important to curate the data to be uploaded afterwards, to optimize the recognition process.
b. Import documents: Once the data is curated, the documents must be uploaded into the system. For that purpose, apart from the general method, a hot swapping process can be executed.

a. Data curation

Data curation is the process of organizing, managing, cleaning up and maintaining data to ensure it stays relevant and valuable. Good practices in this task leads to an efficient recognition by the AI model.

For this purpose, we recommend following these tips, based on research and internal analysis:

1. Data selection and cleaning

Include only data relevant to the purpose of the RAG. Redundant, irrelevant or outdated information should be removed to clean up noise that does not add value.

2. Clarity and consistency in content

Be concrete and specific: Keep the information to the point. Avoid unnecessary words or complex explanations.
Avoid ambiguous messages: Avoid vague or unclear terms that could lead to confusion. Make sure the meaning is easy to interpret.
Reinforce the message: Make the message clearer by using specific terms related to the category being discussed. Use keywords strategically to reinforce the message.
Make sure procedures are clear and include all the necessary steps: Make sure each step in tutorials is fully described, logically structured and easy to follow. Avoid fragmented or disjointed instructions.
Remove unnecessary reference information: Minimize excessive details between steps that could distract or confuse the LLM. Keep the flow simple and clear.

3. Improvements in information

Add missing content: If the product includes features similar to others but with slight variations, add a sentence explaining what is and is not supported to make the LLM more accurate.
Add similar terminology: Although you cannot control what terminology people use, mentioning common alternative terms in your content can help the LLM provide more informative answers.

4. Structure and formatting

Maintain consistent formatting: Ensure all steps follow a parallel structure (similar sentence formats and style) to improve coherence.
Simplify complex tables: Avoid blank cells and ensure every cell has a complete value. Replace symbols (e.g., checkmarks) with clear text (“Yes”, “Supported”) to improve interpretation. Rewrite footnote text to add context. Move complex information in table cells out of the table.
Avoid nested content: LLMs can have difficulty with multiple levels of nesting (e.g., steps within steps). Keep content linear and simple for better understanding.
Add summaries to tutorials or long procedures: LLMs can get “lost” with long tutorials or procedures due to context window limitations. Including a summary is a simple way to enhance results.

5. Clarification and Explanation of Concepts

Easy writing: Resolve writing issues such as wordiness, passive voice, and unclear pronouns (with ambiguous references) to make text more understandable.
Explain graphics/images in text: Clearly explain conceptual graphics through text to resolve ambiguities and avoid relying on an image-to-text model

b. Import documents

Once the data is curated, the documents must be uploaded into the system. For that purpose, the following guidelines must be followed.

Note: The RAG does not support files with whitespaces.

1. Upload documents in the Azure container `atria-resources`

Insert these documents in the <preset_name>/<retrievalStg.sources.name>/<retrievalStg.sources.docs[i].extension>/ folder.
Keep in mind the allowed formats for documents, set in the preset’s variable loader.loaderType.

2. Configure `docs` parameter in preset

For these documents to be used in your use case, they must be included in the preset, following these instructions.

Fill in the parameters in the docs key of your preset, which is related to the configuration of documents.

Here is an example of documents configuration. In this example, documents in the preset are separated into two folders, as we are going to load two different types of data (jsonl and pdf) into this preset.

```json
{
"retrievalStg":{
    "sources":{
        "name":"project-de-faqs",
        "embeddings":"text-embedding-ada-002",
        "docs":[
            {
            "extension":"jsonl",
            "loader":{
                "loaderType":"jsonl"
            }
            },
            {
            "extension":"pdf",
            "loader":{
                "loaderType":"unstructured",
                "options":{
                    "loaderMode":"single"
                }
            }
            }
        ],
        "splitter":{
            "splitterType":"recursivechar",
            "options":{
            "chunkSize":512,
            "chunkOverlap":160
            }
        },
        "retrievers":[
            {
            "retrieverType":"qdrant"
            },
            {
            "retrieverType":"tfidf"
            }
        ]
    }
}
}
```

3. Upload list of URLs

If you use URLs as documents ("loaderType": "url_list"), you also need to upload a file with the list of URLs in the preset folder.
Separate each URL with a line break. The file must have the extension .txt.
```
http://www.url1.com
http://www.url2.com
```

4. Upload jsonl or jsond files

If you use jsonl or jsond files as documents ("loaderType": "jsonl" or "loaderType": "jsond"), you also need to upload the file content in the same folder with the extension .jsonl or .jsond.

To do so, each desired document content must be provided in the page_content key.

{"page_content": "test1", "metadata": {"source": "https://www.dummy1.es/"}, "type": "Document"}
{"page_content": "test2", "metadata": {"source": "https://www.dummy2.es/"}, "type": "Document"}

5. Add project.metadata file (optional)

Scenario 1: Unstructured, csv or text data

If the loaderType is url_list, unstructured or csv, you can optionally add a file called project.metadata with relevant information about each file. This metadata will be stored in the database and is very helpful when we want to modify the source URL.

It is important that the file is correctly tabulated and does not contain any invalid characters.

The file is composed of:

Key __global__, which contains global data that affects all the files.
Names of the specific files to which we want to include this extra data.

It is not necessary to define metadata for all the files in the folder.

Example:

__global__:
   url: https://www.google.com
   field1: test
   field2: test
file1.txt:
   url: https://www.dummy-url.com
   title: file1 title
file2.txt:
   url: https://www.dummy-url.com
   title: file1 title
   source: test

NOTE: From all the information added to the project.metadata when creating your use case, you can select the specific sources that will be shown to the user as part of the response, adding them to the field baseURL of the preset configuration.

Scenario 2: URL or json documents

In this case, there is no need to add the project.metadata file:

"loaderType": "url_list" —> Metadata information is included in the URLs themselves, uploaded in step 3
"loaderType": "jsonl", "loaderType": "jsond" —> Metadata information is already included in the files uploaded in step 4

6. Update data into the environment

Finally, execute the atria-rag-generate-db job to update the data into the environment.

6.4 - Create and configure an agent

Create and configure an agent

Guidelines for the configuration of ATRIA by use cases constructors when developing an experience by means of an agent

Introduction

An agent is a configuration entity in ATRIA that represents an integration point for external channels, services, or platforms.

Each agent defines how ATRIA communicates with and manages sessions for a specific external system, specifying connection details, session parameters, and operational metadata.

Agents are referenced by applications to enable channel or service connectivity within the platform.

Guidelines to configure an agent

1. Create a new agent

Build the agent for your use case (json file), using the available agent fields.

When the agent json file is generated, execute this command to include it:

curl --location --request POST 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/agents/' \
  --header 'Content-Type: application/json' \
  --header 'Accept: application/json' \
  --header 'Authorization: APIKEY XXX' \
  --data-raw '<NEW AGENT JSON>'

1.1. Modify/update an agent

If once created, certain modifications are required, follow these instructions:

Make the required changes in the agent json file using the available agent fields.

When the agent is modified, execute this command to update it:

curl --location --request PUT 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/agents/<agentID>' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: APIKEY XXX' \
  --data '<AGENT JSON WITH MODIFICATIONS>'

1.2. Delete an agent

Execute the following command:

curl --location --request DELETE 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/agents/<agentId>' \
  --header 'Accept: application/json' \
  --header 'Authorization: APIKEY XXX'

2. Include the agent in the application

If the application for your use case does not exist, first create it following the guidelines for the configuration of an application.

Once the application is created, assign the created agent in the field agents.

If you update or delete an agent, ensure that any application referencing it is also updated accordingly.
Remember that agents must exist to be inserted in an application.

Example to update the list of agents in an application:

curl --location --request PATCH 'https://svc-<env>.auracognitive.com/aura-services/v1/applications/<applicationId>' \
  --header 'Accept: application/json' \
  --header 'Authorization: APIKEY XXX' \
  --data '{
    "id": "<applicationId>",
    "agents": [
      "<agentId1>",
      "<agentId2>"
    ]
  }'

Agent fields

The fields for the characterization of an agent are summarized below, as defined in the API swagger Aura Configuration ATRIA Agents:

Field	Type	Mandatory	Description
`id`	string	Yes	Unique identifier (UUID) for the agent.
`name`	string	Yes	Name that uniquely identifies the agent in Aura.
`description`	string	No	Description of the agent.
`communication`	object	Yes	Parameters for the configuration of the communication flow. See communication configuration.
`flowConfig`	object	No	Configuration of the agent flow.
`deploymentName`	string	No	Name of the deployment where the agent is running. If the `endpoint` field is not present in `communication`, this field will be used to route requests to the agent.
`metadata`	object	No	Document metadata (version, createdAt, updatedAt, etc). See metadata.

Communication configuration (`communication`)

Field	Type	Mandatory	Description
`communicationType`	string	Yes	Type of communication. Only `http` is currently supported.
`endpoint`	string	No	HTTP endpoint where the agent listens.
`headers`	object	No	HTTP headers associated with the agent.
`timeout`	number	No	Timeout for agent communication.
`retries`	number	No	Number of retries for communication.

Metadata (`metadata`)

Field	Type	Mandatory	Description
`version`	string	No	Configuration version when the document was created.
`createdAt`	string	No	Creation date (ISO 8601).
`updatedAt`	string	No	Last update date (ISO 8601).

Example: Minimal agent configuration

{
  "id": "b1e2c3d4-5678-1234-9abc-def012345678",
  "name": "example-agent",
  "communication": {
    "communicationType": "http",
    "endpoint": "https://agent.example.com/webhook"
  }
}

Example: Full agent configuration

{
  "id": "b1e2c3d4-5678-1234-9abc-def012345678",
  "name": "example-agent",
  "description": "Agent for integration with Example Service",
  "communication": {
    "communicationType": "http",
    "endpoint": "https://agent.example.com/webhook",
    "headers": {
      "Authorization": "Bearer <token>"
    },
    "timeout": 30,
    "retries": 3
  },
  "flowConfig": {},
  "deploymentName": "example-deployment",
  "metadata": {
    "version": "1.0.0",
    "createdAt": "2024-05-30T10:00:00Z",
    "updatedAt": "2024-05-30T12:00:00Z"
  }
}

Note:

The id, name, and communication fields are mandatory.

The communicationType must be http.

If an agent is deleted, applications referencing it will be updated.

7 - Get Kernel access token

Get Kernel access token for Aura Gateway API

Guidelines to get a Kernel access token for working with aura-gateway-api

Steps in the process

To use the Kernel aura-aiservice API, first authenticate with the client credentials specifying the required scopes, that depend on the specific ATRIA capability to be used:
- NLP as a Service: aura-ai-services:nlp-messaging:write
- Generative AI and RAG: aura-ai-services:messaging:write
Afterwards, refresh the token following Kernel instructions.
To obtain the real secret of your app, just run the following command, as an example of using the app “aura-bot” in Kernel “global-int-current” with a fake password.
```
$ kubectl -n $AURA_ENVIRONMENT get secret aura-bot -o json | jq -r ".data.AURA_FP_CLIENT_SECRET|@base64d"
```

Now you can request the access_token:

# generate a valid UUID as correlator
# substitute {{correlator}} with the generated UUID
export CORRELATOR={{correlator}}
# substitute aura-bot:secret with the specific information for your Kernel client.

$ curl -i -X POST -u aura-bot:secret -H 'Content-Type: application/x-www-form-urlencoded' -H 'Cache-Control: no-cache' -H 'x-correlator: $CORRELATOR' 'https://auth.global-int-current.baikalplatform.com/token' -d 'scope=aura-ai-services:messaging:write&grant_type=client_credentials'

HTTP/2 200
{"access_token":"<token>","token_type":"Bearer","expires_in":3599,"scope":"aura-ai-services:messaging:write","purpose":""}

This token expires after a certain time, so it is required to repeat the steps above to obtain a new one.

Access here to more information about Kernel authentication.

8 - Request to Aura NLP Resolution API

Guidelines for making a request to Aura NLP Resolution API

Steps to be followed to make a request to the aura-gateway-api NLP Resolution API, for using ATRIA NLP as a Service capabilities

Introduction

The use of the ATRIA AI-driven NLP as a Service capability requires making a request to the aura-gateway-api aura-nlp-resolution-api.

For this purpose, constructors must follow the steps below.

Steps in the process

The request from the application must include different fields to be properly processed by this API:

application.id or application.name: Id or name of the application that has configured the specific pipeline to be used for the resolution of the request. If this field is empty or the channel configured in the application does not exist in the Aura NLP service, an error is sent.
message: text of the message with the request to be resolved.
Authorization header: Two-legged token.

Moreover, NLP as a Service can also handle disambiguation. In this scenario, a list of options will be provided back from the Aura NLP service.

A general request and the associated response are included below:

Request

curl --location 'https://api.environment.baikalplatform.com/aura-aiservices/v1/nlp/query' \
--header 'x-correlator: <uuid>' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer {token}' \
--data '{  
  "message": "¿Cómo puedo contactar con Movistar plus?",
  "contextFilters": [],
  "application": {
      "id": "12345678-1234-5678-9a0b-abcd73f96111"
  }
}'

Response

{
  "entities": [
    {
      "entity": "7",
      "type": "faq",
      "score": 0.9489,
      "start_index": 1,
      "end_index": 1,
      "canon": "7",
      "label": "openai-embeddings"
    }
  ],
  "questionId": "7",
  "question": "¿Cómo puedo interactuar con Movistar?",
  "text": "Atención Comercial: 900104708 Atención al Cliente: 1004. Desde el extranjero +34 699 991 004 Soporte Técnico: 1002 Clientes con TV por satélite: 900104709  Clientes con sólo satélite: 900110010  Clientes Prepago: 224430  Atención Canal+: 900 220 305  [Bot de Movistar](http://www.movistar.es/#forward) en Twitter",
  "relatedQuestions": [
    {
      "questionId": "8",
      "question": "¿Cómo puedo darme de baja?",
      "text": "Para bajas de líneas móviles de contrato, puedes solicitarlo en la sección [Bajas](https://www.movistar.es/particulares/BajasIframe/?_ga=2.253697609.1756783427.1543820522-1816496064.1527850957).\r\nPara fusión o  prepago, debes llamar al 1004.\r\nPara paquetes de TV, accede a la sección TV."
    },
    {
      "questionId": "9",
      "question": "¿Dónde puedo comprar un móvil?",
      "text": "Encuentra el móvil que necesitas en la [tienda electrónica](http://www.movistar.es/particulares/movil/moviles). Si has solicitado un smartphone a través de la web o del 1004 con la opción de recogida en tienda y no recuerdas el código de bono, [recupéralo](https://www.movistar.es/atcliente/c2c/servicio-online/index.jsp?language=es&service=consulta-bono-canje)."
    }
  ]
}

Types of responses in NLP resolution API

a. Simple response

When the Aura NLP app (pipeline) recognizes an intent, a simple response with the intent and entities recognized is returned:

Request body

{  
  "message": "Hola",
  "contextFilters": [],
  "application": {
    "id": "12345678-1234-5678-9a0b-abcd73f96111"
  }
}

Response body

{
  "intent": "intent.common.greetings",
  "entities": []
}

Request body

{  
  "message": "Mi cobertura",
  "contextFilters": [],
  "application": {
    "id": "12345678-1234-5678-9a0b-abcd73f96111"
  }
}

Response body

{
  "entities": [
    {
      "entity": "16",
      "type": "faq",
      "score": 0.99,
      "start_index": 1,
      "end_index": 1,
      "canon": "16",
      "label": "openai-embeddings"
    }
  ],
  "questionId": "16",
  "question": "¿Cuál es mi cobertura?",
  "text": "Si quieres, [consulta la cobertura de ADSL y fibra](https://www.movistar.es/coberturas/ ). Para mejorar la cobertura de tu móvil en casa o en el trabajo a través de tu ADSL o fibra puedes contratar el servicio [“Mi cobertura móvil”](http://www.movistar.es/particulares/movil/servicios/ficha/nav-mi-cobertura-movil ).",
  "relatedQuestions": [
    {
      "questionId": "84",
      "question": "¿Cómo puedo reclamar la factura que me ha llegado?",
      "text": "Sentimos que no estés de acuerdo con los conceptos facturados. Por favor, para poder ayudarte, completa el siguiente [formulario](https://www.movistar.es/particulares/atencion-cliente/escribenos/?tipo=telco&tipo_directo=12-21) y el Servicio de Atención te contestará en un plazo aproximado de 48 horas."
    }
  ]
}

c. Disambiguation response

When there is more than one intent recognized, an intent.disambiguation will be returned with a list of options.

Request body

{  
  "message": "Mi factura",
  "contextFilters": [],
  "application": {
    "id": "12345678-1234-5678-9a0b-abcd73f96111"
  }
}

Response body

{
  "intent": "intent.disambiguation",
  "entities": [],
  "options": [
    {
      "questionId": "32",
      "question": "¿Cómo puedo ver mi factura?",
      "text": "Para consultar y descargar las facturas del último año accede a la [web](https://www.movistar.es/cliente/areaprivada/#/facturas) o en Facturas en la sección de [cuenta](https://external-account.movistar-es-dev.svc.dev.mad.tuenti.io/redirect.php?target=account-home) de la app.",
      "relatedQuestions": []
    },
    {
      "questionId": "33",
      "question": "¿Cómo puedo pagar mis facturas pendientes?",
      "text": "Si tienes alguna factura pendiente de pago, recibirás un aviso de pago con el que abonar la deuda en cualquier oficina de correos o en las oficinas bancarias indicadas. También puedes abonarla con tarjeta llamando al 1004."
    }
  ]
}

d. Filtered response

In some FAQs of a generic questions use case, you can add multiple answers and select the most accurate one according to some input context-filters:

Request body

{  
  "message": "¿Cómo puedo acceder a Mi Movistar?",
  "contextFilters": ["channelName:novum-mytelco"],
  "application": {
    "id": "12345678-1234-5678-9a0b-abcd73f96111"
  }
}

Response body

{
  "entities": [
    {
      "entity": "3",
      "type": "faq",
      "score": 1,
      "start_index": 1,
      "end_index": 1,
      "canon": "3",
      "label": "openai-embeddings"
    }
  ],
  "questionId": "3",
  "question": "¿Cómo puedo acceder a Mi Movistar?",
  "contextFilters": [
    "channelName:novum-mytelco"
  ],
  "text": "Si accedes con tu móvil mediante Mobile Connect verás todos tus productos y el consumo de la línea con la que accedes.\r\nAccediendo con contraseña tendrás acceso a todos tus productos, facturas y consumo de todas tus líneas.",
  "relatedQuestions": [
    {
      "questionId": "7",
      "question": "¿Cómo puedo interactuar con Movistar?",
      "text": "Atención Comercial: 900104708 Atención al Cliente: 1004. Desde el extranjero +34 699 991 004 Soporte Técnico: 1002 Clientes con TV por satélite: 900104709  Clientes con sólo satélite: 900110010  Clientes Prepago: 224430  Atención Canal+: 900 220 305  [Bot de Movistar](http://www.movistar.es/#forward) en Twitter"
    },
    {
      "questionId": "19",
      "question": "¿Qué tengo contratado?",
      "contextFilters": [
        "channelName:novum-mytelco"
      ],
      "text": "Puedes ver los servicios y productos que tienes contratados en la [sección de cuenta](https://external-account.movistar-es-dev.svc.dev.mad.tuenti.io/redirect.php?target=account-home). Si quieres conocer el detalle de lo que incluye tu tarifa, ve a Tu Tarifa dentro de la sección de cuenta."
    }
  ]
}

Here is the example of the same request without context-filters. You can see that the texts are different and the contextFilter field is not returned:

Request body

{  
  "message": "¿Cómo puedo acceder a Mi Movistar?",
  "contextFilters": [],
  "application": {
    "id": "12345678-1234-5678-9a0b-abcd73f96111"
  }
}

Response body

{
  "entities": [
    {
      "entity": "3",
      "type": "faq",
      "score": 1,
      "start_index": 1,
      "end_index": 1,
      "canon": "3",
      "label": "openai-embeddings"
    }
  ],
  "questionId": "3",
  "question": "¿Cómo puedo acceder a Mi Movistar?",
  "text": "Puedes entrar:\r\nCon tu móvil: debes tenerlo a mano para validar el acceso.\r\nO con contraseña y con tu NIF, CIF, NIE o pasaporte: si no la recuerdas puedes regenerarla.",
  "relatedQuestions": [
    {
      "questionId": "7",
      "question": "¿Cómo puedo interactuar con Movistar?",
      "text": "Atención Comercial: 900104708 Atención al Cliente: 1004. Desde el extranjero +34 699 991 004 Soporte Técnico: 1002 Clientes con TV por satélite: 900104709  Clientes con sólo satélite: 900110010  Clientes Prepago: 224430  Atención Canal+: 900 220 305  [Bot de Movistar](http://www.movistar.es/#forward) en Twitter"
    },
    {
      "questionId": "9",
      "question": "¿Dónde puedo comprar un móvil?",
      "text": "Encuentra el móvil que necesitas en la [tienda electrónica](http://www.movistar.es/particulares/movil/moviles). Si has solicitado un smartphone a través de la web o del 1004 con la opción de recogida en tienda y no recuerdas el código de bono, [recupéralo](https://www.movistar.es/atcliente/c2c/servicio-online/index.jsp?language=es&service=consulta-bono-canje)."
    }
  ]
}

e. Custom columns response

Request body

{  
  "message": "Información de la jornada 20",
  "application": {
    "id": "12345678-1234-5678-9a0b-abcd73f96111"
  }
}

Response body

{
  "entities":[
    {
      "entity":"10",
      "type":"faq",
      "score":1,
      "start_index":1,
      "end_index":1,
      "canon":"10",
      "label":"openai-embeddings"
    }
  ],
  "questionId":"10",
  "question":"Información de la jornada 20",
  "text":"La información de toda la jornada 20",
  "speak":"La información de toda la jornada veinte",
  "mainContent":"custom",
  "custom":{
    "carrusel":[
      "1",
      "2",
      "3",
      "4"
    ],
    "data_summary":[
      {
        "gol_destacado":{
          "text":"Gol jornada 23",
          "url":"https://www.youtube.com/watch?v=MgP3zDzQ0CE"
        },
        "gol_propia_puerta":{
          "url":"https://www.20minutos.es/deportes/noticia/4146386/0/insolito-golazo-olympique-propia-puerta-psg/"
        }
      },
      {
        "gol_destacado":{
          "text":"Gol jornada 24",
          "url":"https://eldesmarque.com/actualidad/futbol/primera-laliga-santander/video-resumenes-primera/1372827-madrid-atleti-resumen-en-video-del-partido-de-la-jornada-22"
        },
        "gol_propia_puerta":{
          "url":"https://www.mundodeportivo.com/futbol/20200207/473330110580/el-insolito-gol-en-propia-puerta-del-portero-de-uruguay.html"
        }
      }
    ]
  }
}

9 - Best practices for prompts generation

Best practices for prompts generation

The purpose of this document is to provide complete and practical guidelines and best practices for constructors when creating a prompt for an ATRIA use case

The use of Markdown text

It is highly recommended to use Markdown as a format for prompts due to its benefits, summarized below:

Clarity and readability

Markdown allows structuring text in a clear and hierarchical way using headings, lists and other elements. It helps understanding both humans and automated systems.
Raw text is easy to read and understand, even without rendering. This streamlines review and collaborative work.

Easy edition and maintenance

Markdown is editable in any plain text editor, without the need for specialized tools.
It can be easily modified, versioned and maintained, especially within large or distributed teams.

Error minimization

Markdown syntax is simple and minimizes common errors associated with other markup languages such as HTML.
The visual structure allows a quick identification of inconsistencies or formatting issues.

Versatility and compatibility

Markdown is a widely supported standard: it can be easily converted into HTML, PDF, or DOCX formats, presentations and more.
Additionally, it is well-suited for integration with AI tools, static site generators, document management systems and version control systems like Git.

Portability and universality

Markdown files are lightweight and portable, enabling easy use across different platforms and devices without formatting loss.
As a plain text editor, Markdown ensures content accessibility and long-term usability, regardless of future technological changes.

Effective collaboration

Markdown facilitates collaborative work in projects with different teams or people editing or reviewing simultaneously.
Tt includes intuitive and useful change control, version controlling and diff tools.

Simplicity and legibility for AI

Markdown is a lightweight markup language with a minimalist plain-text-formatting syntax. It eases LLMs to identify text structures in comparison to markup languages such as HTML or XML.
Consequently, it reduces significantly misinterpretations and errors and improves processing efficiency.

In summary, using Markdown to define prompts makes them clearer, easier to edit, minimizes errors and provides great flexibility across platforms and tools.

General formatting guidelines

Sections and subsections

Organize your content: Use Markdown syntax for headers to separate the different parts of the prompt (##, ###, etc.)
Ensure clarity: Identify each section univocally for easy reading and maintenance of the prompt. In the following example, although Markdown shows a correct structure, visualization is not adequate:

Sections lists

Add lists: If a section or sub-section includes multiple elements or fields, present them as a bulleted or numbered list for better organization. For example:

Line breaks

Do not include line breaks manually (\n): The final formatting of the prompt with line breaks will be handled by the CTO team. Now, just write the content as it should appear, without adding manual line break characters.

Quotation marks

Be consistent: Use the same type of quotation marks throughout the prompt (preferably: ‘single’ or “straight double”, depending on the project requirements)
Avoid mixing styles: Do not combine straight and curly quotation marks. Although visually similar, mismatched opening and closing quotes can lead to unclosed texts.

URLs

Be careful with URLs: the above-mentioned issue related to quotation marks affects URLs, as seen in the following example.

- Disney: incorrect → the closing quote in the URL is considered as part of the link
- Dazn: correct → the closing quote in the URL is closing the field “answer” correctly

Review unusual characters: Once the prompt creation is finished, make a comprehensive review to check no invalid or wrong characters are included. This is particularly relevant if text has been copied from external sources.

Carry out a visual review: If possible, upload your prompt into a Markdown editor to review its visualization.
- Desktop version editors: Ghostwriter, VS Code, Pycharm, etc.
- Online editors: https://markdownlivepreview.com/, https://dillinger.io/, https://stackedit.io/, etc.

General content guidelines

Simple and direct language

Use clear and simple expressions instead of overly elaborate ones.
- Example: “Use a wide range…” instead of “Utilize a wide range…”.
Avoid unnecessary slang or technical terms that have not been previously defined.
Add a clear definition of relevant keywords and terms used in your prompt. For example: user_type, estado_desconocido, etc.

Language consistency

Include the main structure in English (language used for technical terms such as context, keywords, answer, action, etc.) and examples in the expected language, ex. Spanish.
Avoid mixing languages (English and Spanish) to ease reading and implementation.

Accurate examples and definitions

Examples must be grammatically and orthographically correct, with no errors in accents, capitalization or wording.
Check that fields or variables are clearly defined previous to its implementation.
Integrate examples in the corresponding section, not in a separate one.

Grammar and syntax review

Check grammatical agreements (singular vs. plural, gender, etc.).
Be coherent with the use of pronouns. For example: If you use “user” (singular), follow the sentence by “his/her”, not “their”, or change to “users”.
You can also use impersonal structures for ease.

Homogeneous categories structure

All categories must follow a common structure, with the same fields (context, keywords, answer, action), even if any is left empty.
If there are special values (such as unknown_status), they must be explained and applied consistently.

10 - Request to Aura Generative API

Guidelines for making a request to Aura Generative API

Steps to be followed to make a request to aura-gateway-api Generative API, for using ATRIA Generative or RAG capabilities

Introduction

The use of the ATRIA AI-driven Generative AI or RAG capabilities requires making a request to the aura-gateway-api Generative API.

For this purpose, constructors must follow the steps below.

aura-generative-api is a synchronous service so, if there is no validation error, once the call to atria-model-gateway is made, the response will be sent to the application.

Steps in the process

The request from the application must include different fields to be properly processed by this API:

application.id or application.name: Id or name of the application to be used for the resolution of the request. If this field is empty or the application does not exist in the Generative service, an error is sent.
application.preset: Name of preset to use in atria-model-gateway
message: text of the message with the request to be resolved.
Authorization header: Two-legged token.

Request

curl --location 'https://api.environment.baikalplatform.com/aura-aiservices/v1/generative/prompts' \
--header 'x-correlator: <uuid>' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer {token}' \
--data '{
  "application": {    
    "name": "app-name",
    "preset": "preset-default"
  },
  "message": "Hola, ¿qué es AURA?",  
  "prompt_params": {
    "preamble": "system 1",
    "template": "template 1",
    "fields_mapped": {},
    "examples": ["example 1"]
  },
  "model_params": {
    "max_tokens": 1,
    "temperature": 2,
    "top_p": 1
  }
}'

Response

{
  "message": "Hello I am Aura, how can I help you?",
  "session": {
    "id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
    "sequence": 1,
    "parameters": {
      "window": 10,
      "timeout": 30
    }
  },
  "prompt_info": {
    "sizes": {
      "completion": 100,
      "prompt": 50,
      "total": 150
    },
    "model_params": {
      "max_tokens": 100,
      "temperature": 0.5,
      "top_p": 0.5
    },
    "prompt": [
      {
        "role": "user",
        "content": "I want to know more about the beach"
      }
    ],
    "input": "I want to know more about the beach"
  }
}

Errors

Error 400: Invalid application

  {
    "code": "BAD_REQUEST",
    "message": "Invalid message. Application not found."
  }

Error 400: Preset not found for the application

  {
    "code": "BAD_REQUEST",
    "message": "Invalid message. Preset not valid for application app_name."
  }

Error 400: Invalid Args

{
    "code": "BAD_REQUEST",
    "message": "Bad Request",
    "errors": [
        {
            "code": "InvArg",
            "message": "unknown preset: dfg"
        }
    ]
}

Error 429: Quota

{
    "code": "TOO_MANY_REQUESTS",
    "message": "Too Many Request",
    "errors": [
        {
            "code": "Quota",
            "message": "The system is experiencing operational problems. We apologize for the inconvenience."
        }
    ]
}

Error 500

  {
    "code": "INTERNAL_SERVER_ERROR",
    "message": "Internal Server Error",
    "errors": [
        {
            "code": "Internal",
            "message": "The system is experiencing operational problems. We apologize for the inconvenience."
        }
    ]
  }

Recommendations for using response_format

The response_formatparameter is an object that specifies the format that the model must output. It is compatible with Azure OpenAI GPT models newer than gpt-3.5-turbo-1106.

Setting to { “type”: “json_schema”, “json_schema”: {…} } enables structured outputs which guarantee the model will match your supplied JSON schema.

How to include it in the request:

curl --location 'https://api.environment.baikalplatform.com/aura-aiservices/v1/generative/prompts' \
--header 'x-correlator: <uuid>' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer {token}' \
--data '{
  "application": {    
    "name": "app-name",
    "preset": "preset-default"
  },
  "message": "Hola, ¿qué es AURA?, genera un JSON ",  
  "prompt_params": {
    "preamble": "system 1",
    "template": "template 1",
    "fields_mapped": {},
    "examples": ["example 1"]
  },
  "model_params": {
    "max_tokens": 1,
    "temperature": 2,
    "top_p": 1,
    "response_format":{ "type": "json_object" }
  }
}

There are two key factors that need to be present to successfully use JSON mode:

response_format={ “type”: “json_object” }
With this configuration, we tell the model to output JSON as part of the system message. Including guidance to the model that it should produce JSON as part of the messages conversation is required. We recommend adding instruction as part of the system message.
According to OpenAI, to add this instruction can cause the model to “generate an unending stream of whitespace and the request could run continually until it reaches the token limit.”

If “JSON” is not included within the messages, the following error may occur:

BadRequestError: Error code: 400 - {'error': {'message': "'messages' must contain the word 'json' in some form, to use 'response_format' of type 'json_object'.", 'type': 'invalid_request_error', 'param': 'messages', 'code': None}}

Further Reference: Microsoft documentation: Learn how to use JSON mode

11 - Use ATRIA web interface

Use ATRIA web interface (aura-manager)

Guidelines for using the ATRIA web interface for testing purposes

The ATRIA web interface (aura-manager) is available for Generative AI and RAG capabilities in ATRIA

Introduction

In the current release, a web interface aura-manager has been provided for internal use to test how ATRIA Generative and RAG capabilities work.

Discover below how to use it.

Guidelines

1. Enable aura-manager

Enable aura-manager in Aura installer.

2. Access to ATRIA web interface (aura-manager)

Web chat URL: https://svc-[country-environment].auracognitive.com/aura-manager
Enter the web using Office365 authentication.

If you are interested in the underlying authentication process, access here.

3. Introduce application name

Prerequisite: The application must be previously configured using the applications configuration sheet with all the parameters to communicate with aura-gateway-api.
Add the exact name of the application to be used.

web-app-name

If the name of the application is wrong, a message as shown below will be displayed:

web-app-name-error

If the request fails, the following error will be displayed and the website will reload:

web-error-application

4. Select theme (optional)

If required, select a theme to change the visualization style.
Click on the meatball menu ... and select the preferred theme.

web-select-theme

5. Select preset

Once the application is selected, all the presets that this application can use are loaded.
Click on the meatball menu ... and select the specific preset to be used.

web-select-preset

web-select-preset-options

In the current version of the web interface, the option “Activate response voiceover” is deactivated.

6. Send request

Send your request:
- Writing it down in the search box
- Or using the microphone by clicking on the microphone icon
  NOTE: The microphone is enabled in certain compatible web browsers:
  - Google Chrome v33 (Windows, macOS, Linux, Android)
  - Microsoft Edge v79 (Windows, macOS)
  - Firefox 52 (Windows, macOS, Linux, Android). It requires enabling media.webspeech.synth.enabled in about:config
  - Safari 14.1 (macOS, iOS)
Now, you can start a conversational flow with ATRIA to get the response you need.

web-add-request

If the request fails, the following error will be displayed:

web-error-send-message-popup web-error-send-message-chat

If this error is displayed, you should enter the request again.

7. Receive response

ATRIA will provide you with the most appropriate answer to your request.

Additionally, the information sources used to generate the response are included, so the user can have greater confidence in the answer provided and consult these references afterwards.

In the current release, the references from public URLs can be consulted directly through their corresponding links.
If the documents used as references (such as PDF files) are not publicly accessible, only the document names will be displayed. You can upload them for testing purposes before making their content available via a public URL.

aura-manager-response-and-references

8. Add feedback (Optional)

Use the thumbs-up and thumbs-down symbols to provide feedback regarding the accuracy of the response.

web-feedback

If the request fails, the following error will be displayed:

web-error-feedback

If this error is displayed, you can continue using the application, trying to send feedback again.

9. Copy response (Optional)

If required, you can copy the text response.

web-copy-text

10. Start new conversation

Click on the “New conversation” button or on the “reload” symbol to start a new conversation.

web-new-conversation

If the request fails, the following error will be displayed, and you should try to initiate a new conversation again:

web-error-create-new-conversation

12 - O365 Authentication

Office 365 Authentication

Description of the Office 365 authentication made by ATRIA

Introduction

User authentication on ATRIA web interface is integrated with Office 365, using one internal component component (oauth2-proxy) and one external component (keycloak), managed by Novum

The oauth2-proxy component works as a reverse proxy, receiving requests and redirecting them to keycloak in case they are not authenticated.
Keycloak manages the application users and has a connector for Office 365, so it redirects to the Office365 login web to identify with the www.telefonica.com corporate account.
In case of correct login, it loads the proxified web with a cookie (and optionally, other headers) where the user is already logged in.

Authentication workflow

The authentication process will be transparent for the ATRIA web interface and, therefore, for developers.

The atria web interface may have no authentication at all, or a basic one, and oauth2-proxy and keycloak are in charge of the entire process:

The oauth2-proxy component will be deployed, configured and operated by the Aura DevOps team.
The keycloak component will be managed by the Novum team, including granting access to a user list.

Sequence diagram

sequenceDiagram
    actor Browser
    Browser->>+OAuth2 Proxy: Request /*
    OAuth2 Proxy-->>-Browser: Redirect to Keycloak's login page
    Browser->>+Keycloak: User login
    Keycloak->>Keycloak: O365 Login
    Keycloak-->>-Browser: Redirect to /oauth2/callback
    Browser->>+OAuth2 Proxy: Request /oauth2/callback
    OAuth2 Proxy->>+Keycloak: Get access token
    Keycloak-->>-OAuth2 Proxy: Send id & access token
    OAuth2 Proxy-->>-Browser: Send session cookie and redirect to /*
    Browser->>+OAuth2 Proxy: Request /*
    OAuth2 Proxy->>+Atria web interface: Request /*
    Atria web interface-->>-OAuth2 Proxy: HTTP response
    OAuth2 Proxy-->>-Browser: HTTP response

Authentication steps

The three main authentication steps are detailed below, together with the team in charge of its execution.

1. Installation

A new environment must be created using the aurak8s installer, where oauth2-proxy will be installed and configured.

Responsible teams: Novum

Once installed, it is necessary to create a new client in keycloak, with the redirection URL https://<deployed-env>/oauth2/callback and create a user group with the members that will have access.

OAuth2-proxy tips from Cross team

oauth2-proxy is designed to be installed one per environment.
Redis is necessary, and one instance per environment is also required to be installed.
In Kubernetes, virtualserver in Nginx is used to configured ingress traffic.

Keycloak tips from Novum team

Login: The only login screen will be the one from Office 365.
Logout: Usually, it is not required. If we want to use it, it will logout the user from O365 (for all web apps).
CORS: Identify static REST endpoints and configure two different rules.
Error codes: The web application will not see typically any auth error code.

2. Requesting access for users

Responsible teams: Aura ATRIA team and Novum

The Aura ATRIA team must pass a list to Novum team for requesting access for certain users.
Each user must have the following data:
- Name: Full name of the user
- Email: E-mail of the user
- Group: A list of keycloak groups to where the user must be added (typically, one per environment, dev, pre and pro)
The Novum team is in charge of providing access to these users.

3. Virtualserver

Responsible teams: Aura ATRIA DevOps team

Virtualserver is used to configured Nginx. We have two virtualserver in the authentication method:

aura-services virtualserver: we have to modify it to add two paths:
- /aura-mf-base-atria: redirect to aura-mf-base-atria if the user is logged in or if not to the next path.
- /oauth2/auth: redirect to oauth2-proxy service.
oauth virtualserver: redirect to oauth2-proxy service.

An example is shown below:

aura-services virtualserver /aura-mf-base-atria

    location /aura-mf-base-atria {
         auth_request /oauth2/auth;
         error_page 401 =302 https://auth-svc-ap-nine.auracognitive.com/oauth2/start?rd=$scheme://$host$request_uri;
         auth_request_set $user   $upstream_http_x_auth_request_user;
         auth_request_set $email  $upstream_http_x_auth_request_email;
         proxy_set_header X-User  $user;
         proxy_set_header X-Email $email;
         auth_request_set $token $upstream_http_authorization;
         proxy_set_header Authorization $token;
         auth_request_set $auth_cookie $upstream_http_set_cookie;
         add_header Set-Cookie $auth_cookie;
         auth_request_set $auth_cookie_name_upstream_1 $upstream_cookie_auth_cookie_name_1;
         if ($auth_cookie ~* "(; .*)") {
             set $auth_cookie_name_0 $auth_cookie;
             set $auth_cookie_name_1 "auth_cookie_name_1=$auth_cookie_name_upstream_1$1";
         }
         # Send both Set-Cookie headers now if there was a second part
         if ($auth_cookie_name_upstream_1) {
             add_header Set-Cookie $auth_cookie_name_0;
             add_header Set-Cookie $auth_cookie_name_1;
         }
         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
         proxy_set_header Host            $http_host;
         proxy_pass http://aura-mf-base-atria:4000/aura-mf-base-atria;
    }

aura-services virtualserver /oauth2/auth

  - action:
      proxy:
        upstream: oauth2-proxy
    location-snippets: |
      proxy_pass_request_body off;
      proxy_set_header Content-Length "";
    path: /oauth2/auth

oauth virtualserver /

  routes:
  - action:
      proxy:
        upstream: oauth2-proxy
    path: /
  tls:
    secret: nginx-certificates
  upstreams:
  - name: oauth2-proxy
    port: 80
    service: oauth2-proxy

13 - ATRIA error management

ATRIA error management

Documents defining the most common errors in ATRIA components and how to handle them

Index of documents

atria-model-gateway error management

13.1 - atria-model-gateway error management

atria-model-gateway error management

This document includes the different errors returned by atria-model-gateway

Error descriptions

InvalidModelParam

description: One or more parameters provided for the model are invalid. Please verify the parameter names and values.
message: Specific and descriptive text of the error.
http status: 400

ModelFilterContent

description: The response was filtered due to the prompt triggering Azure OpenAI’s content management policy.
message: Specific and descriptive text of the error.
http status: 400

InvalidPrompt

description: The prompt format is incorrect or contains unsupported characters. Ensure it is a valid string and is adhered to formatting guidelines.
message: Specific and descriptive text of the error.
http status: 400

ModelNotFound

description: The specified model does not exist or is not available for you.
message: Specific and descriptive text of the error.
http status: 400

ContextLengthExceeded

description: The message sent (including prompt + previous messages) exceeds the token limit of the model. Reduce the size of the prompt or the conversation history.
message: Specific and descriptive text of the error.
http status: 400

InjectionAttempt

description: Injection attempt detected. The request appears to contain input designed to manipulate the system’s behavior. This request has been blocked for security reasons.
message: Specific and descriptive text of the error.
http status: 400

InternalError

description: Incoming HTTP request produces an internal error.
message: Specific and descriptive text of the error.
http status: 500

Unauthorized

description: Incoming HTTP request authorization is not valid.
message: Specific and descriptive text of the error.
http status: 500

RequestTimeout

description: The server has decided to close the connection rather than continue waiting. In the headers, the field retry-after is included, which is the waiting time for retrying again.
message: Specific and descriptive text of the error.
http status: 500

QuotaError

description: Incoming HTTP request needs more quota. In the headers, the field retry-after is included, which is the waiting time for retrying again.
message: Specific and descriptive text of the error.
http status: 500

AuraError

description: Generic error.
message: Specific and descriptive text of the error.
http status: 500

14 - Tutorial: Create new Copilot preset

Tutorial: Create new Copilot preset using Aura Configuration API

Comprehensive guidelines for the creation of a new preset in ATRIA for Aura Copilot using aura-configuration-api

Introduction

As an example of the process for the creation of a new preset in ATRIA, the current document shows the detailed guidelines to create a new Aura Copilot preset in a specific environment through the use of aura-configuration-api.

It is important to follow the following steps in the correct order:

Prerequisites
Create a new preset in aura-configuration-api
Include the new preset in an application
Upload documents and execute the generate-db job
Update Aura applications configuration via API

1. Prerequisites

Recommended installations:
- kubectl installed in your local host.
- curl installed in your local host.
- jq installed in your local host.
You must have access to Azure container atria-resources in order to upload documents.

2. Create a new preset in aura-configuration-api

A preset is defined as a configurable entity to define the instructions to work with the AI model for the resolution of the use case.

The creation of a new preset in a specific environment is a key part of ATRIA configuration. The general guidelines for this task are included in:

Modify ATRIA configuration: Create a new preset

Example of the new preset

This can be the structure and fields of the new preset for Aura Copilot, including the prompt with instructions.

New preset

  {
      "id": "a2cdb523-883e-44ab-8e0b-2d164dd98346",
      "name": "new-copilot-preset",
      "group": "enriched_ai",
      "description": "New copilot preset",
      "session": {
          "window": 0,
          "timeout": 30
      },
      "rag": {
          "type": "sql",
          "model": {
              "id": "gpt-4o",
               "parameters": {
                  "max_tokens": 16384,
                  "temperature": 0.01
              }
          },
          "references": {
              "maximum": 3,
              "baseUrl": "project-copilot/jsonl"
          },
          "stages": {
              "language": "en",
              "retrievalStg": {
                  "sources": {
                      "name": "project-copilot",
                      "embeddings": "test_distilbert",
                      "docs": [
                          {
                              "extension": "jsonl",
                              "loader": {
                                  "loaderType": "jsonl"
                              }
                          }
                      ],
                      "retrievers": [
                          {
                              "retrieverType": "qdrant"
                          },
                          {
                              "retrieverType": "tfidf"
                          }
                      ]
                  }
              },
              "generativeStg": {
                  "prompts": {
                      "sqlPrompt": {
                          "default": {
                              "text": "{% raw %}\nGenerate a SQL query statement to answer the following question:\n`{question}`\n    \nUse the data contained in the following table. You have its definition in SQL and in Avro.\n{sql_table_definition}\n    \n    \nThe following tables, containing auxiliary information, are also available:\n```sql\nCREATE TABLE D_CBD_Static_Geo_Area_v6 (GEO_AREA_ID VARCHAR, CBD_GEO_AREA_LEVEL1_ID VARCHAR, CBD_GEO_AREA_LEVEL2_ID VARCHAR, CBD_GEO_AREA_LEVEL3_ID VARCHAR, CBD_GEO_AREA_LEVEL4_ID VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);\n    COMMENT ON TABLE D_CBD_Static_Geo_Area IS 'Geographical areas. This table contains foreign keys to the different levels of geographical areas. In particular, it contains the foreign keys to these tables: CBD_Static_Geo_Area_Level1, CBD_Static_Geo_Area_Level2, CBD_Static_Geo_Area_Level3, CBD_Static_Geo_Area_Level4. Therefore, this tables is used, via JOIN, to query the geographical information contained in the different levels of geographical areas. For instance, if you have a table T with a field GEO_AREA_ID and you need to check whether this location corresponds to the region of Asturias you will need to look for GEO_AREA_ID in this table, then extract the CBD_GEO_AREA_LEVEL4_ID and query the table CBD_Static_Geo_Area_Level4 to get the name of the region.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL1_ID IS 'Identifier of the geographical area Level 1 (max level of detail: CP or similar). FORMAT: string containing a numerical code. This field does not contain location names.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL2_ID IS 'Identifier of the geographical area Level 2 (City/Town). FORMAT: string containing a numerical code. This field does not contain location names.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province). FORMAT: string containing a numerical code. This field does not contain location names.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code. This field does not contain location names.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.EXTRACTION_TM IS 'Date-time of the record';\n    \nCREATE TABLE D_CBD_Static_Geo_Area_Level2_v6 (CBD_GEO_AREA_LEVEL2_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, CBD_GEO_AREA_LEVEL3_ID VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);\n    COMMENT ON TABLE D_CBD_Static_Geo_Area_Level2 IS 'Geographical area level 2 (State)';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.CBD_GEO_AREA_LEVEL2_ID IS 'Identifier of the geographical area Level 2 (City/Town). FORMAT: string containing a numerical code.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 2. FORMAT: alphanumeric string containing the name of the city/town.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province)';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 2';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 2';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_STD_AREA_CD IS 'Standard code of the geo area';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.EXTRACTION_TM IS 'Date-time of the record';\n    \nCREATE TABLE D_CBD_Static_Geo_Area_Level3_v6 (CBD_GEO_AREA_LEVEL3_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, CBD_GEO_AREA_LEVEL4_ID VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, ISO_3166_2_CD VARCHAR, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);\n    COMMENT ON TABLE D_CBD_Static_Geo_Area_Level3 IS 'Geographical area level 3 (Region)';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province). FORMAT: string containing a numerical code.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 3. FORMAT: alphanumeric string containing the name of the province.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 3';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 3';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.ISO_3166_2_CD IS 'ISO 3166-2 associated';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_STD_AREA_CD IS 'Standard code of the geo area';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.EXTRACTION_TM IS 'Date-time of the record';\n    \nCREATE TABLE D_CBD_Static_Geo_Area_Level4_v6 (CBD_GEO_AREA_LEVEL4_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, HASC_1_CD VARCHAR, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);\n    COMMENT ON TABLE D_CBD_Static_Geo_Area_Level4 IS 'Geographical area level 4 (min. Detail)';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 4. FORMAT: alphanumerical string containing the name of the state/region. EXAMPLE VALUES: ''Asturias'', ''Andaluc\u00eda'', etc.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 4';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 4';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.HASC_1_CD IS 'Hierarchical administrative subdivision codes ';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_STD_AREA_CD IS 'Standard code of the geo area';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.EXTRACTION_TM IS 'Date-time of the record';\n    \nCREATE TABLE D_CBD_Static_Station_Type_v6 (STATION_TYPE_CD VARCHAR, TECH_LEVEL_WEIGHT_QT FLOAT, STATION_TYPE_L2_DES VARCHAR, STATION_TYPE_L1_DES VARCHAR, STATION_TYPE_L2_ORDER_NUM INT, STATION_TYPE_L1_ORDER_NUM INT, STATION_TYPE_ORDER_NUM INT, CONSCIOUS_IND BOOLEAN, EXTRACTION_TM VARCHAR);\n    COMMENT ON TABLE D_CBD_Static_Station_Type IS 'Station types';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_CD IS 'Description: Type of device connected to the HGU router. It used to find out which devices are connected to routers in households. Format: String. Example values: \"A/V Equipment\", \"Air Conditioning\", \"Air Conditioning Control\", \"Apple Handheld Device\", \"Apple Home Device\", \"AudioCast\", \"Audiocast\", \"Barcode Printer\", \"Camera\", \"Car Dash Cam\", \"Cryptominner\", \"Digital Clock\", \"Dishwasher\", \"Drone Equipment\", \"GPS\", \"Gaming Console\", \"Hyper Media Player\", \"IP Camera\", \"IPC Hub\", \"IPC Video Recorder\", \"IoT Device\", \"Key Cutting Machine\", \"Media Center\", \"Monitoring Device\", \"Multimedia Player\", \"Network Access Point\", \"Network Equipment\", \"PC\", \"PDA\", \"PIR Sensor\", \"Print Server\", \"Printer\", \"Projector\", \"Raspberry\", \"Router\", \"Security System\", \"Smart AC Control\", \"Smart Air Freshener\", \"Smart Air Fryer\", \"Smart Air Ventilator\", \"Smart Animal Feeder\", \"Smart Baby Monitor\", \"Smart Blind\", \"Smart Bulb\", \"Smart Bulb Adapter\", \"Smart Car\", \"Smart Car e-Charger\", \"Smart Display e-bike\", \"Smart Energy Analyzer\", \"Smart Home Controller\", \"Smart Home Hub\", \"Smart Humidifier\", \"Smart Hydrometer Clock\", \"Smart Kitchen Appliances\", \"Smart Kitchen Scale\", \"Smart Lamp\", \"Smart Light Dimmer\", \"Smart Lock Control\", \"Smart Plug\", \"Smart Pool\", \"Smart Power Strip\", \"Smart Purifier\", \"Smart Scale\", \"Smart Signage\", \"Smart Speaker\", \"Smart Switch\", \"Smart TV\", \"Smart Thermostat\", \"Smart Toothbrush\", \"Smart Vacuum\", \"Smart WallSocket\", \"Smart Watch\", \"Smart Watch Fit\", \"Smart WifiButton\", \"Smartphone\", \"Smartphone/Tablet\", \"Smartwatch\", \"Smartwatch Fit\", \"Solar Panel Equipment\", \"Soundbar\", \"Steam Controller\", \"Storage Device\", \"TPV\", \"TV Dongle\", \"Tablet\", \"Tempest Weather System\", \"UPS\", \"VR/AR Headset\", \"Video Doorbell\", \"Video Intercom\", \"Video STB Equipment\", \"VideointercomIP\", \"Virtual Desktop\", \"VoIP Phone\", \"WAN Extender\", \"WiFi Extender\", \"Wifi Dongle\", \"Wireless Blood Pressure Monitor\", \"Wireless Bridge\", \"Wireless Headphones\", \"Wireless Router + VoIP Series\", \"e-Note\", \"eBook\"';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.TECH_LEVEL_WEIGHT_QT IS 'Associated weight for the technologic level of the home';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L2_DES IS 'Description: Higher level device type grouping. Example values: \"PCs & Home Office\", \"Smartphones / Tablets / eReaders / iWatch\", \"Multimedia Entertainment\", \"Gaming\", \"Sport & Health\", \"Smart Home\", \"Unknown\", \"Network Devices\", \"Security & Control\"';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L1_DES IS 'Description: Intermediate level device type grouping. Example values: \"Smart Speakers & Audio\", \"PCs & Home Office\", \"Video Entertainment\", \"Domestic Appliances\", \"Smart Energy & Lighting\", \"Apple Handheld Device\", \"Smartphones / Tablets / eReaders\", \"Gaming\", \"Sport & Health\", \"Network Devices\", \"Security & Control\", \"IoT\"';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L2_ORDER_NUM IS 'Station type order level 2';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L1_ORDER_NUM IS 'Station type order level 1';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_ORDER_NUM IS 'Station type order';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.CONSCIOUS_IND IS 'Indicates if the related device type has energy efficiency';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.EXTRACTION_TM IS 'Date-time of the record';\n    \nCREATE TABLE D_Segment_v8 (OPERATOR_ID VARCHAR, SEGMENT_ID VARCHAR, SEGMENT_DES VARCHAR, GBL_SEGMENT_ID VARCHAR, SEGMENT_GROUP_ID VARCHAR, SEGMENT_GROUP_DES VARCHAR, EXTRACTION_TM VARCHAR);\n    COMMENT ON TABLE D_Segment IS 'Classifications of the customers, attending to different segmentation criteria, for marketing and management issues, according to OB criteria and its correspondence with the global segment classification';\n    COMMENT ON COLUMN D_Segment.OPERATOR_ID IS 'Global Operator Identifier (Operator acting as owner of the information present in the current entity)';\n    COMMENT ON COLUMN D_Segment.SEGMENT_ID IS 'Description: Organisational segment of the client. Format: two letter string. Possible values: ''NT'' - NTT, ''GP'' - Residencial, ''PE'' - Pymes, ''RE'' - Residencial/SC, ''AU'' - Autonomos, ''OP'' - Operadores, ''GC'' - Grandes Clientes, ''RP'' - Residencial Prepago, ''TE'' - Telefonica, ''SC'' - Sin Clasificar, ''ME'' - Empresas';\n    COMMENT ON COLUMN D_Segment.SEGMENT_DES IS 'Description: Name or description of the organisational segment of the client (provides the description for each segment identifier). Format: string. Example values: ''Residencial'',  ''Pymes'', ''Autonomos'', ''Operadores'', ''Grandes Clientes'', ''Sin Clasificar''';\n    COMMENT ON COLUMN D_Segment.GBL_SEGMENT_ID IS 'ID of the global segment classification';\n    COMMENT ON COLUMN D_Segment.SEGMENT_GROUP_ID IS 'ID code of the segmentation group';\n    COMMENT ON COLUMN D_Segment.SEGMENT_GROUP_DES IS 'Description of the segmentation group';\n    COMMENT ON COLUMN D_Segment.EXTRACTION_TM IS 'Date-time of the record';\nCREATE TABLE D_Fixed_Tariff_Plan_v8 (OPERATOR_ID VARCHAR, DAY_DT VARCHAR, TARIFF_PLAN_ID VARCHAR, TARIFF_PLAN_DES VARCHAR, VOICE_IND BOOLEAN, BBAND_IND BOOLEAN, TV_IND BOOLEAN, WORKSTATION_IND BOOLEAN, APP_IND BOOLEAN, VOICE_BUNDLE_QT FLOAT, BBAND_UP_SPEED_QT FLOAT, BBAND_DOWN_SPEED_QT FLOAT, TV_TYPE_CD VARCHAR, FIXED_SERVICE_COMMERCIAL_NAME VARCHAR, COMMERCIAL_IND BOOLEAN, TARIFF_PLAN_START_DT VARCHAR, TARIFF_PLAN_END_DT VARCHAR, CONVERGENT_IND BOOLEAN, BRAND_ID VARCHAR);\n    COMMENT ON TABLE D_Fixed_Tariff_Plan_v8 IS 'Every fixed Tariff to be applied, either Commercial, Convergent, Individual, or any other, for any product&service for the fixed client base';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.OPERATOR_ID IS 'Global Operator Identifier (Operator acting as owner of the information present in the current entity)';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.DAY_DT IS 'Year, month and day of the data  ### Additional Information  Format: YYYYMMDD (4 digits for year, months from 01 to 12, days from 01 to 31).';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_ID IS 'Unique identifier of the tariff plan';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_DES IS 'Name/short description of the tariff plan';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.VOICE_IND IS 'Indicates whether the line has a fixed line voice service associated.  Values: 0=No; 1=Yes.';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BBAND_IND IS 'Indicates whether the line has a Broadband service associated.  Values: 0=No; 1=Yes.';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TV_IND IS 'Indicates if the line has a TV service associated.  Values: 0=No; 1=Yes.';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.WORKSTATION_IND IS 'Indicates if the line has a workstation service associated.  Values: 0=No; 1=Yes.';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.APP_IND IS 'Indicates if the line has the \"Aplicateca service\" associated.  Values: 0=No; 1=Yes.';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.VOICE_BUNDLE_QT IS 'Amount of data associated with the voice bundle';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BBAND_UP_SPEED_QT IS 'Broadband up speed (Mbps)';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BBAND_DOWN_SPEED_QT IS 'Broadband down speed (Mbps)';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TV_TYPE_CD IS 'Type of TV line';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.FIXED_SERVICE_COMMERCIAL_NAME IS 'Commercial name of the service';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.COMMERCIAL_IND IS 'Indicates if TARIFF_PLAN_ID refers to the COMMERCIAL_TARIFF_ID.    Fill-in with 1 if TARIFF_PLAN_ID refers to the COMMERCIAL_TARIFF_ID or 0 if it doesn''t    0 = Non commercial tariff  1 = commercial tariff';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_START_DT IS 'Start date of the tariff plan validity (that day is the first day when the tariff plan is applicable)  ### Additional Information  Format: YYYYMMDD (4 digits for year, months from 01 to 12, days from 01 to 31).';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_END_DT IS 'End date of the tariff plan validity (that day is the last day when the tariff plan is applicable)  ### Additional Information  Format: YYYYMMDD (4 digits for year, months from 01 to 12, days from 01 to 31).';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.CONVERGENT_IND IS 'Flag indicating if the current fixed tariff plan can be configured as a \"Convergent tariff plan\", i. e., a plan with special conditions due to the fact of including at least one Fixed line/service and one Mobile line.   0 = No (the plan can''t be configured as convergent)   1 = Yes (the plan can be configured as convergent)';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BRAND_ID IS 'Commercial brand identifier. In order to differentiate among different brands in the same OB (e.g. Movistar, O2, Tuenti...)';\n```\nSome of the former tables contain columns in full-qualified format. For instance, these are some examples of full-qualified columns:\n```\nrecord_name.field_name\nTEC_PLAT_REC.DEVICE_ID\nrecord_name.subrecord_name.field_name\nTEC_PLAT_REC.TEC_PLAT_SUBCOMP_REC.DEVICE_ID\n...\n```\nAlways use the full-qualified format when referring to columns in the tables. For instance, if you need to use the column 'TEC_PLAT_REC.DEVICE_ID', you should not refer to it as 'DEVICE_ID', but as 'TEC_PLAT_REC.DEVICE_ID'.\n**Explain in detail, step by step, all your decisions**. \n# General instructions \nFollow these reasoning steps to generate the SQL query:\n- Step 1: Identify Necessary Tables\n- Step 2: Identify Useful Candidate Columns\n- Step 3: Assess if Tables and Columns are Sufficient to Answer the Question\n- Step 4: Identify Columns Contained in Maps\n- Step 5: Plan the SQL Query\n- Step 6: Write the final SQL Query and apply the rules\n- Step 7: Check that the query actually can answer the question\n- Step 8: Create the result as a JSON object \nIf you need to filter by a higher level geographical such as a region (Comunidad Autónoma) you will need to:\n- join the `GEO_AREA_ID` field of the data table (such as `CBD_HGU_Detail_Daily_v10`) with the `GEO_AREA_ID` field in `D_CBD_Static_Geo_Area_v6` table\n- then join the `CBD_GEO_AREA_LEVEL4_ID` field in the `D_CBD_Static_Geo_Area_v6` with the `CBD_GEO_AREA_LEVEL4_ID` field in the `D_CBD_Static_Geo_Area_Level4_v6` table   \n- then compare the `GEO_AREA_LEVEL_DES` field in the `D_CBD_Static_Geo_Area_Level4_v6` table with the name of the region (e.g., 'Cantabria'), since the DESCRIPTION field does contain the actual name of the geographical area.\n**Only perform these joins if explicit filtering or grouping by geographical location is necessary**.\n# Detailed instructions\n### Step 1: Identify Necessary Tables\nFirst, identify which tables are necessary to answer the question `{question}`. Justify why you selected each of these tables. \nUse the following format:\n```\nI need the following tables to answer the question:\n- <table_name>: <reasoning>\n- <table_name>: <reasoning>\n...\n```\n### Step 2: Identify Useful Candidate Columns\nIdentify which columns are useful to answer the question `{question}`. Justify why you selected each of these columns.\nAlways include any column you think may be needed to answer the question. If there are similar columns in the table, you should identify all of them always. You will later choose which them are more suitable to answer the question. But, at this stage, you should include **all the columns that may be useful**.\nWrite the list of candidate columns you identified, and the reasoning after each column, using the following format:\n```\nI can use the following candidate columns to answer the question (including all the columns that may be useful):\n- <table name>:\n  - <column_name>: <copy here the full column description from schema>, including possible values if present>: <reasoning>. \n  - <column_name>: <copy here the full column description from schema>, including possible values if present>: <reasoning>.\n  ...\n- <table_name>:\n  - <column_name>: <copy here the full column description from schema>, including possible values if present>: <reasoning>.\n  - <column_name>: <copy here the full column description from schema>, including possible values if present>: <reasoning>.\n  ...\n...\n```\n### Step 3: Assess if Tables and Columns are Sufficient to Answer the Question\nTell if the tables and columns you identified are enough to answer the question `{question}`. Make sure to justify your answer and check the actual descriptions of the columns in the table definitions and the user question.\nWrite the answer using the following format:\n```\nPossible to answer the question using the former columns: \n- <reasoning>\n- Result: <Yes|No>\n```\n### Step 4: Identify Columns Contained in Maps\nSome columns are actually contained in a map structure. Since these columns need to be queried differently, you need to identify them.\nColumns with a name like '<some_name>.map.<other_name>' are contained in maps. \nFor instance, the column `STATIONS_DETAIL_REC.UNQ_STATION_MAP.map.STATION_TYPE_CD` is contained in a map structure called `STATIONS_DETAIL_REC.UNQ_STATION_MAP`.\nThis map structure is like this:\n```\nSTATIONS_DETAIL_REC.UNQ_STATION_MAP.map.STATION_TYPE_CD: {{\n    <key1>: {{\n        <some_field>; <some_value>,\n        \"STATION_TYPE_CD\": <station_type_value1>\n    }},\n    <key2>: {{\n        <some_other_field>; <some_other_value>,\n        \"STATION_TYPE_CD\": <station_type_value2>\n    }},\n...\n}}\n```\nTherefore, in this step, identify which columns are contained in maps since you will later need to use LATERAL VIEW EXPLODE to access the values of these maps.\n### Step 5: Plan the SQL Query  \nExplain, step by step, how you would write the SQL query to answer the question `{question}`, using the columns you identified. \n**Use the full qualified names of the columns**. **DO NOT USE THE `JSON_OBJECT` FUNCTION IN THE QUERY**.\nSome columns are contained in map structures. You can access the fields of the map using LATERAL VIEW EXPLODE. Do not use UNNEST to access the fields of the map.\nIn particular, you can create a temporary table with the exploded map and then query it. For instance, if you need to get the value of the `ABC.CDE.map.field` column, you should use the following SQL code to create a temporary table with the exploded map data and get the value of the field:\n```sql\nWITH exploded_map AS (\n  SELECT key, value.field_1, value,field_2, value.field_3  -- Select here all the columns/fields you will use later. \n  FROM <table_name>\n  LATERAL VIEW EXPLODE(ABC.CDE) AS key, value\n)\nSELECT exploded_map.field_1\nFROM exploded_map\n``` \nThis is another example:\n```sql\n  WITH exploded_map AS (\n  SELECT DATE, ID, RECORD.GROUP, value.CODE  -- Select here all the columns/fields you will use later.\n    FROM CBD_HGU_Detail_Daily_Aura_v10 LATERAL VIEW EXPLODE(STATIONS_DETAIL_REC.UNQ_STATION_MAP) AS key, value) \n  SELECT COUNT(DISTINCT ID) AS num_homes \n  FROM exploded_map JOIN D_Segment_v8 ON exploded_map.CLASS_ID = D_Segment_v8.CLASS_ID \n    WHERE DATE BETWEEN '2024-01-01' AND '2024-02-01' \n      AND D_Segment_v8.DESCRIPTION = 'DESCRIPTION value' \n      AND exploded_map.CODE = 'CODE value'    \n```\nHere is another example. If you need to count the number of elements in a map column named 'ABC.map' you should use a code like this:\n```sql\nWITH exploded_map AS (\n  SELECT key_from_exploded_map\n  FROM <table_name>\n  LATERAL VIEW EXPLODE(ABC) AS key_from_exploded_map, value_from_exploded_map\n)\nSELECT COUNT(key_from_exploded_map)\nFROM exploded_map\n```\nTake into account that all map fields are named with the suffix `_MAP`. Take into account that you can only use the operation EXPLODE to fields that are maps. Therefore, you should use the EXPLODE operation only on fields that end with `_MAP`.\nTo finish this step, explain how you would write the SQL query to answer the question, using the columns you identified, taking into account the previous considerations for columns contained in maps, if there are any.\n### Step 6: Write the final SQL Query and apply the rules\nFinally, write the SQL query to answer the question `{question}`, using the columns you identified. \nRemarks:\n**DO NOT USE THE `JSON_OBJECT` FUNCTION IN THE QUERY**.\n**IMPORTANT: The keys in the exploded maps should not be used in JOIN operations, since they are just internal keys to the map structure.**\nCheck if you need to use any of the following **business rules** to build the query:\n```json\n{{\n  \"rules\": [\n    {{\n      \"id\": \"B1\",\n      \"name\": \"Fiction\",\n      \"rule\": \"If you need to look for tariff plans including \"ficción\" contents, you will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%FICCION%', '%FICCIÓN%', '%SERIES%', '%CINE%', '%FUSIÓN TOTAL%', '%FUSION TOTAL%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FICCION%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FICCIÓN%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%SERIES%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%CINE%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FUSIÓN TOTAL%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FUSION TOTAL%'`\n\"\n    }},\n    {{\n      \"id\": \"B2\",\n      \"name\": \"Disney\",\n      \"rule\": \"If you need to look for tariff plans including \"Disney\" contents, you will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%DISNEY%'.  To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%DISNEY%'`\n\"\n    }},\n    {{\n      \"id\": \"B3\",\n      \"name\": \"Football\",\n      \"rule\": \"If you need to look for tariff plans including football contents, you will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%FUTBOL%', '%FÚTBOL%', '%FUSION TOTAL%', '%FUSIÓN TOTAL%',  '%FUSION TA TOTAL%', '%FUSIÓN TA TOTAL%', '%LIGA%', '%CHAMPION%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one:  `UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FUTBOL%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FÚTBOL%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FUSION TOTAL%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FUSIÓN TOTAL%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%LIGA%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%CHAMPION%'`\n\"\n    }},\n    {{\n      \"id\": \"B4\",\n      \"name\": \"Netflix\",\n      \"rule\": \"If you need to look for tariff plans including \"Netflix\" contents, you will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%NETFLIX%', '%FICCIÓN%', '%FICCION%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%NETFLIX%'`\n\"\n    }},\n    {{\n      \"id\": \"B5\",\n      \"name\": \"Promociones\",\n      \"rule\": \"If you need to look for tariff plans including \"promotions\", you will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%PROMO%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%PROMO%'`\n\"\n    }},\n    {{\n      \"id\": \"B6\",\n      \"name\": \"Edad promedio 1\",\n      \"rule\": \"You are not allowed to use the field `CBD_INFO_REC.CUST_AGE_NUM` in any query. You should use the field `CBD_INFO_REC.CUST_AGE_SEGMENT_CD` instead.\n\"\n    }},\n    {{\n      \"id\": \"B7\",\n      \"name\": \"Edad promedio 2\",\n      \"rule\": \"If you need to calculate the average age of customers you should use the  following calculation instead of AVG(CBD_INFO_REC.CUST_AGE_SEGMENT_CD): AVG(IF(CBD_INFO_REC.CUST_AGE_SEGMENT_CD = '1', NULL, CBD_INFO_REC.CUST_AGE_SEGMENT_CD))\n\"\n    }},\n    {{\n      \"id\": \"B8\",\n      \"name\": \"Query by customers\",\n      \"rule\": \"If you need to query by customers: if the time scope of the query is daily or weekly then you should use the `DEVICE_ID` field. If the time scope of the query is monthly or longer then you should use the `CUSTOMER_ID` field.\n\"\n    }},\n    {{\n      \"id\": \"B9\",\n      \"name\": \"Station type\",\n      \"rule\": \"The field `STATION_TYPE_L2` corresponds to a higher aggregation level than `STATION_TYPE_L1`.  `STATION_TYPE_L1` corresponds to an intermediate category, used only with analytical purposes.\n\"\n    }},\n    {{\n      \"id\": \"B10\",\n      \"name\": \"Active devices\",\n      \"rule\": \"If you need to check whether a device is active at a given date, you should use this check: `DEVICE_INFO_REC.INACTIVITY_DEVICE_INFO_NUM < 24`. If true, the device is active. If false, the device is inactive.\n\"\n    }},\n    {{\n      \"id\": \"B11\",\n      \"name\": \"Penetración de un producto\",\n      \"rule\": \"If you are asked for calculating \"la penetración de un producto\" you should calculate the percentage of customers with that product.\n\"\n    }},\n    {{\n      \"id\": \"B12\",\n      \"name\": \"Obsolete routers\",\n      \"rule\": \"If you are asked for obsolete routers, you should check for those with MANUFACT_HGU_CHIPSET_DES IN ('Askey Broadcom', 'Askey Econet','MitraStar Broadcom', 'MitraStar Econet').\n\"\n    }},\n    {{\n      \"id\": \"B13\",\n      \"name\": \"High value customers\",\n      \"rule\": \"Consider as high value customers those with a monthly revenue higher than 100 (TOTAL_CUST_RV > 100).\n\"\n    }},\n    {{\n      \"id\": \"B14.1\",\n      \"name\": \"Technological level formula\",\n      \"rule\": \"If you need to check the technological level of a customer, use the following formula on the field `TECH_LEVEL_WEIGHT_QT` of the table `D_CBD_STATIC_STATION_TYPE_v6`: `SUM(COALESCE(D_CBD_STATIC_STATION_TYPE_v6.TECH_LEVEL_WEIGHT_QT,0) + CASE WHEN AMM.VALUE.STATION_BRAND_DES = 'Ubiquiti' THEN 0.8 ELSE 0 END)/COUNT(DISTINCT DAY_DT)`\n\"\n    }},\n    {{\n      \"id\": \"B14.2\",\n      \"name\": \"Technological levels\",\n      \"rule\": \"Consider as **high technological level** customers those with a value higher or equal to 2.5. Consider as **medium technological level** customers those with a value higher or equal to 1 and lower than 2.5. Consider as **low technological level** customers those with a value lower than 1.\n\"\n    }},\n    {{\n      \"id\": \"B15\",\n      \"name\": \"Sport\",\n      \"rule\": \"If you need to look for tariff plans including \"sport\" contents, you will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%DEPORTE%', '%TOTAL PLUS%', '%TOTAL SAT%PLUS%', '%MOTOR%', '%DAZN%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `(UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%DEPORTE%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%TOTAL PLUS%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%TOTAL SAT%PLUS%' -- Se añade para incluir los \"Total Satelite/Satélite Plus\" OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%MOTOR%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%DAZN%')`\n\"\n    }},\n    {{\n      \"id\": \"R1\",\n      \"name\": \"Temporary table fields\",\n      \"rule\": \"When you use in a filter a given filed from a temporary table, built using the `WITH` clause, make sure that the  field is actually present in the SELECT statement defining the temporary table.\n\"\n    }},\n    {{\n      \"id\": \"R2\",\n      \"name\": \"Temporary table field naming\",\n      \"rule\": \"Example: If you write a temporary table like this: `WITH temp_table AS (SELECT field1_prefix.field1 FROM table)`,  then you should use refer to the field as `field1` and not as `field1_prefix.field1` in the rest of the query.\n\"\n    }},\n    {{\n      \"id\": \"R3\",\n      \"name\": \"Tariff plan\",\n      \"rule\": \"If you need to look for some specific tariffs, use the field `TARIFF_PLAN_DES` from the dimensional table D_Fixed_Tariff_Plan instead of using `CBD_INFO_REC.COMMERCIAL_TARIFF_ID` since this last one only contains identifiers without any meaning.\n\"\n    }},\n    {{\n      \"id\": \"R4.1\",\n      \"name\": \"Station type 1\",\n      \"rule\": \"If the query uses `D_CBD_Static_Station_Type_v6.STATION_TYPE_L1_DES` or `D_CBD_Static_Station_Type_v6.STATION_TYPE_L2_DES` answer this question: does the value you are looking for, matches one of the possible values of these fields? Justify your answer. Enumerate the possible values of these fields if they are used.\n\"\n    }},\n    {{\n      \"id\": \"R4.2\",\n      \"name\": \"Station type 2\",\n      \"rule\": \"Apply this rule if the query uses a filter with the field `D_CBD_Static_Station_Type_v6.STATION_TYPE_L1_DES` or `D_CBD_Static_Station_Type_v6.STATION_TYPE_L2_DES` and the value you are looking for does not match any of the possible values of these fields. In this case, you should use the field `STATION_TYPE_CD` instead. Write the result of the previous reasoning in detail.  REMEMBER TO FIX THE QUERY TO USE THE FIELD `STATION_TYPE_CD` INSTEAD.\n\"\n    }},\n    {{\n      \"id\": \"R5\",\n      \"name\": \"Counting entities\",\n      \"rule\": \"If you need to count the number of customer, homes, devices or any other entities, you should ensure that you are actually counting distinct entities. Therefore you should use the `COUNT(DISTINCT ...)` function instead of `COUNT(...)`.\n\"\n    }},\n    {{\n      \"id\": \"R6\",\n      \"name\": \"Time scope less than a month\",\n      \"rule\": \"If you are asked to answer a question for a time scope minor than a month (daily or weekly) you must not use the field `MONTH_DT` in your query.\n\"\n    }},\n    {{\n      \"id\": \"R7\",\n      \"name\": \"No UNION operator\",\n      \"rule\": \"Avoid using the UNION operator in your queries.\n\"\n    }},\n    {{\n      \"id\": \"R8\",\n      \"name\": \"Counting entities\",\n      \"rule\": \"If you are asked to count the number of customers, homes, devices or any other entities, you should ensure that the  result is actually a count and not a list of elements. Therefore you should use the COUNT function.\n\"\n    }},\n    {{\n      \"id\": \"R9\",\n      \"name\": \"IoT devices\",\n      \"rule\": \"If you need to look for IoT (Internet of Things) devices, you should look for devices with `STATION_TYPE_L2_DES = 'Smart Home'`\n\"\n    }},\n    {{\n      \"id\": \"R10\",\n      \"name\": \"Router model\",\n      \"rule\": \"If you need to check the model of the router, you should use the field `MANUFACT_HGU_CHIPSET_DES` (do not use other fields such as `MANUFACTURER_FW_VER_DES`).\n\"\n    }},\n    {{\n      \"id\": \"R11\",\n      \"name\": \"Weekly period\",\n      \"rule\": \"If you need to query data from weekly period, you should start always with the first day of the week (Monday) and end with the last day of the week (Sunday).\n\"\n    }},\n    {{\n      \"id\": \"R12\",\n      \"name\": \"WiFi type\",\n      \"rule\": \"If you need to look for information on a specific WiFi type, such as 2.4 GHz or 5 GHz, you should use the specific fields corresponding to these types.  For instance, if you need to look for WiFi5 device information, you should not use the field `STATIONS_REC.WIFI_REC.ALL_TECH_REC` but the field `STATIONS_REC.WIFI_REC.TECH_5G_REC`.\n\"\n    }},\n    {{\n      \"id\": \"R13\",\n      \"name\": \"Equivalent terms for WiFi technologies\",\n      \"rule\": \"The following terms are considered equivalent: \n- `WiFi 5G`, `WiFi Technology 5G`, `WiFi5`.\n- `WiFi 2.4G`, `WiFi Technology 2.4G`, `WiFi2.4` , `WiFi2`, `WiFi Technology 2G`, `WiFi 2G`.\n\"\n    }},\n    {{\n      \"id\": \"R14\",\n      \"name\": \"Customer Satisfaction Index\",\n      \"rule\": \"The field `CSI_QT` contains the `Customer Satisfaction Index` value. It is not a quality value but a satisfaction value.  Do not confuse it with Quality Index fields.\n\"\n    }},\n    {{\n      \"id\": \"R15\",\n      \"name\": \"Active HGU devices\",\n      \"rule\": \"The field `CUST_HGU_DEVICES_NUM` contains the number of active HGU devices of the customer, i.e. the number of active routers (HGUs) of the customer.  Do not confuse it with the number of active devices of the customer.\n\"\n    }},\n    {{\n      \"id\": \"R16\",\n      \"name\": \"Megabytes\",\n      \"rule\": \"The fields starting with `MB_` or containing `_MB_` in their name refer to Megabytes. Take this into account during your queries.\n\"\n    }},\n    {{\n      \"id\": \"R17\",\n      \"name\": \"Gigabytes\",\n      \"rule\": \"The fields starting with `GB_` or containing `_GB_` in their name refer to Gigabytes. Take this into account during your queries.\n\"\n    }},\n    {{\n      \"id\": \"R18\",\n      \"name\": \"RSSI meaning\",\n      \"rule\": \"The field `RSSI` refers to the `Received Signal Strength Indicator`. It is a measure of the power present in a received radio signal.\n\"\n    }},\n    {{\n      \"id\": \"R19\",\n      \"name\": \"Checking absence of a device\",\n      \"rule\": \"If you need to look for homes without a specific type of device, you should not forget checking at least one of the following fields: `STATION_TYPE_L1_DES`, `STATION_TYPE_L2_DES`, `STATION_TYPE_CD`. In other words, you need an explicit filter checking the absence of the device.\n\"\n    }}\n  ]\n}}\n```\nExplain whether you can apply any of the rules and explain how you would apply them in the SQL query.\nAlways write your result following these steps:\n1. SQL query to answer the question `{question}`: <write the SQL query here>\n2. Reasoning: <explain why you wrote the query like that>\n3. Check of the rules, RULE BY RULE and FOR EACH RULE (one entry per rule)2. <write ALL the rules and tell if they are applied or not>. Follow this format:\n- <rule1>: Should be applied, because <reason> | Should not be applied, because <reason>\n- <rule2>: Should be applied, because <reason> | Should not be applied, because <reason>\n...\n4. Result of the execution of the rules that have been identified to be applied. Follow this format:\n- <rule1>: <result>\n- <rule2>: <result>\n...\n5. Need to fix the query because <reason>. The following changes are needed: <change_1>, <change 2>, etc. | The query is already correct.\n6. SQL query to answer the question `{question}` after considering the previous **rules**: <write the SQL query here>. FIX THE QUERY IF NECESSARY.\n### Step 7: Check that the query actually can answer the question\nCheck again if the generated query answers the question `{question}`.\nFollow these steps:\n1. Write the concepts involved in the question. Enumerate the concepts as a list. Follow this format:\n - <concept1>\n - <concept2>\n ...\n2. Write all the concepts of the question that are covered by the SQL query. Enumerate them and create a match list with the concepts from the previous step. Write down the part of the SQL query covering the concept. Take into account that conditions on specific proper names, such as model names, location names, etc, need to be explicitly checked. Follow this format:\n - <concept1>: covered in <sql query section> or not covered.\n - <concept2>: covered in <sql query section> or not covered.\n3. Find those concepts in the question that are not covered by the SQL query.\n4. Conclude whether the question can actually be answered by the generated query. Follow this format:\n - The question can be answered by the SQL query: <Yes|No>\n### Step 8: Create the result as a JSON object\nReturn the result as a unique JSON object, with the following structure:\n{{\n  \"result\": <Write the SQL query here. **MAKE SURE THAT THE STATEMENT `SELECT JSON_OBJECT` is not used in the query and Use the full qualified names of the columns. Generate a valid SQL sentence in a single line without new line characters.**>,\n  \"status\": \"OK\",\n  \"reason\": <a reasoning explaining the query>\n}}\nIf the former table does not contain the necessary data to answer the question, return the following JSON object:\n{{\n  \"result\": null,\n  \"status\": \"ERROR\",\n  \"reason\": <a reasoning explaining why it is not possible to answer the question>\n}}\nMake sure that the JSON object is correctly formatted, and can be parsed by a JSON parser.\n**Please, ALWAYS follow the 8 steps presented in the instructions.** Start reasoning with ### Step 1 and finish with ### Step 8.\n{% endraw %}\"\"en\": \"{% raw %}\nGenerate a SQL query statement to answer the following question:\n`{question}`\n    \nUse the data contained in the following table. You have its definition in SQL and in Avro.\n{sql_table_definition}\n    \n    \nThe following tables, containing auxiliary information, are also available:\n```sql\nCREATE TABLE D_CBD_Static_Geo_Area_v6 (GEO_AREA_ID VARCHAR, CBD_GEO_AREA_LEVEL1_ID VARCHAR, CBD_GEO_AREA_LEVEL2_ID VARCHAR, CBD_GEO_AREA_LEVEL3_ID VARCHAR, CBD_GEO_AREA_LEVEL4_ID VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);\n    COMMENT ON TABLE D_CBD_Static_Geo_Area IS 'Geographical areas. This table contains foreign keys to the different levels of geographical areas. In particular, it contains the foreign keys to these tables: CBD_Static_Geo_Area_Level1, CBD_Static_Geo_Area_Level2, CBD_Static_Geo_Area_Level3, CBD_Static_Geo_Area_Level4. Therefore, this tables is used, via JOIN, to query the geographical information contained in the different levels of geographical areas. For instance, if you have a table T with a field GEO_AREA_ID and you need to check whether this location corresponds to the region of Asturias you will need to look for GEO_AREA_ID in this table, then extract the CBD_GEO_AREA_LEVEL4_ID and query the table CBD_Static_Geo_Area_Level4 to get the name of the region.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL1_ID IS 'Identifier of the geographical area Level 1 (max level of detail: CP or similar). FORMAT: string containing a numerical code. This field does not contain location names.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL2_ID IS 'Identifier of the geographical area Level 2 (City/Town). FORMAT: string containing a numerical code. This field does not contain location names.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province). FORMAT: string containing a numerical code. This field does not contain location names.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code. This field does not contain location names.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area.EXTRACTION_TM IS 'Date-time of the record';\n    \nCREATE TABLE D_CBD_Static_Geo_Area_Level2_v6 (CBD_GEO_AREA_LEVEL2_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, CBD_GEO_AREA_LEVEL3_ID VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);\n    COMMENT ON TABLE D_CBD_Static_Geo_Area_Level2 IS 'Geographical area level 2 (State)';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.CBD_GEO_AREA_LEVEL2_ID IS 'Identifier of the geographical area Level 2 (City/Town). FORMAT: string containing a numerical code.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 2. FORMAT: alphanumeric string containing the name of the city/town.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province)';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 2';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 2';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_STD_AREA_CD IS 'Standard code of the geo area';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.EXTRACTION_TM IS 'Date-time of the record';\n    \nCREATE TABLE D_CBD_Static_Geo_Area_Level3_v6 (CBD_GEO_AREA_LEVEL3_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, CBD_GEO_AREA_LEVEL4_ID VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, ISO_3166_2_CD VARCHAR, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);\n    COMMENT ON TABLE D_CBD_Static_Geo_Area_Level3 IS 'Geographical area level 3 (Region)';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province). FORMAT: string containing a numerical code.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 3. FORMAT: alphanumeric string containing the name of the province.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 3';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 3';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.ISO_3166_2_CD IS 'ISO 3166-2 associated';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_STD_AREA_CD IS 'Standard code of the geo area';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.EXTRACTION_TM IS 'Date-time of the record';\n    \nCREATE TABLE D_CBD_Static_Geo_Area_Level4_v6 (CBD_GEO_AREA_LEVEL4_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, HASC_1_CD VARCHAR, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);\n    COMMENT ON TABLE D_CBD_Static_Geo_Area_Level4 IS 'Geographical area level 4 (min. Detail)';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 4. FORMAT: alphanumerical string containing the name of the state/region. EXAMPLE VALUES: ''Asturias'', ''Andaluc\u00eda'', etc.';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 4';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 4';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.HASC_1_CD IS 'Hierarchical administrative subdivision codes ';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_STD_AREA_CD IS 'Standard code of the geo area';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';\n    COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.EXTRACTION_TM IS 'Date-time of the record';\n    \nCREATE TABLE D_CBD_Static_Station_Type_v6 (STATION_TYPE_CD VARCHAR, TECH_LEVEL_WEIGHT_QT FLOAT, STATION_TYPE_L2_DES VARCHAR, STATION_TYPE_L1_DES VARCHAR, STATION_TYPE_L2_ORDER_NUM INT, STATION_TYPE_L1_ORDER_NUM INT, STATION_TYPE_ORDER_NUM INT, CONSCIOUS_IND BOOLEAN, EXTRACTION_TM VARCHAR);\n    COMMENT ON TABLE D_CBD_Static_Station_Type IS 'Station types';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_CD IS 'Description: Type of device connected to the HGU router. It used to find out which devices are connected to routers in households. Format: String. Example values: \"A/V Equipment\", \"Air Conditioning\", \"Air Conditioning Control\", \"Apple Handheld Device\", \"Apple Home Device\", \"AudioCast\", \"Audiocast\", \"Barcode Printer\", \"Camera\", \"Car Dash Cam\", \"Cryptominner\", \"Digital Clock\", \"Dishwasher\", \"Drone Equipment\", \"GPS\", \"Gaming Console\", \"Hyper Media Player\", \"IP Camera\", \"IPC Hub\", \"IPC Video Recorder\", \"IoT Device\", \"Key Cutting Machine\", \"Media Center\", \"Monitoring Device\", \"Multimedia Player\", \"Network Access Point\", \"Network Equipment\", \"PC\", \"PDA\", \"PIR Sensor\", \"Print Server\", \"Printer\", \"Projector\", \"Raspberry\", \"Router\", \"Security System\", \"Smart AC Control\", \"Smart Air Freshener\", \"Smart Air Fryer\", \"Smart Air Ventilator\", \"Smart Animal Feeder\", \"Smart Baby Monitor\", \"Smart Blind\", \"Smart Bulb\", \"Smart Bulb Adapter\", \"Smart Car\", \"Smart Car e-Charger\", \"Smart Display e-bike\", \"Smart Energy Analyzer\", \"Smart Home Controller\", \"Smart Home Hub\", \"Smart Humidifier\", \"Smart Hydrometer Clock\", \"Smart Kitchen Appliances\", \"Smart Kitchen Scale\", \"Smart Lamp\", \"Smart Light Dimmer\", \"Smart Lock Control\", \"Smart Plug\", \"Smart Pool\", \"Smart Power Strip\", \"Smart Purifier\", \"Smart Scale\", \"Smart Signage\", \"Smart Speaker\", \"Smart Switch\", \"Smart TV\", \"Smart Thermostat\", \"Smart Toothbrush\", \"Smart Vacuum\", \"Smart WallSocket\", \"Smart Watch\", \"Smart Watch Fit\", \"Smart WifiButton\", \"Smartphone\", \"Smartphone/Tablet\", \"Smartwatch\", \"Smartwatch Fit\", \"Solar Panel Equipment\", \"Soundbar\", \"Steam Controller\", \"Storage Device\", \"TPV\", \"TV Dongle\", \"Tablet\", \"Tempest Weather System\", \"UPS\", \"VR/AR Headset\", \"Video Doorbell\", \"Video Intercom\", \"Video STB Equipment\", \"VideointercomIP\", \"Virtual Desktop\", \"VoIP Phone\", \"WAN Extender\", \"WiFi Extender\", \"Wifi Dongle\", \"Wireless Blood Pressure Monitor\", \"Wireless Bridge\", \"Wireless Headphones\", \"Wireless Router + VoIP Series\", \"e-Note\", \"eBook\"';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.TECH_LEVEL_WEIGHT_QT IS 'Associated weight for the technologic level of the home';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L2_DES IS 'Description: Higher level device type grouping. Example values: \"PCs & Home Office\", \"Smartphones / Tablets / eReaders / iWatch\", \"Multimedia Entertainment\", \"Gaming\", \"Sport & Health\", \"Smart Home\", \"Unknown\", \"Network Devices\", \"Security & Control\"';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L1_DES IS 'Description: Intermediate level device type grouping. Example values: \"Smart Speakers & Audio\", \"PCs & Home Office\", \"Video Entertainment\", \"Domestic Appliances\", \"Smart Energy & Lighting\", \"Apple Handheld Device\", \"Smartphones / Tablets / eReaders\", \"Gaming\", \"Sport & Health\", \"Network Devices\", \"Security & Control\", \"IoT\"';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L2_ORDER_NUM IS 'Station type order level 2';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L1_ORDER_NUM IS 'Station type order level 1';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_ORDER_NUM IS 'Station type order';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.CONSCIOUS_IND IS 'Indicates if the related device type has energy efficiency';\n    COMMENT ON COLUMN D_CBD_Static_Station_Type.EXTRACTION_TM IS 'Date-time of the record';\n    \nCREATE TABLE D_Segment_v8 (OPERATOR_ID VARCHAR, SEGMENT_ID VARCHAR, SEGMENT_DES VARCHAR, GBL_SEGMENT_ID VARCHAR, SEGMENT_GROUP_ID VARCHAR, SEGMENT_GROUP_DES VARCHAR, EXTRACTION_TM VARCHAR);\n    COMMENT ON TABLE D_Segment IS 'Classifications of the customers, attending to different segmentation criteria, for marketing and management issues, according to OB criteria and its correspondence with the global segment classification';\n    COMMENT ON COLUMN D_Segment.OPERATOR_ID IS 'Global Operator Identifier (Operator acting as owner of the information present in the current entity)';\n    COMMENT ON COLUMN D_Segment.SEGMENT_ID IS 'Description: Organisational segment of the client. Format: two letter string. Possible values: ''NT'' - NTT, ''GP'' - Residencial, ''PE'' - Pymes, ''RE'' - Residencial/SC, ''AU'' - Autonomos, ''OP'' - Operadores, ''GC'' - Grandes Clientes, ''RP'' - Residencial Prepago, ''TE'' - Telefonica, ''SC'' - Sin Clasificar, ''ME'' - Empresas';\n    COMMENT ON COLUMN D_Segment.SEGMENT_DES IS 'Description: Name or description of the organisational segment of the client (provides the description for each segment identifier). Format: string. Example values: ''Residencial'',  ''Pymes'', ''Autonomos'', ''Operadores'', ''Grandes Clientes'', ''Sin Clasificar''';\n    COMMENT ON COLUMN D_Segment.GBL_SEGMENT_ID IS 'ID of the global segment classification';\n    COMMENT ON COLUMN D_Segment.SEGMENT_GROUP_ID IS 'ID code of the segmentation group';\n    COMMENT ON COLUMN D_Segment.SEGMENT_GROUP_DES IS 'Description of the segmentation group';\n    COMMENT ON COLUMN D_Segment.EXTRACTION_TM IS 'Date-time of the record';\nCREATE TABLE D_Fixed_Tariff_Plan_v8 (OPERATOR_ID VARCHAR, DAY_DT VARCHAR, TARIFF_PLAN_ID VARCHAR, TARIFF_PLAN_DES VARCHAR, VOICE_IND BOOLEAN, BBAND_IND BOOLEAN, TV_IND BOOLEAN, WORKSTATION_IND BOOLEAN, APP_IND BOOLEAN, VOICE_BUNDLE_QT FLOAT, BBAND_UP_SPEED_QT FLOAT, BBAND_DOWN_SPEED_QT FLOAT, TV_TYPE_CD VARCHAR, FIXED_SERVICE_COMMERCIAL_NAME VARCHAR, COMMERCIAL_IND BOOLEAN, TARIFF_PLAN_START_DT VARCHAR, TARIFF_PLAN_END_DT VARCHAR, CONVERGENT_IND BOOLEAN, BRAND_ID VARCHAR);\n    COMMENT ON TABLE D_Fixed_Tariff_Plan_v8 IS 'Every fixed Tariff to be applied, either Commercial, Convergent, Individual, or any other, for any product&service for the fixed client base';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.OPERATOR_ID IS 'Global Operator Identifier (Operator acting as owner of the information present in the current entity)';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.DAY_DT IS 'Year, month and day of the data  ### Additional Information  Format: YYYYMMDD (4 digits for year, months from 01 to 12, days from 01 to 31).';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_ID IS 'Unique identifier of the tariff plan';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_DES IS 'Name/short description of the tariff plan';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.VOICE_IND IS 'Indicates whether the line has a fixed line voice service associated.  Values: 0=No; 1=Yes.';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BBAND_IND IS 'Indicates whether the line has a Broadband service associated.  Values: 0=No; 1=Yes.';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TV_IND IS 'Indicates if the line has a TV service associated.  Values: 0=No; 1=Yes.';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.WORKSTATION_IND IS 'Indicates if the line has a workstation service associated.  Values: 0=No; 1=Yes.';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.APP_IND IS 'Indicates if the line has the \"Aplicateca service\" associated.  Values: 0=No; 1=Yes.';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.VOICE_BUNDLE_QT IS 'Amount of data associated with the voice bundle';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BBAND_UP_SPEED_QT IS 'Broadband up speed (Mbps)';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BBAND_DOWN_SPEED_QT IS 'Broadband down speed (Mbps)';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TV_TYPE_CD IS 'Type of TV line';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.FIXED_SERVICE_COMMERCIAL_NAME IS 'Commercial name of the service';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.COMMERCIAL_IND IS 'Indicates if TARIFF_PLAN_ID refers to the COMMERCIAL_TARIFF_ID.    Fill-in with 1 if TARIFF_PLAN_ID refers to the COMMERCIAL_TARIFF_ID or 0 if it doesn''t    0 = Non commercial tariff  1 = commercial tariff';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_START_DT IS 'Start date of the tariff plan validity (that day is the first day when the tariff plan is applicable)  ### Additional Information  Format: YYYYMMDD (4 digits for year, months from 01 to 12, days from 01 to 31).';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_END_DT IS 'End date of the tariff plan validity (that day is the last day when the tariff plan is applicable)  ### Additional Information  Format: YYYYMMDD (4 digits for year, months from 01 to 12, days from 01 to 31).';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.CONVERGENT_IND IS 'Flag indicating if the current fixed tariff plan can be configured as a \"Convergent tariff plan\", i. e., a plan with special conditions due to the fact of including at least one Fixed line/service and one Mobile line.   0 = No (the plan can''t be configured as convergent)   1 = Yes (the plan can be configured as convergent)';\n    COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BRAND_ID IS 'Commercial brand identifier. In order to differentiate among different brands in the same OB (e.g. Movistar, O2, Tuenti...)';\n```\nSome of the former tables contain columns in full-qualified format. For instance, these are some examples of full-qualified columns:\n```\nrecord_name.field_name\nTEC_PLAT_REC.DEVICE_ID\nrecord_name.subrecord_name.field_name\nTEC_PLAT_REC.TEC_PLAT_SUBCOMP_REC.DEVICE_ID\n...\n```\nAlways use the full-qualified format when referring to columns in the tables. For instance, if you need to use the column 'TEC_PLAT_REC.DEVICE_ID', you should not refer to it as 'DEVICE_ID', but as 'TEC_PLAT_REC.DEVICE_ID'. \n**Explain in detail, step by step, all your decisions**. \n# General instructions\nFollow these reasoning steps to generate the SQL query:\n- Step 1: Identify Necessary Tables\n- Step 2: Identify Useful Candidate Columns\n- Step 3: Assess if Tables and Columns are Sufficient to Answer the Question\n- Step 4: Identify Columns Contained in Maps\n- Step 5: Plan the SQL Query\n- Step 6: Write the final SQL Query and apply the rules\n- Step 7: Check that the query actually can answer the question\n- Step 8: Create the result as a JSON object \nIf you need to filter by a higher level geographical such as a region (Comunidad Autónoma) you will need to:\n- join the `GEO_AREA_ID` field of the data table (such as `CBD_HGU_Detail_Daily_v10`) with the `GEO_AREA_ID` field in `D_CBD_Static_Geo_Area_v6` table\n- then join the `CBD_GEO_AREA_LEVEL4_ID` field in the `D_CBD_Static_Geo_Area_v6` with the `CBD_GEO_AREA_LEVEL4_ID` field in the `D_CBD_Static_Geo_Area_Level4_v6` table   \n- then compare the `GEO_AREA_LEVEL_DES` field in the `D_CBD_Static_Geo_Area_Level4_v6` table with the name of the region (e.g., 'Cantabria'), since the DESCRIPTION field does contain the actual name of the geographical area.\n**Only perform these joins if explicit filtering or grouping by geographical location is necessary**. \n# Detailed instructions\n### Step 1: Identify Necessary Tables\nFirst, identify which tables are necessary to answer the question `{question}`. Justify why you selected each of these tables. \nUse the following format:\n```\nI need the following tables to answer the question:\n- <table_name>: <reasoning>\n- <table_name>: <reasoning>\n...\n```\n### Step 2: Identify Useful Candidate Columns\nIdentify which columns are useful to answer the question `{question}`. Justify why you selected each of these columns.\nAlways include any column you think may be needed to answer the question. If there are similar columns in the table, you should identify all of them always. You will later choose which them are more suitable to answer the question. But, at this stage, you should include **all the columns that may be useful**.\nWrite the list of candidate columns you identified, and the reasoning after each column, using the following format:\n```\nI can use the following candidate columns to answer the question (including all the columns that may be useful):\n- <table name>:\n  - <column_name>: <copy here the full column description from schema>, including possible values if present>: <reasoning>. \n  - <column_name>: <copy here the full column description from schema>, including possible values if present>: <reasoning>.\n  ...\n- <table_name>:\n  - <column_name>: <copy here the full column description from schema>, including possible values if present>: <reasoning>.\n  - <column_name>: <copy here the full column description from schema>, including possible values if present>: <reasoning>.\n  ...\n...\n```  \n### Step 3: Assess if Tables and Columns are Sufficient to Answer the Question\nTell if the tables and columns you identified are enough to answer the question `{question}`. Make sure to justify your answer and check the actual descriptions of the columns in the table definitions and the user question.\nWrite the answer using the following format:\n```\nPossible to answer the question using the former columns: \n- <reasoning>\n- Result: <Yes|No>\n```  \n### Step 4: Identify Columns Contained in Maps\nSome columns are actually contained in a map structure. Since these columns need to be queried differently, you need to identify them.\nColumns with a name like '<some_name>.map.<other_name>' are contained in maps. \nFor instance, the column `STATIONS_DETAIL_REC.UNQ_STATION_MAP.map.STATION_TYPE_CD` is contained in a map structure called `STATIONS_DETAIL_REC.UNQ_STATION_MAP`.\nThis map structure is like this:\n```\nSTATIONS_DETAIL_REC.UNQ_STATION_MAP.map.STATION_TYPE_CD: {{\n    <key1>: {{\n        <some_field>; <some_value>,\n        \"STATION_TYPE_CD\": <station_type_value1>\n    }},\n    <key2>: {{\n        <some_other_field>; <some_other_value>,\n        \"STATION_TYPE_CD\": <station_type_value2>\n    }},\n    ...\n}}\n```\nTherefore, in this step, identify which columns are contained in maps since you will later need to use LATERAL VIEW EXPLODE to access the values of these maps.  \n### Step 5: Plan the SQL Query  \nExplain, step by step, how you would write the SQL query to answer the question `{question}`, using the columns you identified. \n**Use the full qualified names of the columns**. **DO NOT USE THE `JSON_OBJECT` FUNCTION IN THE QUERY**.\nSome columns are contained in map structures. You can access the fields of the map using LATERAL VIEW EXPLODE. Do not use UNNEST to access the fields of the map.\nIn particular, you can create a temporary table with the exploded map and then query it. For instance, if you need to get the value of the `ABC.CDE.map.field` column, you should use the following SQL code to create a temporary table with the exploded map data and get the value of the field:\n```sql\nWITH exploded_map AS (\n  SELECT key, value.field_1, value,field_2, value.field_3  -- Select here all the columns/fields you will use later. \n  FROM <table_name>\n  LATERAL VIEW EXPLODE(ABC.CDE) AS key, value\n)\nSELECT exploded_map.field_1\nFROM exploded_map\n``` \nThis is another example:\n```sql\n  WITH exploded_map AS (\n  SELECT DATE, ID, RECORD.GROUP, value.CODE  -- Select here all the columns/fields you will use later.\n    FROM CBD_HGU_Detail_Daily_Aura_v10 LATERAL VIEW EXPLODE(STATIONS_DETAIL_REC.UNQ_STATION_MAP) AS key, value) \n  SELECT COUNT(DISTINCT ID) AS num_homes \n  FROM exploded_map JOIN D_Segment_v8 ON exploded_map.CLASS_ID = D_Segment_v8.CLASS_ID \n    WHERE DATE BETWEEN '2024-01-01' AND '2024-02-01' \n      AND D_Segment_v8.DESCRIPTION = 'DESCRIPTION value' \n      AND exploded_map.CODE = 'CODE value'    \n```\nHere is another example. If you need to count the number of elements in a map column named 'ABC.map' you should use a code like this:\n```sql\nWITH exploded_map AS (\n  SELECT key_from_exploded_map\n  FROM <table_name>\n  LATERAL VIEW EXPLODE(ABC) AS key_from_exploded_map, value_from_exploded_map\n)\nSELECT COUNT(key_from_exploded_map)\nFROM exploded_map\n```\nTake into account that all map fields are named with the suffix `_MAP`. Take into account that you can only use the operation EXPLODE to fields that are maps. Therefore, you should use the EXPLODE operation only on fields that end with `_MAP`. \nTo finish this step, explain how you would write the SQL query to answer the question, using the columns you identified, taking into account the previous considerations for columns contained in maps, if there are any.\n### Step 6: Write the final SQL Query and apply the rules\nFinally, write the SQL query to answer the question `{question}`, using the columns you identified. \nRemarks:\n**DO NOT USE THE `JSON_OBJECT` FUNCTION IN THE QUERY**.\n**IMPORTANT: The keys in the exploded maps should not be used in JOIN operations, since they are just internal keys to the map structure.** \nCheck if you need to use any of the following **business rules** to build the query:\n```json\n{{\n  \"rules\": [\n    {{\n      \"id\": \"B1\",\n      \"name\": \"Fiction\",\n      \"rule\": \"If you need to look for tariff plans including \"ficción\" contents, you will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%FICCION%', '%FICCIÓN%', '%SERIES%', '%CINE%', '%FUSIÓN TOTAL%', '%FUSION TOTAL%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FICCION%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FICCIÓN%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%SERIES%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%CINE%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FUSIÓN TOTAL%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FUSION TOTAL%'`\n\"\n    }},\n    {{\n      \"id\": \"B2\",\n      \"name\": \"Disney\",\n      \"rule\": \"If you need to look for tariff plans including \"Disney\" contents, you will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%DISNEY%'.  To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%DISNEY%'`\n\"\n    }},\n    {{\n      \"id\": \"B3\",\n      \"name\": \"Football\",\n      \"rule\": \"If you need to look for tariff plans including football contents, you will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%FUTBOL%', '%FÚTBOL%', '%FUSION TOTAL%', '%FUSIÓN TOTAL%',  '%FUSION TA TOTAL%', '%FUSIÓN TA TOTAL%', '%LIGA%', '%CHAMPION%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one:  `UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FUTBOL%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FÚTBOL%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FUSION TOTAL%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%FUSIÓN TOTAL%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%LIGA%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%CHAMPION%'`\n\"\n    }},\n    {{\n      \"id\": \"B4\",\n      \"name\": \"Netflix\",\n      \"rule\": \"If you need to look for tariff plans including \"Netflix\" contents, you will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%NETFLIX%', '%FICCIÓN%', '%FICCION%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%NETFLIX%'`\n\"\n    }},\n    {{\n      \"id\": \"B5\",\n      \"name\": \"Promociones\",\n      \"rule\": \"If you need to look for tariff plans including \"promotions\", you will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%PROMO%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%PROMO%'`\n\"\n    }},\n    {{\n      \"id\": \"B6\",\n      \"name\": \"Edad promedio 1\",\n      \"rule\": \"You are not allowed to use the field `CBD_INFO_REC.CUST_AGE_NUM` in any query. You should use the field `CBD_INFO_REC.CUST_AGE_SEGMENT_CD` instead.\n\"\n    }},\n    {{\n      \"id\": \"B7\",\n      \"name\": \"Edad promedio 2\",\n      \"rule\": \"If you need to calculate the average age of customers you should use the  following calculation instead of AVG(CBD_INFO_REC.CUST_AGE_SEGMENT_CD): AVG(IF(CBD_INFO_REC.CUST_AGE_SEGMENT_CD = '1', NULL, CBD_INFO_REC.CUST_AGE_SEGMENT_CD))\n\"\n    }},\n    {{\n      \"id\": \"B8\",\n      \"name\": \"Query by customers\",\n      \"rule\": \"If you need to query by customers: if the time scope of the query is daily or weekly then you should use the `DEVICE_ID` field. If the time scope of the query is monthly or longer then you should use the `CUSTOMER_ID` field.\n\"\n    }},\n    {{\n      \"id\": \"B9\",\n      \"name\": \"Station type\",\n      \"rule\": \"The field `STATION_TYPE_L2` corresponds to a higher aggregation level than `STATION_TYPE_L1`.  `STATION_TYPE_L1` corresponds to an intermediate category, used only with analytical purposes.\n\"\n    }},\n    {{\n      \"id\": \"B10\",\n      \"name\": \"Active devices\",\n      \"rule\": \"If you need to check whether a device is active at a given date, you should use this check: `DEVICE_INFO_REC.INACTIVITY_DEVICE_INFO_NUM < 24`. If true, the device is active. If false, the device is inactive.\n\"\n    }},\n    {{\n      \"id\": \"B11\",\n      \"name\": \"Penetración de un producto\",\n      \"rule\": \"If you are asked for calculating \"la penetración de un producto\" you should calculate the percentage of customers with that product.\n\"\n    }},\n    {{\n      \"id\": \"B12\",\n      \"name\": \"Obsolete routers\",\n      \"rule\": \"If you are asked for obsolete routers, you should check for those with MANUFACT_HGU_CHIPSET_DES IN ('Askey Broadcom', 'Askey Econet','MitraStar Broadcom', 'MitraStar Econet').\n\"\n    }},\n    {{\n      \"id\": \"B13\",\n      \"name\": \"High value customers\",\n      \"rule\": \"Consider as high value customers those with a monthly revenue higher than 100 (TOTAL_CUST_RV > 100).\n\"\n    }},\n    {{\n      \"id\": \"B14.1\",\n      \"name\": \"Technological level formula\",\n      \"rule\": \"If you need to check the technological level of a customer, use the following formula on the field `TECH_LEVEL_WEIGHT_QT` of the table `D_CBD_STATIC_STATION_TYPE_v6`: `SUM(COALESCE(D_CBD_STATIC_STATION_TYPE_v6.TECH_LEVEL_WEIGHT_QT,0) + CASE WHEN AMM.VALUE.STATION_BRAND_DES = 'Ubiquiti' THEN 0.8 ELSE 0 END)/COUNT(DISTINCT DAY_DT)`\n\"\n    }},\n    {{\n      \"id\": \"B14.2\",\n      \"name\": \"Technological levels\",\n      \"rule\": \"Consider as **high technological level** customers those with a value higher or equal to 2.5. Consider as **medium technological level** customers those with a value higher or equal to 1 and lower than 2.5. Consider as **low technological level** customers those with a value lower than 1.\n\"\n    }},\n    {{\n      \"id\": \"B15\",\n      \"name\": \"Sport\",\n      \"rule\": \"If you need to look for tariff plans including \"sport\" contents, you will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%DEPORTE%', '%TOTAL PLUS%', '%TOTAL SAT%PLUS%', '%MOTOR%', '%DAZN%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `(UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%DEPORTE%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%TOTAL PLUS%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%TOTAL SAT%PLUS%' -- Se añade para incluir los \"Total Satelite/Satélite Plus\" OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%MOTOR%' OR UPPER(${{TABLE}}.TARIFF_PLAN_DES) LIKE '%DAZN%')`\n\"\n    }},\n    {{\n      \"id\": \"R1\",\n      \"name\": \"Temporary table fields\",\n      \"rule\": \"When you use in a filter a given filed from a temporary table, built using the `WITH` clause, make sure that the  field is actually present in the SELECT statement defining the temporary table.\n\"\n    }},\n    {{\n      \"id\": \"R2\",\n      \"name\": \"Temporary table field naming\",\n      \"rule\": \"Example: If you write a temporary table like this: `WITH temp_table AS (SELECT field1_prefix.field1 FROM table)`,  then you should use refer to the field as `field1` and not as `field1_prefix.field1` in the rest of the query.\n\"\n    }},\n    {{\n      \"id\": \"R3\",\n      \"name\": \"Tariff plan\",\n      \"rule\": \"If you need to look for some specific tariffs, use the field `TARIFF_PLAN_DES` from the dimensional table D_Fixed_Tariff_Plan instead of using `CBD_INFO_REC.COMMERCIAL_TARIFF_ID` since this last one only contains identifiers without any meaning.\n\"\n    }},\n    {{\n      \"id\": \"R4.1\",\n      \"name\": \"Station type 1\",\n      \"rule\": \"If the query uses `D_CBD_Static_Station_Type_v6.STATION_TYPE_L1_DES` or `D_CBD_Static_Station_Type_v6.STATION_TYPE_L2_DES` answer this question: does the value you are looking for, matches one of the possible values of these fields? Justify your answer. Enumerate the possible values of these fields if they are used.\n\"\n    }},\n    {{\n      \"id\": \"R4.2\",\n      \"name\": \"Station type 2\",\n      \"rule\": \"Apply this rule if the query uses a filter with the field `D_CBD_Static_Station_Type_v6.STATION_TYPE_L1_DES` or `D_CBD_Static_Station_Type_v6.STATION_TYPE_L2_DES` and the value you are looking for does not match any of the possible values of these fields. In this case, you should use the field `STATION_TYPE_CD` instead. Write the result of the previous reasoning in detail.  REMEMBER TO FIX THE QUERY TO USE THE FIELD `STATION_TYPE_CD` INSTEAD.\n\"\n    }},\n    {{\n      \"id\": \"R5\",\n      \"name\": \"Counting entities\",\n      \"rule\": \"If you need to count the number of customer, homes, devices or any other entities, you should ensure that you are actually counting distinct entities. Therefore you should use the `COUNT(DISTINCT ...)` function instead of `COUNT(...)`.\n\"\n    }},\n    {{\n      \"id\": \"R6\",\n      \"name\": \"Time scope less than a month\",\n      \"rule\": \"If you are asked to answer a question for a time scope minor than a month (daily or weekly) you must not use the field `MONTH_DT` in your query.\n\"\n    }},\n    {{\n      \"id\": \"R7\",\n      \"name\": \"No UNION operator\",\n      \"rule\": \"Avoid using the UNION operator in your queries.\n\"\n    }},\n    {{\n      \"id\": \"R8\",\n      \"name\": \"Counting entities\",\n      \"rule\": \"If you are asked to count the number of customers, homes, devices or any other entities, you should ensure that the  result is actually a count and not a list of elements. Therefore you should use the COUNT function.\n\"\n    }},\n    {{\n      \"id\": \"R9\",\n      \"name\": \"IoT devices\",\n      \"rule\": \"If you need to look for IoT (Internet of Things) devices, you should look for devices with `STATION_TYPE_L2_DES = 'Smart Home'`\n\"\n    }},\n    {{\n      \"id\": \"R10\",\n      \"name\": \"Router model\",\n      \"rule\": \"If you need to check the model of the router, you should use the field `MANUFACT_HGU_CHIPSET_DES` (do not use other fields such as `MANUFACTURER_FW_VER_DES`).\n\"\n    }},\n    {{\n      \"id\": \"R11\",\n      \"name\": \"Weekly period\",\n      \"rule\": \"If you need to query data from weekly period, you should start always with the first day of the week (Monday) and end with the last day of the week (Sunday).\n\"\n    }},\n    {{\n      \"id\": \"R12\",\n      \"name\": \"WiFi type\",\n      \"rule\": \"If you need to look for information on a specific WiFi type, such as 2.4 GHz or 5 GHz, you should use the specific fields corresponding to these types.  For instance, if you need to look for WiFi5 device information, you should not use the field `STATIONS_REC.WIFI_REC.ALL_TECH_REC` but the field `STATIONS_REC.WIFI_REC.TECH_5G_REC`.\n\"\n    }},\n    {{\n      \"id\": \"R13\",\n      \"name\": \"Equivalent terms for WiFi technologies\",\n      \"rule\": \"The following terms are considered equivalent: \n- `WiFi 5G`, `WiFi Technology 5G`, `WiFi5`.\n- `WiFi 2.4G`, `WiFi Technology 2.4G`, `WiFi2.4` , `WiFi2`, `WiFi Technology 2G`, `WiFi 2G`.\n\"\n    }},\n    {{\n      \"id\": \"R14\",\n      \"name\": \"Customer Satisfaction Index\",\n      \"rule\": \"The field `CSI_QT` contains the `Customer Satisfaction Index` value. It is not a quality value but a satisfaction value.  Do not confuse it with Quality Index fields.\n\"\n    }},\n    {{\n      \"id\": \"R15\",\n      \"name\": \"Active HGU devices\",\n      \"rule\": \"The field `CUST_HGU_DEVICES_NUM` contains the number of active HGU devices of the customer, i.e. the number of active routers (HGUs) of the customer.  Do not confuse it with the number of active devices of the customer.\n\"\n    }},\n    {{\n      \"id\": \"R16\",\n      \"name\": \"Megabytes\",\n      \"rule\": \"The fields starting with `MB_` or containing `_MB_` in their name refer to Megabytes. Take this into account during your queries.\n\"\n    }},\n    {{\n      \"id\": \"R17\",\n      \"name\": \"Gigabytes\",\n      \"rule\": \"The fields starting with `GB_` or containing `_GB_` in their name refer to Gigabytes. Take this into account during your queries.\n\"\n    }},\n    {{\n      \"id\": \"R18\",\n      \"name\": \"RSSI meaning\",\n      \"rule\": \"The field `RSSI` refers to the `Received Signal Strength Indicator`. It is a measure of the power present in a received radio signal.\n\"\n    }},\n    {{\n      \"id\": \"R19\",\n      \"name\": \"Checking absence of a device\",\n      \"rule\": \"If you need to look for homes without a specific type of device, you should not forget checking at least one of the following fields: `STATION_TYPE_L1_DES`, `STATION_TYPE_L2_DES`, `STATION_TYPE_CD`. In other words, you need an explicit filter checking the absence of the device.\n\"\n    }}\n  ]\n}}\n```\nExplain whether you can apply any of the rules and explain how you would apply them in the SQL query.\nAlways write your result following these steps:\n1. SQL query to answer the question `{question}`: <write the SQL query here>\n2. Reasoning: <explain why you wrote the query like that>\n3. Check of the rules, RULE BY RULE and FOR EACH RULE (one entry per rule)2. <write ALL the rules and tell if they are applied or not>. Follow this format:\n- <rule1>: Should be applied, because <reason> | Should not be applied, because <reason>\n- <rule2>: Should be applied, because <reason> | Should not be applied, because <reason>\n...\n4. Result of the execution of the rules that have been identified to be applied. Follow this format:\n- <rule1>: <result>\n- <rule2>: <result>\n...\n5. Need to fix the query because <reason>. The following changes are needed: <change_1>, <change 2>, etc. | The query is already correct.\n6. SQL query to answer the question `{question}` after considering the previous **rules**: <write the SQL query here>. FIX THE QUERY IF NECESSARY.\n### Step 7: Check that the query actually can answer the question\nCheck again if the generated query answers the question `{question}`.\nFollow these steps:\n1. Write the concepts involved in the question. Enumerate the concepts as a list. Follow this format:\n - <concept1>\n - <concept2>\n ...\n2. Write all the concepts of the question that are covered by the SQL query. Enumerate them and create a match list with the concepts from the previous step. Write down the part of the SQL query covering the concept. Take into account that conditions on specific proper names, such as model names, location names, etc, need to be explicitly checked. Follow this format:\n - <concept1>: covered in <sql query section> or not covered.\n - <concept2>: covered in <sql query section> or not covered.\n3. Find those concepts in the question that are not covered by the SQL query.\n4. Conclude whether the question can actually be answered by the generated query. Follow this format:\n - The question can be answered by the SQL query: <Yes|No>\n### Step 8: Create the result as a JSON object\nReturn the result as a unique JSON object, with the following structure:\n{{\n  \"result\": <Write the SQL query here. **MAKE SURE THAT THE STATEMENT `SELECT JSON_OBJECT` is not used in the query and Use the full qualified names of the columns. Generate a valid SQL sentence in a single line without new line characters.**>,\n  \"status\": \"OK\",\n  \"reason\": <a reasoning explaining the query>\n}}\nIf the former table does not contain the necessary data to answer the question, return the following JSON object:\n{{\n  \"result\": null,\n  \"status\": \"ERROR\",\n  \"reason\": <a reasoning explaining why it is not possible to answer the question>\n}}\nMake sure that the JSON object is correctly formatted, and can be parsed by a JSON parser.\n**Please, ALWAYS follow the 8 steps presented in the instructions.** Start reasoning with ### Step 1 and finish with ### Step 8.\n{% endraw %}"
                          }
                      }
                  }
              }
          }
      }
  }

3. Include the new preset in an application

Remember that an ATRIA application must be previously created and configured for the use case.

Once the preset is fully defined and included in aura-configuration-api through the previous steps, it must be declared into the ATRIA application:

Modify ATRIA configuration: Include the new preset in an application

4. Upload documents and execute generate-db job

Follow the guidelines for uploading new or modified documents in a specific environment through the edition of the ConfigMap of the component (included in the general guidelines Import documents into ATRIA).

Upload the documents in the Azure container atria-resources.

Insert these documents in new-copilot-preset/project-copilot/jsonl/ folder.
Keep in mind the allowed formats for documents, set in the preset’s variable loader.loaderType.

Finally, execute the atria-rag-generate-db job to update the data into the environment.

You need to upload the file content in the same folder with the extension .jsonl.

{"page_content": "test1", "metadata": {"source": "https://www.dummy1.es/"}, "type": "Document"}
{"page_content": "test2", "metadata": {"source": "https://www.dummy2.es/"}, "type": "Document"}

5. Update Aura applications configuration via API

Once the new preset is created, the aura-configuration-api must be updated to indicate the application that will make use of this preset.

This document includes a specific scenario in the process of modifying API configuration, described in the document Hot swapping of Aura applications configuration.

    curl --location --request PATCH 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/applications/3e1cb831-d5bf-423d-8bef-4abcc53dfa97' \
    --header 'correlator: <uuid>' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header 'Authorization: APIKEY {{apikey}}' \
    --data '{
        "id": "3e1cb831-d5bf-423d-8bef-4abcc53dfa97",
        "models": {
            "presets": [
                "copilot-preset-rag",
                "copilot-reduced-preset-rag",
                "raw-gpt-4o",
                "openai-preset-gpt-35-turbo-copilot-generative",
                "openai-preset-gpt-4o-copilot-generative",
                "a2cdb523-883e-44ab-8e0b-2d164dd98346" <-- New preset
            ]
        }
    }'

It is necessary to send all application presets in the request.

15 - Adjust timeouts in ATRIA

Adjust timeouts in ATRIA

Guidelines for the adjustment of timeouts in ATRIA aura-gateway-api and atria-model-gateway

Adjust timeouts in aura-gateway-api

The instructions to adjust timeouts in aura-gateway-api and Nginx are detailed below:

Open the ConfigMap aura-gateway-api
kubectl edit configmap aura-gateway-api -n <namespace>
(Change <namespace> by the specific one)
In config key, search and update the AURA_REQUEST_TIMEOUT field:
```
AURA_REQUEST_TIMEOUT: 490000
```
Save and close the ConfigMap
Open the ConfigMap aura-services
kubectl edit vs aura-services -n <namespace>
(Change <namespace> by the specific one)
In aura-gateway-api key, search and update read_timeout and send_timeout field:
```
read_timeout: 495s
send_timeout: 495s
```
Save and close the ConfigMap

Adjust timeouts in atria-model-gateway

The instructions to adjust timeouts in atria-model-gateway are detailed below:

Open the ConfigMap atria-model-gw
kubectl edit cm atria-model-gw-config -n <namespace> (Change <namespace> by the specific one)

Now, modify the following timeout in the corresponding models:

rag-server:
    timeout:
        timeout: 485
        read: 60
gpt-35-turbo:
    timeout:
        timeout: 240
        read: 60
gpt-4:
    timeout:
        timeout: 240
        read: 60

Save and close the ConfigMap

16 - Create new Copilot preset (previous to Metallica)

Create new Copilot preset using ConfigMap

Guidelines valid for releases previous to Metallica

This document includes a specific scenario in the process of modifying ATRIA configuration, described in the document Modify ATRIA components configuration

Guidelines to create new Aura Copilot preset in a specific environment through the use of ConfigMap and aura-configuration-api.

It is important to follow the following steps in the correct order:

Prerequisites
Create ConfigMap copy
Create a new preset in atria-model-gateway
Adjust model params
Allow Preset Access
Add a new project in atria-rag-server
Adjust max_tokens param
Adjust timeouts in aura-gateway-api and Nginx
Upload documents and execute the generate-db job
Restart the deployments
Update Aura applications configuration via API
Load original config and deployments rollback

Prerequisites

Recommended:
- kubectl installed in your local host.
- curl installed in your local host.
- jq installed in your local host.

Enable ConfigMap

As a prerequisite, we must count on a KUBECONFIG with sufficient permissions and access to the environment.

We have one ConfigMap for each component:

atria-model-gateway: atria-model-gw-config
atria-rag-server: atria-rag-config
aura-gateway-api: aura-gateway-api
aura-services: aura-services

For the ConfigMap modification, use the following examples for atria-model-gateway, atria-rag-server, aura-gateway-api and aura-services respectively:

kubectl edit configmap atria-model-gw-config -n <namespace>
kubectl edit configmap atria-rag-config -n <namespace>
kubectl edit cm aura-gateway-api -n <namespace>
kubectl edit vs aura-services -n <namespace>

(Substitute <namespace> with the corresponding environment)

You can also use visual tools for this modification, such as Lens or Sublime.

Access to Azure container

You must have access to Azure container atria-resources.

Create ConfigMap copy

Important: Before modifying anything, it is highly recommended to make a backup of the ConfigMap content, as the format is very sensitive

To avoid possible errors, the first thing to do is to copy the current configuration. For this purpose, execute the following commands:

kubectl get cm atria-model-gw-config -o yaml -n <namespace> > <local_file_path>/model-gw-config.yaml
kubectl get cm atria-rag-config -o yaml -n <namespace> > <local_file_path>/rag-config.yaml
kubectl get cm aura-gateway-api -o yaml -n <namespace> > <local_file_path>/gateway-config.yaml
kubectl get vs aura-services -o yaml -n <namespace> > <local_file_path>/services-config.yaml

Change the namespace by the specific one; change local_file_path by the desired path.

Now you have a copy of the current configuration on your local machine.

Create a new preset in atria-model-gateway

Follow these guidelines for adding a new preset in a specific environment through the edition of the ConfigMap of the component:

Open the ConfigMap atria-model-gw-config
kubectl edit configmap atria-model-gw-config -n <namespace>
(Change <namespace> by the specific one)

Warning: If the presets.yml key is wrongly formatted as a single string, it is necessary to launch the command:

kubectl get cm atria-model-gw-config -n <namespace> -o jsonpath='{.data.presets.yml}'

Afterwards, copy the output and overwrite the whole presets.yml key. This way, you can see the content correctly and include the new preset.

In the key presets, add the new preset with the following structure:

- id: copilot-reduced-preset-rag
  model_id: atria-rag
  name: Copilot
  group: enriched_ai
  description: A RAG system built on a LangChain backend
  session_params:
    window: 0
  preset_params:
    chain: project-copilot-reduced
  model_params:
    max_ref: 3
    sticky_context: null
    candidates_post_filtering: null
    language: en
    max_tokens: 16384

Adjust model params

We also have to set the model that the RAG will use to call the atria-model-gateway. This model is the gpt-4o.

Open the ConfigMap atria-model-gw-config
kubectl edit configmap atria-model-gw-config -n <namespace>
(Change <namespace> by the specific one)

Within the key models, search gpt-4o and update the timeout value:

  timeout:
      timeout: 240
      read: 240

Within the key models, search atria-rag and update the timeout value:
```
timeout: 485        
```
Save and close the ConfigMap

Allow Preset Access

Now that we have created the new preset, we have to modify the access key, to allow the application to use it.

Within the access key, look for the presets key.
In the key 3e1cb831-d5bf-423d-8bef-4abcc53dfa97 (application ID), add the preset name copilot-reduced-preset-rag to the list.

Add a new project in atria-rag-server

Follow these guidelines for adding a new project in a specific environment through the edition of the ConfigMap of the component:

Open the ConfigMap atria-rag-config
kubectl edit configmap atria-rag-config -n <namespace>
(Change <namespace> by the specific one)

Warning: If the projects.yaml.project key is wrongly formatted as a single string, it is necessary to launch the following command:

kubectl get cm atria-rag-config -n <namespace> -o jsonpath='{.data.projects\.yaml\.project}'

Afterwards, copy the output and overwrite the whole projects.yaml.project key. This way, you can see the content correctly and include the new project.

In the key projects.yaml.project, add the new project, as shown below.

Project structure

    project-copilot-reduced:
      name: Project Copilot
      docs:
        json:
          dir: /opt/atria-rag/data/project-copilot-reduced/jsonl
          extensions: jsonl
          loader: jsonl
      embeddings: test_distilbert
      llm: copilot-rag-model-gw-raw-gpt-4-o
      solve_type: sql
      retrievers:
        qdrant:
          host: qdrant.aura-system
          port: 6333
          collection_name: project-copilot-reduced-Aura
          prefix: es-pre-970
        tfidf:
          dump_name: /var/atria-rag-data/tfidf/dump/project-copilot-reduced-Aura
      serving:
        base_url: project-copilot-reduced/jsonl
      parameters:
        candidate_only: false
      prompts:
        generate_sql_query:
          DEFAULT: |
              Generate a SQL query statement to answer the following question:
              `{question}`

              Use the data contained in the following tables.
              {sql_table_definition}

              The following tables, containing auxiliary information, are also available. They include **dimensional tables**:
              ```sql
              CREATE TABLE D_CBD_Static_Geo_Area_v6 (GEO_AREA_ID VARCHAR, CBD_GEO_AREA_LEVEL1_ID VARCHAR, CBD_GEO_AREA_LEVEL2_ID VARCHAR, CBD_GEO_AREA_LEVEL3_ID VARCHAR, CBD_GEO_AREA_LEVEL4_ID VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);
                  COMMENT ON TABLE D_CBD_Static_Geo_Area IS 'Geographical areas. This table contains foreign keys to the different levels of geographical areas. In particular, it contains the foreign keys to these tables: CBD_Static_Geo_Area_Level1, CBD_Static_Geo_Area_Level2, CBD_Static_Geo_Area_Level3, CBD_Static_Geo_Area_Level4. Therefore, this tables is used, via JOIN, to query the geographical information contained in the different levels of geographical areas. For instance, if you have a table T with a field GEO_AREA_ID and you need to check whether this location corresponds to the region of Asturias you will need to look for GEO_AREA_ID in this table, then extract the CBD_GEO_AREA_LEVEL4_ID and query the table CBD_Static_Geo_Area_Level4 to get the name of the region.';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL1_ID IS 'Identifier of the geographical area Level 1 (max level of detail: CP or similar). FORMAT: string containing a numerical code. This field does not contain location names.';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL2_ID IS 'Identifier of the geographical area Level 2 (City/Town). FORMAT: string containing a numerical code. This field does not contain location names.';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province). FORMAT: string containing a numerical code. This field does not contain location names.';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code. This field does not contain location names.';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area.EXTRACTION_TM IS 'Date-time of the record';

              CREATE TABLE D_CBD_Static_Geo_Area_Level2_v6 (CBD_GEO_AREA_LEVEL2_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, CBD_GEO_AREA_LEVEL3_ID VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);
                  COMMENT ON TABLE D_CBD_Static_Geo_Area_Level2 IS 'Geographical area level 2 (State)';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.CBD_GEO_AREA_LEVEL2_ID IS 'Identifier of the geographical area Level 2 (City/Town). FORMAT: string containing a numerical code.';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 2. FORMAT: alphanumeric string containing the name of the city/town.';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province)';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 2';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 2';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_STD_AREA_CD IS 'Standard code of the geo area';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.EXTRACTION_TM IS 'Date-time of the record';

              CREATE TABLE D_CBD_Static_Geo_Area_Level3_v6 (CBD_GEO_AREA_LEVEL3_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, CBD_GEO_AREA_LEVEL4_ID VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, ISO_3166_2_CD VARCHAR, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);
                  COMMENT ON TABLE D_CBD_Static_Geo_Area_Level3 IS 'Geographical area level 3 (Region)';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province). FORMAT: string containing a numerical code.';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 3. FORMAT: alphanumeric string containing the name of the province.';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code.';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 3';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 3';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.ISO_3166_2_CD IS 'ISO 3166-2 associated';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_STD_AREA_CD IS 'Standard code of the geo area';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.EXTRACTION_TM IS 'Date-time of the record';

              CREATE TABLE D_CBD_Static_Geo_Area_Level4_v6 (CBD_GEO_AREA_LEVEL4_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, HASC_1_CD VARCHAR, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);
                  COMMENT ON TABLE D_CBD_Static_Geo_Area_Level4 IS 'Geographical area level 4 (min. Detail)';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code.';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 4. FORMAT: alphanumerical string containing the name of the state/region. EXAMPLE VALUES: ''Asturias'', ''Andaluc\u00eda'', etc.';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 4';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 4';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.HASC_1_CD IS 'Hierarchical administrative subdivision codes ';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_STD_AREA_CD IS 'Standard code of the geo area';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';
                  COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.EXTRACTION_TM IS 'Date-time of the record';

              CREATE TABLE D_CBD_Static_Station_Type_v6 (STATION_TYPE_CD VARCHAR, TECH_LEVEL_WEIGHT_QT FLOAT, STATION_TYPE_L2_DES VARCHAR, STATION_TYPE_L1_DES VARCHAR, STATION_TYPE_L2_ORDER_NUM INT, STATION_TYPE_L1_ORDER_NUM INT, STATION_TYPE_ORDER_NUM INT, CONSCIOUS_IND BOOLEAN, EXTRACTION_TM VARCHAR);
                  COMMENT ON TABLE D_CBD_Static_Station_Type IS 'Station types';
                  COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_CD IS 'Description: Type of device connected to the HGU router. It used to find out which devices are connected to routers in households. Format: String. Example values: "A/V Equipment", "Air Conditioning", "Air Conditioning Control", "Apple Handheld Device", "Apple Home Device", "AudioCast", "Audiocast", "Barcode Printer", "Camera", "Car Dash Cam", "Cryptominner", "Digital Clock", "Dishwasher", "Drone Equipment", "GPS", "Gaming Console", "Hyper Media Player", "IP Camera", "IPC Hub", "IPC Video Recorder", "IoT Device", "Key Cutting Machine", "Media Center", "Monitoring Device", "Multimedia Player", "Network Access Point", "Network Equipment", "PC", "PDA", "PIR Sensor", "Print Server", "Printer", "Projector", "Raspberry", "Router", "Security System", "Smart AC Control", "Smart Air Freshener", "Smart Air Fryer", "Smart Air Ventilator", "Smart Animal Feeder", "Smart Baby Monitor", "Smart Blind", "Smart Bulb", "Smart Bulb Adapter", "Smart Car", "Smart Car e-Charger", "Smart Display e-bike", "Smart Energy Analyzer", "Smart Home Controller", "Smart Home Hub", "Smart Humidifier", "Smart Hydrometer Clock", "Smart Kitchen Appliances", "Smart Kitchen Scale", "Smart Lamp", "Smart Light Dimmer", "Smart Lock Control", "Smart Plug", "Smart Pool", "Smart Power Strip", "Smart Purifier", "Smart Scale", "Smart Signage", "Smart Speaker", "Smart Switch", "Smart TV", "Smart Thermostat", "Smart Toothbrush", "Smart Vacuum", "Smart WallSocket", "Smart Watch", "Smart Watch Fit", "Smart WifiButton", "Smartphone", "Smartphone/Tablet", "Smartwatch", "Smartwatch Fit", "Solar Panel Equipment", "Soundbar", "Steam Controller", "Storage Device", "TPV", "TV Dongle", "Tablet", "Tempest Weather System", "UPS", "VR/AR Headset", "Video Doorbell", "Video Intercom", "Video STB Equipment", "VideointercomIP", "Virtual Desktop", "VoIP Phone", "WAN Extender", "WiFi Extender", "Wifi Dongle", "Wireless Blood Pressure Monitor", "Wireless Bridge", "Wireless Headphones", "Wireless Router + VoIP Series", "e-Note", "eBook"';
                  COMMENT ON COLUMN D_CBD_Static_Station_Type.TECH_LEVEL_WEIGHT_QT IS 'Associated weight for the technologic level of the home';
                  COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L2_DES IS 'Description: Higher level device type grouping. Example values: "PCs & Home Office", "Smartphones / Tablets / eReaders / iWatch", "Multimedia Entertainment", "Gaming", "Sport & Health", "Smart Home", "Unknown", "Network Devices", "Security & Control"';
                  COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L1_DES IS 'Description: Intermediate level device type grouping. Example values: "Smart Speakers & Audio", "PCs & Home Office", "Video Entertainment", "Domestic Appliances", "Smart Energy & Lighting", "Apple Handheld Device", "Smartphones / Tablets / eReaders", "Gaming", "Sport & Health", "Network Devices", "Security & Control", "IoT"';
                  COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L2_ORDER_NUM IS 'Station type order level 2';
                  COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L1_ORDER_NUM IS 'Station type order level 1';
                  COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_ORDER_NUM IS 'Station type order';
                  COMMENT ON COLUMN D_CBD_Static_Station_Type.CONSCIOUS_IND IS 'Indicates if the related device type has energy efficiency';
                  COMMENT ON COLUMN D_CBD_Static_Station_Type.EXTRACTION_TM IS 'Date-time of the record';

              CREATE TABLE D_Segment_v8 (OPERATOR_ID VARCHAR, SEGMENT_ID VARCHAR, SEGMENT_DES VARCHAR, GBL_SEGMENT_ID VARCHAR, SEGMENT_GROUP_ID VARCHAR, SEGMENT_GROUP_DES VARCHAR, EXTRACTION_TM VARCHAR);
                  COMMENT ON TABLE D_Segment IS 'Classifications of the customers, attending to different segmentation criteria, for marketing and management issues, according to OB criteria and its correspondence with the global segment classification';
                  COMMENT ON COLUMN D_Segment.OPERATOR_ID IS 'Global Operator Identifier (Operator acting as owner of the information present in the current entity)';
                  COMMENT ON COLUMN D_Segment.SEGMENT_ID IS 'Description: Organisational segment of the client. Format: two letter string. Possible values: ''NT'' - NTT, ''GP'' - Residencial, ''PE'' - Pymes, ''RE'' - Residencial/SC, ''AU'' - Autonomos, ''OP'' - Operadores, ''GC'' - Grandes Clientes, ''RP'' - Residencial Prepago, ''TE'' - Telefonica, ''SC'' - Sin Clasificar, ''ME'' - Empresas';
                  COMMENT ON COLUMN D_Segment.SEGMENT_DES IS 'Description: Name or description of the organisational segment of the client (provides the description for each segment identifier). Format: string. Example values: ''Residencial",  ''Pymes'', ''Autonomos'', ''Operadores'', ''Grandes Clientes'', ''Sin Clasificar''';
                  COMMENT ON COLUMN D_Segment.GBL_SEGMENT_ID IS 'ID of the global segment classification';
                  COMMENT ON COLUMN D_Segment.SEGMENT_GROUP_ID IS 'ID code of the segmentation group';
                  COMMENT ON COLUMN D_Segment.SEGMENT_GROUP_DES IS 'Description of the segmentation group';
                  COMMENT ON COLUMN D_Segment.EXTRACTION_TM IS 'Date-time of the record';

              CREATE TABLE D_Fixed_Tariff_Plan_v8 (OPERATOR_ID VARCHAR, DAY_DT VARCHAR, TARIFF_PLAN_ID VARCHAR, TARIFF_PLAN_DES VARCHAR, VOICE_IND BOOLEAN, BBAND_IND BOOLEAN, TV_IND BOOLEAN, WORKSTATION_IND BOOLEAN, APP_IND BOOLEAN, VOICE_BUNDLE_QT FLOAT, BBAND_UP_SPEED_QT FLOAT, BBAND_DOWN_SPEED_QT FLOAT, TV_TYPE_CD VARCHAR, FIXED_SERVICE_COMMERCIAL_NAME VARCHAR, COMMERCIAL_IND BOOLEAN, TARIFF_PLAN_START_DT VARCHAR, TARIFF_PLAN_END_DT VARCHAR, CONVERGENT_IND BOOLEAN, BRAND_ID VARCHAR);
                  COMMENT ON TABLE D_Fixed_Tariff_Plan_v8 IS 'Every fixed Tariff to be applied, either Commercial, Convergent, Individual, or any other, for any product&service for the fixed client base';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.OPERATOR_ID IS 'Global Operator Identifier (Operator acting as owner of the information present in the current entity)';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.DAY_DT IS 'Year, month and day of the data  ### Additional Information  Format: YYYYMMDD (4 digits for year, months from 01 to 12, days from 01 to 31).';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_ID IS 'Unique identifier of the tariff plan';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_DES IS 'Name/short description of the tariff plan';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.VOICE_IND IS 'Indicates whether the line has a fixed line voice service associated.  Values: 0=No; 1=Yes.';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BBAND_IND IS 'Indicates whether the line has a Broadband service associated.  Values: 0=No; 1=Yes.';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TV_IND IS 'Indicates if the line has a TV service associated.  Values: 0=No; 1=Yes.';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.WORKSTATION_IND IS 'Indicates if the line has a workstation service associated.  Values: 0=No; 1=Yes.';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.APP_IND IS 'Indicates if the line has the "Aplicateca service" associated.  Values: 0=No; 1=Yes.';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.VOICE_BUNDLE_QT IS 'Amount of data associated with the voice bundle';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BBAND_UP_SPEED_QT IS 'Broadband up speed (Mbps)';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BBAND_DOWN_SPEED_QT IS 'Broadband down speed (Mbps)';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TV_TYPE_CD IS 'Type of TV line';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.FIXED_SERVICE_COMMERCIAL_NAME IS 'Commercial name of the service';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.COMMERCIAL_IND IS 'Indicates if TARIFF_PLAN_ID refers to the COMMERCIAL_TARIFF_ID.    Fill-in with 1 if TARIFF_PLAN_ID refers to the COMMERCIAL_TARIFF_ID or 0 if it doesn''t    0 = Non commercial tariff  1 = commercial tariff';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_START_DT IS 'Start date of the tariff plan validity (that day is the first day when the tariff plan is applicable)  ### Additional Information  Format: YYYYMMDD (4 digits for year, months from 01 to 12, days from 01 to 31).';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_END_DT IS 'End date of the tariff plan validity (that day is the last day when the tariff plan is applicable)  ### Additional Information  Format: YYYYMMDD (4 digits for year, months from 01 to 12, days from 01 to 31).';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.CONVERGENT_IND IS 'Flag indicating if the current fixed tariff plan can be configured as a "Convergent tariff plan", i. e., a plan with special conditions due to the fact of including at least one Fixed line/service and one Mobile line.   0 = No (the plan can''t be configured as convergent)   1 = Yes (the plan can be configured as convergent)';
                  COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BRAND_ID IS 'Commercial brand identifier. In order to differentiate among different brands in the same OB (e.g. Movistar, O2, Tuenti...)';
              ```
              Some of the former tables contain columns in full-qualified format. For instance, these are some examples of full-qualified columns:
              ```
              <record_name>.<field_name>: TEC_PLAT_REC.DEVICE_ID

              <record_name>.<subrecord_name>.<field_name>: TEC_PLAT_REC.TEC_PLAT_SUBCOMP_REC.DEVICE_ID

              ...
              ```
              Always use the full-qualified format when referring to columns in the tables. For instance, if you need to use the column 'TEC_PLAT_REC.DEVICE_ID', you should not refer to it as 'DEVICE_ID', but as 'TEC_PLAT_REC.DEVICE_ID'.

              **Explain in detail, step by step, all your decisions**.

              # General instructions

              ## How to use dimensional tables
              If you need to filter by a higher level geographical such as a region (Comunidad Autónoma) you will need to:
              - join the `GEO_AREA_ID` field of the data table (such as `CBD_HGU_Detail_Daily_v10`) with the `GEO_AREA_ID` field in `D_CBD_Static_Geo_Area_v6` table
                - then join the `CBD_GEO_AREA_LEVEL4_ID` field in the `D_CBD_Static_Geo_Area_v6` with the `CBD_GEO_AREA_LEVEL4_ID` field in the `D_CBD_Static_Geo_Area_Level4_v6` table
                - then compare the `GEO_AREA_LEVEL_DES` field in the `D_CBD_Static_Geo_Area_Level4_v6` table with the name of the region (e.g., 'Cantabria'), since the DESCRIPTION field does contain the actual name of the geographical area.
                **Only perform these joins if explicit filtering or grouping by geographical location is necessary**.

              If you need to filter the `CBD_Summary_HGU_Stations_Daily` table by a period of time yo will need to:
              - join the `DEVICE_ID` field of the data table (such as `CBD_Summary_HGU_Detail_Daily`) with the `DEVICE_ID` field in `CBD_Summary_HGU_Stations_Daily` table
                - then join the `DAY_DT` field of the data table (such as `CBD_Summary_HGU_Detail_Daily`) with the `DAY_DT` field in `CBD_Summary_HGU_Stations_Daily` table
                **Only perform these joins if explicit filtering or grouping by detailed information at a station & interface level is necessary**.

              Use other dimensional tables in a similar way, if necessary.

              ## SQL query generation steps
              Follow these reasoning steps to generate the SQL query:
              - Step 1: Identify Necessary Tables
                - Step 2: Identify Useful Candidate Columns
                - Step 3: Assess if Tables and Columns are Sufficient to Answer the Question
                - Step 4: Plan the SQL Query
                - Step 5: Write the final SQL Query and apply the rules
                - Step 6: Check that the query actually can answer the question
                - Step 7: Create the result as a JSON object


              # Detailed instructions

              ### Step 1: Identify Necessary Tables for answering the question `{question}`
              First, identify which tables are necessary to answer the question `{question}`. Justify why you selected each of these tables.
              Use the following format:
              ```
              I need the following tables to answer the question:
              - <table_name>: <reasoning>
              - <table_name>: <reasoning>
              ...
              ```

              ### Step 2: Identify Useful Candidate Columns for answering the question `{question}`
              Identify which columns are useful to answer the question `{question}`. Justify why you selected each of these columns.
              Always include any column you think may be needed to answer the question. If there are similar columns in the table, you should identify all of them always. You will later choose which them are more suitable to answer the question. But, at this stage, you should include **all the columns that may be useful**.
              Write the list of candidate columns you have identified and the reasoning after each column, using the following format:
              ```
              I can use the following candidate columns to answer the question (including all the columns that may be useful):
              - <table name>:
                - <column_name>: <copy here the full column description from schema, including possible values if present>: <reasoning>.
                - <column_name>: <copy here the full column description from schema, including possible values if present>: <reasoning>.
                ...
              - <table_name>:
                - <column_name>: <copy here the full column description from schema, including possible values if present>: <reasoning>.
                - <column_name>: <copy here the full column description from schema, including possible values if present>: <reasoning>.
                ...
              ...
              ```


              ### Step 3: Assess if Tables and Columns are Sufficient to Answer the Question for answering the question `{question}`
              Tell if the tables and columns you identified are enough to answer the question `{question}`. Make sure to justify your answer and check the actual descriptions of the columns in the table definitions and the user question.
              Write the answer using the following format:
              ```
              Possible to answer the question using the former columns:
              - <reasoning>
              - Result: <Yes|No>
              ```

              ### Step 4: Plan the SQL Query for answering the question `{question}`
              Explain, step by step, how you would write the SQL query to answer the question `{question}`, using the columns you identified.
              **Use the full qualified names of the columns**. **DO NOT USE THE `JSON_OBJECT` FUNCTION IN THE QUERY**.

              To finish this step, explain how you would write the SQL query to answer the question, using the columns you identified, taking into account the previous considerations for columns contained in maps, if there are any.

              ### Step 5: Write the final SQL Query and apply the rules for answering the question `{question}`
              Finally, write the SQL query to answer the question `{question}`, using the columns you identified.
              Remarks:
              **DO NOT USE THE `JSON_OBJECT` FUNCTION IN THE QUERY**.

              Check if you need to use any of the following **business rules** to build the query:
              ```json
              {{
                "rules": [
                  {{
                    "id": "B1",
                    "name": "Fiction",
                    "condition": "Look for tariff plans including \"ficción\" contents in the question `{{question}}`.\n",
                    "action": "You will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%FICCION%', '%FICCIÓN%', '%SERIES%', '%CINE%', '%FUSIÓN TOTAL%', '%FUSION TOTAL%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%FICCION%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%FICCIÓN%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%SERIES%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%CINE%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%FUSIÓN TOTAL%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%FUSION TOTAL%'`\n"
                  }},
                  {{
                    "id": "B2",
                    "name": "Disney",
                    "condition": "Look for tariff plans including \"Disney\" contents in the question `{{question}}`.\n",
                    "action": "You will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%DISNEY%'.  To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%DISNEY%'`\n"
                  }},
                  {{
                    "id": "B3",
                    "name": "Football",
                    "condition": "Look for tariff plans including football contents in the question `{{question}}`.\n",
                    "action": "You will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%FUTBOL%', '%FÚTBOL%', '%FUSION TOTAL%', '%FUSIÓN TOTAL%',  '%FUSION TA TOTAL%', '%FUSIÓN TA TOTAL%', '%LIGA%', '%CHAMPION%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one:  `UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%FUTBOL%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%FÚTBOL%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%FUSION TOTAL%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%FUSIÓN TOTAL%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%LIGA%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%CHAMPION%'`\n"
                  }},
                  {{
                    "id": "B4",
                    "name": "Netflix",
                    "condition": "Look for tariff plans including \"Netflix\" contents in the question `{{question}}`.\n",
                    "action": "You will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%NETFLIX%', '%FICCIÓN%', '%FICCION%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%NETFLIX%'`\n"
                  }},
                  {{
                    "id": "B5",
                    "name": "Promociones",
                    "condition": "Need to look for tariff plans including \"promotions\" in the question `{{question}}`.\n",
                    "action": "You will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%PROMO%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%PROMO%'`\n"
                  }},
                  {{
                    "id": "B6",
                    "name": "Edad promedio 1",
                    "condition": "You are using the field `CBD_INFO_REC.CUST_AGE_NUM` in the query.\n",
                    "action": "You are not allowed to use the field `CBD_INFO_REC.CUST_AGE_NUM` in any query. You should use the field `CBD_INFO_REC.CUST_AGE_SEGMENT_CD` instead.\n"
                  }},
                  {{
                    "id": "B7",
                    "name": "Edad promedio 2",
                    "condition": "Calculate the average age of customers.\n",
                    "action": "You should use the  following calculation instead of AVG(CBD_INFO_REC.CUST_AGE_SEGMENT_CD): AVG(IF(CBD_INFO_REC.CUST_AGE_SEGMENT_CD = '1', NULL, CBD_INFO_REC.CUST_AGE_SEGMENT_CD))\n"
                  }},
                  {{
                    "id": "B8.1",
                    "name": "Query by customers",
                    "condition": "The question `{{question}}` is about customers.\n",
                    "action": "You should use the `CUSTOMER_ID` field to filter by customers.\n"
                  }},
                  {{
                    "id": "B8.2",
                    "name": "Query by homes",
                    "condition": "The question `{{question}}` is about homes.\n",
                    "action": "You should use the `DEVICE_ID` field to filter by homes.\n"
                  }},
                  {{
                    "id": "B9",
                    "name": "Station type",
                    "condition": "The field `STATION_TYPE_L1` or `STATION_TYPE_L2` are used in the query.\n",
                    "action": "The field `STATION_TYPE_L2` corresponds to a higher aggregation level than `STATION_TYPE_L1`.  `STATION_TYPE_L1` corresponds to an intermediate category, used only with analytical purposes.\n"
                  }},
                  {{
                    "id": "B10.1",
                    "name": "Computing of homes or devices (devices are also known as homes)",
                    "condition": "Check if the question: `{{question}}` is asking for a computation on devices or homes (devices are also known as homes).\n",
                    "action": "If no other condition is set, Include this constraint in the query: `DEVICE_INFO_REC.INACTIVITY_DEVICE_INFO_NUM < 24` (The device must be idle less than 24 hours)\n"
                  }},
                  {{
                    "id": "B10.2",
                    "name": "Computing of RSSI",
                    "condition": "Check if the question: `{{question}}` is asking for a computation on RSSI\n",
                    "action": "If no other condition is set, Include this constraint in the query: `DEVICE_INFO_REC.INACTIVITY_DEVICE_INFO_NUM < 24` (The device must be idle less than 24 hours)\n"
                  }},
                  {{
                    "id": "B10.3",
                    "name": "Computing of symmetrical speed",
                    "condition": "Check if the question: `{{question}}` is asking for a computation on symmetrical speed\n",
                    "action": "If no other condition is set, Include this constraint in the query: `DEVICE_INFO_REC.INACTIVITY_DEVICE_INFO_NUM < 24` (The device must be idle less than 24 hours)\n"
                  }},
                  {{
                    "id": "B11",
                    "name": "Penetración de un producto",
                    "condition": "You are asked for calculating \"la penetración de un producto\" in the question `{{question}}`.\n",
                    "action": "You should calculate the percentage of customers with that product.\n"
                  }},
                  {{
                    "id": "B12",
                    "name": "Obsolete routers",
                    "condition": "You are asked for obsolete routers in the question `{{question}}`.\n",
                    "action": "You should check for those with MANUFACT_HGU_CHIPSET_DES IN ('Askey Broadcom', 'Askey Econet','MitraStar Broadcom', 'MitraStar Econet').\n"
                  }},
                  {{
                    "id": "B13",
                    "name": "High value customers",
                    "condition": "You are asked for high value customers in the question `{{question}}`.\n",
                    "action": "Consider as high value customers those with a monthly revenue higher than 100 (TOTAL_CUST_RV > 100).\n"
                  }},
                  {{
                    "id": "B14.1",
                    "name": "Technological level formula",
                    "condition": "Check the technological level of a customer in the question `{{question}}`.\n",
                    "action": "Use the following formula on the field `TECH_LEVEL_WEIGHT_QT` of the table `D_CBD_STATIC_STATION_TYPE_v6`: `SUM(COALESCE(D_CBD_STATIC_STATION_TYPE_v6.TECH_LEVEL_WEIGHT_QT,0) + CASE WHEN AMM.VALUE.STATION_BRAND_DES = 'Ubiquiti' THEN 0.8 ELSE 0 END)/COUNT(DISTINCT DAY_DT)`\n"
                  }},
                  {{
                    "id": "B14.2",
                    "name": "Technological levels",
                    "condition": "You are asked for the technological level of a customer in the question `{{question}}`.\n",
                    "action": "Consider as **high technological level** customers those with a value higher or equal to 2.5. Consider as **medium technological level** customers those with a value higher or equal to 1 and lower than 2.5. Consider as **low technological level** customers those with a value lower than 1.\n"
                  }},
                  {{
                    "id": "B15",
                    "name": "Sport",
                    "condition": "Look for tariff plans including \"sport\" contents.\n",
                    "action": "You will need to look for one the following  patterns in the `TARIFF_PLAN_DES` field: '%DEPORTE%', '%TOTAL PLUS%', '%TOTAL SAT%PLUS%', '%MOTOR%', '%DAZN%'. To make the proper comparison, you should use compare with uppercase letters. For instance, use a filter like this one: `(UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%DEPORTE%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%TOTAL PLUS%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%TOTAL SAT%PLUS%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%MOTOR%' OR UPPER(${{{{TABLE}}}}.TARIFF_PLAN_DES) LIKE '%DAZN%')`\n"
                  }},
                  {{
                    "id": "B16",
                    "name": "Residencial",
                    "condition": "The question `{{question}}` asks for homes or residential customers (B2C users).\n",
                    "action": "Use ONLY the constraint:`CBD_INFO_REC.SEGMENT_ID = 'GP'`. If you use the constraint: `SEGMENT_DES = 'Residencial',  NEVER USE the value in English ('Residential') but the value in Spanish ('Residencial').\n"
                  }},
                  {{
                    "id": "R1",
                    "name": "Temporary table fields",
                    "condition": "You use in a filter a given filed from a temporary table, built using the `WITH` clause.\n",
                    "action": "Make sure that the field is actually present in the SELECT statement defining the temporary table.\n"
                  }},
                  {{
                    "id": "R2",
                    "name": "Temporary table field naming",
                    "condition": "You write a temporary table like this: `WITH temp_table AS (SELECT field1_prefix.field1 FROM table)`.\n",
                    "action": "then you should use refer to the field as `field1` and not as `field1_prefix.field1` in the rest of the query.\n"
                  }},
                  {{
                    "id": "R3",
                    "name": "Tariff plan",
                    "condition": "Look for some specific tariffs in the question `{{question}}`.\n",
                    "action": "Use the field `TARIFF_PLAN_DES` from the dimensional table D_Fixed_Tariff_Plan instead of using `CBD_INFO_REC.COMMERCIAL_TARIFF_ID` since this last one only contains identifiers without any meaning.\n"
                  }},
                  {{
                    "id": "R4.1",
                    "name": "Station type 1",
                    "condition": "The query uses `D_CBD_Static_Station_Type_v6.STATION_TYPE_L1_DES` or `D_CBD_Static_Station_Type_v6.STATION_TYPE_L2_DES`.\n",
                    "action": "Answer this question: does the value you are looking for match one of the possible values of these fields? Justify your answer. Enumerate the possible values of these fields if they are used.\n"
                  }},
                  {{
                    "id": "R4.2",
                    "name": "Station type 2",
                    "condition": "The query uses a filter with the field `D_CBD_Static_Station_Type_v6.STATION_TYPE_L1_DES` or `D_CBD_Static_Station_Type_v6.STATION_TYPE_L2_DES` and the value you are looking for does not match any of the possible values of these fields.\n",
                    "action": "You should use the field `STATION_TYPE_CD` instead. Write the result of the previous reasoning in detail.  REMEMBER TO FIX THE QUERY TO USE THE FIELD `STATION_TYPE_CD` INSTEAD.\n"
                  }},
                  {{
                    "id": "R5",
                    "name": "Counting entities",
                    "condition": "Count the number of customers, homes, devices or any other entities in the question `{{question}}`.\n",
                    "action": "You should ensure that you are actually counting distinct entities. Therefore you should use the `COUNT(DISTINCT ...)` function instead of `COUNT(...)`.\n"
                  }},
                  {{
                    "id": "R6",
                    "name": "Time scope less than a month",
                    "condition": "You are asked to answer a question for a time scope minor than a month (daily or weekly) in the question `{{question}}`.\n",
                    "action": "you must not use the field `MONTH_DT` in your query.\n"
                  }},
                  {{
                    "id": "R7",
                    "name": "No UNION operator",
                    "condition": "You use the UNION operator in your queries.\n",
                    "action": "Avoid using the UNION operator in your queries.\n"
                  }},
                  {{
                    "id": "R8",
                    "name": "Counting entities",
                    "condition": "You are asked to count the number of customers, homes, devices or any other entities in the question `{{question}}`.\n",
                    "action": "You should ensure that the  result is actually a count and not a list of elements. Therefore you should use the COUNT function.\n"
                  }},
                  {{
                    "id": "R9",
                    "name": "IoT devices",
                    "condition": "Look for IoT (Internet of Things) devices in the question `{{question}}`.\n",
                    "action": "You should look for devices with `STATION_TYPE_L2_DES = 'Smart Home'`\n"
                  }},
                  {{
                    "id": "R10",
                    "name": "Router model",
                    "condition": "Check the model of the router in the question `{{question}}`.\n",
                    "action": "You should use the field `MANUFACT_HGU_CHIPSET_DES` (do not use other fields such as `MANUFACTURER_FW_VER_DES`).\n"
                  }},
                  {{
                    "id": "R11",
                    "name": "Weekly period",
                    "condition": "Query data from weekly period.\n",
                    "action": "You should start always with the specified day up to the same day of the following week. For instance, if you are  asked for the week starting on the day 2022-01-01, you should query data from 2022-01-01 to 2022-01-07.\n"
                  }},
                  {{
                    "id": "R12",
                    "name": "WiFi type",
                    "condition": "Look for information on a specific WiFi type, such as 2.4 GHz or 5 GHz.\n",
                    "action": "You should use the specific fields corresponding to these types.  For instance, if you need to look for WiFi5 device information, you should not use the field `STATIONS_REC.WIFI_REC.ALL_TECH_REC` but the field `STATIONS_REC.WIFI_REC.TECH_5G_REC`.\n"
                  }},
                  {{
                    "id": "R13",
                    "name": "Equivalent terms for WiFi technologies",
                    "condition": "You are looking for information on WiFi technologies.\n",
                    "action": "The following terms are considered equivalent: \n- `WiFi 5G`, `WiFi Technology 5G`, `WiFi5`.\n- `WiFi 2.4G`, `WiFi Technology 2.4G`, `WiFi2.4` , `WiFi2`, `WiFi Technology 2G`, `WiFi 2G`.\n"
                  }},
                  {{
                    "id": "R14",
                    "name": "Customer Satisfaction Index",
                    "condition": "The query uses the field `CSI_QT`.\n",
                    "action": "You should keep in mind that the field `CSI_QT` contains the `Customer Satisfaction Index` value. It is not a quality value but a satisfaction value.  Do not confuse it with Quality Index fields.\n"
                  }},
                  {{
                    "id": "R15",
                    "name": "Active HGU devices",
                    "condition": "Look for active HGU devices.\n",
                    "action": "You should keep in mind that the field `CUST_HGU_DEVICES_NUM` contains the number of active HGU devices of the customer, i.e. the number of active routers (HGUs) of the customer.  Do not confuse it with the number of active devices of the customer.\n"
                  }},
                  {{
                    "id": "R16",
                    "name": "Megabytes",
                    "condition": "The query uses fields starting with `MB_` or containing `_MB_` in their name.\n",
                    "action": "Keep in mind that fields starting with `MB_` or containing `_MB_` in their name refer to Megabytes. Take this into account during your queries.\n"
                  }},
                  {{
                    "id": "R17",
                    "name": "Gigabytes",
                    "condition": "The query uses fields starting with `GB_` or containing `_GB_` in their name.\n",
                    "action": "Keep in mind that fields starting with `GB_` or containing `_GB_` in their name refer to Gigabytes. Take this into account during your queries.\n"
                  }},
                  {{
                    "id": "R18",
                    "name": "RSSI meaning",
                    "condition": "The query uses the field `RSSI`.\n",
                    "action": "Keep in mind that the field `RSSI` refers to the `Received Signal Strength Indicator`. It is a measure of the power present in a received radio signal.\n"
                  }},
                  {{
                    "id": "R19",
                    "name": "Checking absence of a device",
                    "condition": "You need to look for homes without a specific type of device.\n",
                    "action": "You should not forget checking at least one of the following fields: `STATION_TYPE_L1_DES`, `STATION_TYPE_L2_DES`, `STATION_TYPE_CD`. In other words, you need an explicit filter checking the absence of the device.\n"
                  }}
                ]
              }}
              ```
              Explain whether you can apply any of the rules and explain how you would apply them in the SQL query.

              Always write your result following these steps:
              5.1. Question to be answered: <write again the question here>
              5.2. SQL query: <write the SQL query here>
              5.3. Reasoning: <explain why you wrote the query like that>
              5.4. Check of the rules, RULE BY RULE and FOR EACH RULE (one entry per rule). Write ALL the rules and tell if they are applied or not. Follow this format:
              - <rule1>: Should be applied, because <reason> | Should not be applied, because <reason>
                - <rule2>: Should be applied, because <reason> | Should not be applied, because <reason>
                ...
                5.5. Result of the execution of the rules that have been identified to be applied. Follow this format:
                - <rule1>: <result>
                - <rule2>: <result>
                ...
                5.6. Need to fix the query because <reason>. The following changes are needed: <change_1>, <change 2>, etc. | The query is already correct.
                5.7. SQL query to answer the question `{question}` after considering the previous **rules**: <write the SQL query here>. FIX THE QUERY IF NECESSARY. Check that the fixed query includes all the rules that should apply.


              ### Step 6: Check that the query actually can answer the question for answering the question `{question}`
              Check again if the generated query answers the question `{question}`.
              Follow these steps:
              6.1. Write the concepts involved in the question. Enumerate the concepts as a list. Follow this format:
              - <concept1>
              - <concept2>
              ...
              6.2. Write all the concepts of the question that are covered by the SQL query. Enumerate them and create a match list with the concepts from the previous step. Write down the part of the SQL query covering the concept. Take into account that conditions on specific proper names, such as model names, location names, etc., need to be explicitly checked with the description of the corresponding column. Follow this format:
              - <concept1>: covered in <sql query section> or not covered.
                - <concept2>: covered in <sql query section> or not covered.
              6.3. Find those concepts in the question that are not covered by the SQL query.
              6.4. Conclude whether the question can actually be answered by the generated query. Follow this format:
                - The question can be answered by the SQL query: <Yes|No>


              ### Step 7: Create the result as a JSON object for answering the question `{question}`
              Return the result as a unique JSON object, with the following structure:
              {{
                "result": <Write the SQL query here. **MAKE SURE THAT THE STATEMENT `SELECT JSON_OBJECT` is not used in the query and Use the full qualified names of the columns. Generate a valid SQL sentence in a single line without new line characters.**>,
                "status": "OK",
                "reason": <a reasoning explaining the query>
              }}
              If the former table does not contain the necessary data to answer the question, return the following JSON object:
              {{
                "result": null,
                "status": "ERROR",
                "reason": <a reasoning explaining why it is not possible to answer the question>
              }}
              Make sure that the JSON object is correctly formatted, and can be parsed by a JSON parser.

              **Please, ALWAYS follow the 7 steps presented in the instructions.** Start reasoning with ### Step 1 and finish with ### Step 7.

Some considerations to keep in mind:
. Make sure that the LLM copilot-rag-model-gw-raw-gpt-4-o is defined within the LLMs field.
. In turn, the preset defined within this LLM must be defined in the ConfigMap atria-model-gw-config.

Save and close the ConfigMap

Adjust max_tokens param

Open the ConfigMap atria-rag-config
kubectl edit configmap atria-rag-config -n <namespace>
(Change <namespace> by the specific one)
In llms key, search copilot-rag-model-gw-raw-gpt-4-o and update the max_tokens field:
```
max_tokens: 16384
```
Save and close the ConfigMap

Adjust timeouts in aura-gateway-api and Nginx

Open the ConfigMap aura-gateway-api
kubectl edit configmap aura-gateway-api -n <namespace>
(Change <namespace> by the specific one)
In config key, search and update the AURA_REQUEST_TIMEOUT field:
```
AURA_REQUEST_TIMEOUT: 490000
```
Save and close the ConfigMap
Open the ConfigMap aura-services
kubectl edit vs aura-services -n <namespace>
(Change <namespace> by the specific one)
In aura-gateway-api key, search and update read_timeout and send_timeout field:
```
read_timeout: 495s
send_timeout: 495s
```
Save and close the ConfigMap

Upload documents and execute generate-db job

Upload the documents in the Azure container atria-resources.

Remember to upload the files to the folder you defined previously in the config project-copilot-reduced/jsonl
Keep in mind the allowed formats for documents, set in the project’s variable loader.

Finally, execute the atria-rag-generate-db job to update the data into the environment.

Restart the deployments

Restart atria-rag-server deployment for the pod to be updated with the changes.
kubectl rollout restart deployment atria-rag -n <namespace>
Restart atria-model-gateway deployment for the pod to be updated with the changes.
kubectl rollout restart deployment atria-model-gw -n <namespace>
Restart aura-gateway-api deployment for the pod to be updated with the changes.
kubectl rollout restart deployment aura-gateway-api -n <namespace>

(Change <namespace> by the specific one)

Update Aura applications configuration via API

Once the changes have been updated and saved in the ConfigMaps, the aura-configuration-api must be updated to indicate the application that will make use of this preset.

This document includes a specific scenario in the process of modifying API configuration, described in the document Hot swapping of Aura applications configuration.

    curl --location --request PATCH 'https://svc-<env>.auracognitive.com/aura-services/v2/configuration/applications/3e1cb831-d5bf-423d-8bef-4abcc53dfa97' \
    --header 'correlator: <uuid>' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header 'Authorization: APIKEY {{apikey}}' \
    --data '{
        "id": "3e1cb831-d5bf-423d-8bef-4abcc53dfa97",
        "models": {
            "presets": [
                "copilot-preset-rag",
                "copilot-reduced-preset-rag",
                "raw-gpt-4o",
                "openai-preset-gpt-35-turbo-copilot-generative",
                "openai-preset-gpt-4o-copilot-generative",
                "openai-preset-gpt-4o-mini-copilot-generative"
            ]
        }
    }'

It is necessary to send all application presets in the request.

Load original config and deployments rollback

In case you want to return to the original configuration, the following steps must be carried out:

Load the original ConfigMap atria-model-gw-config.
kubectl apply -f <local_file_path>/model-gw-config.yaml -n <namespace>
Load the original ConfigMap atria-rag-config.
kubectl apply -f <local_file_path>/rag-config.yaml -n <namespace>
Load the original ConfigMap aura-gateway-api.
kubectl apply -f <local_file_path>/gateway-config.yaml -n <namespace>
Load the original ConfigMap aura-services.
kubectl apply -f <local_file_path>/services-config.yaml -n <namespace>

(Change <namespace> by the specific one; change local_file_path by the desired path)

Restart atria-model-gateway deployment for the pod to be updated with the changes.
kubectl rollout restart deployment atria-model-gw -n <namespace>
Restart atria-rag-server deployment for the pod to be updated with the changes.
kubectl rollout restart deployment atria-rag -n <namespace>
Restart aura-gateway-api deployment for the pod to be updated with the changes.
kubectl rollout restart deployment aura-gateway-api -n <namespace>

(Change <namespace> by the specific one)

17 - Update ATRIA configuration using ConfigMap (previous to Metallica)

Guidelines valid for releases previous to Metallica

This document includes a specific scenario in the process for modifying ATRIA configuration, described in the document Modify ATRIA components configuration

Guidelines to update certain ATRIA configuration parameters related to calls to Aura Copilot in Kiss release in a specific environment through the use of ConfigMap, specifically:

To modify the timeout parameter in the ATRIA gpt-4o model
To modify the SQL prompt in the atria-rag-server project
Upload files and launch the generate-db job

Enable ConfigMap

As a prerequisite, we must count on a KUBECONFIG with sufficient permissions and access to the environment.

We have one ConfigMap for each component:

atria-model-gateway: atria-model-gw-config
atria-rag-server: atria-rag-config

For the ConfigMap modification, use the following example:

kubectl edit configmap atria-model-gw-config -n <namespace> (change the namespace by the specific one)
kubectl edit configmap atria-rag-config -n <namespace> (change the namespace by the specific one)

Substitute <namespace> with the corresponding environment: es-pre or es-pro.

You can also use visual tools for this modification, such as Lens or Sublime.

Edit models timeouts

Guidelines for the modification of the model timeout parameter in the ATRIA gpt-4o model in a specific environment through the edition of the ConfigMap of the component:

Open the ConfigMap atria-model-gw-config and look for the model gpt-4o
kubectl edit configmap atria-model-gw-config -n <namespace>
(Change the namespace by the specific one)
Edit the timeout and read keys to 240
Save and close the ConfigMap
Restart the deployment for the pod to be updated with the changes.
kubectl rollout restart deployment atria-model-gw -n <namespace>
(Change the namespace by the specific one)

Edit models prompts

Guidelines for the modification of the SQL prompt in the atria-rag-server project: Project to Copilot in a specific environment through the edition of the ConfigMap of the component:

Open the ConfigMap atria-rag-config.
kubectl edit configmap atria-rag-config -n <namespace>
(Change the namespace by the specific one)

Important: Before modifying anything, it is highly recommended to make a backup of the ConfigMap content, because the format is very delicate.

Copy the whole content of the projects.yaml.project key and paste it into a new local file. Since it is a string, you need to transform it to YAML format, for an easier modification. You can use the YAML to string tool to convert a string to YAML and vice versa or YAML Lint to validate the YAML format.
When writing prompts, be very careful not to let tabulators (’\t’ characters) slip in. In addition, the spacing must be correct in multi-line strings.
projects.yaml.project contains all projects. At this stage, search the project to be modified: project-copilot.
Within this project, inside the prompts key, add (or modify if it already exists) the generate_sql_query field.
Once the prompt is set, copy all the content and pass it back to string, to paste it in the ConfigMap inside the projects.yaml.project key and save.
Restart the deployment for the pod to be updated with the changes.
kubectl rollout restart deployment atria-rag -n <namespace>
(Change the namespace by the specific one)
Generate SQL query

    DEFAULT: | 
    Generate a SQL query statement to answer the following question:
    `{question}`
    
    Use the data contained in the following table. You have its definition in SQL and in Avro.
    {sql_table_definition}
        
    The following tables, containing auxiliary information, are also available:
    ```sql
    CREATE TABLE D_CBD_Static_Geo_Area_v6 (GEO_AREA_ID VARCHAR, CBD_GEO_AREA_LEVEL1_ID VARCHAR, CBD_GEO_AREA_LEVEL2_ID VARCHAR, CBD_GEO_AREA_LEVEL3_ID VARCHAR, CBD_GEO_AREA_LEVEL4_ID VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);
        COMMENT ON TABLE D_CBD_Static_Geo_Area IS 'Geographical areas. This table contains foreign keys to the different levels of geographical areas. In particular, it contains the foreign keys to these tables: CBD_Static_Geo_Area_Level1, CBD_Static_Geo_Area_Level2, CBD_Static_Geo_Area_Level3, CBD_Static_Geo_Area_Level4. Therefore, this tables is used, via JOIN, to query the geographical information contained in the different levels of geographical areas. For instance, if you have a table T with a field GEO_AREA_ID and you need to check whether this location corresponds to the region of Asturias you will need to look for GEO_AREA_ID in this table, then extract the CBD_GEO_AREA_LEVEL4_ID and query the table CBD_Static_Geo_Area_Level4 to get the name of the region.';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL1_ID IS 'Identifier of the geographical area Level 1 (max level of detail: CP or similar). FORMAT: string containing a numerical code. This field does not contain location names.';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL2_ID IS 'Identifier of the geographical area Level 2 (City/Town). FORMAT: string containing a numerical code. This field does not contain location names.';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province). FORMAT: string containing a numerical code. This field does not contain location names.';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code. This field does not contain location names.';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area.EXTRACTION_TM IS 'Date-time of the record';
        
    CREATE TABLE D_CBD_Static_Geo_Area_Level2_v6 (CBD_GEO_AREA_LEVEL2_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, CBD_GEO_AREA_LEVEL3_ID VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);
        COMMENT ON TABLE D_CBD_Static_Geo_Area_Level2 IS 'Geographical area level 2 (State)';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.CBD_GEO_AREA_LEVEL2_ID IS 'Identifier of the geographical area Level 2 (City/Town). FORMAT: string containing a numerical code.';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 2. FORMAT: alphanumeric string containing the name of the city/town.';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province)';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 2';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 2';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.GEO_STD_AREA_CD IS 'Standard code of the geo area';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level2.EXTRACTION_TM IS 'Date-time of the record';
        
    CREATE TABLE D_CBD_Static_Geo_Area_Level3_v6 (CBD_GEO_AREA_LEVEL3_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, CBD_GEO_AREA_LEVEL4_ID VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, ISO_3166_2_CD VARCHAR, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);
        COMMENT ON TABLE D_CBD_Static_Geo_Area_Level3 IS 'Geographical area level 3 (Region)';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.CBD_GEO_AREA_LEVEL3_ID IS 'Identifier of the geographical area Level 3 (Province). FORMAT: string containing a numerical code.';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 3. FORMAT: alphanumeric string containing the name of the province.';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code.';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 3';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 3';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.ISO_3166_2_CD IS 'ISO 3166-2 associated';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.GEO_STD_AREA_CD IS 'Standard code of the geo area';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level3.EXTRACTION_TM IS 'Date-time of the record';
        
    CREATE TABLE D_CBD_Static_Geo_Area_Level4_v6 (CBD_GEO_AREA_LEVEL4_ID VARCHAR, GEO_AREA_LEVEL_DES VARCHAR, LONGITUDE_LON_CO DOUBLE, LATITUDE_LAT_CO DOUBLE, HASC_1_CD VARCHAR, GEO_AREA_ID VARCHAR, GEO_STD_AREA_CD VARCHAR, OB_ALPHA_ID VARCHAR, EXTRACTION_TM VARCHAR);
        COMMENT ON TABLE D_CBD_Static_Geo_Area_Level4 IS 'Geographical area level 4 (min. Detail)';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.CBD_GEO_AREA_LEVEL4_ID IS 'Identifier of the geographical area Level 4 (State/Region). FORMAT: string containing a numerical code.';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_AREA_LEVEL_DES IS 'Description associated to the identifier level 4. FORMAT: alphanumerical string containing the name of the state/region. EXAMPLE VALUES: ''Asturias'', ''Andaluc\u00eda'', etc.';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.LONGITUDE_LON_CO IS 'Longitude coordinates (in WGS84) associated with level 4';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.LATITUDE_LAT_CO IS 'Latitude coordinates (in WGS84) associated with level 4';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.HASC_1_CD IS 'Hierarchical administrative subdivision codes ';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_AREA_ID IS 'Description: Identifier of the geographical area assigned to the customer (typically the geographical area of the customer home). This identifier is a string code which values are defined in ''D_Geographical_Area'' entity. Format: alphanumeric string. Example values: ''2800983CE'', ''50059'', ''3101142CE''';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.GEO_STD_AREA_CD IS 'Standard code of the geo area';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.OB_ALPHA_ID IS 'Alphanumeric Organizational Business ID';
        COMMENT ON COLUMN D_CBD_Static_Geo_Area_Level4.EXTRACTION_TM IS 'Date-time of the record';
            
    CREATE TABLE D_CBD_Static_Station_Type_v6 (STATION_TYPE_CD VARCHAR, TECH_LEVEL_WEIGHT_QT FLOAT, STATION_TYPE_L2_DES VARCHAR, STATION_TYPE_L1_DES VARCHAR, STATION_TYPE_L2_ORDER_NUM INT, STATION_TYPE_L1_ORDER_NUM INT, STATION_TYPE_ORDER_NUM INT, CONSCIOUS_IND BOOLEAN, EXTRACTION_TM VARCHAR);
        COMMENT ON TABLE D_CBD_Static_Station_Type IS 'Station types';
        COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_CD IS 'Description: Type of device connected to the HGU router. It used to find out which devices are connected to routers in households. Format: String. Example values: "A/V Equipment", "Air Conditioning", "Air Conditioning Control", "Apple Handheld Device", "Apple Home Device", "AudioCast", "Audiocast", "Barcode Printer", "Camera", "Car Dash Cam", "Cryptominner", "Digital Clock", "Dishwasher", "Drone Equipment", "GPS", "Gaming Console", "Hyper Media Player", "IP Camera", "IPC Hub", "IPC Video Recorder", "IoT Device", "Key Cutting Machine", "Media Center", "Monitoring Device", "Multimedia Player", "Network Access Point", "Network Equipment", "PC", "PDA", "PIR Sensor", "Print Server", "Printer", "Projector", "Raspberry", "Router", "Security System", "Smart AC Control", "Smart Air Freshener", "Smart Air Fryer", "Smart Air Ventilator", "Smart Animal Feeder", "Smart Baby Monitor", "Smart Blind", "Smart Bulb", "Smart Bulb Adapter", "Smart Car", "Smart Car e-Charger", "Smart Display e-bike", "Smart Energy Analyzer", "Smart Home Controller", "Smart Home Hub", "Smart Humidifier", "Smart Hydrometer Clock", "Smart Kitchen Appliances", "Smart Kitchen Scale", "Smart Lamp", "Smart Light Dimmer", "Smart Lock Control", "Smart Plug", "Smart Pool", "Smart Power Strip", "Smart Purifier", "Smart Scale", "Smart Signage", "Smart Speaker", "Smart Switch", "Smart TV", "Smart Thermostat", "Smart Toothbrush", "Smart Vacuum", "Smart WallSocket", "Smart Watch", "Smart Watch Fit", "Smart WifiButton", "Smartphone", "Smartphone/Tablet", "Smartwatch", "Smartwatch Fit", "Solar Panel Equipment", "Soundbar", "Steam Controller", "Storage Device", "TPV", "TV Dongle", "Tablet", "Tempest Weather System", "UPS", "VR/AR Headset", "Video Doorbell", "Video Intercom", "Video STB Equipment", "VideointercomIP", "Virtual Desktop", "VoIP Phone", "WAN Extender", "WiFi Extender", "Wifi Dongle", "Wireless Blood Pressure Monitor", "Wireless Bridge", "Wireless Headphones", "Wireless Router + VoIP Series", "e-Note", "eBook"';
        COMMENT ON COLUMN D_CBD_Static_Station_Type.TECH_LEVEL_WEIGHT_QT IS 'Associated weight for the technologic level of the home';
        COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L2_DES IS 'Description: Higher level device type grouping. Example values: "PCs & Home Office", "Smartphones / Tablets / eReaders / iWatch", "Multimedia Entertainment", "Gaming", "Sport & Health", "Smart Home", "Unknown", "Network Devices", "Security & Control"';
        COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L1_DES IS 'Description: Intermediate level device type grouping. Example values: "Smart Speakers & Audio", "PCs & Home Office", "Video Entertainment", "Domestic Appliances", "Smart Energy & Lighting", "Apple Handheld Device", "Smartphones / Tablets / eReaders", "Gaming", "Sport & Health", "Network Devices", "Security & Control", "IoT"';
        COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L2_ORDER_NUM IS 'Station type order level 2';
        COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_L1_ORDER_NUM IS 'Station type order level 1';
        COMMENT ON COLUMN D_CBD_Static_Station_Type.STATION_TYPE_ORDER_NUM IS 'Station type order';
        COMMENT ON COLUMN D_CBD_Static_Station_Type.CONSCIOUS_IND IS 'Indicates if the related device type has energy efficiency';
        COMMENT ON COLUMN D_CBD_Static_Station_Type.EXTRACTION_TM IS 'Date-time of the record';
        
    CREATE TABLE D_Segment_v8 (OPERATOR_ID VARCHAR, SEGMENT_ID VARCHAR, SEGMENT_DES VARCHAR, GBL_SEGMENT_ID VARCHAR, SEGMENT_GROUP_ID VARCHAR, SEGMENT_GROUP_DES VARCHAR, EXTRACTION_TM VARCHAR);
        COMMENT ON TABLE D_Segment IS 'Classifications of the customers, attending to different segmentation criteria, for marketing and management issues, according to OB criteria and its correspondence with the global segment classification';
        COMMENT ON COLUMN D_Segment.OPERATOR_ID IS 'Global Operator Identifier (Operator acting as owner of the information present in the current entity)';
        COMMENT ON COLUMN D_Segment.SEGMENT_ID IS 'Description: Organisational segment of the client. Format: two letter string. Possible values: ''NT'' - NTT, ''GP'' - Residencial, ''PE'' - Pymes, ''RE'' - Residencial/SC, ''AU'' - Autonomos, ''OP'' - Operadores, ''GC'' - Grandes Clientes, ''RP'' - Residencial Prepago, ''TE'' - Telefonica, ''SC'' - Sin Clasificar, ''ME'' - Empresas';
        COMMENT ON COLUMN D_Segment.SEGMENT_DES IS 'Description: Name or description of the organisational segment of the client (provides the description for each segment identifier). Format: string. Example values: ''Residencial",  ''Pymes'', ''Autonomos'', ''Operadores'', ''Grandes Clientes'', ''Sin Clasificar''';
        COMMENT ON COLUMN D_Segment.GBL_SEGMENT_ID IS 'ID of the global segment classification';
        COMMENT ON COLUMN D_Segment.SEGMENT_GROUP_ID IS 'ID code of the segmentation group';
        COMMENT ON COLUMN D_Segment.SEGMENT_GROUP_DES IS 'Description of the segmentation group';
        COMMENT ON COLUMN D_Segment.EXTRACTION_TM IS 'Date-time of the record';
    
    CREATE TABLE D_Fixed_Tariff_Plan_v8 (OPERATOR_ID VARCHAR, DAY_DT VARCHAR, TARIFF_PLAN_ID VARCHAR, TARIFF_PLAN_DES VARCHAR, VOICE_IND BOOLEAN, BBAND_IND BOOLEAN, TV_IND BOOLEAN, WORKSTATION_IND BOOLEAN, APP_IND BOOLEAN, VOICE_BUNDLE_QT FLOAT, BBAND_UP_SPEED_QT FLOAT, BBAND_DOWN_SPEED_QT FLOAT, TV_TYPE_CD VARCHAR, FIXED_SERVICE_COMMERCIAL_NAME VARCHAR, COMMERCIAL_IND BOOLEAN, TARIFF_PLAN_START_DT VARCHAR, TARIFF_PLAN_END_DT VARCHAR, CONVERGENT_IND BOOLEAN, BRAND_ID VARCHAR);
        COMMENT ON TABLE D_Fixed_Tariff_Plan_v8 IS 'Every fixed Tariff to be applied, either Commercial, Convergent, Individual, or any other, for any product&service for the fixed client base';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.OPERATOR_ID IS 'Global Operator Identifier (Operator acting as owner of the information present in the current entity)';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.DAY_DT IS 'Year, month and day of the data  ### Additional Information  Format: YYYYMMDD (4 digits for year, months from 01 to 12, days from 01 to 31).';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_ID IS 'Unique identifier of the tariff plan';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_DES IS 'Name/short description of the tariff plan';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.VOICE_IND IS 'Indicates whether the line has a fixed line voice service associated.  Values: 0=No; 1=Yes.';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BBAND_IND IS 'Indicates whether the line has a Broadband service associated.  Values: 0=No; 1=Yes.';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TV_IND IS 'Indicates if the line has a TV service associated.  Values: 0=No; 1=Yes.';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.WORKSTATION_IND IS 'Indicates if the line has a workstation service associated.  Values: 0=No; 1=Yes.';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.APP_IND IS 'Indicates if the line has the "Aplicateca service" associated.  Values: 0=No; 1=Yes.';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.VOICE_BUNDLE_QT IS 'Amount of data associated with the voice bundle';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BBAND_UP_SPEED_QT IS 'Broadband up speed (Mbps)';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BBAND_DOWN_SPEED_QT IS 'Broadband down speed (Mbps)';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TV_TYPE_CD IS 'Type of TV line';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.FIXED_SERVICE_COMMERCIAL_NAME IS 'Commercial name of the service';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.COMMERCIAL_IND IS 'Indicates if TARIFF_PLAN_ID refers to the COMMERCIAL_TARIFF_ID.    Fill-in with 1 if TARIFF_PLAN_ID refers to the COMMERCIAL_TARIFF_ID or 0 if it doesn''t    0 = Non commercial tariff  1 = commercial tariff';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_START_DT IS 'Start date of the tariff plan validity (that day is the first day when the tariff plan is applicable)  ### Additional Information  Format: YYYYMMDD (4 digits for year, months from 01 to 12, days from 01 to 31).';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.TARIFF_PLAN_END_DT IS 'End date of the tariff plan validity (that day is the last day when the tariff plan is applicable)  ### Additional Information  Format: YYYYMMDD (4 digits for year, months from 01 to 12, days from 01 to 31).';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.CONVERGENT_IND IS 'Flag indicating if the current fixed tariff plan can be configured as a "Convergent tariff plan", i. e., a plan with special conditions due to the fact of including at least one Fixed line/service and one Mobile line.   0 = No (the plan can''t be configured as convergent)   1 = Yes (the plan can be configured as convergent)';
        COMMENT ON COLUMN D_Fixed_Tariff_Plan_v8.BRAND_ID IS 'Commercial brand identifier. In order to differentiate among different brands in the same OB (e.g. Movistar, O2, Tuenti...)';
    ```

    Some of the former tables contain columns in full-qualified format. For instance, these are some examples of full-qualified columns:
    ```
    record_name.field_name
    TEC_PLAT_REC.DEVICE_ID
    
    record_name.subrecord_name.field_name
    TEC_PLAT_REC.TEC_PLAT_SUBCOMP_REC.DEVICE_ID
    ...
    ```
    Always use the full-qualified format when referring to columns in the tables. For instance, if you need to use the column 'TEC_PLAT_REC.DEVICE_ID', you should not refer to it as 'DEVICE_ID', but as 'TEC_PLAT_REC.DEVICE_ID'.
    
    **Explain in detail, step by step, all your decisions**.

    
    # General instructions
    
    Follow these reasoning steps to generate the SQL query:
    - Step 1: Identify Necessary Tables
      - Step 2: Identify Useful Candidate Columns
      - Step 3: Assess if Tables and Columns are Sufficient to Answer the Question
      - Step 4: Identify Columns Contained in Maps
      - Step 5: Plan the SQL Query
      - Step 6: Write the final SQL Query and apply the rules
      - Step 7: Check that the query actually can answer the question
      - Step 8: Create the result as a JSON object
    
    If you need to filter by a higher level geographical such as a region (Comunidad Autónoma) you will need to:
    - join the `GEO_AREA_ID` field of the data table (such as `CBD_HGU_Detail_Daily_v10`) with the `GEO_AREA_ID` field in `D_CBD_Static_Geo_Area_v6` table
      - then join the `CBD_GEO_AREA_LEVEL4_ID` field in the `D_CBD_Static_Geo_Area_v6` with the `CBD_GEO_AREA_LEVEL4_ID` field in the `D_CBD_Static_Geo_Area_Level4_v6` table   
      - then compare the `GEO_AREA_LEVEL_DES` field in the `D_CBD_Static_Geo_Area_Level4_v6` table with the name of the region (e.g., 'Cantabria'), since the DESCRIPTION field does contain the actual name of the geographical area.
      **Only perform these joins if explicit filtering or grouping by geographical location is necessary**.
    
    

    # Detailed instructions
    
    ### Step 1: Identify Necessary Tables
    First, identify which tables are necessary to answer the question `{question}`. Justify why you selected each of these tables. 
    Use the following format:
    ```
    I need the following tables to answer the question:
    - <table_name>: <reasoning>
    - <table_name>: <reasoning>
    ...
    ```
    
    ### Step 2: Identify Useful Candidate Columns
    Identify which columns are useful to answer the question `{question}`. Justify why you selected each of these columns.
    Always include any column you think may be needed to answer the question. If there are similar columns in the table, you should identify all of them always. You will later choose which them are more suitable to answer the question. But, at this stage, you should include **all the columns that may be useful**.
    Write the list of candidate columns you identified, and the reasoning after each column, using the following format:
    ```
    I can use the following candidate columns to answer the question (including all the columns that may be useful):
    - <table name>:
      - <column_name>: <copy here the full column description from schema>, including possible values if present>: <reasoning>. 
      - <column_name>: <copy here the full column description from schema>, including possible values if present>: <reasoning>.
      ...
    - <table_name>:
      - <column_name>: <copy here the full column description from schema>, including possible values if present>: <reasoning>.
      - <column_name>: <copy here the full column description from schema>, including possible values if present>: <reasoning>.
      ...
    ...
    ```

    
    ### Step 3: Assess if Tables and Columns are Sufficient to Answer the Question
    Tell if the tables and columns you identified are enough to answer the question `{question}`. Make sure to justify your answer and check the actual descriptions of the columns in the table definitions and the user question.
    Write the answer using the following format:
    ```
    Possible to answer the question using the former columns: 
    - <reasoning>
        - Result: <Yes|No>
        ```
    
    
    ### Step 4: Identify Columns Contained in Maps
    Some columns are actually contained in a map structure. Since these columns need to be queried differently, you need to identify them.
    Columns with a name like '<some_name>.map.<other_name>' are contained in maps. 
    For instance, the column `STATIONS_DETAIL_REC.UNQ_STATION_MAP.map.STATION_TYPE_CD` is contained in a map structure called `STATIONS_DETAIL_REC.UNQ_STATION_MAP`.
    This map structure is like this:
    ```
    STATIONS_DETAIL_REC.UNQ_STATION_MAP.map.STATION_TYPE_CD: {{
        <key1>: {{
            <some_field>; <some_value>,
            "STATION_TYPE_CD": <station_type_value1>
        }},
        <key2>: {{
            <some_other_field>; <some_other_value>,
            "STATION_TYPE_CD": <station_type_value2>
        }},
        ...
    }}
    ```
    Therefore, in this step, identify which columns are contained in maps since you will later need to use LATERAL VIEW EXPLODE to access the values of these maps.
    
    
    ### Step 5: Plan the SQL Query  
    Explain, step by step, how you would write the SQL query to answer the question `{question}`, using the columns you identified. 
    **Use the full qualified names of the columns**. **DO NOT USE THE `JSON_OBJECT` FUNCTION IN THE QUERY**.
    
    Some columns are contained in map structures. You can access the fields of the map using LATERAL VIEW EXPLODE. Do not use UNNEST to access the fields of the map.
    In particular, you can create a temporary table with the exploded map and then query it. For instance, if you need to get the value of the `ABC.CDE.map.field` column, you should use the following SQL code to create a temporary table with the exploded map data and get the value of the field:
    ```sql
    WITH exploded_map AS (
      SELECT key, value.field_1, value,field_2, value.field_3  -- Select here all the columns/fields you will use later. 
      FROM <table_name>
      LATERAL VIEW EXPLODE(ABC.CDE) AS key, value
    )
    SELECT exploded_map.field_1
    FROM exploded_map
    ``` 
    This is another example:
    ```sql
      WITH exploded_map AS (
      SELECT DATE, ID, RECORD.GROUP, value.CODE  -- Select here all the columns/fields you will use later.
        FROM CBD_HGU_Detail_Daily_Aura_v10 LATERAL VIEW EXPLODE(STATIONS_DETAIL_REC.UNQ_STATION_MAP) AS key, value) 
      SELECT COUNT(DISTINCT ID) AS num_homes 
      FROM exploded_map JOIN D_Segment_v8 ON exploded_map.CLASS_ID = D_Segment_v8.CLASS_ID 
        WHERE DATE BETWEEN '2024-01-01' AND '2024-02-01' 
          AND D_Segment_v8.DESCRIPTION = 'DESCRIPTION value' 
          AND exploded_map.CODE = 'CODE value'	
    ```
    Here is another example. If you need to count the number of elements in a map column named 'ABC.map' you should use a code like this:
    ```sql
    WITH exploded_map AS (
      SELECT key_from_exploded_map
      FROM <table_name>
      LATERAL VIEW EXPLODE(ABC) AS key_from_exploded_map, value_from_exploded_map
    )
    SELECT COUNT(key_from_exploded_map)
    FROM exploded_map
    ```
    Take into account that all map fields are named with the suffix `_MAP`. Take into account that you can only use the operation EXPLODE to fields that are maps. Therefore, you should use the EXPLODE operation only on fields that end with `_MAP`. 
    
    To finish this step, explain how you would write the SQL query to answer the question, using the columns you identified, taking into account the previous considerations for columns contained in maps, if there are any.
    
    
    ### Step 6: Write the final SQL Query and apply the rules
    Finally, write the SQL query to answer the question `{question}`, using the columns you identified. 
    Remarks:
    **DO NOT USE THE `JSON_OBJECT` FUNCTION IN THE QUERY**.
    **IMPORTANT: The keys in the exploded maps should not be used in JOIN operations, since they are just internal keys to the map structure.**
    
    Check if you need to use any of the following **business rules** to build the query:
    ```json
    {rules}
    ```
    Explain whether you can apply any of the rules and explain how you would apply them in the SQL query.
    
    Always write your result following these steps:
    1. SQL query to answer the question `{question}`: <write the SQL query here>
       2. Reasoning: <explain why you wrote the query like that>
       3. Check of the rules, RULE BY RULE and FOR EACH RULE (one entry per rule)2. <write ALL the rules and tell if they are applied or not>. Follow this format:
       - <rule1>: Should be applied, because <reason> | Should not be applied, because <reason>
       - <rule2>: Should be applied, because <reason> | Should not be applied, because <reason>
       ...
       4. Result of the execution of the rules that have been identified to be applied. Follow this format:
       - <rule1>: <result>
       - <rule2>: <result>
       ...
       5. Need to fix the query because <reason>. The following changes are needed: <change_1>, <change 2>, etc. | The query is already correct.
       6. SQL query to answer the question `{question}` after considering the previous **rules**: <write the SQL query here>. FIX THE QUERY IF NECESSARY.


    ### Step 7: Check that the query actually can answer the question
    Check again if the generated query answers the question `{question}`.
    Follow these steps:
    1. Write the concepts involved in the question. Enumerate the concepts as a list. Follow this format:
     - <concept1>
     - <concept2>
     ...
       2. Write all the concepts of the question that are covered by the SQL query. Enumerate them and create a match list with the concepts from the previous step. Write down the part of the SQL query covering the concept. Take into account that conditions on specific proper names, such as model names, location names, etc, need to be explicitly checked. Follow this format:
          - <concept1>: covered in <sql query section> or not covered.
          - <concept2>: covered in <sql query section> or not covered.
          3. Find those concepts in the question that are not covered by the SQL query.
          4. Conclude whether the question can actually be answered by the generated query. Follow this format:
          - The question can be answered by the SQL query: <Yes|No>
    

    ### Step 8: Create the result as a JSON object
    Return the result as a unique JSON object, with the following structure:
    {{
      "result": <Write the SQL query here. **MAKE SURE THAT THE STATEMENT `SELECT JSON_OBJECT` is not used in the query and Use the full qualified names of the columns. Generate a valid SQL sentence in a single line without new line characters.**>,
      "status": "OK",
      "reason": <a reasoning explaining the query>
    }}
    If the former table does not contain the necessary data to answer the question, return the following JSON object:
    {{
      "result": null,
      "status": "ERROR",
      "reason": <a reasoning explaining why it is not possible to answer the question>
    }}
    Make sure that the JSON object is correctly formatted, and can be parsed by a JSON parser.
    
    
    **Please, ALWAYS follow the 8 steps presented in the instructions.** Start reasoning with ### Step 1 and finish with ### Step 8.

Upload documents and execute generate-db job

Guidelines for uploading new or modified documents in a specific environment through the edition of the ConfigMap of the component:

Upload the documents in the Azure container atria-resources.

To make it easier to understand which project the documents belong to, insert these documents in a folder with the name of the project.
Keep in mind the allowed formats for documents, set in the project’s variable loader.
An example of folder structure is shown below.

Structure Folders

If you want to update any parameter in the documents, you need to modify the ConfigMap. For example, if there is a change in the documents’ path, the field dir must be updated with the new path where the documents are stored.

Open the ConfigMap atria-rag-config. kubectl edit configmap atria-rag-config -n <namespace> (Change the namespace by the specific one)
Copy the whole content of the projects.yaml.project key and paste it into a new local file. Since it is a string, you need to transform it to YAML format, for an easier modification. You can use the YAML to string tool to convert a string to YAML and vice versa.
Modify the docs key of the project.
Once changes in docs are set, copy all the content and pass it back to string, to paste it in the ConfigMap inside the projects.yaml.project key and save.
kubectl rollout restart deployment atria-rag -n <namespace> (Change the namespace by the specific one)

Here is an example of documents configuration. In this example, it has been separated into two folders within the project, as we are going to load two different types of data into this project.

```yaml
project-copilot:
  docs:
    pdf:
      dir: /opt/atria-rag/data/project-copilot/pdfs
      extensions: pdf
      loader: unstructured
      loader_options:
          mode: single
    url:
      dir: /opt/atria-rag/data/project-copilot/urls
      extensions: txt
      loader: url_list
```

If you use URLs, upload a file with the list of URLs in the project folder. Separate each URL with a line break. The file must have the extension .txt.
```
http://www.url1.com
http://www.url2.com
```

If you use jsonl files, you need to upload the file content in the same folder with the extension .jsonl. To do so, each desired document content must be provided in the page_content key.

 {"page_content": "test content 1", "metadata": {"source": "https://www.dummy1.es/"}, "type": "Document"}
 {"page_content": "test content 2", "metadata": {"source": "https://www.dummy2.es/"}, "type": "Document"}

Finally, execute the atria-rag-generate-db job to update the data into the environment.

18 - Modify prompts (previous to Metallica)

Modify prompts using ConfigMap

Guidelines valid for releases previous to Metallica

This document includes a specific scenario in the process of modifying ATRIA prompts, described in the document Modify ATRIA components configuration

Guidelines to modify the Aura prompts in a specific environment through the use of ConfigMap.
It is important to follow the following steps in the correct order:

Modify prompts using ConfigMap

Prerequisites

Recommended:
- kubectl installed in your local host.
- curl installed in your local host.
- jq installed in your local host.

Enable ConfigMap

As a prerequisite, we must count on a KUBECONFIG with sufficient permissions and access to the environment.

We have one ConfigMap for each component:

atria-model-gateway: atria-model-gw-config
atria-rag-server: atria-rag-config

For the ConfigMap modification, use the following examples for atria-model-gateway and atria-rag-server respectively:

kubectl edit configmap atria-model-gw-config -n <namespace>
kubectl edit configmap atria-rag-config -n <namespace>

(Substitute <namespace> with the corresponding environment)

You can also use visual tools for this modification, such as Lens or Sublime.

Access to Azure container

You must have access to Azure container atria-resources.

Create ConfigMap copy

Important: Before modifying anything, it is highly recommended to make a backup of the ConfigMap content, as the format is very sensitive

To avoid possible errors, the first thing to do is to copy the current configuration. For this purpose, execute the following commands:

kubectl get cm atria-model-gw-config -o yaml -n <namespace> > <local_file_path>/model-gw-config.yaml
kubectl get cm atria-rag-config -o yaml -n <namespace> > <local_file_path>/rag-config.yaml

Change the namespace by the specific one; change local_file_path by the desired path.

Now you have a copy of the current configuration on your local machine.

Modify prompt in atria-model-gateway

Follow these guidelines for adding a new preset in a specific environment through the edition of the ConfigMap of the component:

Open the ConfigMap atria-model-gw-config
kubectl edit configmap atria-model-gw-config -n <namespace>
(Change <namespace> by the specific one)

Warning: If the presets.yml key is wrongly formatted as a single string, it is necessary to launch the command:

kubectl get cm atria-model-gw-config -n <namespace> -o jsonpath='{.data.presets.yml}'

Afterwards, copy the output and overwrite the whole presets.yml key. This way, you can see the content correctly and include the new preset.

In the key presets, within openai-preset-gpt-4o-example-generative preset, modify the preamble key with the new prompt.

- description: Atria Example Generate Response GPT4o
  group: simple_ai
  id: openai-preset-gpt-4o-example-generative
  model_id: gpt-4o
  name: Atria Example with model GPT 4o
  session_params:
    window: 0
  preamble:
    - <INSERT THE NEW PROMPT>
  query_args:
    query: ''
    data: ''
  max_input_length: 50000
  model_params:
    temperature:
      - 0.001
      - 0.001
      - 1.0
    top_p:
      - null
      - 0.0
      - 1.0

Save and close the ConfigMap

Modify presets name in atria-model-gateway

Follow these guidelines for changing a name of preset in a specific environment through the edition of the ConfigMap of the component:

Open the ConfigMap atria-model-gw-config
kubectl edit configmap atria-model-gw-config -n <namespace>
(Change <namespace> by the specific one)

Warning: If the presets.yml key is wrongly formatted as a single string, it is necessary to launch the command:

kubectl get cm atria-model-gw-config -n <namespace> -o jsonpath='{.data.presets.yml}'

Afterwards, copy the output and overwrite the whole presets.yml key. This way, you can see the content correctly and change the name.

In the key presets, within example-preset-rag preset, modify the name key with the same value as the id (example-preset-rag).

 - id: example-preset-rag
  model_id: atria-rag
  name: example-preset-rag
  group: enriched_ai
  description: A RAG system built on a LangChain backend
  session_params:
    window: 0
  preset_params:
    chain: project-example
  model_params:
    max_ref: 3
    sticky_context: null
    candidates_post_filtering: null
    language: en
    max_tokens: 16384

In the key presets, within example-reduced-preset-rag preset, modify the name key with the same value as the id (*example-reduced-preset-rag).

- id: example-reduced-preset-rag
  model_id: example-reduced-preset-rag
  name: Example
  group: enriched_ai
  description: A RAG system built on a LangChain backend
  session_params:
    window: 0
  preset_params:
    chain: project-example-reduced
  model_params:
    max_ref: 3
    sticky_context: null
    candidates_post_filtering: null
    language: en
    max_tokens: 16384

Save and close the ConfigMap

Modify prompt in atria-rag-server

Follow these guidelines to modify the project-example-reduced prompt in a specific environment through the edition of the ConfigMap of the component:
(The project-example-reduced prompt is used in the example-reduced-preset-rag preset)

Open the ConfigMap atria-rag-config
kubectl edit configmap atria-rag-config -n <namespace>
(Change <namespace> by the specific one)

Warning: If the projects.yaml.project key is wrongly formatted as a single string, it is necessary to launch the following command:

kubectl get cm atria-rag-config -n <namespace> -o jsonpath='{.data.projects\.yaml\.project}'

Afterwards, copy the output and overwrite the whole projects.yaml.project key. This way, you can see the content correctly and modify the corresponding prompt.

In the key projects.yaml.project, inside the project-example-reduced project, modify the prompt.

Project structure

    project-example-reduced:
      name: Project Example
      docs:
        json:
          dir: /opt/atria-rag/data/project-example-reduced/jsonl
          extensions: jsonl
          loader: jsonl
      embeddings: test_distilbert
      llm: example-rag-model-gw-raw-gpt-4-o
      solve_type: sql
      retrievers:
        qdrant:
          host: qdrant.aura-system
          port: 6333
          collection_name: project-example-reduced-Aura
          prefix: es-pre-970
        tfidf:
          dump_name: /var/atria-rag-data/tfidf/dump/project-example-reduced-Aura
      serving:
        base_url: project-example-reduced/jsonl
      parameters:
        candidate_only: false
      prompts:
        generate_sql_query:
          DEFAULT: |
            <INSERT THE NEW PROMPT>

Some considerations to keep in mind:
. Make sure that the LLM example-rag-model-gw-raw-gpt-4-o is defined within the LLMs field.
. In turn, the preset defined within this LLM must be also defined in the ConfigMap atria-model-gw-config.

Save and close the ConfigMap

Upload documents and execute generate-db job

This step is only necessary if you have uploaded new files.

Upload the documents in the Azure container atria-resources.

Remember to upload the files to the folder you defined previously in the config project-example-reduced/jsonl
Keep in mind the allowed formats for documents, set in the project’s variable loader.

Finally, execute the atria-rag-generate-db job to update the data into the environment.

Restart the deployments

Restart atria-rag-server deployment for the pod to be updated with the changes.
kubectl rollout restart deployment atria-rag -n <namespace>
Restart atria-model-gateway deployment for the pod to be updated with the changes.
kubectl rollout restart deployment atria-model-gw -n <namespace>

(Change <namespace> by the specific one)

Load original config and deployments rollback

In case you want to return to the original configuration, the following steps must be carried out:

Load the original ConfigMap atria-model-gw-config.
kubectl apply -f <local_file_path>/model-gw-config.yaml -n <namespace>
Load the original ConfigMap atria-rag-config.
kubectl apply -f <local_file_path>/rag-config.yaml -n <namespace>

(Change <namespace> by the specific one; change local_file_path by the desired path)

Restart atria-model-gateway deployment for the pod to be updated with the changes.
kubectl rollout restart deployment atria-model-gw -n <namespace>
Restart atria-rag-server deployment for the pod to be updated with the changes.
kubectl rollout restart deployment atria-rag -n <namespace>

(Change <namespace> by the specific one)