Work with Kernel

1: Kernel configuration
2: Kernel datasets upload handling

Work with Kernel

Guidelines that include different tasks required for working with Kernel

Index

1 - Kernel configuration

Kernel configuration: General steps

General guidelines for the execution over Kernel of tasks required for Aura features

Introduction

Certain Aura features requires the execution of preliminary tasks over Kernel to access its integrated resources and capabilities, such as APIs, datasets, etc.

The following sections outline the tasks that are common to all Aura features. Additionally, each of them will have their own specific requirements.

1. Check APIs publication in Kernel

If the Aura feature uses a Kernel API, check if the API is published in Kernel: List of available APIS on Telefónica Kernel.
If not, follow the guidelines in the document Publish an API in Kernel.

2. Check datasets publication in Kernel

Only necessary if the Aura feature requires datasets

Identify the required datasets for the configuration of the feature:
- Aura entities
Check that these datasets are published in Kernel: List of available datasets on Telefónica Kernel.
If not, ask the Kernel Team to publish the datasets in Kernel productive environment with the latest version.

3. Create a Kernel application

Accessing Kernel data requires the previous generation of an application (Kernel client), which must be configured with permissions to access specific datasets.

For certain Aura features, a specific Kernel application must be created from scratch. Other ones require the use of already existing Kernel applications, such as aura-bot-[environment], that must be specifically configured for this feature (Step 4).

Ask the Kernel Team to create a new application with a specific name (id) in Kernel for your intended environment.
Once the app is created, two parameters will be provided for securely accessing:
- client_id: unique identifier of the consuming app acting as Kernel API client.
- client_secret: password.

4. Assign purpose/scopes to the application

Only in the case of data that contain personal information (PI-scopes), it is necessary to create a purpose in Kernel, that defines the reason for accessing information-related data. In this scenario, ask the Kernel Team to generate a purpose for the new application or the existing one required by the Aura feature.
Ask the Kernel Team to generate the scopes required for the Aura feature, that define the access level to datasets (writing/reading permissions to/from Kernel datasets).

Take into account the following considerations:
- If a purpose is required, the scopes must be associated to it.
- There are global admin scopes that are always mandatory for every app in order to write/read datasets:
  - admin:datasets:read
  - data:read
  - data:write
- Additionally, each Aura feature requires its specific scopes.
- The version number is not needed in the scopes.

Guidelines for Kernel configuration in specific Aura features

List of published guidelines that include specific Kernel configuration:

Billing module operation

2 - Kernel datasets upload handling

Kernel datasets upload handling

Guidelines for the enabling and disabling of Kernel datasets upload in non-productive environments.

Introduction

After the deployment of Aura in any environment, all its components will generate KPI entities files that will be uploaded into Kernel in CSV or Avro format, as datasets. These procedures increment the cost, both in Aura and in Kernel instances:

More consumption of Azure Storage
More time of execution of the Databricks cluster of Aura
Need for more storage in Kernel, both in Azure for the CSVs and for Avro datasets

Moreover, the data generated in these environments is almost never analyzed nor used.

Because of this, the proposal is to disable the uploading and to minimize the storage of these files, to minimize the costs, once the sanity test set was executed and the process has been validated.

If, eventually, there is a need to test the process again or to upload some data to validate algorithms or to use the Aura billing module, everything can be enabled again.

Prerequisites

A kubeconfig of the Aura environment must be configured.
az client installed in your PC.
Credentials to access the Azure subscription.
Substitute <YOUR-ENV> with the corresponding pre-production environment: es-pre, es-cert, br-pre, de-pre, de-int, etc.
The installation output file (output_install/<YOUR_ENV>_info.json) to get:
- The token and the URL of the Databricks cluster.
  - Substitute <DATABRICKS_TOKEN> with Databricks cluster token.
  - Substitute <DATABRICKS_URL> with the domain of the Databricks cluster URL.
- The job_id of the databricks job in charge of uploading the datasets to Kernel
  - Substitute <DATABRICKS_JOB_ID> with the job_id.
- The Azure Storage account name and the blob container where the KPI entities files are stored.
  - Substitute <AZURE_COMMON_STORAGE> with STORAGE_ACCOUNT_NAME and <KPI_BLOB_CONTAINER_NAME> with its value.

PATH_TO_YOUR_OUTPUT_INSTALL_ENV_FILE=output_install/<YOUR_ENV>_info.json
STORAGE_ACCOUNT_NAME=$(cat ${PATH_TO_YOUR_OUTPUT_INSTALL_ENV_FILE}|jq -r .common_azure_storage_account_name)
STORAGE_ACCESS_KEY=$(cat ${PATH_TO_YOUR_OUTPUT_INSTALL_ENV_FILE}|jq -r .common_azure_storage_access_key)
KPI_BLOB_CONTAINER_NAME=$(cat ${PATH_TO_YOUR_OUTPUT_INSTALL_ENV_FILE}|jq -r .kpi_blob_container_name)
DATABRICKS_JOB_ID=$(cat ${PATH_TO_YOUR_OUTPUT_INSTALL_ENV_FILE}|jq -r .databricks.job_id)
DATABRICKS_URL=$(cat ${PATH_TO_YOUR_OUTPUT_INSTALL_ENV_FILE}|jq -r .databricks.url)
DATABRICKS_TOKEN=$(cat ${PATH_TO_YOUR_OUTPUT_INSTALL_ENV_FILE}|jq -r .databricks.token)

Disable data uploading

Disable aura-kpis-uploader CSV files upload

Suspend aura-kpi-uploader job:

kubectl -n <YOUR-ENV> patch cronjobs kpi-uploader -p '{"spec" : {"suspend" : true }}'

Disable aura-databricks-job Avro files upload

Pause aura-databricks-job job:
- Substitute <DATABRICKS_JOB_ID> with the DATABRICKS_JOB_ID obtained from the installation output file.

curl -XPOST --header 'Authorization: <DATABRICKS_TOKEN>' https://<DATABRICKS_URL>/api/2.1/jobs/update --data '{
   "job_id": <DATABRICKS_JOB_ID>,
   "new_settings":{
      "schedule":{
         "pause_status":"PAUSED"
      }
   }
}'

Remove old KPI entity files generated by Aura and ATRIA components

This step will be fulfilled by applying a removal policy on the Azure blob container where the components write the KPIs.

There are two ways of applying this change:

Apply the policy from Azure portal
Apply the policy using az client

Apply the policy from Azure portal

Access Azure portal
Look for <AZURE_COMMON_STORAGE> account and <KPI_BLOB_CONTAINER_NAME>
Apply management-policy to <KPI_BLOB_CONTAINER_NAME> and to <KPI_BLOB_CONTAINER_NAME>/processed

Apply the policy using az client

To execute this step, first log in to the Azure subscription with az login.

PATH_TO_YOUR_OUTPUT_INSTALL_ENV_FILE=output_install/<YOUR_ENV>_info.json
STORAGE_ACCOUNT_NAME=$(cat ${PATH_TO_YOUR_OUTPUT_INSTALL_ENV_FILE}|jq -r .common_azure_storage_account_name)
STORAGE_ACCESS_KEY=$(cat ${PATH_TO_YOUR_OUTPUT_INSTALL_ENV_FILE}|jq -r .common_azure_storage_access_key)
RESOURCE_GROUP=$(cat ${PATH_TO_YOUR_OUTPUT_INSTALL_ENV_FILE}|jq -r .common_resource_group)

az storage account management-policy show -g ${RESOURCE_GROUP} --account-name ${STORAGE_ACCOUNT_NAME} -o json > policy.json

KPI_BLOB_CONTAINER_NAME=$(cat ${PATH_TO_YOUR_OUTPUT_INSTALL_ENV_FILE}|jq -r .kpi_blob_container_name)

sed -i "s|${KPI_BLOB_CONTAINER_NAME}/proccesed||g" policy.json

az storage account management-policy create -g ${RESOURCE_GROUP} --account-name ${STORAGE_ACCOUNT_NAME} --policy policy.json

Enable data uploading

Execute Aura installer and everything will be reconfigured again, running the deploy_core stage.