Atria Model Gateway metrics

List of metrics available in atria-model-gateway

http_request_duration_seconds

This metric is intended to store the information related to all the incoming HTTP requests received by atria-model-gateway.

It is stored as a Summary in Prometheus, so every sample, besides the defined labels, also includes its duration.

This metric allows measuring the behavior of the requests from any given endpoint. Specifically, the duration since the request lands in atria-model-gateway until its HTTP response is returned:

The number of requests during a time
The average/min/max duration of these requests

Labels:

method: HTTP method used by the request being stored (GET, POST, PUT, DELETE, etc.)
path: specific endpoint of the request
status_code: HTTP status code returned in the response
application: application name that is using the model

outgoing_request_duration_seconds

This metric is intended to store the information related to all the outgoing HTTP requests made by atria-model-gateway. It is stored as a Summary in Prometheus, so every sample, besides the defined labels, also includes its duration.

The metric allows measuring the behavior of the requests to any given endpoint:

The number of requests during a time
The average/min/max duration of these requests

Labels:

method: HTTP method used by the request being stored (GET, POST, PUT, DELETE, etc.)
host: host and domain where the request is being sent
path: specific endpoint of the request
status: HTTP status code returned in the response

generative_tokens

This metric is intended to store the information related to tokens used by OpenAI in atria-rag-server. It is stored as a Summary in Prometheus, so every sample, besides the defined labels, also includes its tokens usages.

The metric allows measuring the behavior of the tokens using any given OpenAI model:

The number of tokens during a time
The average/min/max tokens of these requests

Labels:

application: application name that is using the model
deployment_model_name: name of the deployment model
model_type: identifier of the model

Last modified January 15, 2025: feat: Documentation Assistant and ATRIA for Linkin Park release #AURA-26619 [RTM] (409958c0)