Atria Model Gateway metrics

List of metrics available in atria-model-gateway

http_request_duration_seconds

This metric is intended to store the information related to all the incoming HTTP requests received by atria-model-gateway.

It is stored as a Summary in Prometheus, so every sample, besides the defined labels, also includes its duration.

This metric allows measuring the behavior of the requests from any given endpoint. Specifically, the duration since the request lands in atria-model-gateway until its HTTP response is returned:

  • The number of requests during a time
  • The average/min/max duration of these requests

Labels:

  • method: HTTP method used by the request being stored (GET, POST, PUT, DELETE, etc.)
  • path: specific endpoint of the request
  • status_code: HTTP status code returned in the response
  • application: application name that is using the model

outgoing_request_duration_seconds

This metric is intended to store the information related to all the outgoing HTTP requests made by atria-model-gateway. It is stored as a Summary in Prometheus, so every sample, besides the defined labels, also includes its duration.

The metric allows measuring the behavior of the requests to any given endpoint:

  • The number of requests during a time
  • The average/min/max duration of these requests

Labels:

  • method: HTTP method used by the request being stored (GET, POST, PUT, DELETE, etc.)
  • host: host and domain where the request is being sent
  • path: specific endpoint of the request
  • status: HTTP status code returned in the response

generative_tokens

This metric is intended to store the information related to tokens used by OpenAI in atria-rag-server. It is stored as a Summary in Prometheus, so every sample, besides the defined labels, also includes its tokens usages.

The metric allows measuring the behavior of the tokens using any given OpenAI model:

  • The number of tokens during a time
  • The average/min/max tokens of these requests

Labels:

  • application: application name that is using the model
  • deployment_model_name: name of the deployment model
  • model_type: identifier of the model