Develop a filter for Engine Filter API

Guidelines for the knowledge and generation of filters for Engine Filters API.

Introduction

A filter is a set of configurations and instructions that allow data to be grouped and queried in order to determine whether the flow of a use case should be changed at a given time. For example, if we have a use case that queries an LLM, we can create a filter to limit the use of these LLMs by user, or by channel at a given time. For this documentation we are going to make a filter that limits the use of LLMs by user and use case, we will describe it later.



Engine Filter API

Anatomy of a filter

A filter is composed of a descriptive part, a configuration part, and an execution instruction part. Both parts are stored in JSON format. The service in charge of the storage is the Aura Configuration API. [TODO url]. The execution part is done by MongoDB instructions.

Descriptive Properties

Field name Type Required Description
id string true UUID that uniquely identifies a routing filter in Aura.
name string true Name that uniquely identifies a routing filter in Aura.
description string false Routing Filter description.
type string true Contains the type of filter. Currently, there is only one type ‘userId’.
    {
        "name": "preset-filter-stb-conversational-search",
        "id": "4a879583-5f76-4e6b-87c1-6250e8743dda",
        "description": "Limit the number of messages per user in a month for stb conversational search preset",
        "type": "userId"
        ....
    }

Configuration Properties

Field name Type Required Description
entities string[] true Contains at least one entity necessary to generate the data for the filter.
dataBase object true Contains an object with the collections necessary to store and process data. See database object.
vars object true Contains an object with the custom variables necessary for any phase of the filter. Encrypted field.
fields object true Contains the field mapping for grouping and its relationship with the previously defined entities. Encrypted field.
sourceFilters object false Contains an object to filter source data when loading from entity data. Encrypted field.
    {
        ....
       "entities": [
            "GATEWAYMESSAGE"
        ],
        "vars": {
            "llm_execution_limit": 10
        },
        "dataBase": {
            "dataFilterCollection": {
                "collectionName": "dataFilterPreset",
                "expiration": 5356800,
                "indexes": [
                    {
                        "seqId": 1
                    }
                ]
            },
            "dataSummaryCollection": {
                "collectionName": "dataSummaryStbConvSearch",
                "expiration": 5356800,
                "indexes": [
                    {
                        "month": 1,
                        "total": 1,
                        "year": 1
                    },
                    {
                        "itemId": 1,
                        "month": 1,
                        "year": 1
                    }
                ]
            }
        },
        "fields": {
            "MatchingValue": "^ef3d0603-3fef-4109-a577-0ab92f9060df$",
            "forId": "USER_ID",
            "forMatchingField": "AURA_PRESET_NAME",
            "forTime": "MESSAGE_TM"
        },
        "sourceFilters": [
            {
                "field": "USER_ID",
                "op": "notEqual",
                "val": ""
            }
        ],
        ....
    }

Execution Properties

Field name Type Required Description
match object true Contains an object with the MongoDB aggregation format that must return grouped data. Encrypted field.
summary object true Contains an object with MongoDB command format to insert these obtained aggregates into a summary collection. Encrypted field.
actions object[] true Contains an array of objects with the actions to be performed on the filtered data.
summaryFilter object true Contains an object with MongoDB command format to select the records that meet the filter. Encrypted field.
...
 "match": [
            {
                "$match": {
                    "fieldForMatch": "{{fields.forMatchingField}}",
                    "seqId": {
                        "$gt": "{{ctx.lastSeqId|number}}"
                    },
                    "valueForMatch": {
                        "$options": "i",
                        "$regex": "{{fields.MatchingValue}}"
                    }
                }
            },
            {
                "$group": {
                    "_id": {
                        "itemId": "$itemId",
                        "month": "$month",
                        "year": "$year"
                    },
                    "seqId": {
                        "$max": "$seqId"
                    },
                    "total": {
                        "$sum": 1
                    }
                }
            },
            {
                "$project": {
                    "_id": 0,
                    "itemId": "$_id.itemId",
                    "month": "$_id.month",
                    "seqId": 1,
                    "total": 1,
                    "year": "$_id.year"
                }
            }
        ],
        "summary": {
            "filter": {
                "itemId": "{{doc.itemId}}",
                "month": "{{doc.month|number}}",
                "year": "{{doc.year|number}}"
            },
            "options": {
                "upsert": true
            },
            "update": {
                "$inc": {
                    "total": "{{doc.total|number}}"
                },
                "$set": {
                    "expiresAt": "{{ctx.expiration|expires}}",
                    "month": "{{doc.month|number}}",
                    "updatedAt": "{{__DATE_NOW__}}",
                    "year": "{{doc.year|number}}"
                }
            }
        },
        "summaryFilter": {
            "filter": {
                "month": "{{ctx.month|number}}",
                "total": {
                    "$gt": "{{vars.llm_execution_limit|number}}"
                },
                "year": "{{ctx.year|number}}"
            },
            "options": {
                "projection": {
                    "_id": 0,
                    "itemId": 1,
                    "month": 1,
                    "total": 1,
                    "year": 1
                }
            }
        }

Create, execute and check a filter

To see all the steps necessary to create and use the filter, let’s use the example we discussed in the introduction.

Let’s suppose that we need a filter so that the users have a limit of access to the LLM APIs per use case. We are going to set a limit of 150 interactions per month. And this filter should reset those results at the beginning of each month.

Create a filter. Steps

Filters use data generated for KPIs, so the most important thing before starting to define a filter, is to know which entity has all the necessary fields to be able to extract the necessary conditions for the filter. Note that currently only one Entity can be used per filter, in later versions this may change and support multiple entities.

For our example we will identify what data we need

Date of interaction
User identifier Preset identifier

For our filter the only entity that contains all these data is GATEWAYMESSAGE

Iteration date -> MESSAGE_TM User ID -> USER_ID Preset identifier -> AURA_PRESET_NAME The value of the preset to which we are going to add the filter will be: “ef3d0603-3fef-4109-a577-0ab92f9060df”.

Once the Entity has been identified we proceed to create the descriptive part of the filter.

        "name": "preset-filter-for-users",
        "id": "1a879583-5f76-4e6b-87c1-6250e874bbs",
        "description": "Limit the number of messages per user in a month for LLM User Case",
        "type": "userId"

As we have the entity we are going to use we set it:

       "entities": [
            "GATEWAYMESSAGE"
        ]

We associate the identified fields in the CSV.

        "fields": {
            "MatchingValue": "ef3d0603-3fef-4109-a577-0ab92f9060df",
            "forId": "USER_ID",
            "forMatchingField": "AURA_PRESET_NAME",
            "forTime": "MESSAGE_TM"
        }

We will set the limit to 150 and put it in a variable.

        "vars": {
            "llm_execution_limit": 150
        }

As the USER_ID field is optional, we will filter out those records that do not have USER_ID, i.e. USER_ID notEqual "". This is done in the sourceFilters property. Although it is an array, it currently supports only one element. The available operators are: major, minor, equal and notEqual. SourceFilters also supports other data types for comparison, such as date or number. This is set with the cast: number|date property, otherwise the default is string.

     "sourceFilters": [
            {
                "field": "USER_ID",
                "op": "notEqual",
                "val": ""
            }
        ]

We will need to name the collections and the lifetime of the data in seconds.

      "dataBase": {
            "dataFilterCollection": {
                "collectionName": "dataFilterPreset",
                "expiration": 5356800
            },
            "dataSummaryCollection": {
                "collectionName": "dataSummaryFilterPreset",
                "expiration": 5356800
        }

We have already established all the values for the descriptive part and the configuration part, we must take into account that when we finish defining the execution part we will have to add some more data, such as indexes, if we want the filter execution to be done with the maximum performance.

Create Execution definition

Once a filter has been created, the KPI utilities will already know which entities have to send the Engine Filter API in order to process and group data and then validate the filters with them.

In our example, in order to group those data we need the dataFilterCollection collection that will contain the raw data extracted from the KPIs entity.

  "forId": "USER_ID",                     -> itemId
  "forMatchingField": "AURA_PRESET_NAME", -> fieldForMatch and valueForMatch
  "forTime": "MESSAGE_TM"                 -> year, month, expires and seqId

Example:

{
    "seqId" : 1756883926388.0,                                   -> Timestamp to generate partial summaries
    "itemId" : "IN-ewXp7S8u4mL0RlKJStA",                         -> UserId
    "year" : 2025,                                               -> Year
    "month" : 8,                                                 -> Month 
    "fieldForMatch" : "AURA_PRESET_NAME",                        -> fieldForMatch         
    "valueForMatch" : "a652we235-3fef-4109-a577-0909d8ef234567", ->valueForMatch
    "expiresAt" : "2025-09-03T07:20:15.388Z"                     -> expiration date
}

In our example we have to look for those that have the valueForMatch equal to the one defined in fields MatchingValue and group the data by userId, month, year, counting the number of iterations, i.e. we are looking for something like this:

{
    "month" : 8,
    "itemId" : "bk3iI3GwQWO2-4YI0etc0w",
    "year" : 2025,
    "expiresAt" : "2025-09-04T09:13:48.345Z",
    "total" : 40
}

Before we start defining the execution properties, let’s explain the format in which they are defined.

JSON with PLACEHOLDERS

The JSON that compose the properties with execution parameters have placeholders to be able to inject variables and other values that have other properties.

Prefixes
  • vars: Refers to the properties defined inside the vars object. Example ````vars.llm_execution_limit``` in our example will have the value 150.
  • fields: Refers to the properties defined inside the fields object. Example: ````fields.MatchingValue```.
  • doc: It refers to the document returned by mongodb in the previous execution. It will only appear in the summary property since it is formed with the results of a previous query. Example: ````doc.itemId```. In order to use these properties they must be inside the $project property of the match property.
  • ctx: Contains variables that can be used to insert in execution properties, these are:
    • ctx.day: Today’s day.
    • ctx.month: Current month.
    • ctx.year: Current year.
    • ctx.lastSeqId: Last seqId processed to create summaries.
    • ctx.expiration: Expiration time of the data to be generated.
Data Types
  • number: Numeric, integer or float type. Example: {{ctx.month|number}}.
  • date: Date type. Example: {{var.EDate|date|date}}.
  • expires: Current date + placeholder value. Example: {{ctx.expires|expires}}.
  • boolean: Boolean. Example: {{{var.isTrue|boolean}}.
  • string: String, if nothing is set it is the default string. Example: {{doc.itemId}} or {{doc.itemId|string}} .
  • lower: String in lowercase. Example: {{doc.itemId|lower}}.
  • upper: String in uppercase. Example: {{doc.itemId|upper}}.
  • trim: Removes whitespace from a string. Example: {{doc.itemId|trim}}.

Variables
- _DATE_NOW_: Contains the current date. Example: "updatedAt":"{{__DATE_NOW__}}".

Create execution properties: match, summary and summaryFilter

Once we have data in the raw data collection what we need is

1.- Get the data that has not been processed yet filtering by a certain condition and grouping them in this case by ItemId, month, year, and finally returning a data model to generate the summaries. This is achieved in the match property:

Step A: $match. Select data to filter by a regex.


"$match": {
    "fieldForMatch": "{{fields.forMatchingField}}",
    "seqId": {
        "$gt": "{{ctx.lastSeqId|number}}"
    },
    "valueForMatch": {
        "$options": "i",
        "$regex": "{{fields.MatchingValue}}"
    }
}

In our case this would be to select those elements that the field “AURA_PRESET_NAME” has the value “ef3d0603-3fef-4109-a577-0ab92f9060df” and whose seqId is greater than the last processed seqId.


"$match": {
    "fieldForMatch": "AURA_PRESET_NAME",
    "seqId": {
        "$gt": "1099288387374"
    },
    "valueForMatch": {
        "$options": "i",
        "$regex": "ef3d0603-3fef-4109-a577-0ab92f9060df"
    }
}

Step B: $group. We group by itemId, month and year. We store the last seqId in $max.


"$group": {
   "_id": {
       "itemId": "$itemId",
       "month": "$month",
       "year": "$year"
   },
   "seqId": {
       "$max": "$seqId"
   },
   "total": {
       "$sum": 1
   }
}

This will be grouping the data, adding the matching elements and obtaining the maximum seqId used. The latter will be stored in ctx.lastSeqId.

Step C: $project. We return the grouped data to update or create the elements in the summary collection.


"$project": {
    "_id": 0,
    "itemId": "$_id.itemId",
    "month": "$_id.month",
    "seqId": 1,
    "total": 1,
    "year": "$_id.year"
}

2.- We update or create the grouped data, this is included in the summary property.


"summary": {
    "filter": {
        "itemId": "{{doc.itemId}}",
        "month": "{{doc.month|number}}",
        "year": "{{doc.year|number}}"
    },
    "options": {
        "upsert": true
    },
    "update": {
        "$inc": {
            "total": "{{doc.total|number}}"
        },
        "$set": {
            "expiresAt": "{{ctx.expiration|expires}}",
            "month": "{{doc.month|number}}",
            "updatedAt": "{{__DATE_NOW__}}",
            "year": "{{doc.year|number}}"
        }
    }
}

The doc prefix refers to the result of the previous execution of match. Once the summary has been executed, we will already have in the dataSummaryCollection data to query if they comply or not with the filter.

3.- Obtain the data that meet the filter. summaryFilter


"summaryFilter": {
    "filter": {
        "month": "{{ctx.month|number}}",
        "total": {
            "$gt": "{{vars.llm_execution_limit|number}}"
        },
        "year": "{{ctx.year|number}}"
    },
    "options": {
        "projection": {
            "_id": 0,
            "itemId": 1,
            "month": 1,
            "total": 1,
            "year": 1
        }
    }
}

In this case it looks for records that comply with the current month and year and that exceed the limit configured in vars.llm_execution_limit. The filter engine API stores the elements that exceed the filter in a cache each time the dataSummaryCollection is updated.

Check filters

Filters can be queried via Configuration API calls: https://ENV.auracognitive.com/aura-services/v2/routing-filters/FILTER_ID/items/ITEM_ID, or within a context filter.

Example:

{
  "name": "tvCustomRecommendation",
  "dialogs": [
    {
      "id": "tv-custom-recommendation",
      "onlyIn": [
        "set-top-box"
      ],
      "allowAnonymous": true,
      "triggerConditions": [
        {
          "intent": "intent.tv.custom_recommendation",
          "contextFilters": [
            {
              "name": "limit-num-messages-LLM-user",
              "type": "RoutingByFilter",                             <-- Set the type
              "conditions": "4a879583-5f76-4e6b-87c1-6250e8743dda",  <-- FilterId
                "name": "send-custom-messsage",
                "breakDialogExecution": true,
                "breakFilterEval": true,
                "redirectToIntent": "intent.send-custom-message",
                "resource": "intent.send-custom-message:default.message",
                "removeBypass": true,                                <-- Remove bypass
                "suggestions": false
              }
            }
          ],
          "settings": {
            "action": "noAction",
            "sound": "positive",
            "type": "common"
          }
        }
      ],
      "bypass": {
        "duration": 4,
        "payloadName": "openai",
        "initialData": {},
        "recognizersEnabled": true,
        "recognizersBreakIntents": {
          "intent.tv.display": [
            "[Display Channel]",
            "[Display Contents]"
          ],
          "intent.navigation.section_show": [
            "[Sections]"
          ]
        }
      }
    },

To include a filter inside a context filter the following properties must be included:

  • type: Must have the type “RoutingByFilter “.
  • conditions: It must have the id of the Filter we want to consult.

If the dialog has an associated Bypass it is important to add “removeBypass”: true inside the “true “ property.