Aura Redis Mongo Synchronizer (ARMS)
aura-redis-mongo-synch (ARMS) is a service that allows the use of Redis and MongoDB to create a Bot Framework Context management as a two-level cache.
Find in the current documents the description of this component, its architecture, components and processes.
Aura Virtual Assistant component
Introduction
To implement a two-level cache strategy:
- Redis will be used as the first-level cache, with a short expiration time to speed up writing operations.
- MongoDB will be used as the second-level cache, with a high expiration time.
aura-redis-mongo-synchronizer (ARMS) will be responsible for transferring expired data from Redis to MongoDB.
The service is implemented as a web server just to make it easier to handle it and to connect with Aura monitoring systems, such as Prometheus. It is in charge of managing the expired data in Redis using the Pub/Sub system provided by Redis.
Aura Redis Mongo Synchronizer components
The following figure shows the main components of the aura-redis-mongo-sync.
Server
The web server is implemented using fastify, a very fast web framework for NodeJS.
The web server works only as the service controller, while all the synchronization tasks are performed by a module outside the service called aura-redis-mongo-sync or ARMS.
Modules
Redis Mongo Sync
It is responsible for synchronizing data between Redis and MongoDB.
It uses a publish/subscribe system provided by Redis to notify the module when to synchronize data from Redis to MongoDB, based on events generated by Redis regarding the changes in the status of the documents.
Controllers & Services
When the previous step is finished, the request lands on the controller.
Each controller processes the request through a service in charge of implementing the logic. Once the request has been processed, the controller prepares and sends the response.
There is a controller to manage requests associated with the server itself (generic controller) and a controller for each module, in charge of handling the requests of that specific module.
Generic controller
As indicated above, the generic controller handles requests to retrieve information of the status of the server itself.
healthz: It is used to check if the server is online returning a simple status response:{"status": "ok"}.lastevent: It returns the date of the last synchronization between Redis and MongoDB:{ status: 'ok', lastEvent: DATE };shutdown: Service to be invoked mainly by Kubernetes cluster to perform a server graceful shutdown.
Data Storage
A new class has been created that implements the Bot Framework storage interface called RedisMongoDbStorage.
This class writes the context data to Redis and firstly, it reads the data from Redis. If this data does not exist in Redis, it fetches it from MongoDB.
This class is included in the aura-common-utilities repository. Check the Github repository for further details.
Data structure in Redis
Redis stores its data in key:value structures. This value can support different data structures.
In the following sections, the ones used to store the bot context are explained.
These structures support an expiration time so that Redis deletes them when the expiration time is reached, so the content has expired. The expiration of a record in Redis emits an specific event, of expired type, that can be handled by an external module.
Context (Bot Framework Context)
The bot context is stored in a key:string structure:
Key: context:[MONGO DATABASE NAME]:[MONGO COLLECTION NAME]:[MONGO CONTEXT INDEX KEY]
Value: The content of the context is stored in a string obtained by converting the Object context in JSON to string.
Expiration: [EXPIRATION]
Where:
- MONGO DATABASE NAME: Name of the database to use when ARMS writes this data in MongoDB.
- MONGO COLLECTION NAME: Name of the collection to use when ARMS writes this data in MongoDB.
- MONGO CONTEXT INDEX KEY: Key to use when ARMS writes this data in MongoDB.
When a key:value data expires in Redis, an event is emitted indicating which key contained the expired record (event => expired). The data can no longer be retrieved because Redis deletes value data before emitting the event and if this event is not processed, it will not recur, forcing us to use some auxiliary data structures in order to recover the expired data.
Shadow Keys
A Shadow Key is a key:void structure used to emit the event. As the expiration event returns a key, that will contain the current key of the context containing the data.
To improve efficiency, this shadow-key contains the key SHARD, which allows ARMS to subscribe only to certain shards to improve performance.
The number of shards is defined in the variable AURA_REDIS_CONTEXT_CACHE_SHARD_COUNT, which indicates that the algorithm to calculate the shard based on the key will generate numbers between 1 and the value of the variable.
It is important to notice that the number of shards configured in AURA_REDIS_CONTEXT_CACHE_SHARD_COUNT must be the same as the number of pods configured in the aura-redis-mongo-synchronizer.
Key: shadow-key:[SHARD]:context:[MONGO DATABASE NAME]:[MONGO COLLECTION CONTEXT]:[MONGO CONTEXT INDEX KEY]
Value: empty
Expiration: [REDIS CONTEXT DATABASE EXPIRATION]
When the service receives the expiration of this key, we can extract the shard and the real value from the context.
Active Context List
To ensure all data is synchronized, we need to keep in a list all the active context keys. By doing that, if an error occurs in the service and the expiration notifications are lost, we can recover data by periodically checking this list.
The data structure to be used is key: score-List. The score list contains a list of values with a string field and a numeric field by which the data can be sorted. The string field is used to store the key and the numeric field to store the expiration date.
In this way, we can retrieve those expired records that have not been processed, as items are removed from the list once they are synchronized.
Key: active-context:[SHARD_VALUE]
context:[MONGO DATABASE NAME]:[MONGO COLLECTION CONTEXT]:[MONGO CONTEXT INDEX KEY 1] DATE_1
context:[MONGO DATABASE NAME]:[MONGO COLLECTION CONTEXT]:[MONGO CONTEXT INDEX KEY 2] DATE_2
...
context:[MONGO DATABASE NAME]:[MONGO COLLECTION CONTEXT]:[MONGO CONTEXT INDEX KEY N] DATE_N
To store these elements, we will use partitioned lists in SHARD mode, that is, they will be included in the list that corresponds to them according to their KEY. The SHARD_VALUE is a number that is calculated from the key and the module obtained by dividing this number by the number of partitions that are configured.
For example, there is a series of lists to avoid collisions between PODS of the service (ARMS). That is, if the KEY1 key returns 301 (this data is obtained by performing an operation on the content of the key) this is divided by the number of partitions, 5 for example, and the module +1 is obtained, to avoid the 0. This indicates that the list where it will be stored will be active-context:2.
The execution of this process is based on the current system load. The variables associated to the execution period are:
- AURA_REDIS_CONTEXT_CACHE_CACHE_CHECK_UNPROCESSED_INTERVAL_MIN, which indicates in seconds the minimum time that will be set.
- AURA_REDIS_CONTEXT_CACHE_CACHE_CHECK_UNPROCESSED_INTERVAL_MAX, which indicates in seconds the maximum time for the interval.
The system will recalculate the next run based on the load. If the number of elements to be processed is very high, the interval will be reduced until it reaches the minimum, while if the load is low, the interval will be increased until it reaches the maximum.
SHARD variables
To manage the SHARD and the list to be selected by the ARMS, two variables come into play in Redis stored in the key:string structure.
- AURA_REDIS_CONTEXT_CACHE_SHARD_COUNT : Number of lists to generate (partitions).
- AURA_REDIS_CONTEXT_LAST_SHARD_PROCESSED: Partition to manage at that moment, that is, when a service requests this data, one is added to it so that the next service that requests it can access the next partition, and so on.
- AURA_REDIS_CONTEXT_LAST_INDEX_SHARD_PROCESSED: This variable is used to assign event subscriptions in an orderly fashion. For example, if we have two ARMS services and the AURA_REDIS_CONTEXT_CACHE_SHARD_COUNT is set to four, then the first service will subscribe to 1 and 2 and the second service to 3 and 4. This order of assignment is done by incrementing this environment variable.
Synchronization Flows
Read Data
The Bot Framework components will read data in the following order:
- They will try to read data from Redis.
- If Redis does not return data, they will try it in MongoDB.
ARMS will read data in two situations:
- When an expiration event is issued, ARMS reads the corresponding data, obtaining the context key contained in the expired KEY.
- When the active context control service runs (it does it periodically) and it has some expired context that must be synchronized.
Write Data
- Bot Framework components will write all their data in Redis.
- ARMS will write the synchronized records in MongoDB.
NOTE: ARMS also writes data in Redis but only to control its core and internal behavior, incrementing SHARD, updating active context lists, etc.
Distribution of the event subscription
As previously mentioned, every time a Shadow-Key expires in Redis, it emits an event that will be received by the ARMS services that are subscribed to that event.
Now, as the Shadow-Key contains the SHARD of the real key of the data, we can distribute the events among different ARMS to improve performance and avoid network overload.
For the assignment of the events to subscribe, there is a method that is executed every certain time. The value is in the variable AURA_REDIS_CONTEXT_CACHE_CHECK_INDEX_SHARD_INTERVAL and is measured in seconds.
Steps to generate event subscription
- Save the current state of the service: This saves, in a Redis structure, the identifier of the current ARMS and the current time.
- Consult the number of ARMS services available: This is done by consulting the previous structure, obtaining those services that have been recently updated.
- Calculate the number of events to subscribe: This is done by dividing AURA_REDIS_CONTEXT_CACHE_SHARD_COUNT, which contains the number of partitions, by the number of services. There is a variable that can limit this assignment, AURA_REDIS_CONTEXT_CACHE_MIN_SHARD_BY_NODE: if this variable is greater than the previous result, the value of the variable is used. For example, if the result is 1, but this variable has the value 2, the service must subscribe to 2 events.
- Assign the events: Once the number of events to subscribe to is determined, the AURA_REDIS_CONTEXT_LAST_INDEX_SHARD_PROCESSED variable is consulted and incremented. The value to assign is the module between the returned value and the value of AURA_REDIS_CONTEXT_CACHE_SHARD_COUNT. In this way, it will assign consecutive values between 1 and the AURA_REDIS_CONTEXT_CACHE_SHARD_COUNT.
- Subscription to Redis: Once the SHARDs to be subscribed have been assigned, we only need to disconnect the events that do not match the current selection and connect the ones we do not have connected.
To obtain an optimal performance, ideally the number of AURA_REDIS_CONTEXT_CACHE_SHARD_COUNT should at least match the number of ARMS services.
Example of distribution for AURA_REDIS_CONTEXT_CACHE_SHARD_COUNT = 5, and adding ARMS services until reaching 5.