Use of Grammars in Aura NLP

This section includes the description of Grammars, a deterministic recognition method used in Aura NLP for the recognition of the users’ utterances, their role in the NLP model and practical processes regarding how to use this stage in the understanding process

What are Grammars?

Grammars are a tool that provides an exact and lightweight utterance’s recognition method through a deterministic approach. Grammar uses probabilistic formalisms to recognize specific utterances from the users and to identify how to interpret them.

Aura NLP include Grammars as a stage that can be included in the NLP pipeline. It use has key limitations due to the large burden of building the language model, as Grammars are only able to recognize exact utterances. However, because of it, they constitute an interesting segment within Aura NLP, due to the existence of specific utterances produced by Aura’s users that must be recognized by Aura (such as common utterances from users or difficult ones that are hardly recognized by CLU).

Discover in the documents:

Detailed description of Grammars: Grammars engines, types of Grammars
Guidelines for the generation of Grammars in Unitex
Recognition of utterances with several entities in Grammars

Grammars engines: GrapeNLP and Unitex/GramLab

GrapeNLP is used by Aura NLP for intent recognition and entity extraction using grammars. This grammar engine is based on handcrafted grammars which describe in an exact manner the sentences that are to be recognized and the output information that is to be generated for each one, in our case, the intent the sentence corresponds to and the entities to extract.

Linguists should develop by hand the grammar that exactly recognizes the required sentences. Just in the case of ambiguity (multiple interpretations defined in the grammar for the same sentence), GrapeNLP uses a heuristic approach in order to choose one of the interpretations: the one that in the grammar uses more restrictive linguistic conditions.

Example of Grammars graphs

The core of GrapeNLP is implemented in C++ and includes a Python module to facilitate its integration with Python programs. It can analyse around 2700 sentences per second in an average computer and can be run in Ubuntu, Alpine, MacOS and Android. Moreover, it is open source and LPGL licensed, thus it can be used in commercial products.

GrapeNLP does not include a grammar editor. Instead, we use the editor included in the Unitex / GramLab platform. Unitex / GramLab is also LGPL licensed and can be installed in Windows, Linux and MacOs machines. The grammars created with Unitex are represented with graphs organized in connected boxes that linguists can easily create and update manually. Each box contains a set of possibilities for each token from the user’s utterance. The combination of different connected boxes provides a full variability of sentences to be recognized. The system also allows the generation of sub-grammars for specific Aura domains or for certain intents.

Once the grammars have been developed in Unitex, the grammar engine goes through all the graph paths from the beginning (left side) and compares box by box the user’s utterance with the grammar to evaluate the matching.

At the end of each path, a score is specified corresponding to the highest score among all the feasible paths. The output is a set of labels together with a start and end index and a score. The output is presented as a .json format.

It is important to bear in mind that, currently, grammars are used in Aura mainly for intents recognition. The grammar engine only provides recognized entities that have previously been labelled in the graphs. As another example of Grammars, the utterance “I would like to watch the film Frozen” provides the following output:

PipelineMessage:
       -OriginalMessage:
         -phrase: 'I would like to watch the film Frozen'
       -normalized_phrase: 'i would like to watch the film frozen'
       -normalized_presentable_phrase: 'I would like to watch the film Frozen'
       -annotated_phrase: ' i would like to watch the film frozen'
       -intent:  intent.tv.search'
       -score: 1.0
       -entities:
               Entity: Frozen, Type: ent.audiovisual_film_title, Score: 1.0, Start index: 31, End index: 37, Canon: frozen, Label: None, Deep Links: None

Note that, even though GrapeNLP does not make use of statistical methods or probabilities, the resulting .json includes a score field. This has been added for homogeneity with the machine learning workflow, but it is always hardcoded to 1.0 (since GrapeNLP performs exact matching, the probability is 100%). The machine learning pipeline never returns a score of 1.0, thus this field can be used for knowing whether the sentence was recognized by GrapeNLP or by an intent recognition stage (CLU, etc.).

📄 For more information regarding the use of Grammars for language recognition, please check the Unitex User Manual.

Global and local grammars

There are two types of grammars defined in Aura NLP recognition process, both based on the Grammars engine that offer a different performance depending on the location where they are executed:

Global grammars: defined and executed in Aura back-end.
Local grammars: they are a subset of the Global grammar.

The understanding process is carried out locally, in the channel side, for an agile resolution of the process, therefore allowing a significant latency reduction. It is available for a selected set of use cases.

Global and local grammars must be aligned, so there are no differences in the E2E understanding process (for instance, the same user input must provide the same result in terms of NLP recognition both global and local grammars).

Channels can automatically update their local grammars based on the grammar backend information. Moreover, the channel needs to be able to share the information with the global backend in terms of logs and KPIs.

Last modified May 14, 2025: feat: Documentation improvement for Prince release #AURA-29163 [RTM] (c69c1272)

Tags:

Categories:

Use of Grammars in Aura NLP

What are Grammars?

Grammars engines: GrapeNLP and Unitex/GramLab

Global and local grammars