This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Normalization pipelines

Catalog of NLP normalization pipelines

Catalog of NLP normalization pipelines to compose the NLP pipeline

Aura Platform Team has implemented a set of normalization pipelines in order to be nested in the NLP model pipeline. They are built joining different normalization stages (normalizers).

In every use case, it is necessary to choose the most adequate normalization pipeline.

For example, if numbers are expected to be expressed with text characters (i.e., “one”), it is useful to include the normalization stage CardinalityNormalizer to turn them into digits (“1”).

Another example refers to the fact that written requests are required. In this situation, it can be important to include a normalization stage that reduces transcription mistakes.

Select your intended normalization pipeline in the left menu. Each of them is characterized by its description and configuration.

Section Content Role in the NLP process
Description Identification and objective of the stage in the recognition process Descriptive purpose of the stage in the recognition process
Configuration Required configuration for each NLP stage Configuration of each stage of the NLP model

1 - Nabro

Nabro normalization pipeline

Description and stages

Nabro is a pipeline used for the normalization of the user’s utterance through the execution of the following normalizers:

  • PunctuationNormalizer
  • SplitPunctNormalizer
  • SpaceNormalizer
  • CurrencyNormalizer
  • UnicodeNormalizer
  • LowercaseNormalizer

Nabro normalization pipeline

Configuration

This stage requires the following configuration in the nlp.json configuration file:

For the specific language and channel, in the nlp field of this JSON file, the key normalizer_pipeline_class must be filled in with the value: auracog_pipelines.pipelines.normalization.nabro.NabroPipeline

{
  "es-es": {
    "mp": {
         "nlp": {
         "normalizer_pipeline_class": "auracog_pipelines.pipelines.normalization.nabro.NabroPipeline"
      }
    }
  }
}

2 - Narugo

Narugo normalization pipeline

Description and stages

Narugo is a pipeline used for the normalization of the user’s utterance through the execution of the following normalizers:

  • PunctuationNormalizer
  • SplitPunctNormalizer
  • SpaceNormalizer
  • CurrencyNormalizer
  • UnicodeNormalizer
  • LowercaseNormalizer
  • CardinalityNormalizer

Narugo normalization pipeline

Configuration

This stage requires the following configuration in the nlp.json configuration file:

For the specific language and channel, in the nlp field of this JSON file, the key normalizer_pipeline_class must be filled in with the value: auracog_pipelines.pipelines.normalization.narugo.NarugoPipeline

{
  "es-es": {
    "mp": {
         "nlp": {
         "normalizer_pipeline_class": "auracog_pipelines.pipelines.normalization.narugo.NarugoPipeline"
      }
    }
  }
}

3 - Naeba

Naeba

Description and stages

Naeba is a pipeline used for the normalization of the user’s utterance through the execution of the following normalizers:

  • PunctuationNormalizer
  • SplitPunctNormalizer
  • SpaceNormalizer
  • CurrencyNormalizer
  • LowercaseNormalizer

Naeba normalization pipeline

Configuration

This stage requires the following configuration in the nlp.json configuration file:

For the specific language and channel, in the nlp field of this JSON file, the key normalizer_pipeline_class must be filled in with the value: auracog_pipelines.pipelines.normalization.naeba.NaebaPipeline

{
  "es-es": {
    "mp": {
         "nlp": {
         "normalizer_pipeline_class": "auracog_pipelines.pipelines.normalization.naeba.NaebaPipeline"
      }
    }
  }
}

4 - Nikko

Nikko normalization pipeline

Description and stages

Nikko is a pipeline used for the normalization of the user’s utterance through the execution of the following normalizers:

  • PunctuationNormalizer
  • SplitPunctNormalizer
  • SpaceNormalizer
  • CurrencyNormalizer
  • UnicodeNormalizer
  • LowercaseNormalizer
  • CardinalityNormalizer
  • PunctuationNormalizer
  • SpaceNormalizer

Nikko normalization pipeline

Configuration

This stage requires the following configuration in the nlp.json configuration file:

For the specific language and channel, in the nlp field of this JSON file, the key normalizer_pipeline_class must be filled in with the value: auracog_pipelines.pipelines.normalization.nikko.NikkoPipeline

{
  "es-es": {
    "mp": {
         "nlp": {
         "normalizer_pipeline_class": "auracog_pipelines.pipelines.normalization.nikko.NikkoPipeline"
      }
    }
  }
}

5 - Niseko

Niseko normalization pipeline

Description and stages

Niseko is a pipeline used for the normalization of the user’s utterance through the execution of the following normalizers:

  • PunctuationNormalizer
  • SplitPunctNormalizer
  • SpaceNormalizer
  • CurrencyNormalizer
  • UnicodeNormalizer
  • LowercaseNormalizer
  • CardinalityNormalizer
  • PunctuationNormalizer
  • SpaceNormalizer
  • StopWordsFromFileNormalizer
  • WordReplacerFromFileNormalizer
S W t o o r P p d u W R n o e c r p t d l u s a a F c t r e i o r o m F n F r N i o o l m r e F m N i a o l l r e i m N z a o e l r r i m z a e l r i z e r S p l i S t p P a u c n e c N t o N r o m r a m l a i l z i e z r e r P u n S c p t a u c a e t N i o o r n m N a o l r i m z a e l r i z e r C a C r u d r i r n e a n l c i y t N y o N r o m r a m l a i l z i e z r e r L U o n w i e c r o c d a e s N e o N r o m r a m l a i l z i e z r e r

Configuration

This stage requires the following configuration in the nlp.json configuration file:

For the specific language and channel, in the nlp field of this JSON file, the key normalizer_pipeline_class must be filled in with the value: auracog_pipelines.pipelines.normalization.niseko.NisekoPipeline

{
  "es-es": {
    "mp": {
         "nlp": {
         "normalizer_pipeline_class": "auracog_pipelines.pipelines.normalization.niseko.NisekoPipeline"
      }
    }
  }
}

6 - Norikura

Norikura normalization pipeline

Description and stages

Norikura is a pipeline used for the normalization of the user’s utterance through the execution of the following normalizers:

  • PunctuationNormalizer
  • SplitPunctNormalizer
  • SpaceNormalizer
  • CurrencyNormalizer
  • UnicodeNormalizer
  • LowercaseNormalizer
  • StopWordsFromFileNormalizer
  • WordReplacerFromFileNormalizer
P u n c t u a t i o n N o r m a l i z e r S p l i t P u n c t N o r m a l i z e r W o r d R e p l a c e r S F p r a o c m e F N i o l r e m N a o l r i m z a e l r i z e r S t o p W C o u r r d r s e F n r c o y m N F o i r l m e a N l o i r z m e a r l i z e r L U o n w i e c r o c d a e s N e o N r o m r a m l a i l z i e z r e r

Configuration

This stage requires the following configuration in the nlp.json configuration file:

For the specific language and channel, in the nlp field of this JSON file, the key normalizer_pipeline_class must be filled in with the value: auracog_pipelines.pipelines.normalization.norikura.NorikuraPipeline

{
  "es-es": {
    "mp": {
         "nlp": {
         "normalizer_pipeline_class": "auracog_pipelines.pipelines.normalization.norikura.NorikuraPipeline"
      }
    }
  }
}

7 - Noro

Noro normalization pipeline

Description and stages

Noro is a pipeline used for the normalization of the user’s utterance through the execution of the following normalizers:

  • PunctuationNormalizer
  • SplitPunctNormalizer
  • SpaceNormalizer
  • CurrencyNormalizer
  • UnicodeNormalizer
  • LowercaseNormalizer
  • WordReplacerFromFileNormalizer
  • CardinalityNormalizer
  • PunctuationNormalizer
  • SpaceNormalizer

Noro normalization pipeline

Configuration

This stage requires the following configuration in the nlp.json configuration file:

For the specific language and channel, in the nlp field of this JSON file, the key normalizer_pipeline_class must be filled in with the value:
auracog_pipelines.pipelines.normalization.noro.NoroPipeline

{
  "es-es": {
    "mp": {
         "nlp": {
         "normalizer_pipeline_class": "auracog_pipelines.pipelines.normalization.noro.NoroPipeline"
      }
    }
  }
}