TV search plugin

Complex Logic Framework plugin for the TV search use case

Introduction

The TV search plugin resolves the request from a user searching a video content.

It includes some key steps described in the following sections.

Validate input request

This step validates that the input request fulfills all the requirements from the input schema.

If the input data has an error the resource returned is: tv:video.model-validation.request.error.

Filter invalid entities

The TV search plugin executes the step for entities validation listed below and prepares data before resolving the request:

  1. Check that season/episode entity contains a number. In other case, it removes this entity.

The search of video content is based in the received entities.

The search stage follows the following steps:

Build the search_query param

The search builds a search_query param using label values of entities to prioritize the searched video content. If the label value is null for an entity, then the canon value is used.

Entities used to search are grouped by:

  • Title entities: ent.audiovisual_documental_title, ent.audiovisual_film_title, ent.audiovisual_tvseries_title and ent.audiovisual_tvshows_title
  • Participant entities: ent.audiovisual_actor and ent.audiovisual_director
  • Genre entities: ent.audiovisual_genre and ent.audiovisual_subgenre

The priority to search using these entities is described as follows:

  1. If there are title entities and other entities, then the search_query parameter is built only using title entities.
    • For instance: title: ("Matrix" OR "La princesa prometida")
  2. If there are participant entities and other entities, then the search_query parameter is built only using participant entities.
    • All entities (ent.audiovisual_actor and ent.audiovisual_director) are used for fields actor and director.
    • For instance: actor: ("Tom Cruise" OR "Martin Scorsese") OR director: ("Tom Cruise" OR "Martin Scorsese")
  3. If there are genre entities and other entities, then the search_query parameter is built only using genre entities.
    • The ent.audiovisual_genre is used with the field content_category and the ent.audiovisual_subgenre with field genre.
    • For instance: content_category: ("MOV" OR "SER" OR "DOC") AND genre: ("TE" OR "AC")
  4. If there are not any entities allowed, then a query param to search is not built.

With this prioritization, the system builds a search_query param that will be used to call the API video query for searching purposes.

API video query

After preparing the search_query parameter, we will make a query to the Telefonica Kernel video API by searching through all the possible fields with the remaining words. To make this API call, several parameters are needed, which are described below:

  • user_id: Input data[aura_user][user_id]
  • administrative_number: Input data[app_context][user][account_number]
  • access_token: Input data[aura_user][access_token]
  • scopes: Input data[aura_user][scopes]
  • purposes: Input data[aura_user][purposes]
  • device_type: Input data[app_context][device][type]
  • catalog_types: List of allowed catalog types. The value can now be a list of catalog types provided by the device, as long as the values are one of the following predefined identifiers: VOD, LIVE, L7D, LCH, LSR.
  • search_query: String with a custom search query based in the received entities.
  • show_series: series.
  • profile: Input data[app_context][user][video_profile_name]. This field may not be implemented in certain cases and this could give the error 501 Not Implemented in the video API according to the documentation. To prevent this problem we should send this field only when input data has this value.
  • commercialization_types: List made by SVOD.
  • max_quality: Input data[app_context][device][max_quality]

Search response

When performing a search, the following possible scenarios can occur for the response received by the search:

a. Receive an API error. The returned resource is tv:video.api.answer.error.

b. Receive no results at all. Then, it executes the contingency search if it isn’t disabled by configuration. If it is disabled by configuration, the returned resource is tv:video.search.no-results.

c. Receive a single result by searching for one title entity. The returned resource is tv:video.search.by-title.simple-result and params with the value of the title searched for.

d. Receive several results by searching for one title entity. The returned resource is tv:video.search.by-title.multiple-results and params with the value of the title searched for.

e. Receive a single result by searching for one actor/director entity. The returned resource is tv:video.search.by-participant.simple-result and params with the value of the participant searched for.

f. Receive several results by searching for one actor/director entity. The returned resource is tv:video.search.by-participant.multiple-results and params with the value of the participant searched for.

g. Receive a single result by searching for one genre entity. The returned resource is tv:video.search.by-genre.simple-result and params with the value of the genre searched for.

h. Receive several results by searching for one genre entity. The returned resource is tv:video.search.by-genre.multiple-results and params with the value of the genre searched for.

i. Receive a single result by searching for one genre and one subgenre entity. The returned resource is tv:video.search.by-subgenre.simple-result and params with the value of the genre and subgenre searched for.

j. Receive several results by searching for one genre and one subgenre entity. The returned resource is tv:video.search.by-subgenre.multiple-results and params with the value of the genre and subgenre searched for.

k. Receive a single result in all other cases. The returned resource is tv:video.search.by-default.simple-result.

l. Receive several results in all other cases. The returned resource is tv:video.search.by-default.multiple-results.

The contingency search is a more in-depth search to get, at least, one response. This search uses the entire utterance in every search field.

Contingency search is based on the following steps:

Normalization section

When a request is made by the user, it is possible to find certain words with no real value for Aura, so the system is not able to redirect them to obtain a response. For this reason, a list of forbidden words known as “ignore-words” will be declared, which will be eliminated.

The first step is to transform the entire utterance received as input to lowercase, removing all non-alphanumeric characters and separators. Once the utterance has been normalized, the words that belong to the list of “ignore-words” are eliminated.

This list is declared in a resource file called normalizer_rules.json. This file can be found in the following path: src.aura_clf_video.resources.[language].normalizer_rules.json Where [language] must be replaced by each language, for example: es-es.

If the language does not have the normalizer rule defined, it will be redirected to a default folder: src.aura_clf_video.resources.default.normalizer_rules.json

If, for example, the language is Spanish, the path to the resource where the normalization rules are defined will be: src.aura_clf_video.resources.es-es.normalizer_rules.json

  • Normalizer rules structure It is a dictionary where all the items are declared in a list as shown below:

    {
      "ignore_items": [
        "ignore item 1",
        "ignore item 2"
      ]
    }
    
  • Validation of resource content
    The normalization is done sequentially, a previous rule cannot affect a succeeding one.

    For example:

    • The utterance is “ok aura some”.
    • We define “aura” and also “ok aura” in the “ignore-words” list.
    • If we remove “aura” firstly from the original utterance, we obtain the following normalized utterance: “ok some”. In this case, “ok aura” does not have any effect.
    • In short, the correct order should be: first, remove “ok aura” and, after that, remove “aura”. In this case, we will obtain the final utterance as “some”.

“Ignore-words” will be automatically validated in order to prevent this behavior in every Pull Request.

If after normalization the normalized phrase is empty, the resource returned is: tv:video.search.contingency.no-results

API video query

After normalizing, we will query the Telefonica Kernel video API by searching through all possible fields with the remaining words.

To make this API call, a series of parameters are going to be needed, which are described below:

  • user_id: Input data[aura_user][user_id]
  • administrative_number: Input data[app_context][user][account_number]
  • access_token: Input data[aura_user][access_token]
  • scopes: Input data[aura_user][scopes]
  • purposes: Input data[aura_user][purposes]
  • device_type: Input data[app_context][device][type]
  • catalog_types: List of allowed catalog types. The value can now be a list of catalog types provided by the device, as long as the values are one of the following predefined identifiers: VOD, LIVE, L7D, LCH, LSR.
  • search_query: Is a joint of the normalized phrase by OR operator. Use this format to search by any fields. Example: “La resistencia Shameless” the search_query is “(La OR resistencia OR Shameless)”.
  • show_series: series.
  • profile: Input data[app_context][user][video_profile_name]. This field may not be implemented in some cases and this could give the error 501 Not Implemented in the video API according to the documentation. To prevent this problem, we should send this field only when input data has this value.
  • commercialization_types: List made by SVOD.
  • max_quality: Input data[app_context][device][max_quality]
  • current_region: Input data[app_context][location][currentRegion] if exists. Otherwise, do not use this param.

Contingency search response

When performing a contingency search, there are four possible scenarios for the response received by the search:

a. Receive an API error. The returned resource is tv:video.api.answer.error

b. Receive no results at all. The returned resource is tv:video.search.contingency.no-results

c. Receive a single result. The returned resource is tv:video.search.contingency.single-result

d. Receive several results. The returned resource is tv:video.search.contingency.multiple-results

Response

The response follows this response schema.

Where:

  • intent: input intent.

  • entities: input entities.

  • result_intent: This field is always MEDIA.SEARCH because it is the response associated to this domain.

  • resources: List of response resources that includes three main parameters:

  • payload: Information provided by the Kernel API, when we make the search request. If we receive a response. This field includes the following parameters:

    • type: The value of this field depends on the type of data included in the field data (info returned by the API):
      • If it is a value: details
      • If it is a list: content_list
    • data: It returns the information provided by the Kernel API.
  • status: Final status request. this field includes field the following parameters:

    • code: Status code.
    • message: Status message, which describes the status code.
    • params: Parameter that sends details of status. This field does not appear if it is empty.
  • actions: Actions to follow with the result of request.

  • conditions: Condition for the actions to be applied.