Skip to content

Consider adding support for custom parsers of utterances #256

@rozele

Description

@rozele

Today, we expect that the utterances JSON file is always an array of utterances with entities in one of two formats, an NLU.DevOps generic format:

[
   {
      "text": "order pizza",
      "intent": "OrderFood",
      "entities": [
        {
           "matchText": "pizza",
           "entityType": "FoodItem"
        }
      ]
   }
]

Or LUIS batch format:

[
   {
      "text": "order pizza",
      "intent": "OrderFood",
      "entities": [
        {
           "entity": "FoodItem",
           "startPos": 6,
           "endPos": 10
        }
      ]
   }
]

I suspect we can make this a bit simpler and afford an opportunity to leverage other tooling (that is less likely to get out of sync) if we allow dependency injection of the parser for utterances. One potential scenario I'd like to unblock is I'd like to write a simple script that takes a test utterance JSON file and sends the utterances off for prediction against LUIS / Lex / etc., storing the unmodified results directly from LUIS / Lex back in a JSON array.

I.e., could we easily enable something like this:

[
  {
    "query": "order pizza",
    "topScoringIntent": {
      "intent": "OrderFood",
      "score": 0.99999994
    },
    "entities": [
      {
        "entity": "pizza",
        "type": "FoodItem",
        "startIndex": 6,
        "endIndex": 10,
        "score": 0.973820746
      }
    ]
  }
]

We could achieve this with a couple different options.

Option 1, we add some flags to the compare command for how to inject the parser:

dotnet nlu compare \
  --expected tests.json \
  --actual results.json \
  --expectedFormat luis-batch \
  --actualFormat luis-response

Option 2, we add an optional envelope to the utterances JSON file:

{
  "format": "luis-response",
  "utterances": [
    {
      "query": "order pizza",
      "topScoringIntent": {
        "intent": "OrderFood",
        "score": 0.99999994
      },
      "entities": [
        {
          "entity": "pizza",
          "type": "FoodItem",
          "startIndex": 6,
          "endIndex": 10,
          "score": 0.973820746
        }
      ]
    }
  ]
}

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions