Flow for entity refinement #4

secretsauceai · 2022-07-02T19:27:01Z

Description

We want users, in this case most likely myself and any developers whoever, to benchmark an NLU data set for entity extraction and be able to refine those entities to improve the data set.

Making sure the “human in the for loop” flow works comes from refining the entities, however there will be improvements that need to be made as they block the refinement process. This is a dummy ticket to append such minor code fixes to.

User stories

As a user, I want to

visually see the analytics of the entities so that I know what needs to be improved
review all the incorrect entities so that I can fix them

Sounds easy, right? Especially since we have already built the intent refinement work flow, but it is a bit more complex than that.

With intents; we could visualize all domains, see where the intents are doing the worst, pick those domains, and then review all the incorrectly classified intents in that domain for refinement. With entities, it is a bit more tricky.

We'll do this, but with entities. However, we need to group together the entities in a domain, and there will also be overlap. Some utterances have more than one entity type. So, we have to keep track of that. Furthermore, do we tell the user to refine all entities in an utterance, or do we tell them to ignore them? It would be super annoying to have to go back over the same utterances 2 or more times! This is why we should have users working on multiple entities at the same time. This is harder for a user to do, as the user must know if each one is correct and if not, what they should be.

Ergo, it is better for a user to review incorrect entries in batches. They should have an overview for that domain of example entries where the entities are correct, then go through correcting no more than 100 at a time.

This means, however, we will have to adapt our flow from the intent refinement. With intent refinement, we recorded into CSVs by domain and intent. Here we will just do it by domain in batches, then merge those together into one for the whole domain. If the user is lucky, they will only have to do one batch per domain.

DoD

The text was updated successfully, but these errors were encountered:

AmateurAcademic · 2022-08-06T20:36:34Z

This code needs to be refactored and tested extensively.

secretsauceai added this to the Create entity refinement data set milestone Jul 2, 2022

AmateurAcademic self-assigned this Aug 6, 2022

AmateurAcademic added the enhancement New feature or request label Aug 6, 2022

AmateurAcademic mentioned this issue Sep 3, 2022

Find overlapping entities in a domain by type #6

Closed

This was referenced Nov 5, 2022

Refactor Macro_NLU_Entity_Refinement.ipynb to update the dataframes for the removed and complete data sets #7

Closed

Add review of entity types with examples in Macro_NLU_Entity_Refinement.ipynb #8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flow for entity refinement #4

Flow for entity refinement #4

secretsauceai commented Jul 2, 2022 •

edited by AmateurAcademic

Loading

AmateurAcademic commented Aug 6, 2022

Flow for entity refinement #4

Flow for entity refinement #4

Comments

secretsauceai commented Jul 2, 2022 • edited by AmateurAcademic Loading

Description

User stories

DoD

AmateurAcademic commented Aug 6, 2022

secretsauceai commented Jul 2, 2022 •

edited by AmateurAcademic

Loading