Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add batch analyze API support for recognizer #1506

Open
jimmyxie-figma opened this issue Jan 6, 2025 · 6 comments
Open

Add batch analyze API support for recognizer #1506

jimmyxie-figma opened this issue Jan 6, 2025 · 6 comments

Comments

@jimmyxie-figma
Copy link

jimmyxie-figma commented Jan 6, 2025

Is your feature request related to a problem? Please describe.

Currently, the BatchAnalyzerEngine works by iterating through either a list or dictionary, analyzing and anonymizing the values one by one. while this is not an issue for the predefined recognizers, and there are improvements in the built-in NLP engine to support batch inference, it does pose an efficiency problem for the transformer recognizers, causing idle resources and low inference throughput.

Describe the solution you'd like

We want to build batch inference API support for recognizers. Early testing shows that even with a small batch size of 4, a BERT-like transformer speeds up inferences by 3x without any additional resource or memory usage.

The exact implementation is still up for discussion and one potential solution would be adding a batch recognizer mix-in, where we batch analyze the the batch recognizer first, and pass the results to the regular analyze for extension. similar to the nlp_engine.process_batch

Describe alternatives you've considered
N/A

Additional context
N/A

@omri374
Copy link
Contributor

omri374 commented Jan 7, 2025

Thanks for the suggestion! One of the reasons we chose to use spacy-huggingface-pipelines is its batch support for transformer based models. Have you experimented with this option? In general, I agree that exposing an analyze_batch option for recognizers is a good idea, and having the BatchAnalyzerEngine call analyze_batch instead of analyze for each recognizer.

@jimmyxie-figma
Copy link
Author

jimmyxie-figma commented Jan 7, 2025

@omri374 thanks for the quick reply. I would imagine the transformers_nlp_engine does the same thing as well. nvm, I see the comments, that's just a wrapper around the package

The problem is that we are currently using two transformer models (and not for the purpose of multi-language support), so that approach wouldn’t work. Another way forward for us would be to fine-tune our own models and consolidate to one.

@jimmyxie-figma
Copy link
Author

Alternatively, we could support multiple NLP engines and extend the NLP artifacts. If we decide to provide better support for batching, my team is happy to help with the PR.

@omri374
Copy link
Contributor

omri374 commented Jan 8, 2025

Will the option of having a batch_analyze for recognizers be useful in your case? If yes, a PR would be great. Essentially, we could add a batch_analyze method to the EntityRecognizer base class, and have it iterate through texts and pass them to the analyze method. In specific cases, like the transformers recognizer, we could override this method with a different implementation for batch mode. WDYT?

@jimmyxie-figma
Copy link
Author

@omri374 The issue is that BatchAnalyzer is just a thin wrapper around the regular Analyzer. The magic happens in the Analyzer.analyze function. Adding batch_analyze to the EntityRecognizer would be a good starting point, but I feel like there is a deeper refactor might be needed to Between the two analyzers.

I’ll try cut a PR to add the batch analyzer API to the EntityRecognizers. and In our codebase, we’ll create:

  • A CustomAnalyzer inheriting from the regular Analyzer adding a batch API
  • A CustomBatchAnalyzer with the batch iteration function overridden to accept the previous batch API

@omri374
Copy link
Contributor

omri374 commented Jan 15, 2025

In the BatchAnalyzerEngine, there's a separation between the NLP engine phase, and the recognizers phase. I wonder if we can call the batch_process in the NLP engine, and then do a batch run through recognizers, with similar logic/configuration to the AnalyzerEngine.analyze. This way we wouldn't need the CustomAnalyzer, and we'd just update the code in BatchAnalyzerEngine.
So this line:

results = self.analyzer_engine.analyze(

Would be replaced with a call to self.run_recognizers_in_batch or something like that, with all the configuration coming from self.analyzer_engine.

I must say I haven't thought about this deeply enough to know if it's viable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants