Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Update model at runtime #21

Closed
wants to merge 3 commits into from
Closed

WIP: Update model at runtime #21

wants to merge 3 commits into from

Conversation

bakwc
Copy link
Owner

@bakwc bakwc commented Apr 13, 2018

Ability to add text fragments to a model in runtime, relates to #18

@bakwc bakwc added enhancement WIP Work In Progress labels Apr 13, 2018
@iosadchiy
Copy link

A very useful feature. Is there any chance to see it merged?

Also I don't quite understand how it updates the perfect hash: if the text fragment to be added contains an n-gram that is not present in the hash, how to add it? Don't you have to re-calculate perfect hash from scratch?

@bakwc
Copy link
Owner Author

bakwc commented Jun 11, 2018

A very useful feature. Is there any chance to see it merged?

I'll try to continue working on this, but not sure when.

Also I don't quite understand how it updates the perfect hash: if the text fragment to be added contains an n-gram that is not present in the hash, how to add it? Don't you have to re-calculate perfect hash from scratch?

I planed to add a separate hash_table for additional words, it would be ok only for adding small amount of new words / sentences at runtime. For example in text editors when you plan to add some exceptions / ignore some errors.

@iosadchiy
Copy link

Actually, what I'm trying to do is to first create the model based on n-gram frequencies (#31) and then update it with some domain-specific texts. What could be the best approach here?

It seems like the functionality from this PR is not the best choice (due to the use of an additional hash table).

The other approach is to store all the ngrams in the model (without loading them into memory). This should allow to re-train the model on additional data.

Maybe you'll suggest something better?

@bakwc
Copy link
Owner Author

bakwc commented Jun 16, 2018

The other approach is to store all the ngrams in the model (without loading them into memory). This should allow to re-train the model on additional data.

It could lead to a performance issues if we will go to disk each time we need to get frequency.

@Jbiloki
Copy link

Jbiloki commented Apr 10, 2019

Is there a main function to invoke these commands?

@rprilepskiy
Copy link

@bakwc, do you (by any chance) have plans to solve the conflicts?

@bakwc
Copy link
Owner Author

bakwc commented Jan 27, 2020

sory, currently have no time (

@bakwc
Copy link
Owner Author

bakwc commented Oct 1, 2020

Done in Pro version.

JamSpellPro is available at jamspell.com

@bakwc bakwc closed this Oct 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement WIP Work In Progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants