-
-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fulltext and suggestion handling of multiple languages #734
Comments
@mgautierfr I’m quite interested by this proposal because it might as well solve the problem of a ZIM file with articles in different languages. Would you be able please to elaborate how this could work? On bith indexation and search? |
For now, the language is a property of the whole database. At searching I see different strategies:
|
@mgautierfr What you propose seems appropriate to me. But:
|
Yes and no.
Yes, we can store at db level a list of all article's languages and so know in which languages build the query. |
This ticket is a follow-up of kiwix/libkiwix#785
Current libzim search features are not working fine with contents in different languages, whereas they are in the same ZIM or not. The search can basically apply only one language strategy in a search (so only one stemmer, only one stopword list).
As a consequence, the multizim search/suggestion feature is limited to one language which is annoying under certain circonstances.
We had a first short list of approaches to go forward on this at kiwix/libkiwix#785 (comment).
The text was updated successfully, but these errors were encountered: