-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add quoted search functionality on browse sources page #1737
base: develop
Are you sure you want to change the base?
Conversation
…into issue-1632
r'"(.*?)"', general_str | ||
) # Extract terms in quotes | ||
unquoted_terms = re.findall( | ||
r"\b[\w,-.]+\b", re.sub(r'"(.*?)"', "", general_str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also do this with one regex with look-aheads/look-behinds:
(?<!\")\b[\w,-.]+\b(?!\")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably clearer what's happening here with the way you have it now...
# Add quoted terms to the Q object with exact matching (iexact) | ||
for term in quoted_terms: | ||
holding_institution_q |= Q( | ||
holding_institution__name__unaccent__iexact=term |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think iexact
is correct here...
If someone searches for "North Vancouver", don't I want any results where "North Vancouver" shows up somewhere in one of these fields, not just results where a field is exactly "North Vancouver"?
I think contains
is probably still the method, it's just we're not searching single words, but sometimes groups of words depending on the quotes.
Fixed some bugs in the search functionality on the Browse Sources page by adding support for quoted phrase matching, allowing exact matches for phrases (e.g., "North Vancouver").
Enabled unaccented search terms to match fields with or without accents, applying both partial (icontains) and exact (iexact) matching logic across key fields like institution name, city, shelfmark, and more.
Updated the query logic to handle quoted and unquoted terms separately using regex for better granularity and consistency.
Added tests for these changes.
Resolves #1632. We should proceed with #435 as part of a more robust fix for these types of issues.