Deprecate tokenize parameter in field definition #602
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
These are some proposed changes for the
1.0.0-SNAPSHOT
branch around the handling of tokenization of text based fields.tokenize
parameter in the field definition is deprecated and no longer usedTEXT
type fields will always be tokenizedATOM
type fields will never be tokenizedATOM
norms are alway omitted, this is the default in ES. Ideally, we would want to make this configurable, but the grpc type does not allow for detecting when this is unset (andTEXT
fields need the opposite default). Since this is not a commonly used option, I think hard coding the value is ok for now.ATOM
fields are no longer indexed for search when thesearch
property is falseAdditionally
tokenize
parameter have been removedsearch
parameter has been added toATOM
fields storingdoc_id
values in schemas, as they are no longer searchable without it. It would be good to switch these to_ID
fields at some point, but that is beyond the scope of this branch.