Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Support constaints on distance column in KNN queries, for pagination and range queries #166

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

asg017
Copy link
Owner

@asg017 asg017 commented Jan 11, 2025

refs #165

.load dist/vec0
create virtual table vec_items using vec0(
  vector float[1]
);

insert into vec_items(rowid, vector)
  select value, json_array(value) from generate_series(1, 100);

select vec_to_json(vector), distance
from vec_items
where vector match '[1]'
  and k = 5;

/*
┌─────────────────────┬──────────┐
│ vec_to_json(vector) │ distance │
├─────────────────────┼──────────┤
│ '[1.000000]'        │ 0.0      │
│ '[2.000000]'        │ 1.0      │
│ '[3.000000]'        │ 2.0      │
│ '[4.000000]'        │ 3.0      │
│ '[5.000000]'        │ 4.0      │
└─────────────────────┴──────────┘
*/


select vec_to_json(vector), distance
from vec_items
where vector match '[1]'
  and k = 5
  -- the new magic
  and distance > 4.0;

/*
┌─────────────────────┬──────────┐
│ vec_to_json(vector) │ distance │
├─────────────────────┼──────────┤
│ '[6.000000]'        │ 5.0      │
│ '[7.000000]'        │ 6.0      │
│ '[8.000000]'        │ 7.0      │
│ '[9.000000]'        │ 8.0      │
│ '[10.000000]'       │ 9.0      │
└─────────────────────┴──────────┘
*/

TODO

  • tests
  • docs
  • edge cases - when multiple items have same distance calculations
  • should distances be f64?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant