Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary quantization - evaluate quality #484

Open
cduk opened this issue Nov 29, 2024 · 1 comment
Open

Binary quantization - evaluate quality #484

cduk opened this issue Nov 29, 2024 · 1 comment

Comments

@cduk
Copy link

cduk commented Nov 29, 2024

Feature request

Is there a way of receiving the embeddings back in BQ format? Right now, I receive the full precision embedding and quantize it in the client, but wondering if I'm missing a way to get this directly from the server in binary format? If static quantization is done at zero, then there's no need to profile the data so the server can still remain stateless.

Motivation

More compact response from server.

Your contribution

N/A

@michaelfeil
Copy link
Owner

There is a '--embedding_dtype-' parameter for this cli. You can discover new parameters via --help command in the cli, they also come with a description.

For embedding_dtype, there are few other open source implementations. Let me know if the results from it are good. It uses an English dataset to quantize.

@michaelfeil michaelfeil changed the title Binary quantization Binary quantization - evaluate quality Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants