Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Llama] Remove inplace read for KVCache #849

Merged
merged 1 commit into from
Jan 21, 2025

Conversation

Groverkss
Copy link
Contributor

The in-place read for KVCache doesn't seem to be used and is just used to tell the KVCache number of partitions to read from. The number of partitions for KVCache is in reality fixed (as from the name, K/V cache) and we are adding overhead by passing this parameter.

Copy link
Member

@dan-garvey dan-garvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+15, -263 is what I like to see. If we don't need all this thats great.

@Groverkss
Copy link
Contributor Author

The flux test failure seems unrelated, merging

@Groverkss Groverkss merged commit d4298df into nod-ai:main Jan 21, 2025
32 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants