Releases · kclhi/llama.cpp

30 May 17:25

5921b8f

b3051 Latest

Latest

llama : cache llama_token_to_piece (#7587)

* llama : cache llama_token_to_piece

ggml-ci

* llama : use vectors and avoid has_cache

ggml-ci

* llama : throw on unknown tokenizer types

ggml-ci

* llama : print a log of the total cache size

Assets 21

cudart-llama-bin-win-cu11.7.1-x64.zip

293 MB 2024-05-30T17:25:17Z
cudart-llama-bin-win-cu12.2.0-x64.zip

413 MB 2024-05-30T17:25:27Z
llama-b3051-bin-macos-arm64.zip

41.9 MB 2024-05-30T17:25:40Z
llama-b3051-bin-macos-x64.zip

38.6 MB 2024-05-30T17:25:42Z
llama-b3051-bin-ubuntu-x64.zip

46.5 MB 2024-05-30T17:25:44Z
llama-b3051-bin-win-avx-x64.zip

6.66 MB 2024-05-30T17:25:47Z
llama-b3051-bin-win-avx2-x64.zip

6.64 MB 2024-05-30T17:25:48Z
llama-b3051-bin-win-avx512-x64.zip

6.66 MB 2024-05-30T17:25:49Z
llama-b3051-bin-win-clblast-x64.zip

7.84 MB 2024-05-30T17:25:50Z
llama-b3051-bin-win-cuda-cu11.7.1-x64.zip

65.2 MB 2024-05-30T17:25:52Z
Source code (zip)

2024-05-30T16:01:41Z
Source code (tar.gz)

2024-05-30T16:01:41Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: kclhi/llama.cpp

b3051