Skip to content

[ Speculative decoding ] Support different tokenizers for draft and main models #7232

[ Speculative decoding ] Support different tokenizers for draft and main models

[ Speculative decoding ] Support different tokenizers for draft and main models #7232

Triggered via pull request January 23, 2025 10:12
Status Success
Total duration 37m 14s
Artifacts

causal_lm_cpp.yml

on: pull_request
Matrix: cpp-beam_search_causal_lm-ubuntu
cpp-multinomial-greedy_causal_lm-ubuntu
18m 31s
cpp-multinomial-greedy_causal_lm-ubuntu
cpp-greedy_causal_lm-windows
36m 21s
cpp-greedy_causal_lm-windows
cpp-greedy_causal_lm-Qwen-7B-Chat
12m 45s
cpp-greedy_causal_lm-Qwen-7B-Chat
cpp-beam_search_causal_lm-Qwen1_5-7B-Chat
33m 6s
cpp-beam_search_causal_lm-Qwen1_5-7B-Chat
cpp-beam_search_causal_lm-Phi-2
17m 11s
cpp-beam_search_causal_lm-Phi-2
cpp-beam_search_causal_lm-notus-7b-v1
31m 21s
cpp-beam_search_causal_lm-notus-7b-v1
cpp-speculative_decoding_lm-ubuntu
28m 53s
cpp-speculative_decoding_lm-ubuntu
cpp-prompt_lookup_decoding_lm-ubuntu
9m 50s
cpp-prompt_lookup_decoding_lm-ubuntu
cpp-Phi-1_5
7m 58s
cpp-Phi-1_5
cpp-greedy_causal_lm-redpajama-3b-chat
12m 1s
cpp-greedy_causal_lm-redpajama-3b-chat
cpp-chat_sample-ubuntu
15m 3s
cpp-chat_sample-ubuntu
visual_language_chat_sample-ubuntu-minicpm_v2_6
7m 15s
visual_language_chat_sample-ubuntu-minicpm_v2_6
visual_language_chat_sample-ubuntu-llava_1_5  /  visual_language_chat_sample-ubuntu-llava
29m 53s
visual_language_chat_sample-ubuntu-llava_1_5 / visual_language_chat_sample-ubuntu-llava
visual_language_chat_sample-ubuntu-llava_next  /  visual_language_chat_sample-ubuntu-llava
17m 19s
visual_language_chat_sample-ubuntu-llava_next / visual_language_chat_sample-ubuntu-llava
visual_language_chat_sample-ubuntu-internvl2
14m 7s
visual_language_chat_sample-ubuntu-internvl2
cpp-continuous-batching-ubuntu
15m 30s
cpp-continuous-batching-ubuntu
cpp-continuous-batching-windows
26m 24s
cpp-continuous-batching-windows
cpp-continuous-batching-macos
22m 26s
cpp-continuous-batching-macos
visual_language_chat_sample-ubuntu-qwen2vl
12m 27s
visual_language_chat_sample-ubuntu-qwen2vl
ci/gha_overall_status_causal_lm
0s
ci/gha_overall_status_causal_lm
Fit to window
Zoom out
Zoom in

Annotations

1 warning
ci/gha_overall_status_causal_lm
ubuntu-latest pipelines will use ubuntu-24.04 soon. For more details, see https://github.com/actions/runner-images/issues/10636