You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Gradio Demo demo_video.py runs perfectly fine, there’s no errors for loading in all these vision encoder, BLIP2, LLAMA-2 weights.
2, However, the inference gives totally wrong answer and it seems that the model is not using the vision encoder. For example, asking the model about this following photo gives irrelevant answers.
3, I wonder what may have been wrong in my setup.
Thank you very much!!
The text was updated successfully, but these errors were encountered:
Hello,
1, I have set up Video-LLaMA from this repo. I have downloaded all checkpoints for inference:
The Gradio Demo
demo_video.py
runs perfectly fine, there’s no errors for loading in all these vision encoder, BLIP2, LLAMA-2 weights.2, However, the inference gives totally wrong answer and it seems that the model is not using the vision encoder. For example, asking the model about this following photo gives irrelevant answers.
3, I wonder what may have been wrong in my setup.
Thank you very much!!
The text was updated successfully, but these errors were encountered: