You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Request / Solution:
I would like to be able to configure the chat history limit of my embed widgets. This could be either through:
The embed widget simply adopts the setting of the corresponding workspace.
There is an additional setting in the embed configuration. It can be a direct setting like "Max chats per day" or something like "Enable Chat history override".
Or if you advise me what the best approach could be, maybe I could try to fix it myself with a PR.
Problem:
I get more hallucinations when the chat history is set to a higher number. For example, I enable two history message in my workspace (in combination with RAG it is slow enough :D). This leads to substantially less hallucinations in the case of a longer, ongoing chat. Especially, if hallucinations happen because of ungrounded (bad RAG) data or a non-optimal prompt, the behaviour of hallucinations can accumulate. At least, so it seems,
Therefore, in order to benefit from this type of technique to reduce hallucinations, I would love to limit the chat history for the embed widget.
Example:
I ask 5 questions, always the same ones in order. Then, on the sixth question, the workspace set to two messages gives me correct information. And in general, when I continue asking about my documents (RAG), the results contain a lot less hallucinations. The quality is better.
As soon as I use the embed widget or set the chat history to 20, the chatbot starts making up additional information for the sixth question. Continuing the conversation leads to a lot of invented information.
Additional question:
Is the chat history compressed, when more messages are allowed? In order to not slow down responses? Or how does it work? I have my LLM token context window set to 12.000 tokens.
Of course, I could maybe do something with prompt engineering in order to improve the behaviour with longer chat history. But I would need to experiment with that.
Thank you very much for all the amazing upgrades you do and for reacting so fast to issues! It is surely a lot of work, but the app is quite amazing. Blessings.
The text was updated successfully, but these errors were encountered:
What would you like to see?
Request / Solution:
I would like to be able to configure the chat history limit of my embed widgets. This could be either through:
Or if you advise me what the best approach could be, maybe I could try to fix it myself with a PR.
Problem:
I get more hallucinations when the chat history is set to a higher number. For example, I enable two history message in my workspace (in combination with RAG it is slow enough :D). This leads to substantially less hallucinations in the case of a longer, ongoing chat. Especially, if hallucinations happen because of ungrounded (bad RAG) data or a non-optimal prompt, the behaviour of hallucinations can accumulate. At least, so it seems,
Therefore, in order to benefit from this type of technique to reduce hallucinations, I would love to limit the chat history for the embed widget.
Example:
I ask 5 questions, always the same ones in order. Then, on the sixth question, the workspace set to two messages gives me correct information. And in general, when I continue asking about my documents (RAG), the results contain a lot less hallucinations. The quality is better.
As soon as I use the embed widget or set the chat history to 20, the chatbot starts making up additional information for the sixth question. Continuing the conversation leads to a lot of invented information.
Additional question:
Of course, I could maybe do something with prompt engineering in order to improve the behaviour with longer chat history. But I would need to experiment with that.
Thank you very much for all the amazing upgrades you do and for reacting so fast to issues! It is surely a lot of work, but the app is quite amazing. Blessings.
The text was updated successfully, but these errors were encountered: