Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT]: Embed Widget limits chat history to workspace setting (or has customizable setting) #2953

Open
Alminc91 opened this issue Jan 8, 2025 · 0 comments
Assignees
Labels
enhancement New feature or request feature request

Comments

@Alminc91
Copy link

Alminc91 commented Jan 8, 2025

What would you like to see?

Request / Solution:
I would like to be able to configure the chat history limit of my embed widgets. This could be either through:

  1. The embed widget simply adopts the setting of the corresponding workspace.
  2. There is an additional setting in the embed configuration. It can be a direct setting like "Max chats per day" or something like "Enable Chat history override".

Or if you advise me what the best approach could be, maybe I could try to fix it myself with a PR.

Problem:
I get more hallucinations when the chat history is set to a higher number. For example, I enable two history message in my workspace (in combination with RAG it is slow enough :D). This leads to substantially less hallucinations in the case of a longer, ongoing chat. Especially, if hallucinations happen because of ungrounded (bad RAG) data or a non-optimal prompt, the behaviour of hallucinations can accumulate. At least, so it seems,
Therefore, in order to benefit from this type of technique to reduce hallucinations, I would love to limit the chat history for the embed widget.

Example:
I ask 5 questions, always the same ones in order. Then, on the sixth question, the workspace set to two messages gives me correct information. And in general, when I continue asking about my documents (RAG), the results contain a lot less hallucinations. The quality is better.

As soon as I use the embed widget or set the chat history to 20, the chatbot starts making up additional information for the sixth question. Continuing the conversation leads to a lot of invented information.

Additional question:

  1. Is the chat history compressed, when more messages are allowed? In order to not slow down responses? Or how does it work? I have my LLM token context window set to 12.000 tokens.

Of course, I could maybe do something with prompt engineering in order to improve the behaviour with longer chat history. But I would need to experiment with that.

Thank you very much for all the amazing upgrades you do and for reacting so fast to issues! It is surely a lot of work, but the app is quite amazing. Blessings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature request
Projects
None yet
Development

No branches or pull requests

2 participants