Warning
At this time, this is a personal project and not intended for distribution.
My primary use for generative ai leveraging large language models is for scientific research and code development. While today's llm models are quite adept at solving most problems, I often would like to feed research articles and/or open source projects to the llm for additional context. Many of those research articles are likely outside the scope of the llm's training data.
Chatterbox is a collection langgraph workflows (graphs) made up of components (nodes, conditional edges, and utilities). Each workflow has an associated frontend web application built with Streamlit.
Run the app to launch stremlit chat (requires an .env file with API KEYS)
uv run python app.py --chat
Or
uv run --env-file .env -- python st_chat.py
One of the objectives of the project was to explore different large language models for different agents. Chatterbox does this by providing a function get_llm_model
that returns a BaseChatModel for each llm defined in the LargeLanguageModelsEnum
.
To use the language models provided by Anthropic, OpenAI, or Fireworks, requires an api key which must be stored in the .env
file.
Collecting information from the web, arxiv, and pdf documents will ususally provide the context necessary to answer a question, if you get the chunking right and have enough context in each document.
Another option is to build a collection of documents from research notes. To do this, I write my research notes in latex files (.tex) and use the ... to load and save to a Chroma database.
The chat app is the simplest application ...
To run,
uv run python app.py --chat
The objective of this graph is to summarize all relevant documents to the input research prompt.