This project is a Question-Answering system that uses the llama3
model to answer questions from a PDF document. It uses the gpt4all
embedding model to generate embeddings for the questions and the document. The embeddings are then used to retrieve the most relevant paragraphs from the document using RAG
and added to the context of the llama3
model to generate the answer.
- Download and install ollama instructions here.
- Download and run the
llama3
model instructions here.
git clone https://github.com/cleversonledur/doctalk-pdf-langchain-rag.git
cd doctalk-pdf-langchain-rag
pip install -r requirements.txt
Run the following command to start the program:
python main.py -f <path_to_pdf_file>
Depending on the size of the PDF file, it may take a few minutes to load the document and generate the embeddings.
Start talking to the bot:
[DOCTALK] Ask your question (my.pdf): What is it about?
To exit the program, type exit
.
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests.
This project is licensed under the MIT License.