Source code for paper "Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language models".
If you find this project useful, feel free to ⭐️ it and give it a citation!
Auto-RAG is an autonomous iterative retrieval model centered on the LLM's powerful decision-making capabilities. Auto-RAG models the interaction between the LLM and the retriever through multi-turn dialogue, employs iterative reasoning to determine when and what to retrieve, ceasing the iteration when sufficient external knowledge is available, and subsequently providing the answer to the user.
- GUI interaction: We provide a deployable user interaction interface. After inputting a question, Auto-RAG autonomously engages in interaction with the retriever without any human intervention. Users have the option to decide whether to display the details of the interaction between Auto-RAG and the retriever.
- To interact with Auto-RAG in your browser, follow the guide for GUI interaction.
We provide trained Auto-RAG models using the synthetic data. Please refer to https://huggingface.co/ICTNLP/Auto-RAG-Llama-3-8B-Instruct.
- Environment requirements: Python 3.12, FlexRAG.
conda env create autorag python=3.12
pip install flexrag
- Clone Auto-RAG's repo.
git clone https://github.com/ictnlp/Auto-RAG.git
cd Auto-RAG
- Download corpus and prepare the retriever
We use the wiki corpus provided by DPR project. You can prepare the dense retriever by runing the following command:
bash scripts/prepare_retriever.sh
We use vLLM to deploy the model for inference. You can update the parameters in vllm.sh to adjust the GPU and model path configuration, then execute:
bash scripts/deploy.sh
To interact with Auto-RAG in your browser, run the following command:
bash scripts/run_gui.sh
Tip
The interaction process between Auto-RAG and the retriever can be optionally displayed by adjusting a toggle.
You can also run Auto-RAG as a FlexRAG assistant. To do this, execute the following command:
ENCODER_PATH='intfloat/e5-base-v2'
MODEL_NAME="<name of your deployed vllm model>"
BASE_URL="<your model url>"
DATA_PATH="<path to the test data>"
python -m flexrag.entrypoints.run_assistant \
user_module="<Auto-RAG path>" \
data_path=$DATA_PATH \
assistant_type=autorag \
autorag_config.model_name=$MODEL_NAME \
autorag_config.base_url=$BASE_URL \
autorag_config.database_path=wiki \
autorag_config.index_type=faiss \
autorag_config.query_encoder_config.encoder_type=hf \
autorag_config.query_encoder_config.hf_config.model_path=$ENCODER_PATH \
eval_config.metrics_type=[retrieval_success_rate,generation_f1,generation_em] \
eval_config.retrieval_success_rate_config.eval_field=text \
eval_config.response_preprocess.processor_type=[simplify_answer] \
log_interval=10
Note
Experimental results show that Auto-RAG outperforms all baselines across six benchmarks.
This project is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.
If this repository is useful for you, please cite as:
@article{yu2024autorag,
title={Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models},
author={Tian Yu and Shaolei Zhang and Yang Feng},
year={2024},
eprint={2411.19443},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2411.19443},
}
If you have any questions, feel free to contact yutian23s@ict.ac.cn
.