Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

where is the starter.yaml referred in Tutorial doc? #1067

Open
txsing opened this issue Dec 27, 2024 · 4 comments
Open

where is the starter.yaml referred in Tutorial doc? #1067

txsing opened this issue Dec 27, 2024 · 4 comments

Comments

@txsing
Copy link

txsing commented Dec 27, 2024

In the tutorial document:

it said:

Let’s find an optimal RAG pipeline with AutoRAG! After you prepare your evaluation dataset, you need to have a config YAML file. There are few pre-made config YAML files at our GitHub repo sample_config folder. We highly recommend using pre-made config YAML files for starter. Download starter.yaml file to your local environment, and you are ready to go.

However, i could not find the starter.yaml file in the repo. Do you renamed it? Where i can find such config file to have a quick try on AutoRAG?
A lot of thx.

@e7217
Copy link
Contributor

e7217 commented Dec 30, 2024

@txsing
I think it may have been changed.
If you need a simple example, how about referring to the link below?
The maintainers provide the example repository.

https://github.com/Marker-Inc-Korea/AutoRAG-tutorial

@txsing
Copy link
Author

txsing commented Jan 2, 2025

Thank you very much! I’ll take a look. It would still be great if the tutorial document could be updated to help more beginners avoid getting stuck at the very early stages.

@txsing txsing closed this as completed Jan 2, 2025
@txsing txsing reopened this Jan 2, 2025
@txsing
Copy link
Author

txsing commented Jan 2, 2025

@txsing I think it may have been changed. If you need a simple example, how about referring to the link below? The maintainers provide the example repository.

https://github.com/Marker-Inc-Korea/AutoRAG-tutorial

Seems like this tutorial repo is again not compatible to latest version of AutoRAG. I get stucked at very beginning step again. Is there a latest tutorial code for starters?

@e7217
Copy link
Contributor

e7217 commented Jan 2, 2025

@txsing
The tutorial may not reflect the most recent version of the code.
If you just want to run the sample, I recommend installing AutoRAG[parse]==0.3.10 as the documentation may not be up-to-date.

If needed, please follow these steps:

  1. Install the additional package:
pip install pycryptodome
  1. Run the parser:
python run_parse.py

with this configuration.

modules:
  - module_type: langchain_parse
    parse_method: [ pdfminer, pypdf, pymupdf ]
  # - module_type: llamaparse
  #   result_type: markdown
  #   language: en
  # - module_type: table_hybrid_parse
  #   text_parse_module: langchain_parse
  #   text_params:
  #     parse_method: pdfplumber
  #   table_parse_module: llamaparse
  #   table_params:
  #     result_type: markdown
  #     language: en
  1. Run the chunker:
python run_chunk.py --raw_path ./parsed_raw/0.parquet
  1. Run the QA maker:
python make_qa.py --raw_path ./parsed_raw/0.parquet --corpus_path ./chunked_corpus/0.parquet --qa_size 5

remember this:
Image
5. Run the evaluator:

python main.py --config ./config/tutorial.yaml

I hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants