[ English | 中文 ]
Generate data using OLLAMA and use a simple script to create a pre-training data set
Refer to how to use ollama serve to start the api.
Run python Synth.py
Enter the ollama model you want to use.
To enter prompts, enter done to end manual prompt entry, or place one prompt per line in the txt file.
Modify data=""
For example:
txt is stored in ./cleaned3
data="./cleaned3"
Run python alltxttojson.py
You will get the data{}.json file