-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any guide for training our own .pt? #10
Comments
This method do not train model in advance. It trains/finetunes language model CLIP to make your input text embedding similar to the style embedding you created in runtime. So you just use https://github.com/vicgalle/stable-diffusion-aesthetic-gradients/blob/main/scripts/gen_aesthetic_embeddings.py to generate image embedding created by pretrained CLIP image encoder. |
thanks a lot of the reply, i got it. exactly, i mean i should prepare how many pictures or batch to create embedding? |
According repo owner's example, 3 images should be sufficient. |
thank you guys, as vicgalle mention: fantasy.pt: created from https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6.5plus by filtering only the images with word "fantasy" in the caption. The top 2000 images by score are selected for the embedding. he use 2000 pics, Also, how many batch? UM... |
According chapter Using your own embeddings, he use 3 images as input to create embedding and achieve acceptable performance. But as you said, the author used thousands images to create embedding for actual usage. Maybe much is better. What do you mean of "batch"? |
Thank you for the great work first, but is there any guide for training our own .pt? eg. how many sample picture should be training one or ? and batch size? Thanks.
The text was updated successfully, but these errors were encountered: