Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any guide for training our own .pt? #10

Open
tcflying opened this issue Oct 25, 2022 · 5 comments
Open

Any guide for training our own .pt? #10

tcflying opened this issue Oct 25, 2022 · 5 comments

Comments

@tcflying
Copy link

tcflying commented Oct 25, 2022

Thank you for the great work first, but is there any guide for training our own .pt? eg. how many sample picture should be training one or ? and batch size? Thanks.

@eeyrw
Copy link

eeyrw commented Oct 26, 2022

This method do not train model in advance. It trains/finetunes language model CLIP to make your input text embedding similar to the style embedding you created in runtime. So you just use https://github.com/vicgalle/stable-diffusion-aesthetic-gradients/blob/main/scripts/gen_aesthetic_embeddings.py to generate image embedding created by pretrained CLIP image encoder.

@tcflying
Copy link
Author

thanks a lot of the reply, i got it. exactly, i mean i should prepare how many pictures or batch to create embedding?

@eeyrw
Copy link

eeyrw commented Oct 26, 2022

According repo owner's example, 3 images should be sufficient.

@tcflying
Copy link
Author

According repo owner's example, 3 images should be sufficient.

thank you guys, as vicgalle mention: fantasy.pt: created from https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6.5plus by filtering only the images with word "fantasy" in the caption. The top 2000 images by score are selected for the embedding.
flower_plant.pt: created from https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6.5plus by filtering only the images with word "plant", "flower", "floral", "vegetation" or "garden" in the caption. The top 2000 images by score are selected for the embedding.

he use 2000 pics, Also, how many batch? UM...

@eeyrw
Copy link

eeyrw commented Oct 26, 2022

According chapter Using your own embeddings, he use 3 images as input to create embedding and achieve acceptable performance. But as you said, the author used thousands images to create embedding for actual usage. Maybe much is better. What do you mean of "batch"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants