You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
clip-hallucin-interrogator.py uses the original + in addition, CLIP's own words, obtained in gradient ascent for many diverse images.
⚠️ The 'hallucin' words are NOT filtered. Will very likely contain sensitive / offensive / NSFW words that may be produced even if you have PG-13 images.
⚠️ Contains 'sneaky' offensive words that cannot be detected by simple NLP. Hence I cannot confidently provide a separate (safe, filtered) version.
🔎 A real example CLIP word: aggravfckremove - is: aggravated + look again: aggravfckremove. Yes. It's a CLIP concept of 'being angry and violent'.
⚠️ Alas, clip-hallucin-interrogator.py is for research / personal (and responsible) use only.
Remember that you can use --mode negative (choices=['best', 'classic', 'fast', 'negative']) to steer away from unwanted concepts with a negative prompt.
Consider replacing data/wCLIPy_negative.txt with a "blacklist words github" (just google it). But still, don't use this for publicly accessible stuff.
👉 Also remember: Every chosen word is the best match for the image (according to CLIP, at least; and CLIP guides your generative model).
☝️🤓
# Example usage:
# Fine-tuned SAE-CLIP (huggingface.co/zer0int), including additional 'trippywords':
python clip-hallucin-interrogator.py --output csv --outfile saeclipwords --image_folder images --m_clip zer0int/CLIP-SAE-ViT-L-14 --trippywords
# LongCLIP with a smaller batch size:
python clip-hallucin-interrogator.py --output csv --outfile longclipwords --image_folder images --m_clip zer0int/LongCLIP-SAE-ViT-L-14 --chunk_size 1024
# Fine-tuned GmP-CLIP with BLIP-2, save both .txt and .csv:
python clip-hallucin-interrogator.py --output both --outfile gmpblip2 --image_folder images --m_clip zer0int/CLIP-GmP-ViT-L-14 --m_caption blip2-2.7b
# Fine-tuned LongGmP-CLIP with BLIP-2 in 'fast' mode:
python clip-hallucin-interrogator.py --output csv --outfile longgmpblip2 --image_folder images --m_clip zer0int/CLIP-GmP-ViT-L-14 --m_caption blip2-2.7b --mode fast
# See all options!
python clip-hallucin-interrogator.py --help
👨💻🤖
DIY wordlist using CLIP gradient ascent (yes, all above caveats apply ⚠️):
🔎 Why? CLIP knows best what a CLIP sees (and will subsequently guide a diffusion model into).
But gradient ascent is expensive (compute). Give it a few representative images, get a CLIP 'opinion', re-use the words in CLIP-Interrogator for all images!✨
Usage: python diy-0-run-gradient-ascent.py --img_folder path/to/myimages (or --img_folder images as example)
Then: python diy-1-preprocess-words.py. Result: a .txt file with all the words.
Clean them (manually review, delete weird ones), and replace data/ownwords.txt with the file.
🕵️
Optional: If you gave CLIP a lot of images, you'll likely have an overwhelming amount of words:
Use python diy-2-make-clusters-DBSCAN.py to leverage a clustering algorithm to sort them out.
Clustering with DBSCAN will do 80% of the job, but you'll still have to categorize and review the individual clusters manually.
[Or use the OpenAI API and ask GPT-4o or a similar SOTA model]. Ain't no simple NLP gonna understand CLIP's crazy words!
Example for CLIP words belonging to (I guess) an 'animals' concept cluster found with DBSCAN:
The script also saves plots. Check them. Very scattered clusters? Likely noise words. Is cluster AND seems to include your concept (even if in a weird way)? Valid words.
Later clusters (higher number) are more likely to contain noise, e.g. 'asdfghj' (yes, that's a token in CLIP). Remove suspicious / unclear words.
Edit python diy-1-preprocess-words.py to point to the folder with the edited clusters, save to data/ownwords.txt
Remember to use the argument --ownwords to include your DIY words.
If they don't show up in the result, they were worse than the other choices in CLIP Interrogator / didn't generalize to all of your images. Try again using more images.