Skip to content

Latest commit

 

History

History
60 lines (49 loc) · 2.62 KB

8-adversarial-ai.org

File metadata and controls

60 lines (49 loc) · 2.62 KB

Adversarial AI

#NotHotDog

img/nothotdog.jpg

Slides

Literature

Blog posts:

Exercise: Generating adversarial examples & LLM Prompt Injection

  • Use cleverhans: https://github.com/cleverhans-lab/cleverhans. (Python/ML) Try out the MNIST tutorial with torch! Afterwards, experiment with different image classification datasets, different models, or different attacks (other than fast gradient sign method). How does the epsilon value in FGSM affect the adversarial process? Setup tip: Do PIP install for torch, torchvision and cleverhans if buggy.
  • Try out LLM prompt injection attacks with: https://gandalf.lakera.ai/ (Web) (less techical than above, how many levels can you solve? What kind of prompts worked?)

Target nets:

Dataset ressources:

GANs resources

https://machinelearningmastery.com/resources-for-getting-started-with-generative-adversarial-networks/

More interesting links

Attacking LLMs tutorial

https://gandalf.lakera.ai/