Awesome Machine Learning Reliability

Figure from "Explaining and Harnessing Adversarial Examples" by Goodfellow et al. ICLR15

A curated list of awesome papers regarding machine learning reliability, inspired by Awesome Machine Learning On Source Code and Awesome Adversarial Machine Learning.

Conferences

Security

Machine Learning

Natural Language Processing

Conference Deadlines

Blogs

Competitions

NeurIPS Adversarial Vision Challenge

Papers

Adversarial Computer Vision

Attack

White-box Attack

[ICLR14] Intriguing properties of neural networks - Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus.
[ICLR15] Explaining and Harnessing Adversarial Examples - Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy.
[S&P17] Towards Evaluating the Robustness of Neural Networks - Nicholas Carlini and David Wagner. [code] [talk]
[ICML18] Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples - Anish Athalye, Nicholas Carlini, and David Wagner. [code] [talk]
[CVPR18] Fooling Vision and Language Models Despite Localization and Attention Mechanism - Xiaojun Xu, Xinyun Chen, Chang Liu, Anna Rohrbach, Trevor Darrell, and Dawn Song.
[IJCAI17] Tactics of Adversarial Attack on Deep Reinforcement Learning Agents - Yen-Chen Lin, Zhang-Wei Hong, Yuan-Hong Liao, Meng-Li Shih, Ming-Yu Liu, and Min Sun.
[S&P16] The Limitations of Deep Learning in Adversarial Settings - Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami.
[CVPR16] DeepFool: a simple and accurate method to fool deep neural networks - Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard.

Black-box Attack

[Arxiv16] Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples - Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow.
[AISec17] ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models - Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh.
[Arxiv17] Query-Efficient Black-box Adversarial Examples - Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy Lin.

Real-world Attack

[Arxiv19] Natural Adversarial Examples - Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt, and Dawn Song. [dataset]
[CVPR18] Robust Physical-World Attacks on Deep Learning Models - Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song.
[ICML18] Synthesizing Robust Adversarial Examples - Anish Athalye, Logan Engstrom, Andrew Ilyas, and Kevin Kwok.
[CVPR17 Workshop] NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles - Jiajun Lu, Hussein Sibai, Evan Fabry, and David Forsyth. [slides]
[ICLR17] Adversarial Examples in the Physical World - Alexey Kurakin, Ian Goodfellow, and Samy Bengio.

Benchmarking

[ICLR19] Benchmarking Neural Network Robustness to Common Corruptions and Perturbations - Dan Hendrycks and Thomas Dietterich.

Defense

Adversarial Training

[ICLR18] Towards Deep Learning Models Resistant to Adversarial Attacks - Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. [code (mnist)] [code (cifar10)]
[NeurIPS17] Defense against Adversarial Attacks Using High-Level Representation Guided Denoiser - Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Xiaolin Hu, and Jun Zhu. [code]
[Arxiv18] Adversarial Logit Pairing - Harini Kannan, Alexey Kurakin, and Ian Goodfellow. [code]
[ICLR18] Generating Natural Adversarial Examples - Zhengli Zhao, Dheeru Dua, and Sameer Singh.

Adversarial Detection

[AISec17] Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods - Nicholas Carlini and David Wagner.
[NDSS18] Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks - Weilin Xu, David Evans, and Yanjun Qi.
[NeurIPS18] Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples - Guanhong Tao, Shiqing Ma, Yingqi Liu, and Xiangyu Zhang.
[NDSS19] NIC: Detecting Adversarial Samples with Neural Network Invariant Checking - Shiqing Ma, Yingqi Liu, Guanhong Tao, Wen-Chuan Lee, and Xiangyu Zhang.

Model Compression

[S&P16] Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks - Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami.
[ICLR18] Attacking Binarized Neural Networks - Angus Galloway, Graham W. Taylor, and Medhat Moussa.
[ICLR19] Defensive Quantization: When Efficiency Meets Robustness - Ji Lin, Chuang Gan, and Song Han.

Manifold Projections

[CCS17] MagNet: A Two-Pronged Defense against Adversarial Examples - Dongyu Meng and Hao Chen.

Adversarial NLP and Speech

[Arxiv18] Identifying and Controlling Important Neurons in Neural Machine Translation - Anthony Bau, Yonatan Belinkov, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, and James Glass.
[Arxiv18] Robust Neural Machine Translation with Joint Textual and Phonetic Embedding - Hairong Liu, Mingbo Ma, Liang Huang, Hao Xiong, and Zhongjun He.
[Arxiv18] Improving the Robustness of Speech Translation - Xiang Li, Haiyang Xue, Wei Chen, Yang Liu, Yang Feng, and Qun Liu.
[Arxiv18] Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples - Minhao Cheng, Jinfeng Yi, Huan Zhang, Pin-Yu Chen, and Cho-Jui Hsieh.
[Arxiv18] Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data - Puyudi Yang, Jianbo Chen, Cho-Jui Hsieh, Jane-Ling Wang, and Michael I. Jordan.
[ICLR18] Synthetic and Natural Noise Both Break Neural Machine Translation - Yonatan Belinkov and Yonatan Bisk.
[ACL18] Towards Robust Neural Machine Translation - Yong Cheng, Zhaopeng Tu, Fandong Meng, Junjie Zhai, and Yang Liu.
[ACL18] Did the Model Understand the Question? - Pramod Kaushik Mudrakarta, Ankur Taly, Mukund Sundararajan, and Kedar Dhamdhere.
[ACL18] Trick Me If You Can: Adversarial Writing of Trivia Challenge Questions [Student Research Workshop] - Eric Wallace and Jordan Boyd-Graber.
[EMNLP18] Generating natural language adversarial examples - Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava, and Kai-Wei Chang.
[NAACL18] Adversarial Example Generation with Syntactically Controlled Paraphrase Networks - Mohit Iyyer, John Wieting, Kevin Gimpel, and Luke Zettlemoyer.
[COLING18] On Adversarial Examples for Character-Level Neural Machine Translation - Javid Ebrahimi, Daniel Lowd, and Dejing Dou.
[ICLR17] Adversarial Training Methods for Semi-Supervised Text Classification - Takeru Miyato, Andrew M. Dai, and Ian Goodfellow.
[EMNLP17] Adversarial Examples for Evaluating Reading Comprehension Systems - Robin Jia and Percy Liang.
[MILCOM16] Crafting Adversarial Input Sequences for Recurrent Neural Networks - Nicolas Papernot, Patrick McDaniel, Ananthram Swami, and Richard Harang.
[CSAW16] Hidden Voice Commands - Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang, Micah Sherr, Clay Shields, David Wagner, and Wenchao Zhou. [talk]

Provable and Verifiable AI Robustness

[ICML18] Differentiable Abstract Interpretation for Provably Robust Neural Networks - Matthew Mirman, Timon Gehr, and Martin Vechev.
[ICML18] Provable defenses against adversarial examples via the convex outer adversarial polytope - Eric Wong and J. Zico Kolter. [code]
[ICLR18] Certified Defenses against Adversarial Examples - Aditi Raghunathan, Jacob Steinhardt, and Percy Liang.
[Arxiv18] On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models - Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy Mann, and Pushmeet Kohli.
[Arxiv18] Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability - Kai Y. Xiao, Vincent Tjeng, Nur Muhammad Shafiullah, and Aleksander Madry.

Machine Learning Testing

[Arxiv19] Machine Learning Testing: Survey, Landscapes and Horizons - Jie M. Zhang, Mark Harman, Lei Ma, and Yang Liu.
[FSE18] MODE: Automated Neural Network Model Debugging via State Differential Analysis and Input Selection - Shiqing Ma, Yingqi Liu, Wen-Chuan Lee, Xiangyu Zhang, Ananth Grama.
[Arxiv18] Testing Untestable Neural Machine Translation: An Industrial Case - Wujie Zheng, Wenyu Wang, Dian Liu, Changrong Zhang, Qinsong Zeng, Yuetang Deng, Wei Yang, Pinjia He, Tao Xie.
[ASE18] DeepGauge: Multi-Granularity Testing Criteria for Deep Learning Systems - Lei Ma, Felix Juefei-Xu, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li, Chunyang Chen, Ting Su, Li Li, Yang Liu, Jianjun Zhao, Yadong Wang.
[ICSE18] DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars - Yuchi Tian, Kexin Pei, Suman Jana, Baishakhi Ray.
[SOSP17] DeepXplore: Automated Whitebox Testing of Deep Learning Systems - Kexin Pei, Yinzhi Cao, Junfeng Yang, Suman Jana.
[KDD16] "Why Should I Trust You?": Explaining the Predictions of Any Classifier - Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. [code], [slides], [video]

Survey

[Arxiv17] Adversarial Examples: Attacks and Defenses for Deep Learning - Xiaoyong Yuan, Pan He, Qile Zhu, and Xiaolin Li.
[Arxiv18] Adversarial Examples - A Complete Characterisation of the Phenomenon - Alexandru Constantin Serban and Erik Poll.
[Arxiv18] Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey - Naveed Akhtar and Ajmal Mian.
[Arxiv19] Adversarial Examples: Opportunities and Challenges - Jiliang Zhang and Chen Li.

Empirical Study

[ECCV18] Is Robustness the Cost of Accuracy? -- A Comprehensive Study on the Robustness of 18 Deep Image Classification Models - Dong Su, Huan Zhang, Hongge Chen, Jinfeng Yi, Pin-Yu Chen, Yupeng Gao. [code]

Other Applications

[Arxiv17] Black-Box Attacks against RNN based Malware Detection Algorithms - Weiwei Hu, Ying Tan

Other Resources

Trustworthy Machine Learning - A suite of tools for making machine learning secure and trustworthy

License

To the extent possible under law, Zhuangbin Chen has waived all copyright and related or neighboring rights to Awesome Machine Learning Reliability. This work is published from: China.

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
img		img
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Machine Learning Reliability

Figure from "Explaining and Harnessing Adversarial Examples" by Goodfellow et al. ICLR15

Contents

Conferences

Security

Machine Learning

Natural Language Processing

Conference Deadlines

Blogs

Competitions

Papers

Adversarial Computer Vision

Attack

White-box Attack

Black-box Attack

Real-world Attack

Benchmarking

Defense

Adversarial Training

Adversarial Detection

Model Compression

Manifold Projections

Adversarial NLP and Speech

Provable and Verifiable AI Robustness

Machine Learning Testing

Survey

Empirical Study

Other Applications

Other Resources

License

About

Releases

Packages

Contributors 3

zbchern/awesome-machine-learning-reliability

Folders and files

Latest commit

History

Repository files navigation

Awesome Machine Learning Reliability

Figure from "Explaining and Harnessing Adversarial Examples" by Goodfellow et al. ICLR15

Contents

Conferences

Security

Machine Learning

Natural Language Processing

Conference Deadlines

Blogs

Competitions

Papers

Adversarial Computer Vision

Attack

White-box Attack

Black-box Attack

Real-world Attack

Benchmarking

Defense

Adversarial Training

Adversarial Detection

Model Compression

Manifold Projections

Adversarial NLP and Speech

Provable and Verifiable AI Robustness

Machine Learning Testing

Survey

Empirical Study

Other Applications

Other Resources

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Packages