Skip to content

Latest commit

 

History

History
868 lines (698 loc) · 18.6 KB

RELATED.md

File metadata and controls

868 lines (698 loc) · 18.6 KB

Highlights [0/2]

Related [17/53]

  • [1]
  • [2] DNNs combined with gaussian processes are shown to be more robust to adversarial examples.
  • [3] proposes another adversarial attacking method.
  • [4]
  • [5] assumes that training data may be adversarially manipulated by attackers, e.g., spam/fraud/intrusion detection. They view formulate this as a game between the defender and attacker.
  • [6] leverages adversarial examples to help interpret the mechanism of DNN.
  • [7]
  • [9] masks off "redundant" features to defend against adversarials. However it also limits the model's ability to generalize.
  • [10] density ratio of real-real is close to 1, while real-adversarial is far away from 1.
  • [11] hypothesizes that neural networks are too linear to resist linear adversarial perturbation, e.g., FGSM.
  • [12]
  • [13] tried (de-noise) autoencoder to recover adversarial samples. Despite that their experiment looks promising, they only use MNIST as benchmark.
  • [14] trains end-to-end an attacking model to automatically generate adversarial samples in black-box attack settings, instead of relying on transferability of adversarial samples.
  • [15] implies that ensemble of weak defenses is not sufficient to provide strong defense against adversarial examples.
  • [16] proposes taxonomy of adversarial machine learning. And the authors formulate it as a game between defender and attacker. It is a high level discussion about adversarial machine learning.
  • [17] proposes a min-max training procedure to enhance the model robustness. Basically maximize the least perturbation needed to generate adversarial samples from each data points.
  • [18] generate adversarial paragraphs in a naive way.
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [41] smoothes out the gradient around data samples with distilling technique which successfully enhance the model's resilience to adversarial noise with minimum impact on the model's performance.
  • [38] shows that with a small change in pixel intensity, most images in MNIST can be crafted to a desired target category different from its actual one.
  • [39] successfully applies fast gradient method and forward derivative method to RNN classifiers.
  • [40] introduces a black-box attack against oracle systems, i.e., the attacker has only access to the target system's output, by leveraging the transferability of adversarial samples. In addition also it demonstrate that the attack also applies to non-DNN systems, specifically (k)NN, however with much less success rate.
  • [37] shows that the transferability of adversarial samples is not limited to the same class of models, but rather extend across different model techniques, e.g., deep network, support vector machine, logistic regression, decision tree and ensembles.
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [49] shows that adversarial images appear in large and dense regions in the pixel space.
  • [50]
  • [51]
  • [52]
  • [53]
  • [54]
  • [55]

References

[1] Shumeet Baluja and Ian Fischer. Adversarial transformation networks: Learning to generate adversarial examples. abs/1703.09387, 2017. [ http ]
[2] John Bradshaw, Alexander G de G Matthews, and Zoubin Ghahramani. Adversarial examples, uncertainty, and transfer testing robustness in gaussian process hybrid deep networks. 2017.
[3] Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. abs/1608.04644, 2016. [ http ]
[4] Moustapha Cisse, Piotr Bojanowski, Edouard Grave, Yann Dauphin, and Nicolas Usunier. Parseval networks: Improving robustness to adversarial examples. In International Conference on Machine Learning, pages 854--863, 2017.
[5] Nilesh Dalvi, Pedro Domingos, Sumit Sanghai, Deepak Verma, et al. Adversarial classification. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 99--108. ACM, 2004.
[6] Yinpeng Dong, Hang Su, Jun Zhu, and Fan Bao. Towards interpretable deep neural networks by leveraging adversarial examples. 2017.
[7] Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. Analysis of classifiers' robustness to adversarial perturbations. abs/1502.02590, 2015. [ http ]
[8] Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, and Pascal Frossard. Robustness of classifiers: from adversarial to random noise. In Advances in Neural Information Processing Systems, pages 1624--1632, 2016.
[9] Ji Gao, Beilun Wang, Zeming Lin, Weilin Xu, and Yanjun Qi. Deepcloak: Masking deep neural network models for robustness against adversarial samples. 2017.
[10] Lovedeep Gondara. Detecting adversarial samples using density ratio estimates. 2017.
[11] I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and Harnessing Adversarial Examples. December 2014. [ arXiv ]
[12] Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes, and Patrick McDaniel. On the (statistical) detection of adversarial examples. 2017.
[13] Shixiang Gu and Luca Rigazio. Towards deep neural network architectures robust to adversarial examples. abs/1412.5068, 2014. [ http ]
[14] Jamie Hayes and George Danezis. Machine learning as an adversarial service: Learning black-box adversarial examples. 2017.
[15] Warren He, James Wei, Xinyun Chen, Nicholas Carlini, and Dawn Song. Adversarial example defenses: Ensembles of weak defenses are not strong. 2017.
[16] Ling Huang, Anthony D Joseph, Blaine Nelson, Benjamin IP Rubinstein, and JD Tygar. Adversarial machine learning. In Proceedings of the 4th ACM workshop on Security and artificial intelligence, pages 43--58. ACM, 2011.
[17] Ruitong Huang, Bing Xu, Dale Schuurmans, and Csaba Szepesvári. Learning with a strong adversary. abs/1511.03034, 2015. [ http ]
[18] Robin Jia and Percy Liang. Adversarial examples for evaluating reading comprehension systems. 2017.
[19] J. Kos, I. Fischer, and D. Song. Adversarial Examples for Generative models. February 2017. [ arXiv ]
[20] J. Kos and D. Song. Delving Into Adversarial Attacks on Deep policies. May 2017. [ arXiv ]
[21] A. Kurakin, I. Goodfellow, and S. Bengio. Adversarial Examples in the Physical world. July 2016. [ arXiv ]
[22] Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. Adversarial machine learning at scale. abs/1611.01236, 2016. [ http ]
[23] Bin Liang, Hongcheng Li, Miaoqiang Su, Xirong Li, Wenchang Shi, and Xiaofeng Wang. Detecting adversarial examples in deep networks with adaptive noise reduction. 2017.
[24] Yen-Chen Lin, Zhang-Wei Hong, Yuan-Hong Liao, Meng-Li Shih, Ming-Yu Liu, and Min Sun. Tactics of adversarial attack on deep reinforcement learning agents. 2017.
[25] Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. Delving into transferable adversarial examples and black-box attacks. abs/1611.02770, 2017. [ http ]
[26] Daniel Lowd and Christopher Meek. Adversarial learning. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 641--647. ACM, 2005.
[27] Jiajun Lu, Theerasit Issaranon, and David Forsyth. Safetynet: Detecting and rejecting adversarial examples robustly. 2017.
[28] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. 2017.
[29] Dongyu Meng and Hao Chen. Magnet: a two-pronged defense against adversarial examples. 2017.
[30] Jan Hendrik Metzen, Tim Genewein, Volker Fischer, and Bastian Bischoff. On detecting adversarial perturbations. 2017.
[31] Jan Hendrik Metzen, Mummadi Chaithanya Kumar, Thomas Brox, and Volker Fischer. Universal adversarial perturbations against semantic image segmentation. 2017.
[32] Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, Ken Nakae, and Shin Ishii. Distributional smoothing with virtual adversarial training. 1050:25, 2015.
[33] Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. Universal adversarial perturbations. 2016.
[34] Konda Reddy Mopuri, Utsav Garg, and R Venkatesh Babu. Fast feature fool: A data independent approach to universal adversarial perturbations. 2017.
[35] Taesik Na, Jong Hwan Ko, and Saibal Mukhopadhyay. Cascade adversarial machine learning regularized with a unified embedding. 2017.
[36] Andrew P Norton and Yanjun Qi. Adversarial-playground: A visualization suite showing how adversarial examples fool deep learning. 2017.
[37] N. Papernot, P. McDaniel, and I. Goodfellow. Transferability in Machine Learning: From Phenomena To Black-Box Attacks Using Adversarial Samples. May 2016. [ arXiv ]
[38] Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. The limitations of deep learning in adversarial settings. abs/1511.07528, 2015. [ http ]
[39] Nicolas Papernot, Patrick McDaniel, Ananthram Swami, and Richard E. Harang. Crafting adversarial input sequences for recurrent neural networks. abs/1604.08275, 2016. [ http ]
[40] Nicolas Papernot, Patrick Drew McDaniel, Ian J. Goodfellow, Somesh Jha, Z. Berkay Celik, and Ananthram Swami. Practical black-box attacks against deep learning systems using adversarial examples. abs/1602.02697, 2016. [ http ]
[41] Nicolas Papernot, Patrick Drew McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. Distillation as a defense to adversarial perturbations against deep neural networks. abs/1511.04508, 2015. [ http ]
[42] Sungrae Park, Jun-Keon Park, Su-Jin Shin, and Il-Chul Moon. Adversarial dropout for supervised and semi-supervised learning. 2017.
[43] Andras Rozsa, Manuel Günther, and Terrance E Boult. Adversarial robustness: Softmax versus openmax. 2017.
[44] Sara Sabour, Yanshuai Cao, Fartash Faghri, and David J Fleet. Adversarial manipulation of deep representations. 2015.
[45] Suranjana Samanta and Sameep Mehta. Towards crafting text adversarial samples. 2017.
[46] Sailik Sengupta, Tathagata Chakraborti, and Subbarao Kambhampati. Securing deep neural nets against adversarial attacks with moving target defense. 2017.
[47] Chang Song, Hsin-Pai Cheng, Chunpeng Wu, Hai Li, Yiran Chen, and Qing Wu. A multi-strength adversarial training method to mitigate adversarial attacks. 2017.
[48] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus. Intriguing properties of neural networks. abs/1312.6199, 2013. [ http ]
[49] Pedro Tabacof and Eduardo Valle. Exploring the space of adversarial images. 2015.
[50] Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Dan Boneh, and Patrick McDaniel. Ensemble adversarial training: Attacks and defenses. 2017.
[51] Florian Tramèr, Nicolas Papernot, Ian Goodfellow, Dan Boneh, and Patrick McDaniel. The space of transferable adversarial examples. 2017.
[52] Beilun Wang, Ji Gao, and Yanjun Qi. A theoretical framework for robustness of (deep) classifiers under adversarial noise. 2016.
[53] Y. Wang, S. Jha, and K. Chaudhuri. Analyzing the Robustness of Nearest Neighbors To Adversarial Examples. June 2017. [ arXiv ]
[54] Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, and Alan Yuille. Adversarial examples for semantic segmentation and object detection. 2017.
[55] Huan Xu, Constantine Caramanis, and Shie Mannor. Robustness and regularization of support vector machines. 10(Jul):1485--1510, 2009.