[6] Zhang, H., Yu, Y., Jiao, J., Xing, E. P., Ghaoui, L. E., & Jordan, M. I. (2019). Theoretically principled trade-off between robustness and accuracy. ICML .
(1) f3arwin requires more computational time than PGD-AT for large models (≈3× training slowdown due to population evaluation). (2) The attack may fail on models with extremely non-smooth decision boundaries where crossover becomes destructive. (3) For very high-dimensional inputs (e.g., 224×224×3), the perturbation search space remains challenging without dimensionality reduction. f3arwin
Author: (Generated for academic demonstration) Affiliation: AI Robustness Lab Date: April 17, 2026 Abstract The vulnerability of deep neural networks (DNNs) to adversarial examples—inputs perturbed imperceptibly to induce misclassification—remains a critical challenge for deploying AI in security-sensitive domains. Existing defense mechanisms, such as adversarial training, often rely on static threat models or gradient-based attacks, which can be circumvented by black-box or evolutionary search methods. This paper introduces f3arwin (Fast Flexible Evolutionary Framework for Adversarial Robustness Without Input Normalization), a novel framework that leverages genetic algorithms (GAs) to generate diverse, transferable adversarial perturbations and simultaneously harden DNNs against them. Unlike gradient-based approaches, f3arwin operates in a black-box setting, requires no differentiability of the target model, and adapts its mutation and crossover operators dynamically. We evaluate f3arwin on CIFAR-10 and ImageNet subsets, achieving a success rate of 94.2% against undefended ResNet-50 models and improving adversarial robustness by 37% after evolutionary defensive distillation. The results demonstrate that evolutionary robustness strategies offer a complementary, query-efficient alternative to gradient-based defenses. 1. Introduction Adversarial examples exploit the linearity and non-robust features of DNNs (Goodfellow et al., 2015; Ilyas et al., 2019). While gradient-based attacks (e.g., FGSM, PGD) are common, they assume white-box access and differentiable loss surfaces. Real-world systems often obscure gradients, and defenses like gradient masking can thwart these attacks. Evolutionary algorithms (EAs) require only final model outputs (scores or labels), making them ideal for black-box adversarial generation. [6] Zhang, H
[5] Su, J., Vargas, D. V., & Sakurai, K. (2018). One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation . While gradient-based attacks (e.g.
[2] Goodfellow, I. J., Shlens, J., & Szegedy, C. (2015). Explaining and harnessing adversarial examples. ICLR .
$$F(\delta) = \underbrace\mathbbI[f_\theta(x+\delta) \neq y] \cdot (1 - \textsoftmax(f_\theta(x+\delta)) y) \textMisclassification confidence - \lambda \cdot \frac_2\epsilon \sqrtd$$
50% Complete
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.