Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense

Liu, Deyin; Wu, Lin Yuanbo; Li, Bo; Boussaid, Farid; Bennamoun, Mohammed; Xie, Xianghua; Liang, Chengwu; Wu, Yuanbo

doi:10.1016/j.patcog.2023.109902

Journal article 838 views 190 downloads

Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense

Deyin Liu

, Lin Yuanbo Wu, Bo Li, Farid Boussaid, Mohammed Bennamoun, Xianghua Xie

, Chengwu Liang, Yuanbo Wu

Pattern Recognition, Volume: 145, Start page: 109902

Swansea University Authors: Xianghua Xie , Yuanbo Wu

PDF | Proof
Download (4.5MB)

Check full text

DOI (Published version): 10.1016/j.patcog.2023.109902

Abstract

Deep neural networks (DNNs) can be easily deceived by imperceptible alterations known as adversarial examples. These examples can lead to misclassification, posing a significant threat to the reliability of deep learning systems in real-world applications. Adversarial training (AT) is a popular tech...

Full description

Published in:	Pattern Recognition
ISSN:	0031-3203
Published:	Elsevier BV 2024
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa64108

Abstract:	Deep neural networks (DNNs) can be easily deceived by imperceptible alterations known as adversarial examples. These examples can lead to misclassification, posing a significant threat to the reliability of deep learning systems in real-world applications. Adversarial training (AT) is a popular technique used to enhance robustness by training models on a combination of corrupted and clean data. However, existing AT-based methods often struggle to handle transferred adversarial examples that can fool multiple defense models, thereby falling short of meeting the generalization requirements for real-world scenarios. Furthermore, AT typically fails to provide interpretable predictions, which are crucial for domain experts seeking to understand the behavior of DNNs. To overcome these challenges, we present a novel approach called Jacobian norm and Selective Input Gradient Regularization (J-SIGR). Our method leverages Jacobian normalization to improve robustness and introduces regularization of perturbation-based saliency maps, enabling interpretable predictions. By adopting J-SIGR, we achieve enhanced defense capabilities and promote high interpretability of DNNs. We evaluate the effectiveness of J-SIGR across various architectures by subjecting it to powerful adversarial attacks. Our experimental evaluations provide compelling evidence of the efficacy of J-SIGR against transferred adversarial attacks, while preserving interpretability. The project code can be found at https://github.com/Lywu-github/jJ-SIGR.git.
Keywords:	Selective input gradient regularization, Jacobian normalization, Adversarial robustness
College:	Faculty of Science and Engineering
Funders:	This work was partially supported by NSFC U19A2073 , 62002096, 62001394, 62372150, 62176086.
Start Page:	109902

Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense

Similar Items