No Cover Image

Journal article 220 views

Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense

Deyin Liu Orcid Logo, Lin Yuanbo Wu, Bo Li, Farid Boussaid, Mohammed Bennamoun, Xianghua Xie Orcid Logo, Chengwu Liang, Yuanbo Wu

Pattern Recognition, Volume: 145, Start page: 109902

Swansea University Authors: Xianghua Xie Orcid Logo, Yuanbo Wu

  • Proof under embargo until: 22nd August 2024

Abstract

Deep neural networks (DNNs) can be easily deceived by imperceptible alterations known as adversarial examples. These examples can lead to misclassification, posing a significant threat to the reliability of deep learning systems in real-world applications. Adversarial training (AT) is a popular tech...

Full description

Published in: Pattern Recognition
ISSN: 0031-3203
Published: Elsevier BV 2024
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa64108
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract: Deep neural networks (DNNs) can be easily deceived by imperceptible alterations known as adversarial examples. These examples can lead to misclassification, posing a significant threat to the reliability of deep learning systems in real-world applications. Adversarial training (AT) is a popular technique used to enhance robustness by training models on a combination of corrupted and clean data. However, existing AT-based methods often struggle to handle transferred adversarial examples that can fool multiple defense models, thereby falling short of meeting the generalization requirements for real-world scenarios. Furthermore, AT typically fails to provide interpretable predictions, which are crucial for domain experts seeking to understand the behavior of DNNs. To overcome these challenges, we present a novel approach called Jacobian norm and Selective Input Gradient Regularization (J-SIGR). Our method leverages Jacobian normalization to improve robustness and introduces regularization of perturbation-based saliency maps, enabling interpretable predictions. By adopting J-SIGR, we achieve enhanced defense capabilities and promote high interpretability of DNNs. We evaluate the effectiveness of J-SIGR across various architectures by subjecting it to powerful adversarial attacks. Our experimental evaluations provide compelling evidence of the efficacy of J-SIGR against transferred adversarial attacks, while preserving interpretability. The project code can be found at https://github.com/Lywu-github/jJ-SIGR.git.
Keywords: Selective input gradient regularization, Jacobian normalization, Adversarial robustness
College: Faculty of Science and Engineering
Funders: This work was partially supported by NSFC U19A2073 , 62002096, 62001394, 62372150, 62176086.
Start Page: 109902