Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense

Liu, Deyin; Wu, Lin Yuanbo; Li, Bo; Boussaid, Farid; Bennamoun, Mohammed; Xie, Xianghua; Liang, Chengwu; Wu, Yuanbo

doi:10.1016/j.patcog.2023.109902

Journal article 544 views 42 downloads

Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense

Deyin Liu

, Lin Yuanbo Wu, Bo Li, Farid Boussaid, Mohammed Bennamoun, Xianghua Xie

, Chengwu Liang, Yuanbo Wu

Pattern Recognition, Volume: 145, Start page: 109902

Swansea University Authors: Xianghua Xie , Yuanbo Wu

PDF | Proof
Download (4.5MB)

Check full text

DOI (Published version): 10.1016/j.patcog.2023.109902

Abstract

Deep neural networks (DNNs) can be easily deceived by imperceptible alterations known as adversarial examples. These examples can lead to misclassification, posing a significant threat to the reliability of deep learning systems in real-world applications. Adversarial training (AT) is a popular tech...

Full description

Published in:	Pattern Recognition
ISSN:	0031-3203
Published:	Elsevier BV 2024
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa64108

first_indexed	2023-08-23T08:43:05Z
last_indexed	2024-11-25T14:13:27Z
id	cronfa64108
recordtype	SURis
fullrecord	<?xml version="1.0"?><rfc1807><datestamp>2024-09-05T12:05:57.0584227</datestamp><bib-version>v2</bib-version><id>64108</id><entry>2023-08-23</entry><title>Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense</title><swanseaauthors><author><sid>b334d40963c7a2f435f06d2c26c74e11</sid><ORCID>0000-0002-2701-8660</ORCID><firstname>Xianghua</firstname><surname>Xie</surname><name>Xianghua Xie</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>205b1ac5a767e977bebb5d6afd770784</sid><ORCID>0000-0001-6119-058X</ORCID><firstname>Yuanbo</firstname><surname>Wu</surname><name>Yuanbo Wu</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2023-08-23</date><deptcode>MACS</deptcode><abstract>Deep neural networks (DNNs) can be easily deceived by imperceptible alterations known as adversarial examples. These examples can lead to misclassification, posing a significant threat to the reliability of deep learning systems in real-world applications. Adversarial training (AT) is a popular technique used to enhance robustness by training models on a combination of corrupted and clean data. However, existing AT-based methods often struggle to handle transferred adversarial examples that can fool multiple defense models, thereby falling short of meeting the generalization requirements for real-world scenarios. Furthermore, AT typically fails to provide interpretable predictions, which are crucial for domain experts seeking to understand the behavior of DNNs. To overcome these challenges, we present a novel approach called Jacobian norm and Selective Input Gradient Regularization (J-SIGR). Our method leverages Jacobian normalization to improve robustness and introduces regularization of perturbation-based saliency maps, enabling interpretable predictions. By adopting J-SIGR, we achieve enhanced defense capabilities and promote high interpretability of DNNs. We evaluate the effectiveness of J-SIGR across various architectures by subjecting it to powerful adversarial attacks. Our experimental evaluations provide compelling evidence of the efficacy of J-SIGR against transferred adversarial attacks, while preserving interpretability. The project code can be found at https://github.com/Lywu-github/jJ-SIGR.git.</abstract><type>Journal Article</type><journal>Pattern Recognition</journal><volume>145</volume><journalNumber/><paginationStart>109902</paginationStart><paginationEnd/><publisher>Elsevier BV</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>0031-3203</issnPrint><issnElectronic/><keywords>Selective input gradient regularization, Jacobian normalization, Adversarial robustness</keywords><publishedDay>1</publishedDay><publishedMonth>1</publishedMonth><publishedYear>2024</publishedYear><publishedDate>2024-01-01</publishedDate><doi>10.1016/j.patcog.2023.109902</doi><url>http://dx.doi.org/10.1016/j.patcog.2023.109902</url><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm/><funders>This work was partially supported by NSFC U19A2073 , 62002096, 62001394, 62372150, 62176086.</funders><projectreference/><lastEdited>2024-09-05T12:05:57.0584227</lastEdited><Created>2023-08-23T09:40:24.6634830</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Deyin</firstname><surname>Liu</surname><orcid>0000-0002-0371-9921</orcid><order>1</order></author><author><firstname>Lin Yuanbo</firstname><surname>Wu</surname><order>2</order></author><author><firstname>Bo</firstname><surname>Li</surname><order>3</order></author><author><firstname>Farid</firstname><surname>Boussaid</surname><order>4</order></author><author><firstname>Mohammed</firstname><surname>Bennamoun</surname><order>5</order></author><author><firstname>Xianghua</firstname><surname>Xie</surname><orcid>0000-0002-2701-8660</orcid><order>6</order></author><author><firstname>Chengwu</firstname><surname>Liang</surname><order>7</order></author><author><firstname>Yuanbo</firstname><surname>Wu</surname><orcid>0000-0001-6119-058X</orcid><order>8</order></author></authors><documents><document><filename>64108__28348__a9bdff416cd1485d9b740a80ec38d58a.pdf</filename><originalFilename>64108.pdf</originalFilename><uploaded>2023-08-23T09:42:26.4795898</uploaded><type>Output</type><contentLength>4714696</contentLength><contentType>application/pdf</contentType><version>Proof</version><cronfaStatus>true</cronfaStatus><embargoDate>2024-08-22T00:00:00.0000000</embargoDate><copyrightCorrect>false</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807>
spelling	2024-09-05T12:05:57.0584227 v2 64108 2023-08-23 Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense b334d40963c7a2f435f06d2c26c74e11 0000-0002-2701-8660 Xianghua Xie Xianghua Xie true false 205b1ac5a767e977bebb5d6afd770784 0000-0001-6119-058X Yuanbo Wu Yuanbo Wu true false 2023-08-23 MACS Deep neural networks (DNNs) can be easily deceived by imperceptible alterations known as adversarial examples. These examples can lead to misclassification, posing a significant threat to the reliability of deep learning systems in real-world applications. Adversarial training (AT) is a popular technique used to enhance robustness by training models on a combination of corrupted and clean data. However, existing AT-based methods often struggle to handle transferred adversarial examples that can fool multiple defense models, thereby falling short of meeting the generalization requirements for real-world scenarios. Furthermore, AT typically fails to provide interpretable predictions, which are crucial for domain experts seeking to understand the behavior of DNNs. To overcome these challenges, we present a novel approach called Jacobian norm and Selective Input Gradient Regularization (J-SIGR). Our method leverages Jacobian normalization to improve robustness and introduces regularization of perturbation-based saliency maps, enabling interpretable predictions. By adopting J-SIGR, we achieve enhanced defense capabilities and promote high interpretability of DNNs. We evaluate the effectiveness of J-SIGR across various architectures by subjecting it to powerful adversarial attacks. Our experimental evaluations provide compelling evidence of the efficacy of J-SIGR against transferred adversarial attacks, while preserving interpretability. The project code can be found at https://github.com/Lywu-github/jJ-SIGR.git. Journal Article Pattern Recognition 145 109902 Elsevier BV 0031-3203 Selective input gradient regularization, Jacobian normalization, Adversarial robustness 1 1 2024 2024-01-01 10.1016/j.patcog.2023.109902 http://dx.doi.org/10.1016/j.patcog.2023.109902 COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University This work was partially supported by NSFC U19A2073 , 62002096, 62001394, 62372150, 62176086. 2024-09-05T12:05:57.0584227 2023-08-23T09:40:24.6634830 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Deyin Liu 0000-0002-0371-9921 1 Lin Yuanbo Wu 2 Bo Li 3 Farid Boussaid 4 Mohammed Bennamoun 5 Xianghua Xie 0000-0002-2701-8660 6 Chengwu Liang 7 Yuanbo Wu 0000-0001-6119-058X 8 64108__28348__a9bdff416cd1485d9b740a80ec38d58a.pdf 64108.pdf 2023-08-23T09:42:26.4795898 Output 4714696 application/pdf Proof true 2024-08-22T00:00:00.0000000 false eng
title	Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense
spellingShingle	Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense Xianghua Xie Yuanbo Wu
title_short	Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense
title_full	Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense
title_fullStr	Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense
title_full_unstemmed	Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense
title_sort	Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense
author_id_str_mv	b334d40963c7a2f435f06d2c26c74e11 205b1ac5a767e977bebb5d6afd770784
author_id_fullname_str_mv	b334d40963c7a2f435f06d2c26c74e11_*_Xianghua Xie 205b1ac5a767e977bebb5d6afd770784_*_Yuanbo Wu
author	Xianghua Xie Yuanbo Wu
author2	Deyin Liu Lin Yuanbo Wu Bo Li Farid Boussaid Mohammed Bennamoun Xianghua Xie Chengwu Liang Yuanbo Wu
format	Journal article
container_title	Pattern Recognition
container_volume	145
container_start_page	109902
publishDate	2024
institution	Swansea University
issn	0031-3203
doi_str_mv	10.1016/j.patcog.2023.109902
publisher	Elsevier BV
college_str	Faculty of Science and Engineering
hierarchytype
hierarchy_top_id	facultyofscienceandengineering
hierarchy_top_title	Faculty of Science and Engineering
hierarchy_parent_id	facultyofscienceandengineering
hierarchy_parent_title	Faculty of Science and Engineering
department_str	School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
url	http://dx.doi.org/10.1016/j.patcog.2023.109902
document_store_str	1
active_str	0
description	Deep neural networks (DNNs) can be easily deceived by imperceptible alterations known as adversarial examples. These examples can lead to misclassification, posing a significant threat to the reliability of deep learning systems in real-world applications. Adversarial training (AT) is a popular technique used to enhance robustness by training models on a combination of corrupted and clean data. However, existing AT-based methods often struggle to handle transferred adversarial examples that can fool multiple defense models, thereby falling short of meeting the generalization requirements for real-world scenarios. Furthermore, AT typically fails to provide interpretable predictions, which are crucial for domain experts seeking to understand the behavior of DNNs. To overcome these challenges, we present a novel approach called Jacobian norm and Selective Input Gradient Regularization (J-SIGR). Our method leverages Jacobian normalization to improve robustness and introduces regularization of perturbation-based saliency maps, enabling interpretable predictions. By adopting J-SIGR, we achieve enhanced defense capabilities and promote high interpretability of DNNs. We evaluate the effectiveness of J-SIGR across various architectures by subjecting it to powerful adversarial attacks. Our experimental evaluations provide compelling evidence of the efficacy of J-SIGR against transferred adversarial attacks, while preserving interpretability. The project code can be found at https://github.com/Lywu-github/jJ-SIGR.git.
published_date	2024-01-01T05:13:56Z
_version_	1828625146123911168
score	11.334279

Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense

Similar Items