Fine-pruning: Defending against backdooring attacks on deep neural networks

Kang Liu, Brendan Dolan-Gavitt, Siddharth Garg

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Deep neural networks (DNNs) provide excellent performance across a wide range of classification tasks, but their training requires high computational resources and is often outsourced to third parties. Recent work has shown that outsourced training introduces the risk that a malicious trainer will return a backdoored DNN that behaves normally on most inputs but causes targeted misclassifications or degrades the accuracy of the network when a trigger known only to the attacker is present. In this paper, we provide the first effective defenses against backdoor attacks on DNNs. We implement three backdoor attacks from prior work and use them to investigate two promising defenses, pruning and fine-tuning. We show that neither, by itself, is sufficient to defend against sophisticated attackers. We then evaluate fine-pruning, a combination of pruning and fine-tuning, and show that it successfully weakens or even eliminates the backdoors, i.e., in some cases reducing the attack success rate to 0% with only a 0.4% drop in accuracy for clean (non-triggering) inputs. Our work provides the first step toward defenses against backdoor attacks in deep neural networks.

Original languageEnglish (US)
Title of host publicationResearch in Attacks, Intrusions, and Defenses - 21st International Symposium, RAID 2018, Proceedings
EditorsMichael Bailey, Sotiris Ioannidis, Manolis Stamatogiannakis, Thorsten Holz
PublisherSpringer-Verlag
Pages273-294
Number of pages22
ISBN (Print)9783030004699
DOIs
StatePublished - Jan 1 2018
Event21st International Symposium on Research in Attacks, Intrusions and Defenses, RAID 2018 - Heraklion, Greece
Duration: Sep 10 2018Sep 12 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11050 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other21st International Symposium on Research in Attacks, Intrusions and Defenses, RAID 2018
CountryGreece
CityHeraklion
Period9/10/189/12/18

Fingerprint

Pruning
Attack
Neural Networks
Tuning
Misclassification
Trigger
Eliminate
Sufficient
Resources
Deep neural networks
Evaluate
Range of data
Training

Keywords

  • Backdoor
  • Deep learning
  • Fine-tuning
  • Pruning
  • Trojan

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Liu, K., Dolan-Gavitt, B., & Garg, S. (2018). Fine-pruning: Defending against backdooring attacks on deep neural networks. In M. Bailey, S. Ioannidis, M. Stamatogiannakis, & T. Holz (Eds.), Research in Attacks, Intrusions, and Defenses - 21st International Symposium, RAID 2018, Proceedings (pp. 273-294). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11050 LNCS). Springer-Verlag. https://doi.org/10.1007/978-3-030-00470-5_13

Fine-pruning : Defending against backdooring attacks on deep neural networks. / Liu, Kang; Dolan-Gavitt, Brendan; Garg, Siddharth.

Research in Attacks, Intrusions, and Defenses - 21st International Symposium, RAID 2018, Proceedings. ed. / Michael Bailey; Sotiris Ioannidis; Manolis Stamatogiannakis; Thorsten Holz. Springer-Verlag, 2018. p. 273-294 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11050 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Liu, K, Dolan-Gavitt, B & Garg, S 2018, Fine-pruning: Defending against backdooring attacks on deep neural networks. in M Bailey, S Ioannidis, M Stamatogiannakis & T Holz (eds), Research in Attacks, Intrusions, and Defenses - 21st International Symposium, RAID 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11050 LNCS, Springer-Verlag, pp. 273-294, 21st International Symposium on Research in Attacks, Intrusions and Defenses, RAID 2018, Heraklion, Greece, 9/10/18. https://doi.org/10.1007/978-3-030-00470-5_13
Liu K, Dolan-Gavitt B, Garg S. Fine-pruning: Defending against backdooring attacks on deep neural networks. In Bailey M, Ioannidis S, Stamatogiannakis M, Holz T, editors, Research in Attacks, Intrusions, and Defenses - 21st International Symposium, RAID 2018, Proceedings. Springer-Verlag. 2018. p. 273-294. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-00470-5_13
Liu, Kang ; Dolan-Gavitt, Brendan ; Garg, Siddharth. / Fine-pruning : Defending against backdooring attacks on deep neural networks. Research in Attacks, Intrusions, and Defenses - 21st International Symposium, RAID 2018, Proceedings. editor / Michael Bailey ; Sotiris Ioannidis ; Manolis Stamatogiannakis ; Thorsten Holz. Springer-Verlag, 2018. pp. 273-294 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{b309f58975c744a0b64f7965c976bf8f,
title = "Fine-pruning: Defending against backdooring attacks on deep neural networks",
abstract = "Deep neural networks (DNNs) provide excellent performance across a wide range of classification tasks, but their training requires high computational resources and is often outsourced to third parties. Recent work has shown that outsourced training introduces the risk that a malicious trainer will return a backdoored DNN that behaves normally on most inputs but causes targeted misclassifications or degrades the accuracy of the network when a trigger known only to the attacker is present. In this paper, we provide the first effective defenses against backdoor attacks on DNNs. We implement three backdoor attacks from prior work and use them to investigate two promising defenses, pruning and fine-tuning. We show that neither, by itself, is sufficient to defend against sophisticated attackers. We then evaluate fine-pruning, a combination of pruning and fine-tuning, and show that it successfully weakens or even eliminates the backdoors, i.e., in some cases reducing the attack success rate to 0{\%} with only a 0.4{\%} drop in accuracy for clean (non-triggering) inputs. Our work provides the first step toward defenses against backdoor attacks in deep neural networks.",
keywords = "Backdoor, Deep learning, Fine-tuning, Pruning, Trojan",
author = "Kang Liu and Brendan Dolan-Gavitt and Siddharth Garg",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/978-3-030-00470-5_13",
language = "English (US)",
isbn = "9783030004699",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag",
pages = "273--294",
editor = "Michael Bailey and Sotiris Ioannidis and Manolis Stamatogiannakis and Thorsten Holz",
booktitle = "Research in Attacks, Intrusions, and Defenses - 21st International Symposium, RAID 2018, Proceedings",

}

TY - GEN

T1 - Fine-pruning

T2 - Defending against backdooring attacks on deep neural networks

AU - Liu, Kang

AU - Dolan-Gavitt, Brendan

AU - Garg, Siddharth

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Deep neural networks (DNNs) provide excellent performance across a wide range of classification tasks, but their training requires high computational resources and is often outsourced to third parties. Recent work has shown that outsourced training introduces the risk that a malicious trainer will return a backdoored DNN that behaves normally on most inputs but causes targeted misclassifications or degrades the accuracy of the network when a trigger known only to the attacker is present. In this paper, we provide the first effective defenses against backdoor attacks on DNNs. We implement three backdoor attacks from prior work and use them to investigate two promising defenses, pruning and fine-tuning. We show that neither, by itself, is sufficient to defend against sophisticated attackers. We then evaluate fine-pruning, a combination of pruning and fine-tuning, and show that it successfully weakens or even eliminates the backdoors, i.e., in some cases reducing the attack success rate to 0% with only a 0.4% drop in accuracy for clean (non-triggering) inputs. Our work provides the first step toward defenses against backdoor attacks in deep neural networks.

AB - Deep neural networks (DNNs) provide excellent performance across a wide range of classification tasks, but their training requires high computational resources and is often outsourced to third parties. Recent work has shown that outsourced training introduces the risk that a malicious trainer will return a backdoored DNN that behaves normally on most inputs but causes targeted misclassifications or degrades the accuracy of the network when a trigger known only to the attacker is present. In this paper, we provide the first effective defenses against backdoor attacks on DNNs. We implement three backdoor attacks from prior work and use them to investigate two promising defenses, pruning and fine-tuning. We show that neither, by itself, is sufficient to defend against sophisticated attackers. We then evaluate fine-pruning, a combination of pruning and fine-tuning, and show that it successfully weakens or even eliminates the backdoors, i.e., in some cases reducing the attack success rate to 0% with only a 0.4% drop in accuracy for clean (non-triggering) inputs. Our work provides the first step toward defenses against backdoor attacks in deep neural networks.

KW - Backdoor

KW - Deep learning

KW - Fine-tuning

KW - Pruning

KW - Trojan

UR - http://www.scopus.com/inward/record.url?scp=85053888479&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85053888479&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-00470-5_13

DO - 10.1007/978-3-030-00470-5_13

M3 - Conference contribution

AN - SCOPUS:85053888479

SN - 9783030004699

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 273

EP - 294

BT - Research in Attacks, Intrusions, and Defenses - 21st International Symposium, RAID 2018, Proceedings

A2 - Bailey, Michael

A2 - Ioannidis, Sotiris

A2 - Stamatogiannakis, Manolis

A2 - Holz, Thorsten

PB - Springer-Verlag

ER -