LAVA

Large-Scale Automated Vulnerability Addition

Brendan Dolan-Gavitt, Patrick Hulin, Engin Kirda, Tim Leek, Andrea Mambretti, Wil Robertson, Frederick Ulrich, Ryan Whelan

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Work on automating vulnerability discovery has long been hampered by a shortage of ground-truth corpora with which to evaluate tools and techniques. This lack of ground truth prevents authors and users of tools alike from being able to measure such fundamental quantities as miss and false alarm rates. In this paper, we present LAVA, a novel dynamic taint analysis-based technique for producing ground-truth corpora by quickly and automatically injecting large numbers of realistic bugs into program source code. Every LAVA bug is accompanied by an input that triggers it whereas normal inputs are extremely unlikely to do so. These vulnerabilities are synthetic but, we argue, still realistic, in the sense that they are embedded deep within programs and are triggered by real inputs. Using LAVA, we have injected thousands of bugs into eight real-world programs, including bash, tshark, and the GNU coreutils. In a preliminary evaluation, we found that a prominent fuzzer and a symbolic execution-based bug finder were able to locate some but not all LAVA-injected bugs, and that interesting patterns and pathologies were already apparent in their performance. Our work forms the basis of an approach for generating large ground-truth vulnerability corpora on demand, enabling rigorous tool evaluation and providing a high-quality target for tool developers.

    Original languageEnglish (US)
    Title of host publicationProceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages110-121
    Number of pages12
    ISBN (Electronic)9781509008247
    DOIs
    StatePublished - Aug 16 2016
    Event2016 IEEE Symposium on Security and Privacy, SP 2016 - San Jose, United States
    Duration: May 23 2016May 25 2016

    Other

    Other2016 IEEE Symposium on Security and Privacy, SP 2016
    CountryUnited States
    CitySan Jose
    Period5/23/165/25/16

    Fingerprint

    Pathology
    Dynamic analysis

    ASJC Scopus subject areas

    • Safety, Risk, Reliability and Quality
    • Computer Networks and Communications
    • Software

    Cite this

    Dolan-Gavitt, B., Hulin, P., Kirda, E., Leek, T., Mambretti, A., Robertson, W., ... Whelan, R. (2016). LAVA: Large-Scale Automated Vulnerability Addition. In Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016 (pp. 110-121). [7546498] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/SP.2016.15

    LAVA : Large-Scale Automated Vulnerability Addition. / Dolan-Gavitt, Brendan; Hulin, Patrick; Kirda, Engin; Leek, Tim; Mambretti, Andrea; Robertson, Wil; Ulrich, Frederick; Whelan, Ryan.

    Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016. Institute of Electrical and Electronics Engineers Inc., 2016. p. 110-121 7546498.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Dolan-Gavitt, B, Hulin, P, Kirda, E, Leek, T, Mambretti, A, Robertson, W, Ulrich, F & Whelan, R 2016, LAVA: Large-Scale Automated Vulnerability Addition. in Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016., 7546498, Institute of Electrical and Electronics Engineers Inc., pp. 110-121, 2016 IEEE Symposium on Security and Privacy, SP 2016, San Jose, United States, 5/23/16. https://doi.org/10.1109/SP.2016.15
    Dolan-Gavitt B, Hulin P, Kirda E, Leek T, Mambretti A, Robertson W et al. LAVA: Large-Scale Automated Vulnerability Addition. In Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016. Institute of Electrical and Electronics Engineers Inc. 2016. p. 110-121. 7546498 https://doi.org/10.1109/SP.2016.15
    Dolan-Gavitt, Brendan ; Hulin, Patrick ; Kirda, Engin ; Leek, Tim ; Mambretti, Andrea ; Robertson, Wil ; Ulrich, Frederick ; Whelan, Ryan. / LAVA : Large-Scale Automated Vulnerability Addition. Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 110-121
    @inproceedings{37f4342eb99e45faa2c1411533c0c422,
    title = "LAVA: Large-Scale Automated Vulnerability Addition",
    abstract = "Work on automating vulnerability discovery has long been hampered by a shortage of ground-truth corpora with which to evaluate tools and techniques. This lack of ground truth prevents authors and users of tools alike from being able to measure such fundamental quantities as miss and false alarm rates. In this paper, we present LAVA, a novel dynamic taint analysis-based technique for producing ground-truth corpora by quickly and automatically injecting large numbers of realistic bugs into program source code. Every LAVA bug is accompanied by an input that triggers it whereas normal inputs are extremely unlikely to do so. These vulnerabilities are synthetic but, we argue, still realistic, in the sense that they are embedded deep within programs and are triggered by real inputs. Using LAVA, we have injected thousands of bugs into eight real-world programs, including bash, tshark, and the GNU coreutils. In a preliminary evaluation, we found that a prominent fuzzer and a symbolic execution-based bug finder were able to locate some but not all LAVA-injected bugs, and that interesting patterns and pathologies were already apparent in their performance. Our work forms the basis of an approach for generating large ground-truth vulnerability corpora on demand, enabling rigorous tool evaluation and providing a high-quality target for tool developers.",
    author = "Brendan Dolan-Gavitt and Patrick Hulin and Engin Kirda and Tim Leek and Andrea Mambretti and Wil Robertson and Frederick Ulrich and Ryan Whelan",
    year = "2016",
    month = "8",
    day = "16",
    doi = "10.1109/SP.2016.15",
    language = "English (US)",
    pages = "110--121",
    booktitle = "Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",
    address = "United States",

    }

    TY - GEN

    T1 - LAVA

    T2 - Large-Scale Automated Vulnerability Addition

    AU - Dolan-Gavitt, Brendan

    AU - Hulin, Patrick

    AU - Kirda, Engin

    AU - Leek, Tim

    AU - Mambretti, Andrea

    AU - Robertson, Wil

    AU - Ulrich, Frederick

    AU - Whelan, Ryan

    PY - 2016/8/16

    Y1 - 2016/8/16

    N2 - Work on automating vulnerability discovery has long been hampered by a shortage of ground-truth corpora with which to evaluate tools and techniques. This lack of ground truth prevents authors and users of tools alike from being able to measure such fundamental quantities as miss and false alarm rates. In this paper, we present LAVA, a novel dynamic taint analysis-based technique for producing ground-truth corpora by quickly and automatically injecting large numbers of realistic bugs into program source code. Every LAVA bug is accompanied by an input that triggers it whereas normal inputs are extremely unlikely to do so. These vulnerabilities are synthetic but, we argue, still realistic, in the sense that they are embedded deep within programs and are triggered by real inputs. Using LAVA, we have injected thousands of bugs into eight real-world programs, including bash, tshark, and the GNU coreutils. In a preliminary evaluation, we found that a prominent fuzzer and a symbolic execution-based bug finder were able to locate some but not all LAVA-injected bugs, and that interesting patterns and pathologies were already apparent in their performance. Our work forms the basis of an approach for generating large ground-truth vulnerability corpora on demand, enabling rigorous tool evaluation and providing a high-quality target for tool developers.

    AB - Work on automating vulnerability discovery has long been hampered by a shortage of ground-truth corpora with which to evaluate tools and techniques. This lack of ground truth prevents authors and users of tools alike from being able to measure such fundamental quantities as miss and false alarm rates. In this paper, we present LAVA, a novel dynamic taint analysis-based technique for producing ground-truth corpora by quickly and automatically injecting large numbers of realistic bugs into program source code. Every LAVA bug is accompanied by an input that triggers it whereas normal inputs are extremely unlikely to do so. These vulnerabilities are synthetic but, we argue, still realistic, in the sense that they are embedded deep within programs and are triggered by real inputs. Using LAVA, we have injected thousands of bugs into eight real-world programs, including bash, tshark, and the GNU coreutils. In a preliminary evaluation, we found that a prominent fuzzer and a symbolic execution-based bug finder were able to locate some but not all LAVA-injected bugs, and that interesting patterns and pathologies were already apparent in their performance. Our work forms the basis of an approach for generating large ground-truth vulnerability corpora on demand, enabling rigorous tool evaluation and providing a high-quality target for tool developers.

    UR - http://www.scopus.com/inward/record.url?scp=84987615823&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84987615823&partnerID=8YFLogxK

    U2 - 10.1109/SP.2016.15

    DO - 10.1109/SP.2016.15

    M3 - Conference contribution

    SP - 110

    EP - 121

    BT - Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016

    PB - Institute of Electrical and Electronics Engineers Inc.

    ER -