Use fewer instances of the letter "i": Toward writing style anonymization

Andrew W.E. McDonald, Sadia Afroz, Aylin Caliskan, Ariel Stolerman, Rachel Greenstadt

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    This paper presents Anonymouth, a novel framework for anonymizing writing style. Without accounting for style, anonymous authors risk identification. This framework is necessary to provide a tool for testing the consistency of anonymized writing style and a mechanism for adaptive attacks against stylometry techniques. Our framework defines the steps necessary to anonymize documents and implements them. A key contribution of this work is this framework, including novel methods for identifying which features of documents need to change and how they must be changed to accomplish document anonymization. In our experiment, 80% of the user study participants were able to anonymize their documents in terms of a fixed corpus and limited feature set used. However, modifying pre-written documents were found to be difficult and the anonymization did not hold up to more extensive feature sets. It is important to note that Anonymouth is only the first step toward a tool to acheive stylometric anonymity with respect to state-of-the-art authorship attribution techniques. The topic needs further exploration in order to accomplish significant anonymity.

    Original languageEnglish (US)
    Title of host publicationPrivacy Enhancing Technologies - 12th International Symposium, PETS 2012, Proceedings
    Pages299-318
    Number of pages20
    DOIs
    StatePublished - Jul 30 2012
    Event12th International Symposium on Privacy Enhancing Technologies, PETS 2012 - Vigo, Spain
    Duration: Jul 11 2012Jul 13 2012

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume7384 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference12th International Symposium on Privacy Enhancing Technologies, PETS 2012
    CountrySpain
    CityVigo
    Period7/11/127/13/12

    Fingerprint

    Anonymity
    Necessary
    User Studies
    Testing
    Experiments
    Attack
    Style
    Framework
    Experiment
    Corpus

    Keywords

    • anonymity
    • machine learning
    • privacy
    • stylometry

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • Computer Science(all)

    Cite this

    McDonald, A. W. E., Afroz, S., Caliskan, A., Stolerman, A., & Greenstadt, R. (2012). Use fewer instances of the letter "i": Toward writing style anonymization. In Privacy Enhancing Technologies - 12th International Symposium, PETS 2012, Proceedings (pp. 299-318). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7384 LNCS). https://doi.org/10.1007/978-3-642-31680-7_16

    Use fewer instances of the letter "i" : Toward writing style anonymization. / McDonald, Andrew W.E.; Afroz, Sadia; Caliskan, Aylin; Stolerman, Ariel; Greenstadt, Rachel.

    Privacy Enhancing Technologies - 12th International Symposium, PETS 2012, Proceedings. 2012. p. 299-318 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7384 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    McDonald, AWE, Afroz, S, Caliskan, A, Stolerman, A & Greenstadt, R 2012, Use fewer instances of the letter "i": Toward writing style anonymization. in Privacy Enhancing Technologies - 12th International Symposium, PETS 2012, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7384 LNCS, pp. 299-318, 12th International Symposium on Privacy Enhancing Technologies, PETS 2012, Vigo, Spain, 7/11/12. https://doi.org/10.1007/978-3-642-31680-7_16
    McDonald AWE, Afroz S, Caliskan A, Stolerman A, Greenstadt R. Use fewer instances of the letter "i": Toward writing style anonymization. In Privacy Enhancing Technologies - 12th International Symposium, PETS 2012, Proceedings. 2012. p. 299-318. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-31680-7_16
    McDonald, Andrew W.E. ; Afroz, Sadia ; Caliskan, Aylin ; Stolerman, Ariel ; Greenstadt, Rachel. / Use fewer instances of the letter "i" : Toward writing style anonymization. Privacy Enhancing Technologies - 12th International Symposium, PETS 2012, Proceedings. 2012. pp. 299-318 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
    @inproceedings{a829b38be4654f8ab13888d1bc87c5c3,
    title = "Use fewer instances of the letter {"}i{"}: Toward writing style anonymization",
    abstract = "This paper presents Anonymouth, a novel framework for anonymizing writing style. Without accounting for style, anonymous authors risk identification. This framework is necessary to provide a tool for testing the consistency of anonymized writing style and a mechanism for adaptive attacks against stylometry techniques. Our framework defines the steps necessary to anonymize documents and implements them. A key contribution of this work is this framework, including novel methods for identifying which features of documents need to change and how they must be changed to accomplish document anonymization. In our experiment, 80{\%} of the user study participants were able to anonymize their documents in terms of a fixed corpus and limited feature set used. However, modifying pre-written documents were found to be difficult and the anonymization did not hold up to more extensive feature sets. It is important to note that Anonymouth is only the first step toward a tool to acheive stylometric anonymity with respect to state-of-the-art authorship attribution techniques. The topic needs further exploration in order to accomplish significant anonymity.",
    keywords = "anonymity, machine learning, privacy, stylometry",
    author = "McDonald, {Andrew W.E.} and Sadia Afroz and Aylin Caliskan and Ariel Stolerman and Rachel Greenstadt",
    year = "2012",
    month = "7",
    day = "30",
    doi = "10.1007/978-3-642-31680-7_16",
    language = "English (US)",
    isbn = "9783642316791",
    series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
    pages = "299--318",
    booktitle = "Privacy Enhancing Technologies - 12th International Symposium, PETS 2012, Proceedings",

    }

    TY - GEN

    T1 - Use fewer instances of the letter "i"

    T2 - Toward writing style anonymization

    AU - McDonald, Andrew W.E.

    AU - Afroz, Sadia

    AU - Caliskan, Aylin

    AU - Stolerman, Ariel

    AU - Greenstadt, Rachel

    PY - 2012/7/30

    Y1 - 2012/7/30

    N2 - This paper presents Anonymouth, a novel framework for anonymizing writing style. Without accounting for style, anonymous authors risk identification. This framework is necessary to provide a tool for testing the consistency of anonymized writing style and a mechanism for adaptive attacks against stylometry techniques. Our framework defines the steps necessary to anonymize documents and implements them. A key contribution of this work is this framework, including novel methods for identifying which features of documents need to change and how they must be changed to accomplish document anonymization. In our experiment, 80% of the user study participants were able to anonymize their documents in terms of a fixed corpus and limited feature set used. However, modifying pre-written documents were found to be difficult and the anonymization did not hold up to more extensive feature sets. It is important to note that Anonymouth is only the first step toward a tool to acheive stylometric anonymity with respect to state-of-the-art authorship attribution techniques. The topic needs further exploration in order to accomplish significant anonymity.

    AB - This paper presents Anonymouth, a novel framework for anonymizing writing style. Without accounting for style, anonymous authors risk identification. This framework is necessary to provide a tool for testing the consistency of anonymized writing style and a mechanism for adaptive attacks against stylometry techniques. Our framework defines the steps necessary to anonymize documents and implements them. A key contribution of this work is this framework, including novel methods for identifying which features of documents need to change and how they must be changed to accomplish document anonymization. In our experiment, 80% of the user study participants were able to anonymize their documents in terms of a fixed corpus and limited feature set used. However, modifying pre-written documents were found to be difficult and the anonymization did not hold up to more extensive feature sets. It is important to note that Anonymouth is only the first step toward a tool to acheive stylometric anonymity with respect to state-of-the-art authorship attribution techniques. The topic needs further exploration in order to accomplish significant anonymity.

    KW - anonymity

    KW - machine learning

    KW - privacy

    KW - stylometry

    UR - http://www.scopus.com/inward/record.url?scp=84864225669&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84864225669&partnerID=8YFLogxK

    U2 - 10.1007/978-3-642-31680-7_16

    DO - 10.1007/978-3-642-31680-7_16

    M3 - Conference contribution

    AN - SCOPUS:84864225669

    SN - 9783642316791

    T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    SP - 299

    EP - 318

    BT - Privacy Enhancing Technologies - 12th International Symposium, PETS 2012, Proceedings

    ER -