A Concentration of Measure Approach to Database De-anonymization

Farhad Shirani, Siddharth Garg, Elza Erkip

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, matching of correlated high-dimensional databases is investigated. A stochastic database model is considered where the correlation among the database entries is governed by an arbitrary joint distribution. Concentration of measure theorems such as typicality and laws of large numbers are used to develop a database matching scheme and derive necessary conditions for successful matching. Furthermore, it is shown that these conditions are tight through a converse result which characterizes a set of distributions on the database entries for which reliable matching is not possible. The necessary and sufficient conditions for reliable matching are evaluated in the cases when the database entries are independent and identically distributed as well as under Markovian database models.

Original languageEnglish (US)
Title of host publication2019 IEEE International Symposium on Information Theory, ISIT 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2748-2752
Number of pages5
ISBN (Electronic)9781538692912
DOIs
StatePublished - Jul 2019
Event2019 IEEE International Symposium on Information Theory, ISIT 2019 - Paris, France
Duration: Jul 7 2019Jul 12 2019

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
Volume2019-July
ISSN (Print)2157-8095

Conference

Conference2019 IEEE International Symposium on Information Theory, ISIT 2019
CountryFrance
CityParis
Period7/7/197/12/19

Fingerprint

Concentration of Measure
Necessary Conditions
Law of large numbers
Joint Distribution
Converse
Identically distributed
High-dimensional
Sufficient Conditions
Arbitrary

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Modeling and Simulation
  • Applied Mathematics

Cite this

Shirani, F., Garg, S., & Erkip, E. (2019). A Concentration of Measure Approach to Database De-anonymization. In 2019 IEEE International Symposium on Information Theory, ISIT 2019 - Proceedings (pp. 2748-2752). [8849392] (IEEE International Symposium on Information Theory - Proceedings; Vol. 2019-July). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISIT.2019.8849392

A Concentration of Measure Approach to Database De-anonymization. / Shirani, Farhad; Garg, Siddharth; Erkip, Elza.

2019 IEEE International Symposium on Information Theory, ISIT 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. p. 2748-2752 8849392 (IEEE International Symposium on Information Theory - Proceedings; Vol. 2019-July).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Shirani, F, Garg, S & Erkip, E 2019, A Concentration of Measure Approach to Database De-anonymization. in 2019 IEEE International Symposium on Information Theory, ISIT 2019 - Proceedings., 8849392, IEEE International Symposium on Information Theory - Proceedings, vol. 2019-July, Institute of Electrical and Electronics Engineers Inc., pp. 2748-2752, 2019 IEEE International Symposium on Information Theory, ISIT 2019, Paris, France, 7/7/19. https://doi.org/10.1109/ISIT.2019.8849392
Shirani F, Garg S, Erkip E. A Concentration of Measure Approach to Database De-anonymization. In 2019 IEEE International Symposium on Information Theory, ISIT 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2019. p. 2748-2752. 8849392. (IEEE International Symposium on Information Theory - Proceedings). https://doi.org/10.1109/ISIT.2019.8849392
Shirani, Farhad ; Garg, Siddharth ; Erkip, Elza. / A Concentration of Measure Approach to Database De-anonymization. 2019 IEEE International Symposium on Information Theory, ISIT 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 2748-2752 (IEEE International Symposium on Information Theory - Proceedings).
@inproceedings{cb115f1e3bfa4456ab6f568ca52e04d7,
title = "A Concentration of Measure Approach to Database De-anonymization",
abstract = "In this paper, matching of correlated high-dimensional databases is investigated. A stochastic database model is considered where the correlation among the database entries is governed by an arbitrary joint distribution. Concentration of measure theorems such as typicality and laws of large numbers are used to develop a database matching scheme and derive necessary conditions for successful matching. Furthermore, it is shown that these conditions are tight through a converse result which characterizes a set of distributions on the database entries for which reliable matching is not possible. The necessary and sufficient conditions for reliable matching are evaluated in the cases when the database entries are independent and identically distributed as well as under Markovian database models.",
author = "Farhad Shirani and Siddharth Garg and Elza Erkip",
year = "2019",
month = "7",
doi = "10.1109/ISIT.2019.8849392",
language = "English (US)",
series = "IEEE International Symposium on Information Theory - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "2748--2752",
booktitle = "2019 IEEE International Symposium on Information Theory, ISIT 2019 - Proceedings",

}

TY - GEN

T1 - A Concentration of Measure Approach to Database De-anonymization

AU - Shirani, Farhad

AU - Garg, Siddharth

AU - Erkip, Elza

PY - 2019/7

Y1 - 2019/7

N2 - In this paper, matching of correlated high-dimensional databases is investigated. A stochastic database model is considered where the correlation among the database entries is governed by an arbitrary joint distribution. Concentration of measure theorems such as typicality and laws of large numbers are used to develop a database matching scheme and derive necessary conditions for successful matching. Furthermore, it is shown that these conditions are tight through a converse result which characterizes a set of distributions on the database entries for which reliable matching is not possible. The necessary and sufficient conditions for reliable matching are evaluated in the cases when the database entries are independent and identically distributed as well as under Markovian database models.

AB - In this paper, matching of correlated high-dimensional databases is investigated. A stochastic database model is considered where the correlation among the database entries is governed by an arbitrary joint distribution. Concentration of measure theorems such as typicality and laws of large numbers are used to develop a database matching scheme and derive necessary conditions for successful matching. Furthermore, it is shown that these conditions are tight through a converse result which characterizes a set of distributions on the database entries for which reliable matching is not possible. The necessary and sufficient conditions for reliable matching are evaluated in the cases when the database entries are independent and identically distributed as well as under Markovian database models.

UR - http://www.scopus.com/inward/record.url?scp=85073150095&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85073150095&partnerID=8YFLogxK

U2 - 10.1109/ISIT.2019.8849392

DO - 10.1109/ISIT.2019.8849392

M3 - Conference contribution

AN - SCOPUS:85073150095

T3 - IEEE International Symposium on Information Theory - Proceedings

SP - 2748

EP - 2752

BT - 2019 IEEE International Symposium on Information Theory, ISIT 2019 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -