Right-protected data publishing with provable distance-based mining

Spyros I. Zoumpoulis, Michail Vlachos, Nikolaos Freris, Claudio Lucchese

Research output: Contribution to journalArticle

Abstract

Protection of one's intellectual property is a topic with important technological and legal facets. We provide mechanisms for establishing the ownership of a dataset consisting of multiple objects. The algorithms also preserve important properties of the dataset, which are important for mining operations, and so guarantee both right protection and utility preservation. We consider a right-protection scheme based on watermarking. Watermarking may distort the original distance graph. Our watermarking methodology preserves important distance relationships, such as: the Nearest Neighbors (NN) of each object and the Minimum Spanning Tree (MST) of the original dataset. This leads to preservation of any mining operation that depends on the ordering of distances between objects, such as NN-search and classification, as well as many visualization techniques. We prove fundamental lower and upper bounds on the distance between objects post-watermarking. In particular, we establish a restricted isometry property, i.e., tight bounds on the contraction/expansion of the original distances. We use this analysis to design fast algorithms for NN-preserving and MST-preserving watermarking that drastically prune the vast search space. We observe two orders of magnitude speedup over the exhaustive schemes, without any sacrifice in NN or MST preservation.

Original languageEnglish (US)
Article number6529087
Pages (from-to)2014-2028
Number of pages15
JournalIEEE Transactions on Knowledge and Data Engineering
Volume26
Issue number8
DOIs
StatePublished - Jan 1 2014

Fingerprint

Watermarking
Intellectual property
Visualization

Keywords

  • minimum spanning tree (MST)
  • nearest neighbors (NN)
  • restricted isometry property (RIP)
  • Watermarking

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Cite this

Right-protected data publishing with provable distance-based mining. / Zoumpoulis, Spyros I.; Vlachos, Michail; Freris, Nikolaos; Lucchese, Claudio.

In: IEEE Transactions on Knowledge and Data Engineering, Vol. 26, No. 8, 6529087, 01.01.2014, p. 2014-2028.

Research output: Contribution to journalArticle

Zoumpoulis, SI, Vlachos, M, Freris, N & Lucchese, C 2014, 'Right-protected data publishing with provable distance-based mining', IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 8, 6529087, pp. 2014-2028. https://doi.org/10.1109/TKDE.2013.90
Zoumpoulis, Spyros I. ; Vlachos, Michail ; Freris, Nikolaos ; Lucchese, Claudio. / Right-protected data publishing with provable distance-based mining. In: IEEE Transactions on Knowledge and Data Engineering. 2014 ; Vol. 26, No. 8. pp. 2014-2028.
@article{2f923fa2b49c414c8d5f3709716f6fe4,
title = "Right-protected data publishing with provable distance-based mining",
abstract = "Protection of one's intellectual property is a topic with important technological and legal facets. We provide mechanisms for establishing the ownership of a dataset consisting of multiple objects. The algorithms also preserve important properties of the dataset, which are important for mining operations, and so guarantee both right protection and utility preservation. We consider a right-protection scheme based on watermarking. Watermarking may distort the original distance graph. Our watermarking methodology preserves important distance relationships, such as: the Nearest Neighbors (NN) of each object and the Minimum Spanning Tree (MST) of the original dataset. This leads to preservation of any mining operation that depends on the ordering of distances between objects, such as NN-search and classification, as well as many visualization techniques. We prove fundamental lower and upper bounds on the distance between objects post-watermarking. In particular, we establish a restricted isometry property, i.e., tight bounds on the contraction/expansion of the original distances. We use this analysis to design fast algorithms for NN-preserving and MST-preserving watermarking that drastically prune the vast search space. We observe two orders of magnitude speedup over the exhaustive schemes, without any sacrifice in NN or MST preservation.",
keywords = "minimum spanning tree (MST), nearest neighbors (NN), restricted isometry property (RIP), Watermarking",
author = "Zoumpoulis, {Spyros I.} and Michail Vlachos and Nikolaos Freris and Claudio Lucchese",
year = "2014",
month = "1",
day = "1",
doi = "10.1109/TKDE.2013.90",
language = "English (US)",
volume = "26",
pages = "2014--2028",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE Computer Society",
number = "8",

}

TY - JOUR

T1 - Right-protected data publishing with provable distance-based mining

AU - Zoumpoulis, Spyros I.

AU - Vlachos, Michail

AU - Freris, Nikolaos

AU - Lucchese, Claudio

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Protection of one's intellectual property is a topic with important technological and legal facets. We provide mechanisms for establishing the ownership of a dataset consisting of multiple objects. The algorithms also preserve important properties of the dataset, which are important for mining operations, and so guarantee both right protection and utility preservation. We consider a right-protection scheme based on watermarking. Watermarking may distort the original distance graph. Our watermarking methodology preserves important distance relationships, such as: the Nearest Neighbors (NN) of each object and the Minimum Spanning Tree (MST) of the original dataset. This leads to preservation of any mining operation that depends on the ordering of distances between objects, such as NN-search and classification, as well as many visualization techniques. We prove fundamental lower and upper bounds on the distance between objects post-watermarking. In particular, we establish a restricted isometry property, i.e., tight bounds on the contraction/expansion of the original distances. We use this analysis to design fast algorithms for NN-preserving and MST-preserving watermarking that drastically prune the vast search space. We observe two orders of magnitude speedup over the exhaustive schemes, without any sacrifice in NN or MST preservation.

AB - Protection of one's intellectual property is a topic with important technological and legal facets. We provide mechanisms for establishing the ownership of a dataset consisting of multiple objects. The algorithms also preserve important properties of the dataset, which are important for mining operations, and so guarantee both right protection and utility preservation. We consider a right-protection scheme based on watermarking. Watermarking may distort the original distance graph. Our watermarking methodology preserves important distance relationships, such as: the Nearest Neighbors (NN) of each object and the Minimum Spanning Tree (MST) of the original dataset. This leads to preservation of any mining operation that depends on the ordering of distances between objects, such as NN-search and classification, as well as many visualization techniques. We prove fundamental lower and upper bounds on the distance between objects post-watermarking. In particular, we establish a restricted isometry property, i.e., tight bounds on the contraction/expansion of the original distances. We use this analysis to design fast algorithms for NN-preserving and MST-preserving watermarking that drastically prune the vast search space. We observe two orders of magnitude speedup over the exhaustive schemes, without any sacrifice in NN or MST preservation.

KW - minimum spanning tree (MST)

KW - nearest neighbors (NN)

KW - restricted isometry property (RIP)

KW - Watermarking

UR - http://www.scopus.com/inward/record.url?scp=84904627055&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84904627055&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2013.90

DO - 10.1109/TKDE.2013.90

M3 - Article

AN - SCOPUS:84904627055

VL - 26

SP - 2014

EP - 2028

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 8

M1 - 6529087

ER -