Subspace search and visualization to make sense of alternative clusterings in high-dimensional data

Andrada Tatu, Fabian Maaß, Ines Färber, Enrico Bertini, Tobias Schreck, Thomas Seidl, Daniel Keim

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    In explorative data analysis, the data under consideration often resides in a high-dimensional (HD) data space. Currently many methods are available to analyze this type of data. So far, proposed automatic approaches include dimensionality reduction and cluster analysis, whereby visual-interactive methods aim to provide effective visual mappings to show, relate, and navigate HD data. Furthermore, almost all of these methods conduct the analysis from a singular perspective, meaning that they consider the data in either the original HD data space, or a reduced version thereof. Additionally, HD data spaces often consist of combined features that measure different properties, in which case the particular relationships between the various properties may not be clear to the analysts a priori since it can only be revealed if appropriate feature combinations (subspaces) of the data are taken into consideration. Considering just a single subspace is, however, often not sufficient since different subspaces may show complementary, conjointly, or contradicting relations between data items. Useful information may consequently remain embedded in sets of subspaces of a given HD input data space. Relying on the notion of subspaces, we propose a novel method for the visual analysis of HD data in which we employ an interestingness-guided subspace search algorithm to detect a candidate set of subspaces. Based on appropriately defined subspace similarity functions, we visualize the subspaces and provide navigation facilities to interactively explore large sets of subspaces. Our approach allows users to effectively compare and relate subspaces with respect to involved dimensions and clusters of objects. We apply our approach to synthetic and real data sets. We thereby demonstrate its support for understanding HD data from different perspectives, effectively yielding a more complete view on HD data.

    Original languageEnglish (US)
    Title of host publicationIEEE Conference on Visual Analytics Science and Technology 2012, VAST 2012 - Proceedings
    Pages63-72
    Number of pages10
    DOIs
    StatePublished - 2012
    Event2012 IEEE Conference on Visual Analytics Science and Technology, VAST 2012 - Seattle, WA, United States
    Duration: Oct 14 2012Oct 19 2012

    Other

    Other2012 IEEE Conference on Visual Analytics Science and Technology, VAST 2012
    CountryUnited States
    CitySeattle, WA
    Period10/14/1210/19/12

    Fingerprint

    Cluster analysis
    Navigation
    Visualization

    Keywords

    • H.2.8 [Database Applications]: Data mining
    • H.3.3 [Information Search and Retrieval]: Selection process
    • I.3.3 [Picture/Image Generation]: Display algorithms

    ASJC Scopus subject areas

    • Computer Science Applications
    • Computer Vision and Pattern Recognition

    Cite this

    Tatu, A., Maaß, F., Färber, I., Bertini, E., Schreck, T., Seidl, T., & Keim, D. (2012). Subspace search and visualization to make sense of alternative clusterings in high-dimensional data. In IEEE Conference on Visual Analytics Science and Technology 2012, VAST 2012 - Proceedings (pp. 63-72). [6400488] https://doi.org/10.1109/VAST.2012.6400488

    Subspace search and visualization to make sense of alternative clusterings in high-dimensional data. / Tatu, Andrada; Maaß, Fabian; Färber, Ines; Bertini, Enrico; Schreck, Tobias; Seidl, Thomas; Keim, Daniel.

    IEEE Conference on Visual Analytics Science and Technology 2012, VAST 2012 - Proceedings. 2012. p. 63-72 6400488.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Tatu, A, Maaß, F, Färber, I, Bertini, E, Schreck, T, Seidl, T & Keim, D 2012, Subspace search and visualization to make sense of alternative clusterings in high-dimensional data. in IEEE Conference on Visual Analytics Science and Technology 2012, VAST 2012 - Proceedings., 6400488, pp. 63-72, 2012 IEEE Conference on Visual Analytics Science and Technology, VAST 2012, Seattle, WA, United States, 10/14/12. https://doi.org/10.1109/VAST.2012.6400488
    Tatu A, Maaß F, Färber I, Bertini E, Schreck T, Seidl T et al. Subspace search and visualization to make sense of alternative clusterings in high-dimensional data. In IEEE Conference on Visual Analytics Science and Technology 2012, VAST 2012 - Proceedings. 2012. p. 63-72. 6400488 https://doi.org/10.1109/VAST.2012.6400488
    Tatu, Andrada ; Maaß, Fabian ; Färber, Ines ; Bertini, Enrico ; Schreck, Tobias ; Seidl, Thomas ; Keim, Daniel. / Subspace search and visualization to make sense of alternative clusterings in high-dimensional data. IEEE Conference on Visual Analytics Science and Technology 2012, VAST 2012 - Proceedings. 2012. pp. 63-72
    @inproceedings{ca7a58e40acb425e828b163b13ce21f6,
    title = "Subspace search and visualization to make sense of alternative clusterings in high-dimensional data",
    abstract = "In explorative data analysis, the data under consideration often resides in a high-dimensional (HD) data space. Currently many methods are available to analyze this type of data. So far, proposed automatic approaches include dimensionality reduction and cluster analysis, whereby visual-interactive methods aim to provide effective visual mappings to show, relate, and navigate HD data. Furthermore, almost all of these methods conduct the analysis from a singular perspective, meaning that they consider the data in either the original HD data space, or a reduced version thereof. Additionally, HD data spaces often consist of combined features that measure different properties, in which case the particular relationships between the various properties may not be clear to the analysts a priori since it can only be revealed if appropriate feature combinations (subspaces) of the data are taken into consideration. Considering just a single subspace is, however, often not sufficient since different subspaces may show complementary, conjointly, or contradicting relations between data items. Useful information may consequently remain embedded in sets of subspaces of a given HD input data space. Relying on the notion of subspaces, we propose a novel method for the visual analysis of HD data in which we employ an interestingness-guided subspace search algorithm to detect a candidate set of subspaces. Based on appropriately defined subspace similarity functions, we visualize the subspaces and provide navigation facilities to interactively explore large sets of subspaces. Our approach allows users to effectively compare and relate subspaces with respect to involved dimensions and clusters of objects. We apply our approach to synthetic and real data sets. We thereby demonstrate its support for understanding HD data from different perspectives, effectively yielding a more complete view on HD data.",
    keywords = "H.2.8 [Database Applications]: Data mining, H.3.3 [Information Search and Retrieval]: Selection process, I.3.3 [Picture/Image Generation]: Display algorithms",
    author = "Andrada Tatu and Fabian Maa{\ss} and Ines F{\"a}rber and Enrico Bertini and Tobias Schreck and Thomas Seidl and Daniel Keim",
    year = "2012",
    doi = "10.1109/VAST.2012.6400488",
    language = "English (US)",
    isbn = "9781467347532",
    pages = "63--72",
    booktitle = "IEEE Conference on Visual Analytics Science and Technology 2012, VAST 2012 - Proceedings",

    }

    TY - GEN

    T1 - Subspace search and visualization to make sense of alternative clusterings in high-dimensional data

    AU - Tatu, Andrada

    AU - Maaß, Fabian

    AU - Färber, Ines

    AU - Bertini, Enrico

    AU - Schreck, Tobias

    AU - Seidl, Thomas

    AU - Keim, Daniel

    PY - 2012

    Y1 - 2012

    N2 - In explorative data analysis, the data under consideration often resides in a high-dimensional (HD) data space. Currently many methods are available to analyze this type of data. So far, proposed automatic approaches include dimensionality reduction and cluster analysis, whereby visual-interactive methods aim to provide effective visual mappings to show, relate, and navigate HD data. Furthermore, almost all of these methods conduct the analysis from a singular perspective, meaning that they consider the data in either the original HD data space, or a reduced version thereof. Additionally, HD data spaces often consist of combined features that measure different properties, in which case the particular relationships between the various properties may not be clear to the analysts a priori since it can only be revealed if appropriate feature combinations (subspaces) of the data are taken into consideration. Considering just a single subspace is, however, often not sufficient since different subspaces may show complementary, conjointly, or contradicting relations between data items. Useful information may consequently remain embedded in sets of subspaces of a given HD input data space. Relying on the notion of subspaces, we propose a novel method for the visual analysis of HD data in which we employ an interestingness-guided subspace search algorithm to detect a candidate set of subspaces. Based on appropriately defined subspace similarity functions, we visualize the subspaces and provide navigation facilities to interactively explore large sets of subspaces. Our approach allows users to effectively compare and relate subspaces with respect to involved dimensions and clusters of objects. We apply our approach to synthetic and real data sets. We thereby demonstrate its support for understanding HD data from different perspectives, effectively yielding a more complete view on HD data.

    AB - In explorative data analysis, the data under consideration often resides in a high-dimensional (HD) data space. Currently many methods are available to analyze this type of data. So far, proposed automatic approaches include dimensionality reduction and cluster analysis, whereby visual-interactive methods aim to provide effective visual mappings to show, relate, and navigate HD data. Furthermore, almost all of these methods conduct the analysis from a singular perspective, meaning that they consider the data in either the original HD data space, or a reduced version thereof. Additionally, HD data spaces often consist of combined features that measure different properties, in which case the particular relationships between the various properties may not be clear to the analysts a priori since it can only be revealed if appropriate feature combinations (subspaces) of the data are taken into consideration. Considering just a single subspace is, however, often not sufficient since different subspaces may show complementary, conjointly, or contradicting relations between data items. Useful information may consequently remain embedded in sets of subspaces of a given HD input data space. Relying on the notion of subspaces, we propose a novel method for the visual analysis of HD data in which we employ an interestingness-guided subspace search algorithm to detect a candidate set of subspaces. Based on appropriately defined subspace similarity functions, we visualize the subspaces and provide navigation facilities to interactively explore large sets of subspaces. Our approach allows users to effectively compare and relate subspaces with respect to involved dimensions and clusters of objects. We apply our approach to synthetic and real data sets. We thereby demonstrate its support for understanding HD data from different perspectives, effectively yielding a more complete view on HD data.

    KW - H.2.8 [Database Applications]: Data mining

    KW - H.3.3 [Information Search and Retrieval]: Selection process

    KW - I.3.3 [Picture/Image Generation]: Display algorithms

    UR - http://www.scopus.com/inward/record.url?scp=84872923174&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84872923174&partnerID=8YFLogxK

    U2 - 10.1109/VAST.2012.6400488

    DO - 10.1109/VAST.2012.6400488

    M3 - Conference contribution

    SN - 9781467347532

    SP - 63

    EP - 72

    BT - IEEE Conference on Visual Analytics Science and Technology 2012, VAST 2012 - Proceedings

    ER -