Abstract
Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. Most existing systems concentrate either on mining algorithms or on visualization techniques. Though visual methods developed in information visualization have been helpful, for improved understanding of a complex large high-dimensional dataset, there is a need for an effective projection of such a dataset onto a lower-dimension (2D or 3D) manifold. This paper introduces a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualization domain. The framework follows Shneiderman’s mantra to provide an effective user interface. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection methods, such as Generative Topographic Mapping (GTM) and Hierarchical GTM (HGTM), with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, billboarding, and user interaction facilities, to provide an integrated visual data mining framework. Results on a real life high-dimensional dataset from the chemoinformatics domain are also reported and discussed. Projection results of GTM are analytically compared with the projection results from other traditional projection methods, and it is also shown that the HGTM algorithm provides additional value for large datasets. The computational complexity of these algorithms is discussed to demonstrate their suitability for the visual data mining framework.
Original language | English |
---|---|
Title of host publication | Workshop on Multimedia Data Mining “Merging Multimedia and Data Mining Research” |
Editors | Zhongfei Zhang, Florent Masseglia, Ramesh Jain, Alberto Del Bimbo |
Publisher | ACM |
Pages | 143-152 |
Number of pages | 10 |
Publication status | Published - 20 Aug 2006 |
Bibliographical note
© ACM, 2006. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Workshop on Multimedia Data Mining “Merging Multimedia and Data Mining Research”, 2006 http://www.fortune.binghamton.edu/MDM2006/ MDM/KDD2006 Seventh International Workshop on Multimedia Data Mining "Merging Multimedia and Data Mining Research" Held in conjunction with the KDD conference, 20 August 2006, Philadelphia (US)Keywords
- visual data mining
- data visualization
- principled projection
- algorithms
- information visualization techniques