Heterogeneous clustering ensemble method for combining different cluster results

Hye Sung Yoon, Sun Young Ahn, Sang Ho Lee, Sung Bum Cho, Ju Han Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

31 Citations (Scopus)

Abstract

Biological data set sizes have been growing rapidly with the technological advances that have occurred in bioinformatics. Data mining techniques have been used extensively as approaches to detect interesting patterns in large databases. In bioinformatics, clustering algorithm technique for data mining can be applied to find underlying genetic and biological interactions, without considering prior information from datasets. However, many clustering algorithms are practically available, and different clustering algorithms may generate dissimilar clustering results due to bio-data characteristics and experimental assumptions. In this paper, we propose a novel heterogeneous clustering ensemble scheme that uses a genetic algorithm to generate high quality and robust clustering results with characteristics of bio-data. The proposed method combines results of various clustering algorithms and crossover operation of genetic algorithm, and is founded on the concept of using the evolutionary processes to select the most commonly-inherited characteristics. Our framework proved to be available on real data set and the optimal clustering results generated by means of our proposed method are detailed in this paper. Experimental results demonstrate that the proposed method yields better clustering results than applying a single best clustering algorithm.

Original languageEnglish
Title of host publicationData Mining for Biomedical Applications - PAKDD 2006 Workshop, BioDM 2006, Proceedings
Pages82-92
Number of pages11
DOIs
StatePublished - 14 Jul 2006
EventPAKDD 2006 Workshop on Data Mining for Biomedical Applications, BioDM 2006 - Singapore, Singapore
Duration: 9 Apr 20069 Apr 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3916 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

OtherPAKDD 2006 Workshop on Data Mining for Biomedical Applications, BioDM 2006
CountrySingapore
CitySingapore
Period9/04/069/04/06

Fingerprint

Ensemble Methods
Clustering Methods
Clustering algorithms
Clustering Algorithm
Clustering
Bioinformatics
Data mining
Data Mining
Genetic algorithms
Genetic Algorithm
Prior Information
Crossover
Ensemble
Experimental Results
Interaction
Demonstrate

Cite this

Yoon, H. S., Ahn, S. Y., Lee, S. H., Cho, S. B., & Kim, J. H. (2006). Heterogeneous clustering ensemble method for combining different cluster results. In Data Mining for Biomedical Applications - PAKDD 2006 Workshop, BioDM 2006, Proceedings (pp. 82-92). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3916 LNBI). https://doi.org/10.1007/11691730_9
Yoon, Hye Sung ; Ahn, Sun Young ; Lee, Sang Ho ; Cho, Sung Bum ; Kim, Ju Han. / Heterogeneous clustering ensemble method for combining different cluster results. Data Mining for Biomedical Applications - PAKDD 2006 Workshop, BioDM 2006, Proceedings. 2006. pp. 82-92 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{92748d89dce540d2b5b87a00aff232c6,
title = "Heterogeneous clustering ensemble method for combining different cluster results",
abstract = "Biological data set sizes have been growing rapidly with the technological advances that have occurred in bioinformatics. Data mining techniques have been used extensively as approaches to detect interesting patterns in large databases. In bioinformatics, clustering algorithm technique for data mining can be applied to find underlying genetic and biological interactions, without considering prior information from datasets. However, many clustering algorithms are practically available, and different clustering algorithms may generate dissimilar clustering results due to bio-data characteristics and experimental assumptions. In this paper, we propose a novel heterogeneous clustering ensemble scheme that uses a genetic algorithm to generate high quality and robust clustering results with characteristics of bio-data. The proposed method combines results of various clustering algorithms and crossover operation of genetic algorithm, and is founded on the concept of using the evolutionary processes to select the most commonly-inherited characteristics. Our framework proved to be available on real data set and the optimal clustering results generated by means of our proposed method are detailed in this paper. Experimental results demonstrate that the proposed method yields better clustering results than applying a single best clustering algorithm.",
author = "Yoon, {Hye Sung} and Ahn, {Sun Young} and Lee, {Sang Ho} and Cho, {Sung Bum} and Kim, {Ju Han}",
year = "2006",
month = "7",
day = "14",
doi = "10.1007/11691730_9",
language = "English",
isbn = "3540331042",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "82--92",
booktitle = "Data Mining for Biomedical Applications - PAKDD 2006 Workshop, BioDM 2006, Proceedings",

}

Yoon, HS, Ahn, SY, Lee, SH, Cho, SB & Kim, JH 2006, Heterogeneous clustering ensemble method for combining different cluster results. in Data Mining for Biomedical Applications - PAKDD 2006 Workshop, BioDM 2006, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3916 LNBI, pp. 82-92, PAKDD 2006 Workshop on Data Mining for Biomedical Applications, BioDM 2006, Singapore, Singapore, 9/04/06. https://doi.org/10.1007/11691730_9

Heterogeneous clustering ensemble method for combining different cluster results. / Yoon, Hye Sung; Ahn, Sun Young; Lee, Sang Ho; Cho, Sung Bum; Kim, Ju Han.

Data Mining for Biomedical Applications - PAKDD 2006 Workshop, BioDM 2006, Proceedings. 2006. p. 82-92 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3916 LNBI).

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

TY - GEN

T1 - Heterogeneous clustering ensemble method for combining different cluster results

AU - Yoon, Hye Sung

AU - Ahn, Sun Young

AU - Lee, Sang Ho

AU - Cho, Sung Bum

AU - Kim, Ju Han

PY - 2006/7/14

Y1 - 2006/7/14

N2 - Biological data set sizes have been growing rapidly with the technological advances that have occurred in bioinformatics. Data mining techniques have been used extensively as approaches to detect interesting patterns in large databases. In bioinformatics, clustering algorithm technique for data mining can be applied to find underlying genetic and biological interactions, without considering prior information from datasets. However, many clustering algorithms are practically available, and different clustering algorithms may generate dissimilar clustering results due to bio-data characteristics and experimental assumptions. In this paper, we propose a novel heterogeneous clustering ensemble scheme that uses a genetic algorithm to generate high quality and robust clustering results with characteristics of bio-data. The proposed method combines results of various clustering algorithms and crossover operation of genetic algorithm, and is founded on the concept of using the evolutionary processes to select the most commonly-inherited characteristics. Our framework proved to be available on real data set and the optimal clustering results generated by means of our proposed method are detailed in this paper. Experimental results demonstrate that the proposed method yields better clustering results than applying a single best clustering algorithm.

AB - Biological data set sizes have been growing rapidly with the technological advances that have occurred in bioinformatics. Data mining techniques have been used extensively as approaches to detect interesting patterns in large databases. In bioinformatics, clustering algorithm technique for data mining can be applied to find underlying genetic and biological interactions, without considering prior information from datasets. However, many clustering algorithms are practically available, and different clustering algorithms may generate dissimilar clustering results due to bio-data characteristics and experimental assumptions. In this paper, we propose a novel heterogeneous clustering ensemble scheme that uses a genetic algorithm to generate high quality and robust clustering results with characteristics of bio-data. The proposed method combines results of various clustering algorithms and crossover operation of genetic algorithm, and is founded on the concept of using the evolutionary processes to select the most commonly-inherited characteristics. Our framework proved to be available on real data set and the optimal clustering results generated by means of our proposed method are detailed in this paper. Experimental results demonstrate that the proposed method yields better clustering results than applying a single best clustering algorithm.

UR - http://www.scopus.com/inward/record.url?scp=33745771946&partnerID=8YFLogxK

U2 - 10.1007/11691730_9

DO - 10.1007/11691730_9

M3 - Conference contribution

SN - 3540331042

SN - 9783540331049

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 82

EP - 92

BT - Data Mining for Biomedical Applications - PAKDD 2006 Workshop, BioDM 2006, Proceedings

ER -

Yoon HS, Ahn SY, Lee SH, Cho SB, Kim JH. Heterogeneous clustering ensemble method for combining different cluster results. In Data Mining for Biomedical Applications - PAKDD 2006 Workshop, BioDM 2006, Proceedings. 2006. p. 82-92. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/11691730_9