Extraction of informative genes from multiple microarray data integrated by rank-based approach

Dongwan Hong, Jeehee Yoon, Jongkeun Lee, Sanghyun Park, Jong-Il Kim

Research output: Contribution to journalArticleResearchpeer-review

Abstract

By converting the expression values of each sample into the corresponding rank values, the rank-based approach enables the direct integration of multiple microarray data produced by different laboratories and/or different techniques. In this study, we verify through statistical and experimental methods that informative genes can be extracted from multiple microarray data integrated by the rank-based approach (briefly, integrated rank-based microarray data). First, after showing that a nonparametric technique can be used effectively as a scoring metric for rankbased microarray data, we prove that the scoring results from integrated rank-based microarray data are statistically significant. Next, through experimental comparisons, we show that the informative genes from integrated rank-based microarray data are statistically more significant than those of single-microarray data. In addition, by comparing the lists of informative genes extracted from experimental data, we show that the rankbased data integration method extracts more significant genes than the zscore- based normalization technique or the rank products technique. Public cancer microarray data were used for our experiments and the marker genes list from the CGAP database was used to compare the extracted genes. The GO database and the GSEA method were also used to analyze the functionalities of the extracted genes.

Original languageEnglish
Pages (from-to)841-854
Number of pages14
JournalIEICE Transactions on Information and Systems
VolumeE94-D
Issue number4
DOIs
StatePublished - 1 Jan 2011

Fingerprint

Microarrays
Genes
Data integration

Keywords

  • Data integration
  • Informative gene
  • Microarray data
  • Significance test

Cite this

Hong, Dongwan ; Yoon, Jeehee ; Lee, Jongkeun ; Park, Sanghyun ; Kim, Jong-Il. / Extraction of informative genes from multiple microarray data integrated by rank-based approach. In: IEICE Transactions on Information and Systems. 2011 ; Vol. E94-D, No. 4. pp. 841-854.
@article{430b51efa73b430e906b775d687ea6a7,
title = "Extraction of informative genes from multiple microarray data integrated by rank-based approach",
abstract = "By converting the expression values of each sample into the corresponding rank values, the rank-based approach enables the direct integration of multiple microarray data produced by different laboratories and/or different techniques. In this study, we verify through statistical and experimental methods that informative genes can be extracted from multiple microarray data integrated by the rank-based approach (briefly, integrated rank-based microarray data). First, after showing that a nonparametric technique can be used effectively as a scoring metric for rankbased microarray data, we prove that the scoring results from integrated rank-based microarray data are statistically significant. Next, through experimental comparisons, we show that the informative genes from integrated rank-based microarray data are statistically more significant than those of single-microarray data. In addition, by comparing the lists of informative genes extracted from experimental data, we show that the rankbased data integration method extracts more significant genes than the zscore- based normalization technique or the rank products technique. Public cancer microarray data were used for our experiments and the marker genes list from the CGAP database was used to compare the extracted genes. The GO database and the GSEA method were also used to analyze the functionalities of the extracted genes.",
keywords = "Data integration, Informative gene, Microarray data, Significance test",
author = "Dongwan Hong and Jeehee Yoon and Jongkeun Lee and Sanghyun Park and Jong-Il Kim",
year = "2011",
month = "1",
day = "1",
doi = "10.1587/transinf.E94.D.841",
language = "English",
volume = "E94-D",
pages = "841--854",
journal = "IEICE Transactions on Information and Systems",
issn = "0916-8532",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "4",

}

Extraction of informative genes from multiple microarray data integrated by rank-based approach. / Hong, Dongwan; Yoon, Jeehee; Lee, Jongkeun; Park, Sanghyun; Kim, Jong-Il.

In: IEICE Transactions on Information and Systems, Vol. E94-D, No. 4, 01.01.2011, p. 841-854.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Extraction of informative genes from multiple microarray data integrated by rank-based approach

AU - Hong, Dongwan

AU - Yoon, Jeehee

AU - Lee, Jongkeun

AU - Park, Sanghyun

AU - Kim, Jong-Il

PY - 2011/1/1

Y1 - 2011/1/1

N2 - By converting the expression values of each sample into the corresponding rank values, the rank-based approach enables the direct integration of multiple microarray data produced by different laboratories and/or different techniques. In this study, we verify through statistical and experimental methods that informative genes can be extracted from multiple microarray data integrated by the rank-based approach (briefly, integrated rank-based microarray data). First, after showing that a nonparametric technique can be used effectively as a scoring metric for rankbased microarray data, we prove that the scoring results from integrated rank-based microarray data are statistically significant. Next, through experimental comparisons, we show that the informative genes from integrated rank-based microarray data are statistically more significant than those of single-microarray data. In addition, by comparing the lists of informative genes extracted from experimental data, we show that the rankbased data integration method extracts more significant genes than the zscore- based normalization technique or the rank products technique. Public cancer microarray data were used for our experiments and the marker genes list from the CGAP database was used to compare the extracted genes. The GO database and the GSEA method were also used to analyze the functionalities of the extracted genes.

AB - By converting the expression values of each sample into the corresponding rank values, the rank-based approach enables the direct integration of multiple microarray data produced by different laboratories and/or different techniques. In this study, we verify through statistical and experimental methods that informative genes can be extracted from multiple microarray data integrated by the rank-based approach (briefly, integrated rank-based microarray data). First, after showing that a nonparametric technique can be used effectively as a scoring metric for rankbased microarray data, we prove that the scoring results from integrated rank-based microarray data are statistically significant. Next, through experimental comparisons, we show that the informative genes from integrated rank-based microarray data are statistically more significant than those of single-microarray data. In addition, by comparing the lists of informative genes extracted from experimental data, we show that the rankbased data integration method extracts more significant genes than the zscore- based normalization technique or the rank products technique. Public cancer microarray data were used for our experiments and the marker genes list from the CGAP database was used to compare the extracted genes. The GO database and the GSEA method were also used to analyze the functionalities of the extracted genes.

KW - Data integration

KW - Informative gene

KW - Microarray data

KW - Significance test

UR - http://www.scopus.com/inward/record.url?scp=79953331937&partnerID=8YFLogxK

U2 - 10.1587/transinf.E94.D.841

DO - 10.1587/transinf.E94.D.841

M3 - Article

VL - E94-D

SP - 841

EP - 854

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

SN - 0916-8532

IS - 4

ER -