Prediction of axillary lymph node metastasis in primary breast cancer patients using a decision tree-based model

Masahiro Takada, Masahiro Sugimoto, Yasuhiro Naito, Hyeong Gon Moon, Wonshik Han, Dongyoung Noh, Masahide Kondo, Katsumasa Kuroi, Hironobu Sasano, Takashi Inamoto, Masaru Tomita, Masakazu Toi

Research output: Contribution to journalArticle

20 Citations (Scopus)

Abstract

Background: The aim of this study was to develop a new data-mining model to predict axillary lymph node (AxLN) metastasis in primary breast cancer. To achieve this, we used a decision tree-based prediction method-The alternating decision tree (ADTree). Methods: Clinical datasets for primary breast cancer patients who underwent sentinel lymph node biopsy or AxLN dissection without prior treatment were collected from three institutes (institute A, n = 148; institute B, n = 143; institute C, n = 174) and were used for variable selection, model training and external validation, respectively. The models were evaluated using area under the receiver operating characteristics (ROC) curve analysis to discriminate node-positive patients from node-negative patients. Results: The ADTree model selected 15 of 24 clinicopathological variables in the variable selection dataset. The resulting area under the ROC curve values were 0.770 [95% confidence interval (CI), 0.689-0.850] for the model training dataset and 0.772 (95% CI: 0.689-0.856) for the validation dataset, demonstrating high accuracy and generalization ability of the model. The bootstrap value of the validation dataset was 0.768 (95% CI: 0.763-0.774). Conclusions: Our prediction model showed high accuracy for predicting nodal metastasis in patients with breast cancer using commonly recorded clinical variables. Therefore, our model might help oncologists in the decision-making process for primary breast cancer patients before starting treatment.

Original languageEnglish
Article number54
JournalBMC Medical Informatics and Decision Making
Volume12
Issue number1
DOIs
StatePublished - 18 Jun 2012

Fingerprint

Decision Trees
Lymph Nodes
Breast Neoplasms
Neoplasm Metastasis
Confidence Intervals
ROC Curve
Sentinel Lymph Node Biopsy
Aptitude
Data Mining
Lymph Node Excision
Decision Making
Datasets
Therapeutics

Keywords

  • Alternating decision tree
  • Breast cancer
  • Data mining
  • Lymph node metastasis

Cite this

Takada, Masahiro ; Sugimoto, Masahiro ; Naito, Yasuhiro ; Moon, Hyeong Gon ; Han, Wonshik ; Noh, Dongyoung ; Kondo, Masahide ; Kuroi, Katsumasa ; Sasano, Hironobu ; Inamoto, Takashi ; Tomita, Masaru ; Toi, Masakazu. / Prediction of axillary lymph node metastasis in primary breast cancer patients using a decision tree-based model. In: BMC Medical Informatics and Decision Making. 2012 ; Vol. 12, No. 1.
@article{7a0a8df5e4f34a14be8fdda0b63d5c15,
title = "Prediction of axillary lymph node metastasis in primary breast cancer patients using a decision tree-based model",
abstract = "Background: The aim of this study was to develop a new data-mining model to predict axillary lymph node (AxLN) metastasis in primary breast cancer. To achieve this, we used a decision tree-based prediction method-The alternating decision tree (ADTree). Methods: Clinical datasets for primary breast cancer patients who underwent sentinel lymph node biopsy or AxLN dissection without prior treatment were collected from three institutes (institute A, n = 148; institute B, n = 143; institute C, n = 174) and were used for variable selection, model training and external validation, respectively. The models were evaluated using area under the receiver operating characteristics (ROC) curve analysis to discriminate node-positive patients from node-negative patients. Results: The ADTree model selected 15 of 24 clinicopathological variables in the variable selection dataset. The resulting area under the ROC curve values were 0.770 [95{\%} confidence interval (CI), 0.689-0.850] for the model training dataset and 0.772 (95{\%} CI: 0.689-0.856) for the validation dataset, demonstrating high accuracy and generalization ability of the model. The bootstrap value of the validation dataset was 0.768 (95{\%} CI: 0.763-0.774). Conclusions: Our prediction model showed high accuracy for predicting nodal metastasis in patients with breast cancer using commonly recorded clinical variables. Therefore, our model might help oncologists in the decision-making process for primary breast cancer patients before starting treatment.",
keywords = "Alternating decision tree, Breast cancer, Data mining, Lymph node metastasis",
author = "Masahiro Takada and Masahiro Sugimoto and Yasuhiro Naito and Moon, {Hyeong Gon} and Wonshik Han and Dongyoung Noh and Masahide Kondo and Katsumasa Kuroi and Hironobu Sasano and Takashi Inamoto and Masaru Tomita and Masakazu Toi",
year = "2012",
month = "6",
day = "18",
doi = "10.1186/1472-6947-12-54",
language = "English",
volume = "12",
journal = "BMC medical informatics and decision making",
issn = "1472-6947",
publisher = "BioMed Central Ltd.",
number = "1",

}

Prediction of axillary lymph node metastasis in primary breast cancer patients using a decision tree-based model. / Takada, Masahiro; Sugimoto, Masahiro; Naito, Yasuhiro; Moon, Hyeong Gon; Han, Wonshik; Noh, Dongyoung; Kondo, Masahide; Kuroi, Katsumasa; Sasano, Hironobu; Inamoto, Takashi; Tomita, Masaru; Toi, Masakazu.

In: BMC Medical Informatics and Decision Making, Vol. 12, No. 1, 54, 18.06.2012.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Prediction of axillary lymph node metastasis in primary breast cancer patients using a decision tree-based model

AU - Takada, Masahiro

AU - Sugimoto, Masahiro

AU - Naito, Yasuhiro

AU - Moon, Hyeong Gon

AU - Han, Wonshik

AU - Noh, Dongyoung

AU - Kondo, Masahide

AU - Kuroi, Katsumasa

AU - Sasano, Hironobu

AU - Inamoto, Takashi

AU - Tomita, Masaru

AU - Toi, Masakazu

PY - 2012/6/18

Y1 - 2012/6/18

N2 - Background: The aim of this study was to develop a new data-mining model to predict axillary lymph node (AxLN) metastasis in primary breast cancer. To achieve this, we used a decision tree-based prediction method-The alternating decision tree (ADTree). Methods: Clinical datasets for primary breast cancer patients who underwent sentinel lymph node biopsy or AxLN dissection without prior treatment were collected from three institutes (institute A, n = 148; institute B, n = 143; institute C, n = 174) and were used for variable selection, model training and external validation, respectively. The models were evaluated using area under the receiver operating characteristics (ROC) curve analysis to discriminate node-positive patients from node-negative patients. Results: The ADTree model selected 15 of 24 clinicopathological variables in the variable selection dataset. The resulting area under the ROC curve values were 0.770 [95% confidence interval (CI), 0.689-0.850] for the model training dataset and 0.772 (95% CI: 0.689-0.856) for the validation dataset, demonstrating high accuracy and generalization ability of the model. The bootstrap value of the validation dataset was 0.768 (95% CI: 0.763-0.774). Conclusions: Our prediction model showed high accuracy for predicting nodal metastasis in patients with breast cancer using commonly recorded clinical variables. Therefore, our model might help oncologists in the decision-making process for primary breast cancer patients before starting treatment.

AB - Background: The aim of this study was to develop a new data-mining model to predict axillary lymph node (AxLN) metastasis in primary breast cancer. To achieve this, we used a decision tree-based prediction method-The alternating decision tree (ADTree). Methods: Clinical datasets for primary breast cancer patients who underwent sentinel lymph node biopsy or AxLN dissection without prior treatment were collected from three institutes (institute A, n = 148; institute B, n = 143; institute C, n = 174) and were used for variable selection, model training and external validation, respectively. The models were evaluated using area under the receiver operating characteristics (ROC) curve analysis to discriminate node-positive patients from node-negative patients. Results: The ADTree model selected 15 of 24 clinicopathological variables in the variable selection dataset. The resulting area under the ROC curve values were 0.770 [95% confidence interval (CI), 0.689-0.850] for the model training dataset and 0.772 (95% CI: 0.689-0.856) for the validation dataset, demonstrating high accuracy and generalization ability of the model. The bootstrap value of the validation dataset was 0.768 (95% CI: 0.763-0.774). Conclusions: Our prediction model showed high accuracy for predicting nodal metastasis in patients with breast cancer using commonly recorded clinical variables. Therefore, our model might help oncologists in the decision-making process for primary breast cancer patients before starting treatment.

KW - Alternating decision tree

KW - Breast cancer

KW - Data mining

KW - Lymph node metastasis

UR - http://www.scopus.com/inward/record.url?scp=84862207098&partnerID=8YFLogxK

U2 - 10.1186/1472-6947-12-54

DO - 10.1186/1472-6947-12-54

M3 - Article

C2 - 22695278

AN - SCOPUS:84862207098

VL - 12

JO - BMC medical informatics and decision making

JF - BMC medical informatics and decision making

SN - 1472-6947

IS - 1

M1 - 54

ER -