RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      KCI등재 SCOPUS SCIE

      Improving classification accuracy of cancer types using parallel hybrid feature selection on microarray gene expression data

      한글로보기

      https://www.riss.kr/link?id=A106437968

      • 0

        상세조회
      • 0

        다운로드
      서지정보 열기
      • 내보내기
      • 내책장담기
      • 공유하기
      • 오류접수

      부가정보

      다국어 초록 (Multilingual Abstract)

      Background Data mining techniques are used to mine unknown knowledge from huge data. Microarray gene expression (MGE) data plays a major role in predicting type of cancer. But as MGE data is huge in volume, applying traditional data mining approaches ...

      Background Data mining techniques are used to mine unknown knowledge from huge data. Microarray gene expression (MGE) data plays a major role in predicting type of cancer. But as MGE data is huge in volume, applying traditional data mining approaches is time consuming. Hence parallel programming frameworks like Hadoop, Spark and Mahout are necessary to ease the task of computation.
      Objective Not all the gene expressions are necessary in prediction, it is very essential to select important genes for improving classification accuracy. So feature selection algorithms are parallelized and executed on Spark framework to eliminate unnecessary genes and identify only predictive genes in very less time without affecting prediction accuracy.
      Methods Parallelized hybrid feature selection (HFS) method is proposed to serve the purpose. This method includes parallelized correlation feature subset selection followed by rank-based feature selection methods. The selected subset of genes is evaluated using parallel classification algorithms. The accuracy values obtained are compared with existing rank-weight feature selection, parallelized recursive feature selection methods and also with the values obtained by executing parallelized HFS on DistributedWekaSpark.
      Results The classification accuracy obtained with the proposed parallelized HFS method is 97% and 79% for gastric cancer and childhood leukemia respectively. The proposed parallelized HFS method produced ~ 4% to ~ 15% improvement in classification accuracy when compared with previous methods.
      Conclusion The results reveal the fact that the proposed parallelized feature selection algorithm is scalable to growing medical data and predicts cancer sub-types in lesser time with higher accuracy.

      더보기

      참고문헌 (Reference)

      1 "Waikato Environment for Knowledge Analysis (WEKA)"

      2 방만석, "Transcriptome analysis of non-small cell lung cancer and genetically matched adjacent normal tissues identifies novel prognostic marker genes" 한국유전학회 39 (39): 277-284, 2017

      3 Hall M, "The WEKA data mining software : an update" 11 (11): 10-18, 2009

      4 "Spark Release 2.2.1—Apache Spark"

      5 강성만, "Prediction of personalized drugs based on genetic variations provided by DNA sequencing technologies" 한국유전학회 33 (33): 591-603, 2011

      6 Venkataramana L, "Parallelized classification of cancer sub-types from gene expression profiles using recursive gene selection" 27 (27): 215-224, 2018

      7 Eiras-Franco C, "Multithreaded and Spark parallelization of feature selection filters" 17 : 609-619, 2016

      8 Golub TR, "Molecular classification of cancer : class discovery and class prediction by gene expression monitoring" 286 (286): 531-537, 1999

      9 Zhang H, "Informative gene selection and direct classification of tumor based on chi square test of pairwise gene interactions" 2014 (2014): 1-9, 2014

      10 Singh RK, "Feature selection of gene expression data for cancer classification : a review" 50 : 52-57, 2015

      1 "Waikato Environment for Knowledge Analysis (WEKA)"

      2 방만석, "Transcriptome analysis of non-small cell lung cancer and genetically matched adjacent normal tissues identifies novel prognostic marker genes" 한국유전학회 39 (39): 277-284, 2017

      3 Hall M, "The WEKA data mining software : an update" 11 (11): 10-18, 2009

      4 "Spark Release 2.2.1—Apache Spark"

      5 강성만, "Prediction of personalized drugs based on genetic variations provided by DNA sequencing technologies" 한국유전학회 33 (33): 591-603, 2011

      6 Venkataramana L, "Parallelized classification of cancer sub-types from gene expression profiles using recursive gene selection" 27 (27): 215-224, 2018

      7 Eiras-Franco C, "Multithreaded and Spark parallelization of feature selection filters" 17 : 609-619, 2016

      8 Golub TR, "Molecular classification of cancer : class discovery and class prediction by gene expression monitoring" 286 (286): 531-537, 1999

      9 Zhang H, "Informative gene selection and direct classification of tumor based on chi square test of pairwise gene interactions" 2014 (2014): 1-9, 2014

      10 Singh RK, "Feature selection of gene expression data for cancer classification : a review" 50 : 52-57, 2015

      11 Yuan M, "Feature selection by maximizing correlation information for integrated high-dimensional protein data" 92 : 17-24, 2017

      12 Peralta D, "Evolutionary feature selection for big data classification : a Mapreduce approach" 2015 (2015): 1-11, 2015

      13 Bolón-CanedoV V, "Distributed feature selection : an application to microarray data classification" 30 : 136-150, 2015

      14 허정훈, "Distinct gene expression signatures during development of distant metastasis" 한국유전학회 35 (35): 511-522, 2013

      15 Gracia Jacob S, "Discovery of novel oncogenic patterns using hybrid feature selection and rule mining" Anna University 2015

      16 Hall MA, "Correlation-based feature selection for discrete and numeric class machine learning" 359-366, 2000

      17 Li J, "Challenges of feature selection for big data analytics" 32 (32): 9-15, 2017

      18 "Bioinformatics Laboratory"

      19 Ramani RG, "Benchmarking classification models for cancer prediction from gene expression data : a novel approach and new findings" 22 (22): 134-143, 2013

      20 Wang Z, "Application of ReliefF algorithm to selecting feature sets for classification of high resolution remote sensing image" 755-758, 2016

      21 Lokeswari YV, "Advances in big data and cloud computing" Springer 529-538, 2019

      22 Ryza S, "Advanced analytics with Spark: patterns for learning from data at scale" O’Reilly Media Inc. 2017

      23 Alshamlan HM, "A study of cancer microarray gene expression prole : objectives and approaches" 2 : 1-6, 2013

      24 Wang X, "A robust gene selection method for microarray-based cancer classification" 9 : CIN-S3794-, 2010

      25 Lee CP, "A novel hybrid feature selection method for microarray data analysis" 11 (11): 208-213, 2017

      26 Das AK, "A new hybrid feature selection approach using feature association map for supervised and unsupervised classification" 88 : 81-94, 2017

      27 Jia-Feng Yu, "A hybrid strategy for comprehensive annotation of the protein coding genes in prokaryotic genome" 한국유전학회 37 (37): 347-355, 2015

      28 Chuang LY, "A hybrid feature selection method for DNA microarray data" 41 (41): 228-237, 2011

      29 Lu H, "A hybrid feature selection algorithm for gene expression data classification" 256 : 56-62, 2017

      30 Ali SI, "A feature subset selection method based on symmetric uncertainty and ant colony optimization" 1-6, 2012

      더보기

      분석정보

      View

      상세정보조회

      0

      Usage

      원문다운로드

      0

      대출신청

      0

      복사신청

      0

      EDDS신청

      0

      동일 주제 내 활용도 TOP

      더보기

      주제

      연도별 연구동향

      연도별 활용동향

      연관논문

      연구자 네트워크맵

      공동연구자 (7)

      유사연구자 (20) 활용도상위20명

      인용정보 인용지수 설명보기

      학술지 이력

      학술지 이력
      연월일 이력구분 이력상세 등재구분
      2023 평가예정 해외DB학술지평가 신청대상 (해외등재 학술지 평가)
      2020-01-01 평가 등재학술지 유지 (해외등재 학술지 평가) KCI등재
      2015-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2012-05-07 학술지명변경 한글명 : 한국유전학회지 -> Genes & Genomics KCI등재
      2011-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2009-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2008-04-14 학술지명변경 외국어명 : Korean Journal of Genetics -> Genes and Genomics KCI등재
      2007-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2004-01-01 평가 등재학술지 선정 (등재후보2차) KCI등재
      2003-01-01 평가 등재후보 1차 PASS (등재후보1차) KCI등재후보
      2002-01-01 평가 등재후보학술지 유지 (등재후보1차) KCI등재후보
      1999-07-01 평가 등재후보학술지 선정 (신규평가) KCI등재후보
      더보기

      학술지 인용정보

      학술지 인용정보
      기준연도 WOS-KCI 통합IF(2년) KCIF(2년) KCIF(3년)
      2016 0.51 0.12 0.38
      KCIF(4년) KCIF(5년) 중심성지수(3년) 즉시성지수
      0.32 0.27 0.258 0.02
      더보기

      이 자료와 함께 이용한 RISS 자료

      나만을 위한 추천자료

      해외이동버튼