首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于数据挖掘技术的乳腺癌亚型识别方法
引用本文:杨绍华,陈冬东,张旭,何林.基于数据挖掘技术的乳腺癌亚型识别方法[J].西南农业大学学报,2018,40(5):113-116.
作者姓名:杨绍华  陈冬东  张旭  何林
作者单位:西南大学数学与统计学院;中国科学院植物研究所
基金项目:国家自然科学基金项目(11701471);重庆市基础科学与前沿技术研究项目(cstc2017jcyjAX0476)
摘    要:随机森林算法可对特征进行重要性排序,并能提高运行效率和分类的准确率.采用方差分析、随机森林算法对乳腺癌基因进行筛选,使得用随机森林算法、支持向量机算法和k近邻算法测试集的准确率分别达到95.6%,92.9%和92.7%,并发现了区分乳腺癌不同亚型的两种最重要的基因GATA3和ESR1.

关 键 词:数据挖掘    微阵列    乳腺癌    分类  

An Identification Method for Breast Cancer Subtypes Based on Data Mining Technology
YANG Shao-hu,CHEN Dong-dong,ZHANG Xu,HE lin.An Identification Method for Breast Cancer Subtypes Based on Data Mining Technology[J].Journal of Southwest Agricultural University,2018,40(5):113-116.
Authors:YANG Shao-hu  CHEN Dong-dong  ZHANG Xu  HE lin
Abstract:The random forest algorithm can rank features in accordance with their importance and improve the efficiency of operation and the accuracy of classification. In a study reported herein, variance analysis and the random forest algorithm were used to select the characteristics of breast cancer, and the accuracy rate of the random forest algorithm, the CVM (support vector machine) algorithm and the KNN (k-nearest neighbor) algorithm were 95.6%, 92.9% and 92.7%, respectively. Two most important genes, GATA3 and ESR1, were discovered, which can distinguish different subtypes of breast cancer.
Keywords:
本文献已被 CNKI 等数据库收录!
点击此处可从《西南农业大学学报》浏览原始摘要信息
点击此处可从《西南农业大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号