摘要
如今,心血管疾病相当普遍,是导致死亡的主要原因之一。冠状动脉疾病作为心血管疾病的一种类型,其准确和及时的诊断非常重要。在冠状动脉疾病的精确诊断和疾病严重程度的确定中,侵入性方法——血管造影被用作黄金标准。血管造影虽然成本高昂且需要高级专业知识,还可能导致严重并发症。因此,人们正在研究数据挖掘的使用,以提供更便宜和更有效的方法。本研究应用了数据挖掘方法来开发冠状动脉疾病风险的分类模型。研究范围内比较了分类方法获得的结果和正确分类率。为此,使用了来自克利夫兰诊所的包含303条记录和14个变量的心脏病数据集。为了进行必要的计算和获取模型,我们在Weka软件包中应用了1R、J48决策树、朴素贝叶斯和多层人工神经网络(YSA)分类方法。应用结果表明,冠状动脉疾病检测中最好的结果是通过多层YSA分类方法获得的,准确率为83.498%。多层YSA算法之后是朴素贝叶斯和优化的J48决策树算法。
结论
数据挖掘算法在冠状动脉疾病的识别和风险因素的确定中发挥着重要作用。本研究在冠状动脉疾病风险的确定中使用了1R、剪枝、未剪枝和优化的J48决策树、朴素贝叶斯和多层YSA分类方法。分类算法从准确性、真阳性率(TP)、假阳性率(FP)、精确度、F-度量、ROC和时间角度进行了比较。从准确性角度分析数据挖掘分类算法时,最佳结果是通过多层YSA分类方法获得的,准确率为83.498%。虽然人工神经网络模型具有最高的准确率,但在解释和应用方面如同一个黑匣子。尽管朴素贝叶斯算法简单,但它是最具准确率的算法之一。J48决策树虽然具有中等水平的准确率,但为专家医生和研究人员提供了可解释性。因此,在应用中选择哪种模型应考虑应用的特殊情况。本研究的结果预计将在冠状动脉疾病疑似患者就诊的诊断和治疗过程,以及将接受侵入性程序的正确患者群体的选择中,指导心血管领域专家的临床决策。此外,通过开发的数据挖掘模型,可以减少医疗错误、不必要的应用差异和医疗成本,从而提高患者安全性和生活质量。本研究中,分类算法应用于数据集中的14个变量。在未来的研究中,可以使用优化算法更详细地检查数据集中的变量,并应用分类算法。此外,在模型应用中,可以考虑TP和FN比率可能导致的结果,从风险管理角度进行评估。另一方面,未来的研究可以开发同时优化这两种比率(即TP和FN)的算法。
参考文献
- Abdullah, A. S. (2012). A Data Mining Model to Predict and Analyze the Events Related to Coronary Heart Disease using Decision Trees with Particle Swarm Optimization for Feature Selection. International Journal of Computer Applications, 55(8).
- Alizadehsani, R., Habibi, J., Hosseini, M. J., Mashayekhi, H., Boghrati, R., Ghandeharioun, A., Sani, Z. A. (2013). A Data Mining Approach for Diagnosis of Coronary Artery Disease. Computer Methods and Programs in Biomedicine, 111(1), 52-61.
- Alizadehsani, R., Hosseini, M. J., Sani, Z. A., Ghandeharioun, A., & Boghrati, R. (2012). Diagnosis of Coronary Artery Disease Using Cost-Sensitive Algorithms. In Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on (pp. 9-16).
- Anbarasi, M., Anupriya, E., & Iyengar, N. C. S. N. (2010). Enhanced Prediction of Heart Disease with Feature Subset Selection Using Genetic Algorithm. International Journal of Engineering Science and Technology, 2(10), 5370-5376.
- Avşar, A., Önder, Akçı., Beyter, M. E. (2011). Aterosklerozun Patogenezi (Aterogenez). Turkiye Klinikleri Journal of Cardiology Special Topics, 4(2), 1-15.
- Cardiovascular diseases (CVDs), (Erişim tarihi; Ekim, 2016).
- Ceylan, Y., Kaya, Y., & Tuncer, M. (2011). Akut Koroner Sendrom Kliniği ile Başvuran Hastalarda Koroner Arter Hastalığı Risk Faktörleri. Van Tıp Dergisi, 18(3), 147-54.
- Chen, A. H., Huang, S. Y., Hong, P. S., Cheng, C. H., & Lin, E. J. (2011, September). HDPS: Heart Disease Prediction System. In Computing in Cardiology, 2011 (pp. 557-560). IEEE.
- Çınar, H. ve Arslan, G., 2008. "Veri madenciliği ve CRISP-DM yaklaşımı", XVII. İstatistik Araştırma Sempozyumu, 304-314, Ankara.
- De Flines, J., & Scheen, A. J. (2009). Management Of Metabolic Syndrome And Associated Cardiovascular Risk Factors. Acta Gastro-Enterologica Belgica, 73(2), 261-266.
- El-Bialy, R., Salamay, M. A., Karam, O. H., & Khalifa, M. E. (2015). Feature Analysis of Coronary Artery Heart Disease Data Sets. Procedia Computer Science, 65, 459-468.
- Erdoğan, N., Altın, L., Altunkan, Ş. (2002). Elektron Beam Tomografi ile Koroner Arterlerdeki Kalsiyum Miktar›n›n Saptanması. Tanısal ve Girişimsel Radyoloji, 8, 533-537.
- Griffin, B. P., Callahan T.D., Menon, V.(Eds.). (2012). Manual of Cardiovascular Medicine. Lippincott Williams & Wilkins.
- Mann, D. L., Zipes, D. P., Libby, P., & Bonow, R. O. (2014). Braunwald's Heart Disease: a Textbook of Cardiovascular Medicine. Elsevier Health Sciences.
- Ökçün, B., Gürmen, T. (2007). Koroner Anjiyografi Komplikasyonları ve Tedavisi. Turkiye Klinikleri Journal of Internal Medical Sciences, 3(42), 48-72.
- Palaniappan, S., & Awang, R. (2008). Intelligent Heart Disease Prediction System Using Data Mining Techniques. In Computer Systems and Applications, 2008. AICCSA 2008. IEEE/ACS International Conference on (pp. 108-115). IEEE.
- Pandey, A. K., Pandey, P., & Jaiswal, K. L. (2013). A Heart Disease Prediction Model Using Decision Tree. IUP Journal of Computer Sciences, 7(3), 43.
- Shafique, U., Majeed, F., Qaiser, H., & Mustafa, I. U. (2015). Data Mining in Healthcare for Heart Diseases. International Journal of Innovation and Applied Studies, 10(4), 1312.
- Sharan M.L, Sathees, K.B. (2016). Analysis of Cardiovascular Heart Disease Prediction Using Data Mining Techniques. Analysis, 4(1), 55-58.
- Soni, J., Ansari, U., Sharma, D., Soni, S. (2011). Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction. International Journal of Computer Applications, 17(8), 43-48.
- Srinivas, K., Rani, B. K., & Govrdhan, A. (2010). Applications of Data Mining Techniques in Healthcare and Prediction of Heart Attacks. International Journal on Computer Science and Engineering (IJCSE), 2(02), 250-255.
- Onat, A., Sansoy, V., Soydan, İ., Tokgözoğlu, L., & Adalet, K. (2003). TEKHARF, Oniki Yıllık İzleme Deneyimine Göre Türk Erişkinlerinde Kalp Sağlığı. İstanbul Türkiye, 12-4.
- Verma, L., Srivastava, S., Negi, P. C. (2016). A Hybrid Data Mining Model to Predict Coronary Artery Disease Cases Using Non- Invasive Clinical Data. Journal of Medical Systems, 40(7), 1-7.
- Wirth, R., & Hipp, J. (2000). CRISP-DM: Towards a Standard Process Model for Data Mining. In Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining, 29-39.
- Wong, N. D. (2014). Epidemiological Studies of CHD and the Evolution of Preventive Cardiology. Nature Reviews. Cardiology, 11(5), 276.
【全文结束】


