TY - JOUR AU - A. Kahraman A1 - AB - Cardiovascular disease continues to cause an important global health challenge, highlighting the critical importance of early detection in mitigating cardiac-related issues. There is a significant demand for reliable diagnostic alternatives. Taking advantage of health data through diverse machine learning algorithms may offer a more precise diagnostic approach. Machine learning-based decision support systems that utilize patients' clinical parameters present a promising solution for diagnosing cardiovascular disease. In this research, we collected extensive publicly available healthcare records. We integrated medical datasets based on common features to implement several machine learning models aimed at exploring the potential for more robust predictions of cardiovascular disease (CVD). The merged dataset initially contained 323,680 samples sourced from multiple databases. Following data preprocessing steps including cleaning, alignment of features, and removal of missing values, the final dataset consisted of 311,710 samples used for model training and evaluation. In our experiments, the CatBoost model achieved the highest area under the curve (AUC) of up to 94.1%. AD - Department of Computer Engineering, Faculty of Engineering, Balıkesir University, Balıkesir, Türkiye. AN - 41446898 BT - Front Artif Intell C5 - HIT & Telehealth DO - 10.3389/frai.2025.1694450 DP - NLM ET - 20251209 JF - Front Artif Intell LA - eng N2 - Cardiovascular disease continues to cause an important global health challenge, highlighting the critical importance of early detection in mitigating cardiac-related issues. There is a significant demand for reliable diagnostic alternatives. Taking advantage of health data through diverse machine learning algorithms may offer a more precise diagnostic approach. Machine learning-based decision support systems that utilize patients' clinical parameters present a promising solution for diagnosing cardiovascular disease. In this research, we collected extensive publicly available healthcare records. We integrated medical datasets based on common features to implement several machine learning models aimed at exploring the potential for more robust predictions of cardiovascular disease (CVD). The merged dataset initially contained 323,680 samples sourced from multiple databases. Following data preprocessing steps including cleaning, alignment of features, and removal of missing values, the final dataset consisted of 311,710 samples used for model training and evaluation. In our experiments, the CatBoost model achieved the highest area under the curve (AUC) of up to 94.1%. PY - 2025 SN - 2624-8212 SP - 1694450 ST - Machine learning techniques for improved prediction of cardiovascular diseases using integrated healthcare data T1 - Machine learning techniques for improved prediction of cardiovascular diseases using integrated healthcare data T2 - Front Artif Intell TI - Machine learning techniques for improved prediction of cardiovascular diseases using integrated healthcare data U1 - HIT & Telehealth U3 - 10.3389/frai.2025.1694450 VL - 8 VO - 2624-8212 Y1 - 2025 ER -