Integrating Machine Learning Ensembles and Large Language Models for Heart Disease Prediction Using Voting Fusion
#Heart Disease Prediction #Machine Learning Ensembles #Large Language Models #Hybrid AI Systems #Medical Diagnosis #Cardiovascular Disease #Gemini 2.5 Flash
📌 Key Takeaways
- Machine learning ensembles outperformed large language models in heart disease prediction
- A hybrid approach combining ML ensembles and LLMs achieved the highest accuracy (96.62%)
- The research used a dataset of 1,190 patient records to test various prediction methods
- Large language models work better when integrated with ML models rather than used alone
- This hybrid method shows promise for more reliable clinical decision-support tools
📖 Full Retelling
Researchers led by Md. Tahsin Amin and a team of seven scientists published a groundbreaking study on February 25, 2026, demonstrating how combining machine learning ensembles with large language models can improve heart disease prediction accuracy, addressing the critical need for early detection of cardiovascular disease, which remains the leading cause of death worldwide. The research team developed and tested a novel hybrid approach that leverages the strengths of traditional machine learning algorithms and advanced language models, using a dataset of 1,190 patient records to validate their methodology. Traditional machine learning ensemble methods including Random Forest, XGBoost, LightGBM, and CatBoost achieved impressive results with 95.78% accuracy and an ROC-AUC of 0.96, while large language models showed moderate performance with 78.9% accuracy in zero-shot settings and 72.6% in few-shot scenarios when tested individually. The most significant finding was that a hybrid fusion approach combining the ML ensemble with LLM reasoning using Gemini 2.5 Flash achieved the best results overall, reaching 96.62% accuracy with an AUC of 0.97, demonstrating that large language models perform best when integrated with traditional machine learning approaches rather than being used in isolation.
🏷️ Themes
Medical AI, Machine Learning, Healthcare Technology
📚 Related People & Topics
Large language model
Type of machine learning model
A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are generative pre-trained transformers (GPTs) that provide the c...
Entity Intersection Graph
Connections for Large language model:
🌐
Educational technology
4 shared
🌐
Reinforcement learning
3 shared
🌐
Machine learning
2 shared
🌐
Artificial intelligence
2 shared
🌐
Benchmark
2 shared
Original Source
--> Computer Science > Machine Learning arXiv:2602.22280 [Submitted on 25 Feb 2026] Title: Integrating Machine Learning Ensembles and Large Language Models for Heart Disease Prediction Using Voting Fusion Authors: Md. Tahsin Amin , Tanim Ahmmod , Zannatul Ferdus , Talukder Naemul Hasan Naem , Ehsanul Ferdous , Arpita Bhattacharjee , Ishmam Ahmed Solaiman , Nahiyan Bin Noor View a PDF of the paper titled Integrating Machine Learning Ensembles and Large Language Models for Heart Disease Prediction Using Voting Fusion, by Md. Tahsin Amin and 7 other authors View PDF HTML Abstract: Cardiovascular disease is the primary cause of death globally, necessitating early identification, precise risk classification, and dependable decision-support technologies. The advent of large language models provides new zero-shot and few-shot reasoning capabilities, even though machine learning algorithms, especially ensemble approaches like Random Forest, XGBoost, LightGBM, and CatBoost, are excellent at modeling complex, non-linear patient data and routinely beat logistic regression. This research predicts cardiovascular disease using a merged dataset of 1,190 patient records, comparing traditional machine learning models (95.78% accuracy, ROC-AUC 0.96) with open-source large language models via OpenRouter APIs. Finally, a hybrid fusion of the ML ensemble and LLM reasoning under Gemini 2.5 Flash achieved the best results (96.62% accuracy, 0.97 AUC), showing that LLMs (78.9 % accuracy) work best when combined with ML models rather than used alone. Results show that ML ensembles achieved the highest performance (95.78% accuracy, ROC-AUC 0.96), while LLMs performed moderately in zero-shot (78.9%) and slightly better in few-shot (72.6%) settings. The proposed hybrid method enhanced the strength in uncertain situations, illustrating that ensemble ML is considered the best structured tabular prediction case, but it can be integrated with hybrid ML-LLM systems to provide a minor increase and op...
Read full article at source