Hybrid machine learning models outperform in predicting gut-microbiome-related diseases
Hybrid models excel in predicting gut-microbiome diseases
The human gut, home to over 100 trillion microbes, is often called the "second brain" due to its major role in digestion, immunity, and overall health. Imbalances in gut microbiota (dysbiosis) are linked to several diseases, including metabolic, autoimmune, and neurological disorders.
Recent advances in sequencing and bioinformatics methods have provided capabilities to detect disease patterns from gut microbiota data. In this project, we apply machine learning and ensemble models to classify a disease depending on the gut microbiota profiles. These models aim to improve predictions for early diagnosis and diagnostic accuracy.
Each .ipynb file includes:
- Data Preprocessing (Data Cleaning, Class Balancing, Reducing Dimensionality)
- Implementation of Base Models (Random Forest, SVM, KNN, Naïve Bayes)
- Implementation of Ensemble Models using Stacking (EM1 and EM2)
- Implementation of MLP and Neural Network Architecture
DeepMicro_Obesity.ipynb(D1)DeepMicro_Cirrhosis.ipynb(D4)DeepMicro_Colorectal_cancer.ipynb(D5)DeepMicro_T2D.ipynb(D6)DeepMicro_WT2D.ipynb(D2)DeepMicro_IBD.ipynb(D3)Yachida.ipynb(Y1)Kaggle.ipynb(K1)TaxoNN_Cirrhosis.ipynb(T1)TaxoNN_T2D.ipynb(T2)
pandasnumpyscikit-learntensorflowmatplotlib
- Oh, M., Zhang, L. DeepMicro: deep representation learning for disease prediction based on microbiome data. Sci Rep 10, 6026 (2020)
- Divya Sharma, Andrew D Paterson, Wei Xu, TaxoNN: ensemble of neural networks on stratified microbiome data for disease prediction, Bioinformatics, Volume 36, Issue 17, September 2020, Pages 4544–4550
- Muller, E., Algavi, Y.M. & Borenstein, E. The gut microbiome-metabolome dataset collection: a curated resource for integrative meta-analysis. npj Biofilms Microbiomes 8, 79 (2022)
- Kaggle: Human Metagenomics Dataset