This project leverages Machine Learning to predict COVID-19 infection outcomes based on patient symptoms and demographic data. The goal is to enhance early detection, optimize resource allocation, and improve healthcare efficiency.
- Predicts COVID-19 infection using Logistic Regression, Decision Trees, Random Forest, and XGBoost
- Data preprocessing includes EDA, feature selection, and hyperparameter tuning
- Implements train-test splitting for robust model evaluation
- Provides insights into key symptoms affecting diagnosis
- Scalable approach for future disease prediction models
├── data/ # Dataset used for training/testing
├── models/ # Trained machine learning models
├── notebooks/ # Jupyter notebooks with analysis & visualization
├── scripts/ # Python scripts for data processing & training
├── README.md # Project documentation (this file)
├── requirements.txt # Dependencies and libraries
- Programming Language: Python
- Libraries: Pandas, NumPy, Scikit-Learn, Matplotlib, Seaborn, XGBoost
- Data Storage: MySQL (for structured patient records)
- Model Evaluation: Accuracy, Precision, Recall
Model | Accuracy |
---|---|
Logistic Regression | 93.14% |
Decision Tree | 94.46% |
Random Forest | 94.46% |
XGBoost Classifier | 94.46% |
- Data Collection: Patient symptom and demographic data
- EDA & Feature Engineering: Data cleaning, correlation analysis, feature selection
- Model Training: Multiple ML models trained & tuned for performance
- Evaluation: Comparison of models using various performance metrics
- Deployment Considerations: Scalability for predicting other diseases
- Cough, Fever, and Shortness of Breath are strong indicators of COVID-19.
- Decision Tree, Random Forest, and XGBoost showed similar accuracy (94.46%).
- Feature scaling & encoding techniques improved model performance.
- Potential for future applications in detecting other infectious diseases.
- Johns Hopkins University COVID-19 Dataset
- UCI Machine Learning Repository
- Research Papers on ML-based COVID-19 Diagnosis
# Clone the repository
git clone https://github.com/your-username/covid19-ml-prediction.git
cd covid19-ml-prediction
# Install dependencies
pip install -r requirements.txt
# Run model training
python scripts/train_model.py
Contributions are welcome! Feel free to fork the repo, create a feature branch, and submit a PR.
This project is licensed under the MIT License - see the LICENSE file for details.
💡 Like this project? Give it a ⭐ on GitHub!