A curated portfolio of 26 end-to-end machine learning projects — spanning healthcare AI, real-time computer vision, NLP chatbots, time series forecasting, and classical ML. Each project applies theory to a practical problem, with several fully deployed as web and GUI applications.
| 📁 Projects | 🏷️ Domains | 🚀 Deployed Apps | 🖥️ GUI Apps | ⭐ GitHub Stars |
|---|---|---|---|---|
| 26 | 6 | 5 | 3 | 1.3k+ |
Click to expand / collapse
Legend: 🟢 Beginner 🟡 Intermediate 🔴 Advanced | 🌐 Web App 🖥️ GUI App 📓 Notebook
6 Projects — click to collapse
| Project | Description | Tools & Algorithms | Level | Type |
|---|---|---|---|---|
| Brain Tumor Detection | Detects tumors in MRI scans using a CNN. Upload a scan and get a real-time prediction. | PyTorch · CNN · Flask | 🔴 | 🌐 |
| Diabetes Prediction | Predicts diabetes likelihood from 8 health markers (glucose, BMI, insulin, age) using the Pima Indians dataset. | scikit-learn · SVM · Flask | 🟡 | 🌐 |
| Heart Disease Prediction | Predicts cardiac risk from 13 clinical features with ~92% accuracy. | scikit-learn · Logistic Reg. · Flask | 🟡 | 🌐 |
| Arrhythmia Classification | Classifies 16 arrhythmia types from 279 ECG features (UCI dataset). | SVM · KNN · Decision Tree | 🟡 | 📓 |
| Medical Chatbot | NLP chatbot mapping user-described symptoms to diagnoses via a curated medical knowledge base. | NLTK · TF-IDF · Flask | 🔴 | 🌐 |
| MoA Prediction | Predicts drug biological activity from gene expression and cell viability data (Kaggle competition). | PyTorch · TabNet · Multi-label | 🔴 | 📓 |
9 Projects — click to collapse
| Project | Description | Tools & Algorithms | Level | Type |
|---|---|---|---|---|
| Driver Drowsiness Detection | Monitors driver eye state via Eye Aspect Ratio (EAR) and triggers an audio alert on drowsiness. | OpenCV · dlib · EAR | 🟡 | 🖥️ |
| Distracted Driver Detection | Classifies 10 distracted behaviors (texting, eating, phone call, etc.) from dashboard camera images. | CNN · Keras · ImageDataGenerator | 🔴 | 📓 |
| Lane Line Detection | Overlays detected road lane lines on images/video using Canny edge detection and Hough transforms. | OpenCV · Canny · Hough Transform | 🟢 | 🖥️ |
| Human Detection & Counting | Detects and counts people in live video or images using HOG + SVM. | OpenCV · HOG · SVM | 🟢 | 🖥️ |
| Gender & Age Detection | Predicts gender and age group from a face image using pre-trained Caffe models. | OpenCV DNN · Caffe Models | 🟡 | 🖥️ |
| Image Colorization | Adds realistic color to grayscale photos using the Zhang et al. deep colorization network. | OpenCV DNN · Zhang et al. · LAB space | 🟡 | 📓 |
| Smile Selfie Capture | Auto-captures a photo the instant a smile is detected in the webcam feed. No button needed. | OpenCV · Haar Cascades | 🟢 | 🖥️ |
| Emoji Creator from Emotions | Detects real-time facial emotions via webcam and overlays the matching emoji on screen. | OpenCV · CNN · FER dataset | 🟡 | 🖥️ |
| Human Activity Recognition | Classifies activities (walking, sitting, standing) from pose estimation keypoints over time. | LSTM · Keras · 2D Pose Estimation | 🔴 | 📓 |
7 Projects — click to collapse
| Project | Description | Tools & Algorithms | Level | Type |
|---|---|---|---|---|
| Iris Flower Classification | Classic benchmark — classifies iris species from petal/sepal measurements. Ideal for comparing classifiers side-by-side. | KNN · SVM · Decision Tree · Naive Bayes | 🟢 | 📓 |
| Wine Quality Prediction | Predicts wine quality score (3–8) from 11 physicochemical properties like acidity, sulfates, and alcohol. | Random Forest · XGBoost | 🟡 | 📓 |
| Loan Repayment Prediction | Predicts whether a LendingClub borrower will repay based on credit history, income, and loan purpose. | Random Forest · XGBoost · Class Balancing | 🟡 | 📓 |
| College Admission Prediction | Estimates graduate admission probability from GRE, TOEFL, GPA, and research experience. | Linear Reg. · Ridge · Lasso · SVR | 🟢 | 📓 |
| Employee Turnover Prediction | Identifies employees at high risk of leaving using HR data (satisfaction, evaluations, workload, promotions). | Decision Tree · Random Forest | 🟡 | 📓 |
| Property Maintenance Fines | Predicts fine compliance from Detroit's blight dataset — a real-world class-imbalance problem (Michigan Data Science Team). | Gradient Boosting · SMOTE · AUC optimization | 🔴 | 📓 |
| Research Topic Prediction | Classifies academic papers into topic categories using NLP-based feature extraction on titles/abstracts. | TF-IDF · Naive Bayes · SVM · NLTK | 🟡 | 📓 |
2 Projects — click to collapse
| Project | Description | Tools & Algorithms | Level | Type |
|---|---|---|---|---|
| AI Room Booking Chatbot | Hotel room booking chatbot using IBM Watson. Handles slot-filling, availability queries, and booking confirmations through a web interface. | IBM Watson Assistant · Watson Discovery | 🟡 | 🌐 |
| Medical Chatbot | Symptom-to-diagnosis NLP chatbot with multi-turn conversation support. (Also listed under Healthcare.) | NLTK · Flask · TF-IDF · Cosine Similarity | 🔴 | 🌐 |
2 Projects — click to collapse
| Project | Description | Tools & Algorithms | Level | Type |
|---|---|---|---|---|
| Multi-Store Sales Prediction | Forecasts daily sales for 50 items across 10 stores using three time series approaches and model ensembling. | ARIMA · Facebook Prophet · LSTM (Keras) | 🔴 | 📓 |
| IPL Score Prediction | Predicts first-innings T20 scores from ball-by-ball match data with deep EDA and multiple regression models. | Linear/Ridge Reg. · Random Forest · ANN | 🟡 | 📓 |
1 Project — click to collapse
| Project | Description | Tools & Algorithms | Level | Type |
|---|---|---|---|---|
| The Battle of Neighborhoods | IBM Capstone — clusters city neighborhoods using Foursquare API data to recommend optimal business locations. | K-Means · Foursquare API · Folium · Geopy | 🟡 | 📓 |
Every project follows a consistent layout for easy navigation and reuse:
ProjectName/
│
├── 📂 data/ # Raw and processed datasets
├── 📂 notebooks/ # Jupyter notebooks (EDA → Training → Evaluation)
├── 📂 models/ # Saved weights (.pkl / .h5 / .pt)
├── 📂 static/ # CSS, JS, images (Flask apps)
├── 📂 templates/ # Jinja2 HTML templates (Flask apps)
├── 📂 src/
│ ├── preprocess.py # Data cleaning & feature engineering
│ ├── train.py # Model training pipeline
│ └── predict.py # Inference logic
├── app.py # Flask entry point (web apps)
├── requirements.txt # Python dependencies
└── README.md # Project-specific documentation
Python 3.7+ | pip | Git
# Clone the repository
git clone https://github.com/shsarv/Machine-Learning-Projects.git
cd Machine-Learning-Projects
# Navigate to any project
cd "Heart Disease Prediction [END 2 END]"
# (Recommended) Create a virtual environment
python -m venv venv
source venv/bin/activate # Linux / macOS
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# For Flask web apps
python app.py
# → Open http://127.0.0.1:5000
# For notebooks
jupyter notebookheroku login
heroku create your-app-name
echo "web: gunicorn app:app" > Procfile
git push heroku main
heroku openWe welcome contributions to this project! If you would like to improve the existing codebase or contribute new features, feel free to submit a pull request. Before submitting, please ensure that you adhere to the following:
- Fork this repo
- Branch:
git checkout -b feature/YourProjectName - Structure your folder with a
README.mdandrequirements.txt - Commit:
git commit -m "Add: YourProjectName" - Push:
git push origin feature/YourProjectName - Open a Pull Request → target
main
Please read CONTRIBUTING.md and follow the Code of Conduct.
- Integrate Explainable AI (XAI) models for better understanding of predictions in complex models.
- Add Docker support for easy containerization of all projects.
- Incorporate CI/CD pipelines using GitHub Actions for automated testing and deployment.
- Migrate some projects to use streamlit for interactive dashboards.
- Explore Reinforcement Learning for game-based AI projects.
- Expand the NLP section to include text summarization, translation, and more chatbot capabilities.
- Official Python Documentation: Python.org
- Flask Documentation: Flask.palletsprojects.com
- Scikit-learn User Guide: Scikit-learn.org
- Keras Documentation: Keras.io
- TensorFlow Documentation: Tensorflow.org
- PyTorch Documentation: Pytorch.org
For a deeper understanding of AI, machine learning, and data science, I recommend the following courses:
- Coursera - Machine Learning by Andrew Ng
- Udacity - AI for Everyone
- Kaggle Learn - Data Science
- The wonderful Kaggle community, which provided open datasets and insightful discussions.
- Udemy, Coursera, and edX instructors who have helped me build a solid foundation in AI.
Distributed under the MIT License. See LICENSE for more information.
Sarvesh Sharma |
