Skip to content

Latest commit

 

History

History
321 lines (226 loc) · 19 KB

File metadata and controls

321 lines (226 loc) · 19 KB
Typing SVG

GIF

Stars Forks Contributors Issues License Last Commit

Project Overview

A curated portfolio of 26 end-to-end machine learning projects — spanning healthcare AI, real-time computer vision, NLP chatbots, time series forecasting, and classical ML. Each project applies theory to a practical problem, with several fully deployed as web and GUI applications.

📊 Repository at a Glance

📁 Projects 🏷️ Domains 🚀 Deployed Apps 🖥️ GUI Apps ⭐ GitHub Stars
26 6 5 3 1.3k+

📚 Table of Contents

Click to expand / collapse

All Projects by Category

Legend:   🟢 Beginner   🟡 Intermediate   🔴 Advanced  |  🌐 Web App   🖥️ GUI App   📓 Notebook

🏥 Healthcare & Medical AI

6 Projects — click to collapse
Project Description Tools & Algorithms Level Type
Brain Tumor Detection Detects tumors in MRI scans using a CNN. Upload a scan and get a real-time prediction. PyTorch · CNN · Flask 🔴 🌐
Diabetes Prediction Predicts diabetes likelihood from 8 health markers (glucose, BMI, insulin, age) using the Pima Indians dataset. scikit-learn · SVM · Flask 🟡 🌐
Heart Disease Prediction Predicts cardiac risk from 13 clinical features with ~92% accuracy. scikit-learn · Logistic Reg. · Flask 🟡 🌐
Arrhythmia Classification Classifies 16 arrhythmia types from 279 ECG features (UCI dataset). SVM · KNN · Decision Tree 🟡 📓
Medical Chatbot NLP chatbot mapping user-described symptoms to diagnoses via a curated medical knowledge base. NLTK · TF-IDF · Flask 🔴 🌐
MoA Prediction Predicts drug biological activity from gene expression and cell viability data (Kaggle competition). PyTorch · TabNet · Multi-label 🔴 📓

🎥 Computer Vision & OpenCV

9 Projects — click to collapse
Project Description Tools & Algorithms Level Type
Driver Drowsiness Detection Monitors driver eye state via Eye Aspect Ratio (EAR) and triggers an audio alert on drowsiness. OpenCV · dlib · EAR 🟡 🖥️
Distracted Driver Detection Classifies 10 distracted behaviors (texting, eating, phone call, etc.) from dashboard camera images. CNN · Keras · ImageDataGenerator 🔴 📓
Lane Line Detection Overlays detected road lane lines on images/video using Canny edge detection and Hough transforms. OpenCV · Canny · Hough Transform 🟢 🖥️
Human Detection & Counting Detects and counts people in live video or images using HOG + SVM. OpenCV · HOG · SVM 🟢 🖥️
Gender & Age Detection Predicts gender and age group from a face image using pre-trained Caffe models. OpenCV DNN · Caffe Models 🟡 🖥️
Image Colorization Adds realistic color to grayscale photos using the Zhang et al. deep colorization network. OpenCV DNN · Zhang et al. · LAB space 🟡 📓
Smile Selfie Capture Auto-captures a photo the instant a smile is detected in the webcam feed. No button needed. OpenCV · Haar Cascades 🟢 🖥️
Emoji Creator from Emotions Detects real-time facial emotions via webcam and overlays the matching emoji on screen. OpenCV · CNN · FER dataset 🟡 🖥️
Human Activity Recognition Classifies activities (walking, sitting, standing) from pose estimation keypoints over time. LSTM · Keras · 2D Pose Estimation 🔴 📓

📈 Classical ML & Prediction

7 Projects — click to collapse
Project Description Tools & Algorithms Level Type
Iris Flower Classification Classic benchmark — classifies iris species from petal/sepal measurements. Ideal for comparing classifiers side-by-side. KNN · SVM · Decision Tree · Naive Bayes 🟢 📓
Wine Quality Prediction Predicts wine quality score (3–8) from 11 physicochemical properties like acidity, sulfates, and alcohol. Random Forest · XGBoost 🟡 📓
Loan Repayment Prediction Predicts whether a LendingClub borrower will repay based on credit history, income, and loan purpose. Random Forest · XGBoost · Class Balancing 🟡 📓
College Admission Prediction Estimates graduate admission probability from GRE, TOEFL, GPA, and research experience. Linear Reg. · Ridge · Lasso · SVR 🟢 📓
Employee Turnover Prediction Identifies employees at high risk of leaving using HR data (satisfaction, evaluations, workload, promotions). Decision Tree · Random Forest 🟡 📓
Property Maintenance Fines Predicts fine compliance from Detroit's blight dataset — a real-world class-imbalance problem (Michigan Data Science Team). Gradient Boosting · SMOTE · AUC optimization 🔴 📓
Research Topic Prediction Classifies academic papers into topic categories using NLP-based feature extraction on titles/abstracts. TF-IDF · Naive Bayes · SVM · NLTK 🟡 📓

💬 NLP & Conversational AI

2 Projects — click to collapse
Project Description Tools & Algorithms Level Type
AI Room Booking Chatbot Hotel room booking chatbot using IBM Watson. Handles slot-filling, availability queries, and booking confirmations through a web interface. IBM Watson Assistant · Watson Discovery 🟡 🌐
Medical Chatbot Symptom-to-diagnosis NLP chatbot with multi-turn conversation support. (Also listed under Healthcare.) NLTK · Flask · TF-IDF · Cosine Similarity 🔴 🌐

📊 Time Series & Business Analytics

2 Projects — click to collapse
Project Description Tools & Algorithms Level Type
Multi-Store Sales Prediction Forecasts daily sales for 50 items across 10 stores using three time series approaches and model ensembling. ARIMA · Facebook Prophet · LSTM (Keras) 🔴 📓
IPL Score Prediction Predicts first-innings T20 scores from ball-by-ball match data with deep EDA and multiple regression models. Linear/Ridge Reg. · Random Forest · ANN 🟡 📓

🗺️ Geospatial & Data Science

1 Project — click to collapse
Project Description Tools & Algorithms Level Type
The Battle of Neighborhoods IBM Capstone — clusters city neighborhoods using Foursquare API data to recommend optimal business locations. K-Means · Foursquare API · Folium · Geopy 🟡 📓

🛠️ Tech Stack

Languages & Environments : Python Jupyter Google Colab

Machine Learning & Deep Learning : scikit-learn TensorFlow Keras PyTorch XGBoost

Computer Vision & NLP : OpenCV NLTK IBM Watson

Data & Visualization : Pandas NumPy Matplotlib Seaborn

Deployment : Flask Heroku Tkinter

📁 Project Structure

Every project follows a consistent layout for easy navigation and reuse:

ProjectName/
│
├── 📂 data/                  # Raw and processed datasets
├── 📂 notebooks/             # Jupyter notebooks (EDA → Training → Evaluation)
├── 📂 models/                # Saved weights (.pkl / .h5 / .pt)
├── 📂 static/                # CSS, JS, images  (Flask apps)
├── 📂 templates/             # Jinja2 HTML templates  (Flask apps)
├── 📂 src/
│   ├── preprocess.py         # Data cleaning & feature engineering
│   ├── train.py              # Model training pipeline
│   └── predict.py            # Inference logic
├── app.py                    # Flask entry point  (web apps)
├── requirements.txt          # Python dependencies
└── README.md                 # Project-specific documentation

🚀 Getting Started

Prerequisites

Python 3.7+  |  pip  |  Git

Clone & Run

# Clone the repository
git clone https://github.com/shsarv/Machine-Learning-Projects.git
cd Machine-Learning-Projects

# Navigate to any project
cd "Heart Disease Prediction [END 2 END]"

# (Recommended) Create a virtual environment
python -m venv venv
source venv/bin/activate        # Linux / macOS
venv\Scripts\activate           # Windows

# Install dependencies
pip install -r requirements.txt

# For Flask web apps
python app.py
# → Open http://127.0.0.1:5000

# For notebooks
jupyter notebook

Deploy to Heroku

heroku login
heroku create your-app-name
echo "web: gunicorn app:app" > Procfile
git push heroku main
heroku open

Contributions 🌱

We welcome contributions to this project! If you would like to improve the existing codebase or contribute new features, feel free to submit a pull request. Before submitting, please ensure that you adhere to the following:

  1. Fork this repo
  2. Branch: git checkout -b feature/YourProjectName
  3. Structure your folder with a README.md and requirements.txt
  4. Commit: git commit -m "Add: YourProjectName"
  5. Push: git push origin feature/YourProjectName
  6. Open a Pull Request → target main

Please read CONTRIBUTING.md and follow the Code of Conduct.

Future Enhancements:

  • Integrate Explainable AI (XAI) models for better understanding of predictions in complex models.
  • Add Docker support for easy containerization of all projects.
  • Incorporate CI/CD pipelines using GitHub Actions for automated testing and deployment.
  • Migrate some projects to use streamlit for interactive dashboards.
  • Explore Reinforcement Learning for game-based AI projects.
  • Expand the NLP section to include text summarization, translation, and more chatbot capabilities.

📚 Resources and References

For a deeper understanding of AI, machine learning, and data science, I recommend the following courses:

  • Coursera - Machine Learning by Andrew Ng
  • Udacity - AI for Everyone
  • Kaggle Learn - Data Science

⭐ Acknowledgments

  • The wonderful Kaggle community, which provided open datasets and insightful discussions.
  • Udemy, Coursera, and edX instructors who have helped me build a solid foundation in AI.

License

Distributed under the MIT License. See LICENSE for more information.

Maintained By


Sarvesh Sharma