This repository contains course materials, assignments, and implementations for the Probabilistic Machine Learning course at the Department of Computer Science (DIKU), University of Copenhagen.
This course covers fundamental and advanced topics in probabilistic machine learning, including Bayesian inference, variational methods, generative models, and modern deep probabilistic models. The course combines theoretical foundations with practical implementation using Python and PyTorch.
-
Week 1: Introduction to Probabilistic ML, Variational Autoencoders (VAE)
- Basic regression and probabilistic modeling
- VAE implementations and theory
- Linear regression from probabilistic perspective
-
Week 2: Expectation-Maximization and Variational Inference
- EM algorithm implementation
- Gaussian Mixture Models (GMM)
- Variational inference fundamentals
- Tutorial on GMM with solutions
-
Week 3: Sampling Methods and Monte Carlo
- Importance sampling
- Numerical integration techniques
- Cauchy distribution sampling
- MCMC methods using Pyro
-
Week 4: Gaussian Processes and Advanced Sampling
- Gaussian process regression
- NUTS (No-U-Turn Sampler) implementation
- Stochastic Variational Inference (SVI)
- Real-world GP applications (Mauna Loa CO₂ data)
-
Week 5: Diffusion Models
- Introduction to diffusion processes
- Score-based generative models
- Implementation of basic diffusion models
-
Week 6: Denoising Diffusion Probabilistic Models (DDPM)
- Advanced DDPM implementations
- Template notebooks for DDPM training
- Generated samples and visualizations
-
Autoencoders/: Various autoencoder implementations
- Standard autoencoders
- Variational autoencoders (VAE)
- PCA-based autoencoders
- Anomaly detection with autoencoders
-
CNN/: Convolutional Neural Networks
- MNIST digit classification with PyTorch
- CNN architectures and best practices
-
GPs/: Gaussian Processes
- Scikit-learn GP implementations
- Advanced GP modeling techniques
-
VAE/: Advanced Variational Autoencoders
- Complete VAE implementations
- Training pipelines and visualizations
- Model architectures and utilities
- Assignment Solutions: Complete solutions for course assignments
- Exam/: Exam materials and solutions
- Section A: DDPM implementations and theory
- Section B: Advanced topics and applications
- Python 3.8+
- CUDA-compatible GPU (optional but recommended)
- Apple Silicon Mac with MPS support (optional)
- PyTorch: Deep learning framework
- Pyro: Probabilistic programming
- scikit-learn: Traditional ML algorithms
- NumPy/SciPy: Numerical computing
- Matplotlib/Seaborn: Visualization
- Jupyter: Interactive notebooks
- TensorBoard: Experiment tracking
- PyTorch: Primary deep learning framework
- Pyro: Probabilistic programming and Bayesian inference
- scikit-learn: Gaussian processes and traditional ML
- Arviz: Bayesian analysis and visualization
- TorchInfo: Model architecture summaries
- Bayesian Inference: Prior, likelihood, and posterior distributions
- Variational Methods: Variational inference and approximation techniques
- Expectation-Maximization: Parameter estimation for latent variable models
- Monte Carlo Methods: Sampling techniques and numerical integration
- Gaussian Processes: Non-parametric Bayesian modeling
- Variational Autoencoders (VAE): Deep generative modeling
- Autoencoders: Dimensionality reduction and representation learning
- Diffusion Models: Score-based and DDPM approaches
- Gaussian Mixture Models: Classical mixture modeling
- Stochastic Variational Inference: Scalable Bayesian methods
- MCMC Sampling: NUTS and advanced sampling techniques
- Neural Networks in Bayesian Setting: Uncertainty quantification
- Anomaly Detection: Probabilistic approaches
- Week 1: Start with
week1/week_1.ipynbfor basic probabilistic regression - Autoencoders: Explore
Autoencoders/Autoencoder.ipynbfor hands-on implementation - VAE: Check
week1/vae_simple.ipynbfor variational autoencoder basics - Gaussian Processes: Run
week4/gaussian_processes.ipynbfor GP introduction
- Reproducible research with fixed random seeds
- GPU acceleration (CUDA/MPS) when available
- Type annotations for better code quality
- Cross-validation and proper model evaluation
- Visualization of results and model behavior
By the end of this course, students will be able to:
- Implement fundamental probabilistic ML algorithms from scratch
- Apply Bayesian inference to real-world problems
- Design and train deep generative models
- Use modern probabilistic programming frameworks
- Critically evaluate uncertainty in machine learning models
- Implement advanced sampling techniques for complex distributions
- Course lecture notes available in
HTML/directory - Model checkpoints and trained models in respective
models/folders - Generated samples and experiment results in
figures/directories - Utility scripts for notebook management in
scripts/