Open-Source Learning Tools

Machine Learning
Explained

The project is still actively being developed, with new apps on the way. Interactive tools for understanding ML algorithms step by step. Pick a topic and start exploring.

14 interactive apps 4 guided tours
📚
⚙️
7 tools
🔧
ML_02 / ML_03
Data Preparation & Feature Engineering
Complete pipeline from data types through EDA to imputation, scaling, encoding, and feature selection — all with the Heart Failure dataset.
EDAPreparationInteractive
📈
Linear Regression
Simple and Multiple Linear Regression with evaluation metrics, regularization, Grid Search and K-Fold Cross Validation.
Evaluation MetricsRegularizationCross Validation
📊
η² Categorical–Numerical Associations
An interactive explainer — effect size for categorical → numerical associations and explained variance.
NC AssociationExplained VarianceData Prep
🎯
Bias & Variance Tradeoff
Bias–Variance & Generalization Explorer. Inspired by S. Raschka, STAT 479 — Overfitting and Underfitting.
OverfittingBias–VarianceS. Raschka
⚖️
Grid Search & K-Fold Cross Validation
How GridSearchCV calculates validation-fold statistics to find the best α for Lasso, and how refit=True leverages your full training data.
GridSearchK-Fold CVScikit-Learn
🔗
Sklearn Pipeline
Chaining preprocessing steps and a model into a single, reproducible workflow with scikit-learn Pipelines.
PreprocessingPipelinesScikit-Learn
📐
ML_07
PCA (SVD & Eigendecomposition)
Principal Components Analysis by two equivalent approaches: Singular Value Decomposition and Eigendecomposition of the standardized covariance matrix.
PCASVDDim. ReductionFeature Extraction
🧬
Softmax Regression (Multinomial Logistic Regression)
Step-by-step walkthrough of the fowrward pass (Predictions) and the backward pass (training (error backpropagation)) with Live-Simulator.
Multiclass ClassificationSoftmaxError backpropagation
⚖️
1 tool
🧬
SMOTE & SMOTE-NC
Step-by-step walkthrough of synthetic oversampling. See every distance calculation, k-NN selection, and synthetic sample generation live.
Oversamplingk-NNChawla et al.
🧠
🤖
4 tools
🤖
Vaswani et al. 2017
Attention Is All You Need
Interactive guide to the Transformer architecture — self-attention, Q/K/V matrices, multi-head attention, positional encoding, masking, and the full encoder-decoder pipeline.
Self-AttentionTransformersEncoder-Decoder
🔮
GPT-Decoder Simulator
17 step-by-step stages of the forward pass: tokenization → embedding → positional encoding → Q/K/V → masked attention → FFN → logits → softmax → prediction.
GPTDecoderForward Pass
🔍
RAG-System Simulator
Retrieval-Augmented Generation step by step — embedding, vector search, context injection, and generation pipeline.
RAGEmbeddingRetrieval
Hu et al. 2021
Low-Rank Adaptation (LoRA Fine-Tuning)
A parameter-efficient fine-tuning method that freezes pre-trained weights and learns a small rank-r update — dramatically reducing trainable parameters.
LoRAFine-TuningEfficient
RAG & Fine-Tuned LLM Evaluation Guide
Evaluating Retrieval-Augmented Generation Systems and Fine-Tuned Large Language Models. An introduction with examples, mathematical derivation, and discussion of common evaluation metrics.
LoRAFine-TuningEfficient
🎓
🗺️
4 tours