Open-Source Learning Tools

Machine Learning
Explained

The project is still actively being developed, with new apps on the way. Interactive tools for understanding ML algorithms step by step. Pick a topic and start exploring.

14 interactive apps 4 guided tours

📚

⚙️

ML Foundations 7 tools

Data Preparation & Feature Engineering

Complete pipeline from data types through EDA to imputation, scaling, encoding, and feature selection — all with the Heart Failure dataset.

EDAPreparationInteractive

Linear Regression

Simple and Multiple Linear Regression with evaluation metrics, regularization, Grid Search and K-Fold Cross Validation.

Evaluation MetricsRegularizationCross Validation

η² Categorical–Numerical Associations

An interactive explainer — effect size for categorical → numerical associations and explained variance.

NC AssociationExplained VarianceData Prep

Bias & Variance Tradeoff

Bias–Variance & Generalization Explorer. Inspired by S. Raschka, STAT 479 — Overfitting and Underfitting.

OverfittingBias–VarianceS. Raschka

Grid Search & K-Fold Cross Validation

How GridSearchCV calculates validation-fold statistics to find the best α for Lasso, and how refit=True leverages your full training data.

GridSearchK-Fold CVScikit-Learn

Sklearn Pipeline

Chaining preprocessing steps and a model into a single, reproducible workflow with scikit-learn Pipelines.

PreprocessingPipelinesScikit-Learn

PCA (SVD & Eigendecomposition)

Principal Components Analysis by two equivalent approaches: Singular Value Decomposition and Eigendecomposition of the standardized covariance matrix.

PCASVDDim. ReductionFeature Extraction

Softmax Regression (Multinomial Logistic Regression)

Step-by-step walkthrough of the fowrward pass (Predictions) and the backward pass (training (error backpropagation)) with Live-Simulator.

Multiclass ClassificationSoftmaxError backpropagation

⚖️

Imbalanced Data 1 tool

SMOTE & SMOTE-NC

Step-by-step walkthrough of synthetic oversampling. See every distance calculation, k-NN selection, and synthetic sample generation live.

Oversamplingk-NNChawla et al.

🧠

🤖

Deep Learning & NLP 4 tools

Vaswani et al. 2017

Attention Is All You Need

Interactive guide to the Transformer architecture — self-attention, Q/K/V matrices, multi-head attention, positional encoding, masking, and the full encoder-decoder pipeline.

Self-AttentionTransformersEncoder-Decoder

GPT-Decoder Simulator

17 step-by-step stages of the forward pass: tokenization → embedding → positional encoding → Q/K/V → masked attention → FFN → logits → softmax → prediction.

GPTDecoderForward Pass

RAG-System Simulator

Retrieval-Augmented Generation step by step — embedding, vector search, context injection, and generation pipeline.

RAGEmbeddingRetrieval

Low-Rank Adaptation (LoRA Fine-Tuning)

A parameter-efficient fine-tuning method that freezes pre-trained weights and learns a small rank-r update — dramatically reducing trainable parameters.

LoRAFine-TuningEfficient

RAG & Fine-Tuned LLM Evaluation Guide

Evaluating Retrieval-Augmented Generation Systems and Fine-Tuned Large Language Models. An introduction with examples, mathematical derivation, and discussion of common evaluation metrics.

LoRAFine-TuningEfficient

🎓

🗺️

Apps with Guided Tours 4 tours

🇬🇧 English

PCA Interactive — Guided Tour

10-step walkthrough of projection, covariance, eigendecomposition, scree plots, SVD, and real-data PC regression — with "Try this" prompts at every step.

Guided TourPCASVD9 Tabs

Classification Interactive — Guided Tour

12-step walkthrough across 11 tabs: logistic regression, training, confusion matrix, decision boundary, metrics, ROC/PR curves, imbalanced data, multiclass, and class weighting — with concrete "Try this" prompts.

Guided TourLogistic RegressionROCSoftmax11 Tabs

🇩🇪 Deutsch

PCA Interaktiv — Geführte Tour

10-Schritte-Tour: Projektion, Kovarianz, Eigenzerlegung, Scree-Plots, SVD und reale PC-Regression. App komplett auf Deutsch.

Geführte TourPCASVD9 Tabs

Klassifikation Interaktiv — Geführte Tour

12-Schritte-Tour über 11 Tabs: Logistische Regression, Training, Konfusionsmatrix, Entscheidungsgrenze, Kennzahlen, ROC/PR-Kurven, unausgewogene Daten, Mehrklassen und Klassengewichtung — mit konkreten „Probiere dies"-Aufgaben. App komplett auf Deutsch.

Geführte TourLogistische RegressionROCSoftmax11 Tabs