Kernel Methods

Jan 27, 2026 ML Math Derivations 28 min read

ML Math Derivations (8): Support Vector Machines

Complete SVM derivation from maximum margin to Lagrangian duality, KKT conditions, soft margin, kernel trick, and SMO algorithm with step-by-step proofs and Python code.

Dec 30, 2021 Kernel Methods 38 min read

Kernel Methods (8): Deep Kernel Learning vs Deep Learning — A Practitioner's Guide

Deep kernel learning combines neural feature extractors with kernel methods. When to pick kernels over deep nets, hyperparameter tuning playbook, common failure modes, and a final 5-step kernel decision flowchart.

Dec 24, 2021 Kernel Methods 52 min read

Kernel Methods (7): Large-Scale Kernels — Nystrom Approximation and Random Fourier Features

Kernel methods are O(n^3). Nystrom approximation and Random Fourier Features pull them back to linear time without giving up the kernel trick's expressive power.

Dec 19, 2021 Kernel Methods 34 min read

Kernel Methods (6): Gaussian Processes — When Kernels Meet Bayesian Inference

Gaussian Processes turn kernels into a Bayesian model — posterior with uncertainty, marginal likelihood for hyperparameters, and the kernel as a prior over functions.

Dec 14, 2021 Kernel Methods 44 min read

Kernel Methods (5): Kernel SVM, Kernel PCA, and Kernel Ridge Regression

The classic algorithms, kernelized — SVM's dual form, Kernel PCA's eigendecomposition in feature space, and Kernel Ridge's closed-form solution. With sklearn code and worked examples.

Dec 9, 2021 Kernel Methods 44 min read

Kernel Methods (4): Common Kernel Families — RBF, Matern, Polynomial, Periodic, and More

A tour of the kernels you'll actually use: RBF (Gaussian), polynomial, linear, Matern, periodic, sigmoid. When to pick which, hyperparameter intuition, and how kernels combine.

Dec 4, 2021 Kernel Methods 44 min read

Kernel Methods (3): RKHS — The Theoretical Soul of Kernel Methods

Reproducing Kernel Hilbert Space — the function space where kernel methods live. The reproducing property, the representer theorem, and why finite-data optimization works in infinite dimensions.

Nov 29, 2021 Kernel Methods 76 min read

Kernel Methods (2): Mathematical Foundations — Positive-Definite Kernels and Mercer's Theorem

What makes a function a valid kernel? Positive-definiteness, the Gram matrix test, and Mercer's theorem — the spectral decomposition that justifies the kernel trick.

Nov 24, 2021 Kernel Methods 66 min read

Kernel Methods (1): Why We Need Them — Hitting the Ceiling of Linear Algorithms

Linear algorithms can't capture non-linear patterns. The kernel trick lets you keep the linear algorithm's elegance AND model non-linear relationships — without writing the high-dimensional feature map. Part 1 of an …