<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>PCA on Chen Kai Blog</title><link>https://www.chenk.top/en/tags/pca/</link><description>Recent content in PCA on Chen Kai Blog</description><generator>Hugo</generator><language>en</language><lastBuildDate>Thu, 05 Feb 2026 09:00:00 +0000</lastBuildDate><atom:link href="https://www.chenk.top/en/tags/pca/index.xml" rel="self" type="application/rss+xml"/><item><title>ML Math Derivations (17): Dimensionality Reduction and PCA</title><link>https://www.chenk.top/en/ml-math-derivations/17-dimensionality-reduction-and-pca/</link><pubDate>Thu, 05 Feb 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/ml-math-derivations/17-dimensionality-reduction-and-pca/</guid><description>&lt;p>&lt;figure class="article-figure">
 &lt;img src="https://blog-pic-ck.oss-cn-beijing.aliyuncs.com/posts/en/ml-math-derivations/17-Dimensionality-Reduction-and-PCA/illustration_1.png" alt="ML Math Derivations (17): Dimensionality Reduction and PCA — Chapter overview" loading="lazy" decoding="async" class="content-image">
 
&lt;/figure>
&lt;/p>
&lt;hr>
&lt;h2 id="what-you-will-learn" class="heading-anchor">What You Will Learn&lt;a href="#what-you-will-learn" class="heading-link" aria-label="Permalink to this section" title="Copy link to this section">#&lt;/a>
&lt;/h2>&lt;p>Feed a clustering algorithm &lt;span class="math-inline">$10{,}000$&lt;/span>
-dimensional data and it will most likely fail — not because the algorithm is broken, but because &lt;strong>high-dimensional space is a hostile environment for distance-based learning&lt;/strong>. Volumes evaporate into thin shells, the ratio of nearest- to farthest-neighbour distances tends to &lt;span class="math-inline">$1$&lt;/span>
, and &amp;ldquo;closeness&amp;rdquo; stops carrying information. Dimensionality reduction is the response: project the data into a lower-dimensional space while keeping the structure that actually matters.&lt;/p></description></item><item><title>Essence of Linear Algebra (15): Linear Algebra in Machine Learning</title><link>https://www.chenk.top/en/linear-algebra/15-linear-algebra-in-machine-learning/</link><pubDate>Wed, 09 Apr 2025 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/linear-algebra/15-linear-algebra-in-machine-learning/</guid><description>&lt;p>Ask any senior ML engineer &amp;ldquo;what math do you actually use day to day?&amp;rdquo; and the answer is almost always &lt;strong>linear algebra&lt;/strong>. Calculus shows up in derivations; probability shows up in modeling; but the runtime of a real ML system is dominated by matrix-vector multiplies, decompositions, and projections. PyTorch&amp;rsquo;s &lt;code>Linear&lt;/code>, scikit-learn&amp;rsquo;s &lt;code>PCA&lt;/code>, Spark MLlib&amp;rsquo;s &lt;code>ALS&lt;/code>, and a Transformer&amp;rsquo;s attention head are all the same primitive in different costumes.&lt;/p>
&lt;p>This chapter covers the algorithms used in production ML systems — PCA, LDA, SVM with kernels, matrix factorization for recommenders, regularized linear regression, neural network layers, and attention — and explains the linear algebra behind each. We focus on intuition first, then geometry, and finally formulas.&lt;/p></description></item><item><title>Essence of Linear Algebra (9): Singular Value Decomposition — The Crown Jewel of Linear Algebra</title><link>https://www.chenk.top/en/linear-algebra/09-singular-value-decomposition/</link><pubDate>Wed, 26 Feb 2025 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/linear-algebra/09-singular-value-decomposition/</guid><description>&lt;p>&lt;figure class="article-figure">
 &lt;img src="https://blog-pic-ck.oss-cn-beijing.aliyuncs.com/posts/en/linear-algebra/09-singular-value-decomposition/illustration_1.png" alt="Essence of Linear Algebra (9): Singular Value Decomposition — The Crown Jewel of Linear Algebra — Chapter overview" loading="lazy" decoding="async" class="content-image">
 
&lt;/figure>
&lt;/p>
&lt;hr>
&lt;h2 id="why-svd-earns-the-crown" class="heading-anchor">Why SVD Earns the Crown&lt;a href="#why-svd-earns-the-crown" class="heading-link" aria-label="Permalink to this section" title="Copy link to this section">#&lt;/a>
&lt;/h2>&lt;p>The spectral theorem of &lt;a href="https://www.chenk.top/en/linear-algebra/08-symmetric-matrices-and-quadratic-forms/">Chapter 8&lt;/a>
 gave us &lt;span class="math-inline">$A = Q\Lambda Q^T$&lt;/span>
 — a beautifully clean factorisation, but &lt;strong>only for symmetric matrices&lt;/strong>. Most matrices that show up in practice are not symmetric, and many are not even square:&lt;/p>
&lt;ul>
&lt;li>a photograph stored as a &lt;span class="math-inline">$1920 \times 1080$&lt;/span>
 pixel matrix,&lt;/li>
&lt;li>a Netflix-style user&amp;ndash;movie rating matrix (millions of rows, thousands of columns),&lt;/li>
&lt;li>a document&amp;ndash;term matrix in NLP (documents by vocabulary),&lt;/li>
&lt;li>a gene-expression matrix in bioinformatics.&lt;/li>
&lt;/ul>
&lt;span class="math-block">$$
A = U\,\Sigma\,V^{\!\top}.
$$&lt;/span>
&lt;p>
This is the most powerful, most universally applicable decomposition in all of linear algebra.&lt;/p></description></item><item><title>Kernel Methods (5): Kernel SVM, Kernel PCA, and Kernel Ridge Regression</title><link>https://www.chenk.top/en/kernel-methods/05-kernel-algorithms/</link><pubDate>Tue, 14 Dec 2021 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/kernel-methods/05-kernel-algorithms/</guid><description>&lt;p>Your features are two-dimensional, your data is clearly a circle inside a circle, and &lt;code>LinearSVC&lt;/code> is at 50% accuracy with the wide-eyed look of an algorithm that genuinely believes a straight line is the answer. You stare at the scatter plot, you stare at the model, and somewhere in the back of your head the words &lt;em>kernel SVM&lt;/em> surface. You type &lt;code>kernel='rbf'&lt;/code>, the accuracy jumps to 0.98, and the rest of the afternoon you wonder what exactly just happened — and why the same trick also gives you a Kernel PCA that unfolds a Swiss roll and a Kernel Ridge regressor that fits a sine wave with three lines of code.&lt;/p></description></item></channel></rss>