<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>3D Reconstruction on Chen Kai Blog</title><link>https://www.chenk.top/en/tags/3d-reconstruction/</link><description>Recent content in 3D Reconstruction on Chen Kai Blog</description><generator>Hugo</generator><language>en</language><lastBuildDate>Wed, 23 Apr 2025 09:00:00 +0000</lastBuildDate><atom:link href="https://www.chenk.top/en/tags/3d-reconstruction/index.xml" rel="self" type="application/rss+xml"/><item><title>Essence of Linear Algebra (17): Linear Algebra in Computer Vision</title><link>https://www.chenk.top/en/linear-algebra/17-linear-algebra-in-computer-vision/</link><pubDate>Wed, 23 Apr 2025 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/linear-algebra/17-linear-algebra-in-computer-vision/</guid><description>&lt;p>Computer vision is the science of teaching machines to see. What is striking is how thoroughly the whole field reduces to linear algebra: an image is a matrix, a geometric transformation is a matrix product, a camera is a &lt;span class="math-inline">$3 \times 4$&lt;/span>
 projection matrix, two-view geometry is the equation &lt;span class="math-inline">$\mathbf{x}_2^\top \mathbf{F}\, \mathbf{x}_1 = 0$&lt;/span>
, and 3D reconstruction is a sparse linear least-squares problem. Once you see the field through that lens, what once looked like a zoo of algorithms turns out to be a small set of linear-algebraic ideas applied repeatedly.&lt;/p></description></item><item><title>Tennis-Scene Computer Vision: From Paper Survey to Production</title><link>https://www.chenk.top/en/standalone/tennis-cv-system-design/</link><pubDate>Sat, 31 Aug 2024 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/standalone/tennis-cv-system-design/</guid><description>&lt;p>A 6.7 cm tennis ball travels at over 200 km/h. Reconstructing its 3D trajectory from eight 4K cameras in real time, while also classifying each player&amp;rsquo;s stroke, involves &lt;strong>small-object detection, multi-view geometry, Kalman filtering, physics modeling, and human-pose estimation&lt;/strong> — all at once. This post follows the same steps as in deployment: state the constraints, survey the literature, choose, build, and lay out a millisecond-by-millisecond budget for production.&lt;/p>
&lt;hr>
&lt;h2 id="what-you-will-learn" class="heading-anchor">What You Will Learn&lt;a href="#what-you-will-learn" class="heading-link" aria-label="Permalink to this section" title="Copy link to this section">#&lt;/a>
&lt;/h2>&lt;ul>
&lt;li>Why traditional detectors collapse on 10–20 px tennis balls and how the TrackNet line fixes it&lt;/li>
&lt;li>Multi-camera calibration, PTP synchronisation, and DLT triangulation in code and math&lt;/li>
&lt;li>A 9-state Kalman filter coupled with a drag-plus-Magnus ODE for trajectory prediction&lt;/li>
&lt;li>Action recognition: rule-based templates vs. end-to-end learning, and when each wins&lt;/li>
&lt;li>How to fit detection → 3D → tracking → pose → analytics into a 16.7 ms / frame budget&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Prerequisites&lt;/strong>: pinhole camera model, basic Kalman filtering, and some PyTorch inference experience.&lt;/p></description></item></channel></rss>