<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Contrastive Learning on Chen Kai Blog</title><link>https://www.chenk.top/en/tags/contrastive-learning/</link><description>Recent content in Contrastive Learning on Chen Kai Blog</description><generator>Hugo</generator><language>en</language><lastBuildDate>Wed, 31 Dec 2025 09:00:00 +0000</lastBuildDate><atom:link href="https://www.chenk.top/en/tags/contrastive-learning/index.xml" rel="self" type="application/rss+xml"/><item><title>Recommendation Systems (11): Contrastive Learning and Self-Supervised Learning</title><link>https://www.chenk.top/en/recommendation-systems/11-contrastive-learning/</link><pubDate>Wed, 31 Dec 2025 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/recommendation-systems/11-contrastive-learning/</guid><description>&lt;p>Classical recommenders learn from one signal: did a user click, watch, or buy? That signal is precious, but it is also brutally sparse. Most users touch fewer than 1% of the catalogue, most items are touched by fewer than 0.1% of users, and a brand-new item or user has nothing at all. Optimising a model directly against such sparse labels almost guarantees overfitting on the head and silence on the tail.&lt;/p></description></item><item><title>Transfer Learning (8): Multimodal Transfer</title><link>https://www.chenk.top/en/transfer-learning/08-multimodal-transfer/</link><pubDate>Thu, 12 Jun 2025 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/transfer-learning/08-multimodal-transfer/</guid><description>&lt;p>How can a model classify an image of a Burmese cat correctly without ever having seen a label &amp;ldquo;Burmese cat&amp;rdquo;? Traditional supervised learning needs millions of labeled examples per class. CLIP, released by OpenAI in 2021, sidesteps that constraint entirely: it learns to put images and natural-language descriptions into the same vector space, and then &amp;ldquo;classification&amp;rdquo; reduces to picking which sentence — out of any candidate sentences you write down — sits closest to the image.&lt;/p></description></item><item><title>HCGR: Hyperbolic Contrastive Graph Representation Learning for Session-based Recommendation</title><link>https://www.chenk.top/en/standalone/hcgr/</link><pubDate>Wed, 01 May 2024 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/standalone/hcgr/</guid><description>&lt;p>A user opens a sneaker app, taps &amp;ldquo;running shoes,&amp;rdquo; drills into a brand, then a price band, and finally a single SKU. This trajectory forms a &lt;em>tree&lt;/em>: each click narrows the candidate set roughly multiplicatively. In Euclidean space, you need many dimensions to keep all the leaves of the tree apart because the volume grows polynomially with radius. In hyperbolic space, volume grows &lt;em>exponentially&lt;/em> with radius, so the tree fits naturally — a few dimensions are enough to keep the long tail untangled.&lt;/p></description></item></channel></rss>