Series

Standalone Articles

Jan 21, 2026 Standalone 12 min read

Solving Constrained Mean-Variance Portfolio Optimization Using Spiral Optimization

Apply Spiral Optimization Algorithm (SOA) to mean-variance portfolio problems with buy-in thresholds and cardinality constraints. Covers MINLP formulation, penalty methods, and performance comparison.

Dec 31, 2025 Standalone 23 min read

AI Agents Complete Guide: From Theory to Industrial Practice

A practitioner-grade guide to building AI agents: planning (CoT/ReAct/ToT), memory architectures, tool use, reflection, multi-agent patterns, frameworks (LangChain, LangGraph, AutoGen, CrewAI), evaluation, and production …

Oct 15, 2025 Standalone 27 min read

Prompt Engineering Complete Guide: From Zero to Advanced Optimization

Master prompt engineering from zero-shot basics to Tree of Thoughts, DSPy, and automated optimization. Includes benchmarks, code, and a debugging toolkit.

Sep 22, 2025 Standalone 10 min read

Low-Rank Matrix Approximation and the Pseudoinverse: From SVD to Regularization

From the least-squares view to the Moore-Penrose pseudoinverse, the four Penrose conditions, computation via SVD, truncated SVD, Tikhonov regularization, and modern applications from PCA to LoRA.

Jul 24, 2025 Standalone 14 min read

Reparameterization Trick & Gumbel-Softmax: A Deep Dive

Make sense of the reparameterization trick and Gumbel-Softmax: why gradients can flow through sampling, how temperature trades bias for variance, and the practical pitfalls of training discrete latent variables …

Jun 21, 2025 Standalone 16 min read

Symplectic Geometry and Structure-Preserving Neural Networks

Learn physics-informed neural networks that preserve energy and symplectic structure. Covers HNN, LNN, SympNet, symplectic integrators, and four classical experiments.

Jun 21, 2025 Standalone 15 min read

LLM Workflows and Application Architecture: Enterprise Implementation Guide

From a single API call to a production LLM platform — workflow patterns, RAG, model routing, deployment, cost levers, observability, and enterprise integration, with the trade-offs that actually matter.

Mar 31, 2025 Standalone 10 min read

Prefix-Tuning: Optimizing Continuous Prompts for Generation

Prefix-Tuning adapts frozen LLMs by learning continuous key/value vectors injected into attention. Covers the method, reparameterization, KV-cache mechanics, and comparisons with prompt tuning, adapters, and LoRA.

Dec 6, 2024 Standalone 14 min read

Vim Essentials: Modal Editing, Motions, and a Repeatable Workflow

Learn Vim by understanding its grammar -- modes, operators + motions, text objects -- not by memorizing shortcuts. A practical, beginner-friendly guide with a one-week practice plan.

Oct 12, 2024 Standalone 13 min read

MoSLoRA: Mixture-of-Subspaces in Low-Rank Adaptation

MoSLoRA boosts LoRA expressivity by mixing multiple low-rank subspaces with a lightweight mixer. Covers when vanilla LoRA fails, mixer design choices, and tuning tips.

Oct 7, 2024 Standalone 21 min read

Tennis-Scene Computer Vision: From Paper Survey to Production

A complete CV system for tennis: small high-speed object detection, multi-camera 3D reconstruction, physics-based trajectory prediction, and pose-based action recognition. From the literature down to a 16.7 ms-per-frame …

Dec 16, 2023 Standalone 11 min read

HCGR: Hyperbolic Contrastive Graph Representation Learning for Session-based Recommendation

HCGR embeds session graphs in the Lorentz model of hyperbolic space and trains them with InfoNCE-style contrastive learning. This review unpacks why hierarchical session intent fits hyperbolic geometry, how Lorentz …

Oct 15, 2023 Standalone 14 min read

Kernel Methods: From Theory to Practice (RKHS, Common Kernels, and Hyperparameter Tuning)

Understand the kernel trick, RKHS theory, and practical kernel selection. Covers RBF, polynomial, Matern, and periodic kernels with sklearn code and a tuning flowchart.

Sep 20, 2023 Standalone 6 min read

Position Encoding Brief: From Sinusoidal to RoPE and ALiBi

A practitioner's tour of Transformer position encoding: why attention needs it at all, how sinusoidal/learned/relative/RoPE/ALiBi schemes differ, and which one to pick when long-context extrapolation matters.

Sep 1, 2023 Standalone 25 min read

LAMP Stack on Alibaba Cloud ECS: From Fresh Instance to Production-Ready Web Server

Set up a LAMP stack (Linux, Apache, MySQL, PHP) on Alibaba Cloud ECS. Covers security groups, service installation, Discuz deployment, source compilation, hardening and three-tier scale-out.

Aug 26, 2023 Standalone 13 min read

Variational Autoencoder (VAE): From Intuition to Implementation and Troubleshooting

Build a VAE from scratch in PyTorch. Covers the ELBO objective, reparameterization trick, posterior collapse fixes, beta-VAE, and a complete training pipeline.

Aug 22, 2023 Standalone 10 min read

paper2repo: GitHub Repository Recommendation for Academic Papers

paper2repo aligns academic papers with GitHub repositories in a shared embedding space using a constrained GCN. Covers the joint heterogeneous graph, the WARP ranking loss, the cosine alignment constraint, and the full …

Jul 13, 2023 Standalone 14 min read

Session-based Recommendation with Graph Neural Networks (SR-GNN)

SR-GNN turns a click session into a directed weighted graph and runs a gated GNN to predict the next item. Covers session-graph construction, GGNN updates, attention-based session pooling, training, benchmarks, and the …

Mar 13, 2023 Standalone 19 min read

Learning Rate: From Basics to Large-Scale Training

A practitioner's guide to the single most important hyperparameter: why too-large LR explodes, how warmup and schedules really work, the LR range test, the LR-batch-size-weight-decay coupling, and recent ideas like WSD, …

Jan 15, 2023 Standalone 12 min read

Graph Contextualized Self-Attention Network (GC-SAN) for Session-based Recommendation

GC-SAN combines a session-graph GGNN (local transitions) with multi-layer self-attention (global dependencies) for session-based recommendation. Covers graph construction, message passing, attention fusion, and where the …

Dec 27, 2022 Standalone 13 min read

Lipschitz Continuity, Strong Convexity & Nesterov Acceleration

Three concepts that demystify most of optimization: Lipschitz smoothness fixes the maximum step size, strong convexity sets the convergence rate and uniqueness of the minimizer, and Nesterov acceleration replaces kappa …

Dec 9, 2022 Standalone 10 min read

Optimizer Evolution: From Gradient Descent to Adam (and Beyond, 2025)

One article that traces the full lineage GD -> SGD -> Momentum -> NAG -> AdaGrad -> RMSProp -> Adam -> AdamW, then onwards to Lion / Sophia / Schedule-Free. Each step is framed by the specific failure of the previous …

Nov 26, 2022 Standalone 12 min read

LLMGR: Integrating Large Language Models with Graphical Session-Based Recommendation

LLMGR uses an LLM as the semantic engine for session-based recommendation and a GNN as the ranker. Covers the hybrid encoding layer, two-stage prompt tuning, ~8.68% HR@20 lift, and how to deploy without running an LLM …

Aug 5, 2022 Standalone 21 min read

Multimodal LLMs and Downstream Tasks: A Practitioner's Guide

End-to-end map of multimodal LLMs: vision-language alignment, cross-modal fusion, the CLIP/BLIP/LLaVA families, downstream tasks (VQA, captioning, grounding, OCR), fine-tuning trade-offs, benchmarks, and what it takes to …

Aug 1, 2022 Standalone 20 min read

Operating System Fundamentals: A Deep Dive

Walk through processes, virtual memory, file systems, the I/O stack, system calls and schedulers, with the actual numbers and the commands you can verify each claim with on a Linux box.

Jul 25, 2022 Standalone 17 min read

Proximal Operator: From Moreau Envelope to ISTA/FISTA and ADMM

A systematic walk through the proximal operator: convex-analysis basics, the Moreau envelope, closed-form proxes, and how they power ISTA, FISTA, ADMM, LASSO, and SVM in practice.

Jul 22, 2022 Standalone 14 min read

Graph Neural Networks for Learning Equivariant Representations of Neural Networks

Represent a neural network as a directed graph (neurons as nodes, weights as edges) and use a GNN to produce permutation-equivariant embeddings. The right symmetry unlocks generalisation prediction, network …