<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Function-Calling on Chen Kai Blog</title><link>https://www.chenk.top/en/tags/function-calling/</link><description>Recent content in Function-Calling on Chen Kai Blog</description><generator>Hugo</generator><language>en</language><lastBuildDate>Thu, 02 Apr 2026 09:00:00 +0000</lastBuildDate><atom:link href="https://www.chenk.top/en/tags/function-calling/index.xml" rel="self" type="application/rss+xml"/><item><title>LLM Engineering (7): Function Calling and Tool Use</title><link>https://www.chenk.top/en/llm-engineering/07-function-calling/</link><pubDate>Thu, 02 Apr 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/llm-engineering/07-function-calling/</guid><description>&lt;p>Function calling connects an LLM to the world outside its weights. It combines chat-template details (&lt;a href="https://www.chenk.top/en/llm-engineering/02-tokenization/">Chapter 2&lt;/a>
), structured-output kernels (&lt;a href="https://www.chenk.top/en/llm-engineering/05-inference/">Chapter 5&lt;/a>
), and prompt engineering (&lt;a href="https://www.chenk.top/en/llm-engineering/09-prompting/">Chapter 9&lt;/a>
). This chapter explores what happens under the hood, the guarantees you can rely on, and the agent-loop patterns that handle real workloads.&lt;/p>
&lt;p>The intellectual lineage matters. Tool use as an LLM capability traces back to two near-simultaneous papers in 2022: &lt;strong>MRKL Systems&lt;/strong> (Karpas et al., AI21) which proposed expert-routing among neuro-symbolic modules, and &lt;strong>ReAct&lt;/strong> (&lt;a href="https://arxiv.org/abs/2210.03629" target="_blank" rel="noopener noreferrer">Yao et al., 2022 &lt;span aria-hidden="true" style="font-size:0.75em; opacity:0.55; margin-left:2px;">↗&lt;/span>&lt;/a>
) which interleaved chain-of-thought reasoning with tool actions. &lt;strong>Toolformer&lt;/strong> (&lt;a href="https://arxiv.org/abs/2302.04761" target="_blank" rel="noopener noreferrer">Schick et al., 2023 &lt;span aria-hidden="true" style="font-size:0.75em; opacity:0.55; margin-left:2px;">↗&lt;/span>&lt;/a>
) showed self-supervised teaching of tool use, generating training data by having a model insert tool-call markers into existing text. By 2024 every frontier model had post-training data structured around the tool-use format, and tool calling moved from &amp;ldquo;research demo&amp;rdquo; to &amp;ldquo;API feature.&amp;rdquo;&lt;/p></description></item><item><title>Aliyun Bailian (2): The Qwen LLM API in Production</title><link>https://www.chenk.top/en/aliyun-bailian/02-qwen-llm-api/</link><pubDate>Thu, 26 Feb 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/aliyun-bailian/02-qwen-llm-api/</guid><description>&lt;p>This article in the series covers most of the production wins. While the other models are interesting, the LLMs are what every product I&amp;rsquo;ve shipped on Bailian calls every minute of every day. The official Qwen API reference is dense and complete; this article is the readable companion that guides you through it.&lt;/p>
&lt;p>&lt;figure class="article-figure">
 &lt;img src="https://blog-pic-ck.oss-cn-beijing.aliyuncs.com/posts/en/aliyun-bailian/02-qwen-llm-api/illustration_1.png" alt="Aliyun Bailian (2): The Qwen LLM API in Production — Chapter overview" loading="lazy" decoding="async" class="content-image">
 
&lt;/figure>
&lt;/p>
&lt;hr>
&lt;h2 id="pick-the-right-qwen-variant-for-the-workload" class="heading-anchor">Pick the right Qwen variant for the workload&lt;a href="#pick-the-right-qwen-variant-for-the-workload" class="heading-link" aria-label="Permalink to this section" title="Copy link to this section">#&lt;/a>
&lt;/h2>&lt;p>The Qwen family is large. Some teams overspend by defaulting to &lt;code>qwen-max&lt;/code> everywhere; others underspend on quality by defaulting to &lt;code>qwen-turbo&lt;/code>. The right answer is &amp;ldquo;match variant to job&amp;rdquo;:&lt;/p></description></item></channel></rss>