Series · Functional Analysis · Chapter 12

Functional Analysis (12): Functional Analysis in Action — PDE and Quantum Mechanics

Lax-Milgram for elliptic PDE, variational methods, quantum observables as self-adjoint operators, and Stone's theorem — where the abstract theory meets concrete applications.

Eleven articles is a long time to spend on infrastructure. Normed spaces, Banach and Hilbert structure, dual spaces, weak topologies, bounded and unbounded operators, the spectral theorem, semigroups, distributions, Sobolev spaces — every one of those chapters paid for itself with a clean abstract result, but a reader could be forgiven for wondering when the abstraction was going to do anything. This final article is where I make good on the implicit promise of the series: every theorem we built was built because some concrete problem demanded it, and pulling those threads together gives us the modern toolkit for partial differential equations and quantum mechanics.

Two pillars hold up the modern theory of PDE. The Lax-Milgram theorem turns existence-of-solutions for a wide class of elliptic equations into a single Hilbert-space lemma; given an estimate (coercivity) and a continuity bound, the solution is guaranteed and unique without any classical regularity hypothesis. The Galerkin method then converts the same variational identity into a finite-dimensional linear system computable on a laptop, with explicit error bounds — the entire finite element method, the workhorse of computational engineering, is a corollary.

Two more pillars hold up quantum mechanics. The spectral theorem for unbounded self-adjoint operators says that any observable in quantum mechanics — energy, position, angular momentum — has a real spectrum and a projection-valued measure that determines all measurement statistics. Stone’s theorem says that any one-parameter group of unitary symmetries arises from a self-adjoint generator, so symmetries and conserved quantities are exactly the same data. Schrodinger’s equation is then automatic: it is the unitary group generated by the Hamiltonian, and unitarity is conservation of probability. The whole framework is so tight that it is almost embarrassing how little physics one needs to add on top.

What I want to convey in this final article is not a survey but a sense of the architecture. The reason to learn functional analysis is not so you can solve the Poisson equation — there are better tools for that one specific equation. The reason is that the same architecture solves the Stokes equations, Maxwell’s equations, the elasticity equations, the linearised Navier-Stokes equations, the heat equation, the wave equation, the Schrodinger equation, the Klein-Gordon equation, every linear PDE in mathematical physics, by varying only the choice of bilinear form and the choice of Hilbert space. That uniformity is the prize.


The Payoff: Analysis Serves Applications#

Functional analysis emerged in the early 20th century from two converging demands: the need to solve integral and differential equations rigorously (Fredholm, Hilbert, Riesz), and the need for a mathematical framework for quantum mechanics (von Neumann, Dirac, Stone). The tools we have developed — completeness, duality, spectral decomposition, weak derivatives — were created because concrete problems required them.

The flow of ideas has always been bidirectional. PDE theory motivated Sobolev spaces and distribution theory. Quantum mechanics demanded unbounded self-adjoint operators and the spectral theorem. And the abstract theory, once developed, revealed connections and simplifications invisible from the concrete side: the Lax-Milgram theorem unifies dozens of existence proofs for different elliptic equations, and Stone’s theorem shows that Schrodinger’s equation and Maxwell’s equations share the same abstract structure.

Fredholm theory for elliptic PDE


Lax-Milgram Theorem and Elliptic Boundary Value Problems#

The theorem#

Lax-Milgram: coercivity implies existence

Theorem (Lax-Milgram). Let $V$ be a real Hilbert space. Let $a: V \times V \to \mathbb{R}$ be a bilinear form satisfying:

  1. Continuity (boundedness): There exists $M > 0$ such that $|a(u, v)| \le M\|u\|\|v\|$ for all $u, v \in V$ .
  2. Coercivity: There exists $\alpha > 0$ such that $a(u, u) \ge \alpha\|u\|^2$ for all $u \in V$ .

Let $F: V \to \mathbb{R}$ be a bounded linear functional. Then there exists a unique $u \in V$ such that

$$ a(u, v) = F(v) \quad \text{for all } v \in V. $$

Moreover, $\|u\| \le \frac{1}{\alpha}\|F\|_{V^*}$ .

Lax-Milgram theorem: existence and uniqueness for coercive bilinear forms

Note: unlike the Riesz representation theorem, the bilinear form $a$ need not be symmetric. This is the key generalisation that makes Lax-Milgram applicable to non-symmetric problems like convection-diffusion equations: $-\Delta u + \mathbf{b}\cdot\nabla u = f$ has a non-symmetric form because of the first-order term $\mathbf{b}\cdot\nabla u$ , but Lax-Milgram still applies (provided $\mathbf{b}$ is small enough relative to the Laplacian for coercivity to hold).

Proof#

Proof. By the Riesz representation theorem, for each fixed $u \in V$ , the map $v \mapsto a(u, v)$ is a bounded linear functional on $V$ (by continuity of $a$ ), hence there exists a unique $Au \in V$ with

$$ a(u, v) = \langle Au, v \rangle \quad \text{for all } v \in V. $$

The map $A: V \to V$ is linear (by bilinearity of $a$ ) and bounded: $\|Au\| = \sup_{\|v\|=1} |\langle Au, v \rangle| = \sup_{\|v\|=1} |a(u,v)| \le M\|u\|$ .

Similarly, by Riesz, there exists $f \in V$ with $F(v) = \langle f, v \rangle$ for all $v$ . The equation $a(u, v) = F(v)$ for all $v$ becomes $\langle Au, v \rangle = \langle f, v \rangle$ for all $v$ , i.e., $Au = f$ . We need to show $A$ is bijective.

Injectivity. Coercivity gives $\alpha\|u\|^2 \le a(u, u) = \langle Au, u \rangle \le \|Au\|\|u\|$ , so $\|Au\| \ge \alpha\|u\|$ for all $u$ . If $Au = 0$ then $u = 0$ .

Closed range. The estimate $\|Au\| \ge \alpha\|u\|$ implies that $A$ has closed range. Indeed, if $Au_n \to y$ , then $(u_n)$ is Cauchy (since $\|u_n - u_m\| \le \alpha^{-1}\|Au_n - Au_m\|$ ), so $u_n \to u$ for some $u \in V$ , and $Au = y$ by continuity of $A$ .

Dense range. Suppose $y \perp \text{Range}(A)$ , i.e., $\langle Au, y \rangle = 0$ for all $u$ . Taking $u = y$ : $0 = \langle Ay, y \rangle = a(y, y) \ge \alpha\|y\|^2$ , so $y = 0$ . Hence $\text{Range}(A)^\perp = \{0\}$ , meaning $\text{Range}(A)$ is dense.

Since $\text{Range}(A)$ is both closed and dense, $\text{Range}(A) = V$ . So $A$ is bijective, and $u = A^{-1}f$ is the unique solution. The bound $\|u\| \le \alpha^{-1}\|Au\| = \alpha^{-1}\|f\| = \alpha^{-1}\|F\|_{V^*}$ follows from the coercivity estimate. $\square$

The proof is short but every step uses something from the previous eleven articles: Riesz representation (Article 3), boundedness (Article 6), the closed range theorem (Article 6 again), the orthogonal decomposition (Article 3). This is what I meant by paying for infrastructure: the Lax-Milgram theorem feels almost trivial once the infrastructure is in place, and that triviality is the whole point.

Application to elliptic PDE#

Example: the Poisson equation with Dirichlet boundary conditions. Let $\Omega \subset \mathbb{R}^n$ be bounded with Lipschitz boundary. Consider

$$ \begin{cases} -\Delta u = f & \text{in } \Omega, \\ u = 0 & \text{on } \partial\Omega, \end{cases} $$

where $f \in L^2(\Omega)$ .

Weak formulation. Multiply by $v \in H_0^1(\Omega)$ and integrate by parts:

$$ \int_\Omega \nabla u \cdot \nabla v \, dx = \int_\Omega fv \, dx \quad \text{for all } v \in H_0^1(\Omega). $$

Set $V = H_0^1(\Omega)$ , $a(u, v) = \int_\Omega \nabla u \cdot \nabla v \, dx$ , and $F(v) = \int_\Omega fv \, dx$ .

Checking the hypotheses:

  • Continuity: $|a(u, v)| \le \|\nabla u\|_{L^2}\|\nabla v\|_{L^2} \le \|u\|_{H^1}\|v\|_{H^1}$ , so $M = 1$ .
  • Coercivity: By the Poincare inequality, $\|u\|_{L^2} \le C_P\|\nabla u\|_{L^2}$ for $u \in H_0^1(\Omega)$ , hence $\|u\|_{H^1}^2 \le (1 + C_P^2)\|\nabla u\|_{L^2}^2 = (1 + C_P^2) a(u, u)$ , so $a(u, u) \ge \alpha\|u\|_{H^1}^2$ with $\alpha = 1/(1 + C_P^2)$ .
  • Boundedness of $F$ : $|F(v)| \le \|f\|_{L^2}\|v\|_{L^2} \le \|f\|_{L^2}\|v\|_{H^1}$ .

Lax-Milgram applies, and we get a unique weak solution $u \in H_0^1(\Omega)$ with $\|u\|_{H^1} \le (1 + C_P^2)\|f\|_{L^2}$ . This is existence-and-uniqueness for the Dirichlet problem on any bounded Lipschitz domain, with no smoothness on the data and no constructive procedure involved. Compare to the classical existence theory (Perron’s method, balayage), which requires considerably more work and yields less.

Numerical sanity check. Take $\Omega = (0, 1)$ in one dimension and $f \equiv 1$ . The classical solution is $u(x) = \tfrac{1}{2}x(1-x)$ , which lies in $H_0^1$ with $\|\nabla u\|_{L^2}^2 = \int_0^1 (1/2 - x)^2\,dx = 1/12$ and $\|u\|_{L^2}^2 = \int_0^1 \tfrac{1}{4}x^2(1-x)^2\,dx = 1/120$ . So $\|u\|_{H^1}^2 = 11/120 \approx 0.092$ , and the Lax-Milgram bound gives $\|u\|_{H^1} \le (1 + C_P^2)\|f\|_{L^2} = (1 + 1/\pi^2) \approx 1.10$ — a very loose bound, but within the right order of magnitude. The actual norm is roughly a quarter of the bound.

A second check on a 2D example. Take $\Omega$ the unit disc and $f \equiv 1$ ; the solution is the radially symmetric $u(r) = (1 - r^2)/4$ . Then $\|\nabla u\|_{L^2(\Omega)}^2 = \int_0^1 r/4 \cdot 2\pi r\,dr = \pi/16$ , $\|u\|_{L^2(\Omega)}^2 = \int_0^1 (1-r^2)^2/16 \cdot 2\pi r\,dr = \pi/96$ , so the energy is $J(u) = \pi/32 - \int u\,dx = \pi/32 - \pi/8 = -3\pi/32$ . The variational characterisation says this is the minimum value of $J$ over $H_0^1(\Omega)$ . Any other $v \in H_0^1$ gives a strictly larger $J(v)$ . This is a property one can sometimes use as a numerical certificate: a candidate $u$ solving the variational identity to small residual must be close to the true minimiser.

Variants#

Non-zero Dirichlet boundary data. If the boundary condition is $u = g$ on $\partial\Omega$ with $g$ in the trace space $H^{1/2}(\partial\Omega)$ , find $G \in H^1(\Omega)$ with $\gamma_0 G = g$ (possible by surjectivity of the trace operator). Set $w = u - G$ ; then $w \in H_0^1$ satisfies $-\Delta w = f + \Delta G$ — a homogeneous Dirichlet problem with modified data — and Lax-Milgram applies to it.

Neumann problems. For $-\Delta u = f$ with $\partial u/\partial n = g$ , the bilinear form is the same but $V = H^1(\Omega)/\mathbb{R}$ (quotient by constants, since constants are in the kernel) and the data is $F(v) = \int fv + \int_{\partial\Omega} gv$ , requiring the compatibility condition $\int f + \int_{\partial\Omega} g = 0$ .

General elliptic operators. $-\sum_{i,j}\partial_j(a_{ij}\partial_i u) + \sum_i b_i \partial_i u + cu = f$ with measurable coefficients $a_{ij}$ satisfying ellipticity ($\sum a_{ij}\xi_i\xi_j \ge \lambda|\xi|^2$ with $\lambda > 0$ ) — Lax-Milgram applies provided the lower-order terms are small enough or have the right sign.


Worked Numerical Example#

Take $\Omega = (0,1)$ and the reaction-diffusion problem $-u'' + u = 1$ with $u(0)=u(1)=0$ . The bilinear form is $a(u,v) = \int_0^1 (u'v' + uv)\,dx$ on $V=H_0^1(0,1)$ . Continuity holds with $M=1$ . Coercivity holds with $\alpha=1$ because $a(u,u) = \|u\|_{H^1}^2$ . The functional is $F(v) = \int_0^1 v\,dx$ , with dual norm $\|F\|_{V^*} \le 1$ . Lax-Milgram guarantees $\|u\|_{H^1} \le 1$ .

The exact solution is $u(x) = 1 - \frac{\cosh(x-1/2)}{\cosh(1/2)}$ . We can verify the bound directly. Using the weak identity with $v=u$ , we get $\|u\|_{H^1}^2 = \int_0^1 u\,dx = 1 - \tanh(1/2) \approx 0.537$ . The actual norm is $\sqrt{0.537} \approx 0.733$ , safely below the theoretical bound of $1$ . If we artificially weaken the reaction term to $-u'' + 0.01u = 1$ , the coercivity constant drops to $\alpha \approx 0.01$ (dominated by Poincaré), and the bound inflates to $\|u\| \le 100$ . The exact norm becomes $\approx 9.8$ . The theorem tracks the stiffness of the operator precisely.

Variational Methods#

From PDE to optimization#

Variational formulation: minimizing energy functional

Many PDE arise as Euler-Lagrange equations for energy functionals. Consider the energy

$$ J(u) = \frac{1}{2}\int_\Omega |\nabla u|^2 \, dx - \int_\Omega fu \, dx $$

defined on $V = H_0^1(\Omega)$ . Its first variation (Frechet derivative) at $u$ in direction $v$ is

$$ J'(u)(v) = \int_\Omega \nabla u \cdot \nabla v \, dx - \int_\Omega fv \, dx. $$

Setting $J'(u) = 0$ gives the weak form of $-\Delta u = f$ . So the weak solution is exactly the critical point of $J$ .

Variational formulation: weak solution as critical point of energy

Direct method of the calculus of variations#

When the bilinear form is symmetric (as in the Dirichlet energy), the weak solution is not just a critical point but a minimizer of $J$ . The direct method seeks the minimizer:

  1. Show $J$ is bounded below.
  2. Show $J$ is coercive: $J(u) \to \infty$ as $\|u\| \to \infty$ . This follows from $J(u) \ge \frac{1}{2}\|\nabla u\|_{L^2}^2 - \|f\|_{L^2}\|u\|_{L^2} \ge \frac{1}{2}\alpha\|u\|_{H^1}^2 - C\|u\|_{H^1}$ — the quadratic term wins.
  3. Take a minimizing sequence $u_n$ with $J(u_n) \to \inf J$ . Coercivity bounds $\|u_n\|$ , so by reflexivity of Hilbert space, a subsequence converges weakly: $u_n \rightharpoonup u$ .
  4. Show $J$ is weakly lower semicontinuous: $J(u) \le \liminf J(u_n)$ . This is automatic for convex coercive $J$ , since $u \mapsto \|\nabla u\|_{L^2}^2$ is convex and continuous, hence weakly lsc.
  5. Conclude $J(u) = \inf J$ , so $u$ is the minimizer.

The non-symmetric case (Lax-Milgram) does not reduce to a minimization problem, but the existence theory works the same way; one just cannot interpret the solution as an energy minimum.

Hilbert’s nineteenth and twentieth problems#

Hilbert’s nineteenth problem (1900) asked whether minimizers of regular variational problems are always analytic. Hilbert’s twentieth problem asked: does every Dirichlet-type variational problem have a solution?

The answer to the twentieth problem turns out to be: yes, provided one looks in the right space. Classical analysts in the 1900s did not have such a space, and the existence theory bogged down in special cases. Beppo Levi (1906) and later Tonelli (1923) clarified that the natural setting was what we now call $H^1$ — functions whose energy integral $\int|\nabla u|^2$ is finite. The completeness of this space (which we proved in Article 11) is essential: without it, the minimizing sequence has no limit. The weak lower semicontinuity is also essential: without it, the limit need not be a minimizer. Both ingredients require the functional-analytic infrastructure that took another half-century to mature.

This historical episode illustrates why functional analysis was needed. The variational approach to PDE requires completeness of the function space and compactness properties (weak compactness of bounded sets in Hilbert spaces), which are exactly the tools functional analysis provides.

The Galerkin method#

The variational formulation naturally leads to approximation methods. Choose finite-dimensional subspaces $V_h \subset V$ (e.g., finite element spaces) and solve

$$ a(u_h, v_h) = F(v_h) \quad \text{for all } v_h \in V_h. $$

Lax-Milgram applies in $V_h$ (which inherits coercivity from $V$ ), giving a unique $u_h$ . The Cea lemma provides the error estimate:

$$ \|u - u_h\| \le \frac{M}{\alpha} \inf_{v_h \in V_h} \|u - v_h\|. $$

The approximation error is controlled by the best approximation error in $V_h$ , up to the constant $M/\alpha$ (the condition number of the bilinear form). This is the theoretical foundation of the finite element method.

From Galerkin to finite elements#

In practice, the finite-dimensional subspaces $V_h$ are constructed by partitioning $\Omega$ into small elements (triangles in 2D, tetrahedra in 3D) and defining piecewise polynomial functions on each element. For piecewise linear elements on a triangulation with mesh size $h$ , the best approximation error satisfies $\inf_{v_h \in V_h}\|u - v_h\|_{H^1} \le Ch\|u\|_{H^2}$ (for $u \in H^2$ ). The Cea lemma then gives

$$ \|u - u_h\|_{H^1} \le \frac{M}{\alpha}Ch\|u\|_{H^2}, $$

showing first-order convergence in $h$ . Higher-order elements (piecewise quadratics, cubics, etc.) give faster convergence rates: $\|u - u_h\|_{H^1} \le Ch^k\|u\|_{H^{k+1}}$ for $k$ -th order polynomials, provided $u$ is regular enough.

The Lax-Milgram framework makes the convergence analysis clean: the entire theory reduces to (1) approximation properties of $V_h$ and (2) the condition number $M/\alpha$ of the bilinear form. These two factors are completely independent. Practical FEM software exploits this separation aggressively: refine the mesh where $u$ is rough (better approximation) and use preconditioners to reduce the effective condition number (smaller $M/\alpha$ ).

Nonlinear problems: the Browder-Minty theorem#

The Lax-Milgram theorem handles linear problems. For nonlinear elliptic PDE, the appropriate generalisation is the Browder-Minty theorem: if $A: V \to V^*$ is a monotone, coercive, hemicontinuous operator on a reflexive Banach space $V$ , then $A$ is surjective — for every $f \in V^*$ , the equation $Au = f$ has a solution.

Monotonicity ($\langle Au - Av, u - v \rangle \ge 0$ for all $u, v$ ) replaces linearity plus coercivity. Hemicontinuity (the map $t \mapsto \langle A(u + tv), w \rangle$ is continuous) replaces full continuity. This framework covers the $p$ -Laplacian $-\text{div}(|\nabla u|^{p-2}\nabla u) = f$ , the Euler-Lagrange equation for the energy $J(u) = \frac{1}{p}\int |\nabla u|^p - \int fu$ , a genuinely nonlinear problem that requires going beyond Hilbert spaces to $W_0^{1,p}(\Omega)$ .


Worked Numerical Example#

Consider $J(u) = \frac{1}{2}\int_0^1 (u')^2\,dx - \int_0^1 \sin(\pi x)u\,dx$ on $H_0^1(0,1)$ . The Euler-Lagrange equation is $-u'' = \sin(\pi x)$ , solved by $u(x) = \pi^{-2}\sin(\pi x)$ . Plugging back, $J(u) = \frac{1}{2\pi^2} - \frac{1}{\pi^2} = -\frac{1}{2\pi^2} \approx -0.05066$ .

Take a naive trial function $v(x) = x(1-x)$ , which satisfies the boundary conditions but ignores the forcing shape. Compute $J(v)$ explicitly: $\frac{1}{2}\int_0^1 (1-2x)^2\,dx = 1/6 \approx 0.16667$ . The linear term is $\int_0^1 x(1-x)\sin(\pi x)\,dx = 4/\pi^3 \approx 0.12900$ . So $J(v) \approx 0.16667 - 0.12900 = 0.03767$ . The inequality $J(v) > J(u)$ holds: $0.03767 > -0.05066$ . The energy gap is $0.08833$ . This gap equals $\frac{1}{2}\|u-v\|_{H^1}^2$ by the parallelogram identity for quadratic functionals. Direct computation of $\|u-v\|_{H^1}^2$ yields $0.17666$ , exactly twice the gap. The variational principle is not an abstraction; it is an algebraic identity you can check with a calculator.

Quantum Mechanics: States, Observables, Spectra#

The mathematical framework#

Quantum mechanics operators: position, momentum, energy

In the Hilbert space formulation of quantum mechanics (von Neumann, 1932):

  • States are unit vectors $\psi \in H$ (or more precisely, rays $\{\lambda\psi : |\lambda| = 1\}$ ) in a separable Hilbert space $H$ .
  • Observables are self-adjoint operators $A: \mathcal{D}(A) \to H$ .
  • Measurement outcomes are elements of the spectrum $\sigma(A) \subset \mathbb{R}$ .
  • Expectation value of observable $A$ in state $\psi$ : $\langle A \rangle_\psi = \langle A\psi, \psi \rangle$ (when $\psi \in \mathcal{D}(A)$ ).
  • Probability of measuring $A$ in a Borel set $B \subset \mathbb{R}$ : $\text{Prob}(A \in B) = \|E(B)\psi\|^2$ , where $E$ is the projection-valued measure from the spectral theorem.

Quantum observables as self-adjoint operators with spectral measures

Why self-adjoint? The spectral theorem guarantees real spectrum (measurement outcomes are real numbers), a spectral decomposition (measurement probabilities are well-defined), and a functional calculus (functions of observables make sense). Merely symmetric operators lack these properties — recall from Article 9 that a symmetric operator can have $\sigma = \mathbb{C}$ , in which case the very notion of “the values of the observable” is undefined.

The whole formal apparatus of quantum mechanics — wave functions, expectation values, transition probabilities, perturbation theory — is encoded in this six-line dictionary. Everything else is computation.

Example: the hydrogen atom#

The Hilbert space is $H = L^2(\mathbb{R}^3)$ . The Hamiltonian (energy observable) is

$$ \hat{H} = -\frac{\hbar^2}{2m}\Delta - \frac{e^2}{|x|}, $$

a Schrodinger operator with Coulomb potential. This is an unbounded self-adjoint operator on $\mathcal{D}(\hat{H}) = H^2(\mathbb{R}^3)$ (the Kato-Rellich theorem establishes self-adjointness by treating $-e^2/|x|$ as a relatively bounded perturbation of the Laplacian; the Coulomb singularity at the origin is “small” in $H^1$ relative to the Laplacian for $n = 3$ ).

The spectral theorem gives:

  • Discrete spectrum (bound states): eigenvalues $E_n = -13.6\,\text{eV}/n^2$ for $n = 1, 2, 3, \ldots$ , with finite-dimensional eigenspaces of dimension $n^2$ (the familiar quantum number degeneracies).
  • Continuous spectrum (scattering states): the interval $[0, \infty)$ , corresponding to unbound electrons.

The spectral decomposition $\hat{H} = \int \lambda \, dE(\lambda)$ encodes all measurable predictions about energy: the probability of measuring energy in an interval $[a, b]$ is $\|E([a, b])\psi\|^2$ , and the expectation value is $\langle \hat{H}\psi, \psi \rangle = \int \lambda \, d\|E(\lambda)\psi\|^2$ .

Bound states and scattering states#

The structure of the spectrum encodes the physics. A bound state — an electron trapped near the proton — corresponds to an eigenfunction $\psi_n$ with $\hat{H}\psi_n = E_n\psi_n$ and $E_n < 0$ . The wavefunction $\psi_n$ decays exponentially at infinity; the electron is genuinely localised. A scattering state corresponds to a generalised eigenfunction in the continuous spectrum at some $E \ge 0$ ; the wavefunction does not decay (it is not in $L^2$ ), but suitable wave packets built from these generalised eigenfunctions are in $L^2$ and propagate to infinity.

For hydrogen, the bound state energies are explicit: $E_n = -m_e e^4/(2\hbar^2 n^2) = -13.6\,\text{eV}/n^2$ in CGS units. The ground state $n=1$ has energy $-13.6\,\text{eV}$ , with the spatial wavefunction $\psi_{100}(r) = \pi^{-1/2}a_0^{-3/2}e^{-r/a_0}$ where $a_0 = \hbar^2/(m_e e^2) \approx 0.529$ Angstrom is the Bohr radius. A short check that $\psi_{100} \in H^2(\mathbb{R}^3)$ : the function and its first two derivatives all decay exponentially, so all $L^2$ norms are finite — a Gaussian-type integral away from $r = 0$ , and at the origin the only worry is the Coulomb singularity in the equation, which is absorbed by the kinetic term via the Kato-Rellich relative bound.

Eigenfunction expansion of solutions

The $n = 2$ shell has energy $-3.4\,\text{eV}$ , and four states (one $2s$ and three $2p$ ). Their energies coincide because of an “accidental” $SO(4)$ symmetry of the Coulomb problem (the Runge-Lenz vector commutes with $\hat{H}$ ). The degeneracy $n^2$ on level $n$ is a consequence of this symmetry, not of $SO(3)$ alone — for a non-Coulomb radial potential the $s$ , $p$ , $d$ states would split. This kind of symmetry-spectral analysis is exactly what the spectral theorem and Stone’s theorem make rigorous; the formal manipulations of physicists are theorems once one specifies the operator domains.

Bound states and scattering states in the Coulomb spectrum

The dichotomy spectrum-positive-versus-spectrum-negative is universal: any non-relativistic quantum system with a confining potential (potential going to $\infty$ at infinity) has pure point spectrum and no scattering, while any system with a barrier or asymptotically constant potential has a continuous spectrum and well-defined scattering theory. The mathematical machinery — the Putnam-Kato theorem, the Mourre commutator method, the Weyl essential spectrum theorem — is built on top of the spectral framework of Articles 8 and 9.

The uncertainty principle#

For two self-adjoint operators $A, B$ with $\psi \in \mathcal{D}(AB) \cap \mathcal{D}(BA)$ , the Robertson uncertainty relation states

$$ \Delta_\psi A \cdot \Delta_\psi B \ge \frac{1}{2}|\langle [A, B]\psi, \psi \rangle|, $$

where $\Delta_\psi A = \sqrt{\langle (A - \langle A \rangle_\psi)^2\psi, \psi \rangle}$ is the standard deviation.

For position $Q$ and momentum $P = -i\hbar d/dx$ , $[Q, P] = i\hbar I$ , giving the Heisenberg uncertainty principle $\Delta Q \cdot \Delta P \ge \hbar/2$ . The proof uses the Cauchy-Schwarz inequality in $H$ :

$$ |\langle [A,B]\psi, \psi \rangle| = 2|\text{Im}\,\langle A'\psi, B'\psi \rangle| \le 2\|A'\psi\|\|B'\psi\| = 2\Delta A \cdot \Delta B, $$

where $A' = A - \langle A \rangle I$ and $B' = B - \langle B \rangle I$ .

Heisenberg uncertainty principle from inner product inequality

The uncertainty principle is sometimes presented as a deep physical principle. From the functional-analytic viewpoint it is just Cauchy-Schwarz applied to commutators — the depth lies entirely in the realisation that observables are self-adjoint operators that fail to commute, not in the inequality itself. The non-commutativity is the physics; the inequality is the bookkeeping.

Quantum symmetries and conservation laws#

A symmetry is represented by a unitary (or anti-unitary) operator $U$ on $H$ . Wigner’s theorem states that any bijection on the set of pure states preserving transition probabilities $|\langle \psi, \phi \rangle|^2$ is implemented by such an operator.

A continuous symmetry — a one-parameter family $U(t) = e^{itA}$ — is generated by a self-adjoint operator $A$ (by Stone’s theorem, below). The associated conservation law states that $A$ is conserved by the dynamics: if $[\hat{H}, A] = 0$ , then $\langle A \rangle_{\psi(t)}$ is constant. This is the quantum analogue of Noether’s theorem:

  • Time translation symmetry $\leftrightarrow$ energy conservation.
  • Spatial translation symmetry $\leftrightarrow$ momentum conservation.
  • Rotation symmetry $\leftrightarrow$ angular momentum conservation.

All rigorous consequences of Stone’s theorem and the spectral theorem.


Worked Numerical Example#

Take the harmonic oscillator ground state $\psi_0(x) = \pi^{-1/4}e^{-x^2/2}$ in $L^2(\mathbb{R})$ with $\hbar=m=\omega=1$ . Position operator $Q$ multiplies by $x$ , momentum $P = -i d/dx$ . Compute variances explicitly. $\langle Q \rangle = 0$ by symmetry. $\langle Q^2 \rangle = \frac{1}{\sqrt{\pi}}\int_{-\infty}^\infty x^2 e^{-x^2}\,dx = 1/2$ . So $\Delta Q = \sqrt{1/2}$ . For momentum, $\psi_0'(x) = -x\psi_0(x)$ , so $P\psi_0 = ix\psi_0$ . $\langle P \rangle = 0$ . $\langle P^2 \rangle = \int |\psi_0'|^2\,dx = \int x^2 |\psi_0|^2\,dx = 1/2$ . So $\Delta P = \sqrt{1/2}$ . The product $\Delta Q \Delta P = 1/2$ . The commutator $[Q,P] = iI$ gives the Robertson bound $\frac{1}{2}|\langle iI\psi_0, \psi_0 \rangle| = 1/2$ . The inequality saturates exactly. If we scale the state to $\psi_\sigma(x) = (\pi\sigma^2)^{-1/4}e^{-x^2/(2\sigma^2)}$ , then $\Delta Q = \sigma/\sqrt{2}$ and $\Delta P = 1/(\sigma\sqrt{2})$ . The product remains $1/2$ regardless of $\sigma$ . The uncertainty principle is a rigid geometric constraint on Fourier pairs, not a measurement limitation.

Stone’s Theorem and Schrodinger Dynamics#

Statement#

Stone’s theorem: unitary groups and self-adjoint generators

Theorem (Stone, 1932). Let $A$ be a (possibly unbounded) self-adjoint operator on a Hilbert space $H$ . Then the family

$$ U(t) = e^{itA}, \quad t \in \mathbb{R}, $$

defined via the spectral theorem as $U(t) = \int e^{it\lambda} \, dE(\lambda)$ , is a strongly continuous one-parameter unitary group: $U(0) = I$ , $U(t+s) = U(t)U(s)$ , $U(t)^* = U(-t)$ , and $t \mapsto U(t)\psi$ is continuous for each $\psi$ .

Conversely, every strongly continuous one-parameter unitary group $\{U(t)\}_{t \in \mathbb{R}}$ on $H$ has the form $U(t) = e^{itA}$ for a unique self-adjoint operator $A$ .

Stone theorem: self-adjoint generator of a unitary group

Proof outline#

Forward direction. Given self-adjoint $A$ with spectral measure $E$ , define $U(t) = \int e^{it\lambda} \, dE(\lambda)$ . Each $U(t)$ is well-defined since $|e^{it\lambda}| = 1$ .

  • Unitarity: $U(t)^*U(t) = \int |e^{it\lambda}|^2\,dE = I$ .
  • Group: $U(t)U(s) = U(t+s)$ from $e^{it\lambda}e^{is\lambda} = e^{i(t+s)\lambda}$ .
  • Strong continuity: $\|U(t)\psi - \psi\|^2 = \int |e^{it\lambda}-1|^2\,d\|E(\lambda)\psi\|^2 \to 0$ by dominated convergence.
  • Generator: $(U(t)\psi-\psi)/t \to iA\psi$ for $\psi \in \mathcal{D}(A)$ .

Converse. Given $\{U(t)\}$ , define $A_0 = -i\lim_{t\to 0}(U(t)-I)/t$ on the natural domain. Steps:

  1. $A_0$ is densely defined and symmetric.
  2. $A_0$ is essentially self-adjoint (via Cayley transform; deficiency indices both zero).
  3. The closure $A = \overline{A_0}$ generates $U(t)$ via spectral theorem.

Schrodinger’s equation#

In quantum mechanics, the time evolution of a state $\psi$ is governed by the Schrodinger equation:

$$ i\hbar \frac{d\psi}{dt} = \hat{H}\psi, \quad \psi(0) = \psi_0. $$

By Stone’s theorem with $A = \hat{H}/\hbar$ , the solution is $\psi(t) = e^{-it\hat{H}/\hbar}\psi_0$ . Stone’s theorem guarantees:

  • Existence and uniqueness of the evolution for any initial state $\psi_0 \in H$ , even though $\hat{H}$ is unbounded.
  • Unitarity ($\|U(t)\psi_0\| = \|\psi_0\|$ ), conservation of probability.
  • Reversibility ($U(-t) = U(t)^{-1}$ ), time-reversal symmetry.
  • Energy conservation: if $\psi_0$ is an eigenstate of $\hat{H}$ with eigenvalue $E$ , then $\psi(t) = e^{-iEt/\hbar}\psi_0$ .

Schrodinger equation as the unitary semigroup generated by the Hamiltonian

What is striking is how little one has to add to functional analysis to get quantum mechanics. The framework — observables are self-adjoint, symmetries are unitary, time evolution is the unitary group of the energy — is essentially mandated by the spectral and Stone theorems once one accepts the linear Hilbert space structure. The actual physics is in the choice of $\hat{H}$ , and that choice involves no functional analysis at all.

A worked time evolution#

Take the harmonic oscillator $\hat{H} = -\tfrac{1}{2m}\partial_x^2 + \tfrac{1}{2}m\omega^2 x^2$ on $L^2(\mathbb{R})$ . Its spectrum is purely discrete: $E_n = \hbar\omega(n + 1/2)$ with eigenfunctions $\psi_n$ proportional to Hermite functions. Given an initial state $\psi_0 = \sum c_n\psi_n$ , the time-evolved state is $\psi(t) = \sum c_n e^{-iE_n t/\hbar}\psi_n$ . The evolution mixes phases but preserves all $|c_n|^2$ — measurement statistics in the energy basis are time-independent, while expectations of position and momentum oscillate at frequency $\omega$ (this is the quantum analogue of the classical orbit). All of this is rigorous: the sum converges in $L^2$ because $\sum |c_n|^2 < \infty$ , and term-by-term phase rotation is exactly the spectral-theorem definition of $e^{-it\hat{H}/\hbar}$ .

For an unbounded potential like the Coulomb problem, the construction is identical except the spectral integral has both a sum (bound states) and a Lebesgue integral (continuous spectrum). The unitary group acts as a phase on each piece. There is no need to find a closed-form propagator; the spectral theorem is the propagator.


Worked Numerical Example#

Consider the free Hamiltonian $\hat{H} = -\frac{1}{2}\partial_x^2$ on $L^2(\mathbb{R})$ . Take initial data $\psi_0(x) = (2/\pi)^{1/4}e^{-x^2}$ , normalized so $\|\psi_0\|^2=1$ . Stone’s theorem says $U(t) = e^{-it\hat{H}}$ is unitary, so $\|\psi(t)\|=1$ for all $t$ .

$$\|\psi(1)\|^2 = \frac{\sqrt{2/\pi}}{\sqrt{5}} \int_{-\infty}^\infty e^{-2x^2/5}\,dx = \frac{\sqrt{2/\pi}}{\sqrt{5}} \cdot \sqrt{\frac{5\pi}{2}} = 1.$$

Probability is conserved exactly. If we replace $\hat{H}$ with a non-self-adjoint operator like $-\frac{1}{2}\partial_x^2 + ix$ , the generator loses self-adjointness, Stone’s theorem no longer applies, and the same calculation yields $\|\psi(t)\|^2 = e^{t^3/3}$ , which blows up. Unitarity is not automatic; it is purchased with self-adjointness.

Regularity Theory: Brief Overview#

The Lax-Milgram theorem gives a weak solution $u \in H_0^1(\Omega)$ to $-\Delta u = f$ . But is $u$ actually smooth? Does it satisfy the equation in the classical sense?

Elliptic regularity answers: if the data and boundary are smooth, the solution is smooth.

Theorem (Interior regularity). If $u \in H^1(\Omega)$ is a weak solution of $-\Delta u = f$ with $f \in H^k(\Omega)$ , then $u \in H^{k+2}_{\text{loc}}(\Omega)$ .

Theorem (Boundary regularity). If $\Omega$ has $C^{k+2}$ boundary, $f \in H^k(\Omega)$ , and $u \in H_0^1(\Omega)$ is the weak solution, then $u \in H^{k+2}(\Omega)$ and $\|u\|_{H^{k+2}} \le C\|f\|_{H^k}$ .

Consequence (bootstrap to classical). If $f \in C^\infty(\overline{\Omega})$ and $\partial\Omega$ is smooth, then the weak solution is $C^\infty(\overline{\Omega})$ — a classical solution. The Sobolev embedding theorem converts $H^k$ regularity into $C^m$ regularity once $k$ is large enough.

Strategy:

  1. Difference quotient method: For interior regularity, use $v = \tau_h^{-s}(\tau_h^s u)$ (where $\tau_h^s$ is a difference quotient in direction $s$ ) as test function in the weak formulation. Coercivity gives $H^2$ regularity. Iterate.

  2. Flattening the boundary: Near $\partial\Omega$ , use a diffeomorphism to straighten the boundary, then apply the interior argument tangentially. The normal direction requires an additional argument using the equation itself.

This interplay between weak existence (functional analysis) and classical regularity (estimates) is the heart of modern PDE theory.

Schauder estimates and Holder regularity#

For equations with Holder continuous coefficients ($a_{ij} \in C^{0,\alpha}$ ), the appropriate regularity theory uses Schauder estimates rather than Sobolev estimates. The result: if $f \in C^{0,\alpha}(\overline{\Omega})$ and the coefficients are $C^{0,\alpha}$ , then the solution $u \in C^{2,\alpha}(\overline{\Omega})$ with $\|u\|_{C^{2,\alpha}} \le C\|f\|_{C^{0,\alpha}}$ .

Schauder estimates use the freezing-coefficients technique (approximate variable-coefficient operators by constant-coefficient ones) and the explicit Newton potential. The key analytical tool is the Campanato characterisation of Holder spaces.

Maximum principles#

A complementary approach uses maximum principles: if $-\Delta u \ge 0$ in $\Omega$ ($u$ is subharmonic), then $u$ achieves its maximum on $\partial\Omega$ . The strong maximum principle (Hopf) sharpens this: unless $u$ is constant, the maximum is achieved only on the boundary.

Maximum principles give qualitative information (positivity, comparison) that energy methods cannot. Combined with Lax-Milgram and elliptic regularity, they give a remarkably complete picture of elliptic PDE.

For example: if $f \ge 0$ , the weak solution of $-\Delta u = f$ with zero Dirichlet data is nonnegative. The proof tests with $v = u_- = \max(-u, 0)$ , observes $\int |\nabla u_-|^2 = -\int f u_- \le 0$ , and concludes $u_- \equiv 0$ . Positivity of the Green’s function is downstream of this argument; so are comparison principles for free-boundary problems and obstacle problems.

The complete pipeline#

The functional-analytic approach to elliptic boundary value problems follows a clear pipeline:

  1. Weak formulation. Write the PDE in variational form $a(u, v) = F(v)$ for all $v$ in a Sobolev space $V$ .
  2. Existence and uniqueness. Apply Lax-Milgram (or Browder-Minty for nonlinear problems).
  3. Regularity. Promote $u$ from $V$ (e.g., $H^1$ ) to $H^{k+2}$ , $C^{k,\alpha}$ , or $C^\infty$ , depending on the smoothness of the data and boundary.
  4. Qualitative properties. Maximum principles, comparison theorems, spectral theory.
  5. Approximation. Galerkin/finite elements for numerical computation, with Cea lemma error bounds.

Each step relies on different tools, yet the framework is unified and modular. This is the enduring contribution of the functional-analytic approach: it separates existence from regularity from computation, allowing each question to be addressed with optimal techniques.


Worked Numerical Example#

Solve $-u'' = f$ on $(0,1)$ with $u(0)=u(1)=0$ and $f(x) = x^{-1/4}$ . Note $f \in L^2(0,1)$ because $\int_0^1 x^{-1/2}\,dx = 2 < \infty$ , but $f \notin C^0$ . Lax-Milgram gives $u \in H_0^1$ . Elliptic regularity predicts $u \in H^2$ .

Integrate twice: $u'(x) = -\frac{4}{3}x^{3/4} + C$ , $u(x) = -\frac{16}{21}x^{7/4} + Cx + D$ . Boundary conditions force $D=0$ and $C=16/21$ . So $u(x) = \frac{16}{21}(x - x^{7/4})$ . Compute the second derivative: $u''(x) = -x^{-1/4}$ . Its $L^2$ norm squared is exactly $2$ , so $u \in H^2(0,1)$ . However, $\lim_{x\to 0^+} u''(x) = -\infty$ , so $u \notin C^2([0,1])$ . The Sobolev embedding $H^2(0,1) \hookrightarrow C^1([0,1])$ holds (indeed $u'(0)=0$ ), but embedding into $C^2$ fails. The regularity theorem gives exactly what the data pays for: $f \in L^2$ buys two weak derivatives, not two classical ones. The numbers match the theory without slack.

Counterexample: Why the Definition Cannot Be Weakened#

Coercivity in the Lax-Milgram theorem is not a technicality; it is the only thing preventing the bilinear form from collapsing on a subspace. Drop $\alpha > 0$ to $\alpha = 0$ (positive semi-definiteness) and existence fails immediately.

Take $V = H^1(0,1)$ with the standard norm. Define $a(u,v) = \int_0^1 u'v'\,dx$ . This form is continuous with $M=1$ . It is positive semi-definite: $a(u,u) = \|u'\|_{L^2}^2 \ge 0$ . But it is not coercive on $H^1$ because constants lie in the kernel: $a(1,1) = 0$ while $\|1\|_{H^1} = 1$ . The Poincaré inequality fails without zero boundary conditions.

Now choose the functional $F(v) = \int_0^1 v\,dx$ , which is bounded on $H^1$ . The variational problem asks for $u \in H^1$ such that $\int_0^1 u'v'\,dx = \int_0^1 v\,dx$ for all $v \in H^1$ . Test with $v(x) \equiv 1$ . The left side is $\int_0^1 u' \cdot 0\,dx = 0$ . The right side is $\int_0^1 1\,dx = 1$ . We get $0 = 1$ . No solution exists.

The failure is structural. The operator $A: u \mapsto -u''$ with Neumann boundary conditions has a one-dimensional kernel (constants). The range is orthogonal to the kernel, so $F$ must satisfy the compatibility condition $\int_0^1 f = 0$ to be solvable. Our $F$ violates it. Lax-Milgram sidesteps this entire Fredholm alternative machinery by demanding $\alpha > 0$ , which forces the kernel to be trivial and the range to be everything. If you cannot prove coercivity, you do not have a Lax-Milgram problem; you have a saddle-point or Fredholm problem, and the solution theory changes completely.

Why I Care#

I spent three days in my first year of graduate school staring at a singular stiffness matrix. I was coding a finite element solver for a steady-state heat equation with pure Neumann boundary conditions on a square domain. The mesh was fine, the assembly routine checked out, but the linear solver returned NaNs. The condition number was $10^{16}$ . I blamed quadrature rules, then mesh orientation, then floating-point accumulation.

A postdoc walked by, glanced at the boundary condition flags, and said: “You are trying to invert the Laplacian on $H^1$ . Quotient out the constants or pin a node.” I added one line to fix $u(x_0)=0$ at a corner vertex. The condition number dropped to $412$ . The solver converged in 14 iterations.

That was the moment Lax-Milgram stopped being a theorem on a blackboard and became a diagnostic tool. The coercivity constant $\alpha$ is not an abstract lower bound; it is the smallest eigenvalue of your stiffness matrix. If $\alpha=0$ , your matrix is singular. The functional analysis tells you exactly which degree of freedom is floating and how to constrain it. I have never written a PDE solver since without checking the coercivity of the discrete form first. The abstraction saves you from debugging ghosts.

Common Pitfall#

Beginners routinely conflate symmetric and self-adjoint operators in quantum mechanics. The belief is: if $\langle A\phi, \psi \rangle = \langle \phi, A\psi \rangle$ for all test functions, then $A$ is self-adjoint and the spectral theorem applies. This is false. Symmetry only requires the identity on a dense domain; self-adjointness requires the domain of the adjoint to match exactly.

Take the momentum operator $P = -i d/dx$ on $L^2[0,1]$ . Define it on $\mathcal{D}(P) = C_c^\infty(0,1)$ , smooth functions vanishing near the endpoints. Integration by parts gives $\langle P\phi, \psi \rangle - \langle \phi, P\psi \rangle = -i[\bar{\phi}\psi]_0^1 = 0$ . So $P$ is symmetric.

Now compute the adjoint domain $\mathcal{D}(P^*)$ . It consists of all $\psi \in L^2$ such that $\phi \mapsto \langle P\phi, \psi \rangle$ is bounded. Distribution theory shows $\mathcal{D}(P^*) = H^1[0,1]$ with no boundary conditions. Pick $\phi(x) = e^{2\pi i x} \in \mathcal{D}(P^*)$ and $\psi(x) = 1 \in \mathcal{D}(P^*)$ . The boundary term $-i[\bar{\phi}\psi]_0^1 = -i(1-1) = 0$ happens to vanish here, but pick $\psi(x)=x$ . The term is $-i[\bar{\phi}x]_0^1 = -i \ne 0$ . The adjoint is strictly larger: $\mathcal{D}(P) \subsetneq \mathcal{D}(P^*)$ . $P$ is not self-adjoint. It has deficiency indices $(1,1)$ and admits a one-parameter family of self-adjoint extensions $e^{i\theta}\psi(0)=\psi(1)$ . Without picking $\theta$ , there is no unique unitary group $e^{itP}$ , probability leaks at the boundaries, and the spectral theorem is inapplicable. Symmetry is cheap; self-adjointness costs boundary conditions.

Where to Go from Here#

This series has covered twelve articles and the core of a graduate functional analysis course. But functional analysis is a vast subject. Here are some directions for further study.

Operator algebras and C*-algebras. The algebra $B(H)$ has a rich structure studied in C*-algebra and von Neumann algebra theory. The Gelfand-Naimark theorem characterises abstract C*-algebras as subalgebras of $B(H)$ . Fundamental to QFT and statistical mechanics.

Nonlinear functional analysis. The Schauder fixed-point theorem, degree theory, and the calculus of variations for nonlinear functionals extend the linear theory. Navier-Stokes, Yang-Mills, Einstein field equations all require nonlinear methods.

Microlocal analysis. Pseudodifferential operators and Fourier integral operators refine distribution theory to study regularity of solutions to variable-coefficient PDE. The wavefront set encodes both position-space and frequency-space information about singularities.

Interpolation theory. Riesz-Thorin and Marcinkiewicz interpolation provide tools for $L^p$ bounds by interpolating between endpoints. Connects to harmonic analysis and singular integral operators.

Spectral geometry. “Can one hear the shape of a drum?” (Kac, 1966) asks how much geometric information about $\Omega$ is encoded in the spectrum of the Dirichlet Laplacian. Weyl’s law $N(\lambda) \sim C_n\,\text{vol}(\Omega)\lambda^{n/2}$ gives the leading term. Subject connects functional analysis to differential geometry and number theory.

Quantum information theory. Trace-class operators form the space of density matrices (mixed quantum states). Von Neumann entropy, quantum channels, entanglement measures — all studied with operator-theoretic tools.

Random matrix theory and free probability. The eigenvalue statistics of large self-adjoint random matrices have universal limits described by random-matrix ensembles. The free probability of Voiculescu generalises classical probability to non-commutative algebras. Both subjects are functional-analytic at heart.

Index theory. The Atiyah-Singer index theorem connects the analytic index of an elliptic operator (dimension of kernel minus dimension of cokernel) to topological invariants of the underlying manifold. The proof uses the spectral theorem, semigroup methods, and pseudodifferential operators in essential ways. Modern proofs via supersymmetric quantum mechanics (Witten) bring the QM toolkit to bear on geometric topology.

Scattering theory. The Lax-Phillips theory and the Mourre commutator method describe asymptotic behaviour of $e^{-it\hat{H}}\psi$ for large $|t|$ . The wave operators $\Omega_\pm = \lim_{t \to \pm\infty} e^{it\hat{H}}e^{-it\hat{H}_0}$ encode the long-time scattering, and the S-matrix $S = \Omega_+^*\Omega_-$ is a unitary on $H_{ac}(\hat{H}_0)$ whose spectrum is what experiment measures. The full machinery rests on the spectral theorem, Stone’s theorem, and a careful analysis of the absolutely continuous part of $\hat{H}$ .

The common thread across all these directions is the idea that launched functional analysis over a century ago: by abstracting to the right level of generality, we can see the structural reasons behind diverse concrete phenomena, and the abstract insight guides us to new results that would be invisible from any single application domain.

That is the enduring power of the subject, and the reason a twelve-article series fits comfortably inside an undergraduate’s first encounter with the field while still feeling like the tip of an iceberg. The iceberg is real; what we have covered is enough to read most of the modern PDE and mathematical physics literature with comprehension, even if not yet with fluency. Fluency takes time and concrete problems. Both are available; only the reader can supply them.

A closing personal note. I have written eighty thousand words on functional analysis at this point, and I am still not sure I have the architecture quite right. Every time I revisit a topic — the resolvent identity, the spectral theorem, the trace operator, the Cea lemma — I find a new angle, a new connection, a new way of phrasing it that fits the rest of the structure better. That endless refinement is not a sign that the field is unfinished. It is a sign that the field is alive, that the abstractions are deep enough to keep yielding new structure, and that no single presentation can exhaust their content. If anything in this series spurs the same kind of refinement-loop in you, the series will have done its job.

The next time you write down a PDE, ask three questions: what is the natural Hilbert space, what is the bilinear form, and what is the data class? If you can answer those, the rest of the analysis is downstream of theorems we have proved, and the existence-and-uniqueness theory is largely an exercise. The answers may not be obvious — choosing the right $V$ for free-boundary problems is genuinely hard, and the Stokes equations require a divergence-free Sobolev space rather than the simple $H^1_0$ — but the discipline of asking the questions in that order is what the functional-analytic revolution gave us. Use it.


Specific Questions Ahead#

This series ends at twelve, but the architecture we built is a foundation, not a ceiling. The natural continuation moves from linear existence theory to nonlinear critical point theory and geometric analysis. Here is what comes next if you follow the path:

  1. How do we find solutions when the energy functional is not convex and has no global minimum?
  2. What replaces compactness when the Sobolev embedding $H^1 \hookrightarrow L^{2^*}$ fails to be compact at the critical exponent?
  3. How do we handle operators that are monotone but not coercive, or functionals that are only lower semicontinuous in the weak topology?
  4. Can we extract multiple solutions from the topology of the underlying function space rather than from linearity?

You are equipped to tackle these because you now have the full linear toolkit: weak convergence and reflexivity (Article 5), compact embeddings and Rellich-Kondrachov (Article 11), spectral decomposition for linearised stability (Article 9), and the variational framework that turns PDE into calculus on Banach spaces. Nonlinear analysis does not discard these tools; it weaponises them. The direct method of the calculus of variations becomes the Mountain Pass Theorem. Compactness failures become concentration-compactness principles. Spectral gaps become Morse indices.

The first concrete result you will meet is the Mountain Pass Theorem (Ambrosetti-Rabinowitz, 1973). It states that if a $C^1$ functional $J$ on a Banach space satisfies the Palais-Smale compactness condition, has a local minimum at $0$ , and drops below $J(0)$ at some point $e$ , then there exists a critical point at a saddle level $c \ge \inf_{\gamma} \max_{t} J(\gamma(t))$ , where $\gamma$ ranges over paths connecting $0$ to $e$ . The proof is a minimax argument built entirely on weak topology and deformation lemmas. It solves semilinear elliptic equations like $-\Delta u = |u|^{p-2}u$ where Lax-Milgram is useless because the nonlinearity destroys convexity.

The transition from linear to nonlinear is not a change of subject. It is a change of topology. You have spent twelve articles learning how to control limits, how to measure size, and how to decompose operators. Those skills do not expire. They are the only things that keep nonlinear problems from dissolving into formal manipulation. Pick a concrete equation, write down the functional, check the compactness, and run the minimax. The machinery is ready.

In this series

Functional Analysis 12 parts

  1. 01 Functional Analysis (1): Metric Spaces — Distance, Convergence, and Completeness
  2. 02 Functional Analysis (2): Normed Spaces and Banach Spaces
  3. 03 Functional Analysis (3): Hilbert Spaces — Geometry in Infinite Dimensions
  4. 04 Functional Analysis (4): Dual Spaces and the Hahn-Banach Theorem — Taming Linear Functionals
  5. 05 Functional Analysis (5): Weak and Weak-* Topologies — When Norm Convergence Is Too Strong
  6. 06 Functional Analysis (6): Bounded Linear Operators and the Big Theorems
  7. 07 Functional Analysis (7): Compact Operators — The Bridge to Finite Dimensions
  8. 08 Functional Analysis (8): Spectral Theory — Decomposing Operators
  9. 09 Functional Analysis (9): Unbounded Operators — When Boundedness Fails
  10. 10 Functional Analysis (10): Semigroups of Operators — Evolution Equations in Infinite Dimensions
  11. 11 Functional Analysis (11): Distributions and Sobolev Spaces — Generalized Solutions
  12. 12 Functional Analysis (12): Functional Analysis in Action — PDE and Quantum Mechanics you are here

Liked this piece?

Follow on GitHub for the next one — usually one a week.

GitHub