
Functional Analysis (8): Spectral Theory — Decomposing Operators
The spectrum generalizes eigenvalues to infinite dimensions — the spectral theorem for bounded self-adjoint operators and continuous functional calculus give us a complete decomposition.
When I first saw the word “spectrum” used for an operator I assumed it was a fancy synonym for “set of eigenvalues.” That is the right intuition for matrices and for compact operators, and it is exactly what one wants in introductory linear algebra. The trouble is that it is wrong as soon as the operator is not compact. The position operator $(Mf)(x) = x f(x)$ on $L^2[0, 1]$ has no eigenvalues: any eigenfunction would have to satisfy $x f(x) = \lambda f(x)$ a.e., which forces $f = 0$ everywhere away from a single point, hence $f = 0$ in $L^2$ . And yet the operator is clearly not invertible, since $\lambda I - M$ is multiplication by $x - \lambda$ , which fails to be boundedly invertible whenever $\lambda \in [0, 1]$ .
So we need a notion of “spectral value” that is broader than eigenvalue. The idea is to define the spectrum as the set of $\lambda$ for which $\lambda I - T$ fails to be a bounded invertible operator, for any reason. This makes the spectrum a property of the operator’s invertibility structure, not just its eigenvector structure, and it works uniformly for compact and non-compact operators alike. The full reward — the spectral theorem for bounded self-adjoint operators — promotes the diagonalization picture from compact to general self-adjoint operators by replacing finite or countable diagonals with continuous integrals against a spectral measure. This article is the unhurried walk through that reward.
The Spectrum, Defined#
Let $T \in B(X)$ be a bounded operator on a complex Banach space $X$ . The resolvent set is

The spectrum is the complement: $\sigma(T) = \mathbb{C} \setminus \rho(T)$ . By the open mapping theorem, $\lambda I - T$ being bijective is enough to guarantee bounded inverse, so the second condition is automatic. The spectrum is therefore the set of $\lambda$ for which $\lambda I - T$ fails to be either injective or surjective.
This binary distinction (injective / surjective) lets us refine the spectrum into pieces.

- Point spectrum $\sigma_p(T) = \{\lambda : \lambda I - T \text{ is not injective}\}$ — the eigenvalues, in the usual sense.
- Continuous spectrum $\sigma_c(T) = \{\lambda : \lambda I - T \text{ is injective with dense range, but not surjective}\}$ — no eigenvector, but $\lambda I - T$ is “almost” surjective.
- Residual spectrum $\sigma_r(T) = \{\lambda : \lambda I - T \text{ is injective, range not dense}\}$ — no eigenvector, and the range misses a substantial part of the space.
These three sets are disjoint and their union is $\sigma(T)$ . For self-adjoint operators on Hilbert space, the residual spectrum is empty — a small but useful fact.
A few examples are worth concrete computation.
Example 1 (matrix). For $A \in M_n(\mathbb{C})$ , the spectrum is the eigenvalues, all in $\sigma_p$ . There is no continuous or residual spectrum: in finite dimensions, injective implies surjective. The whole “three pieces” story collapses to the matrix eigenvalue story.
Example 2 (compact operator). Article 7 told us that for compact $T$ , every nonzero spectral value is an eigenvalue. So $\sigma(T) \setminus \{0\} \subset \sigma_p(T)$ , with possibly $0 \in \sigma_c(T)$ or $\sigma_r(T)$ . The Volterra operator on $L^2[0, 1]$ has $\sigma = \{0\}$ , with $0 \in \sigma_c$ .
Example 3 (multiplication on $L^2[0, 1]$ ). $(M f)(x) = x f(x)$ . For $\lambda \in [0, 1]$ , $\lambda I - M$ is multiplication by $x - \lambda$ , which is injective (since $x - \lambda \neq 0$ except on a null set) but not surjective (the image is $\{g \in L^2 : g(x)/(x - \lambda) \in L^2\}$ , a proper subset). For $\lambda \in [0, 1]$ in the interior, the image is dense, so $\lambda \in \sigma_c$ . For $\lambda \notin [0, 1]$ , $\lambda I - M$ is invertible (multiplication by $1/(x-\lambda) \in L^\infty$ ), so $\lambda \in \rho$ . Conclusion: $\sigma(M) = [0, 1]$ , all continuous spectrum, no eigenvalues.
The third example is the prototype non-compact self-adjoint operator. It shows in stark form why “find the eigenvalues” is the wrong question for general operators: this operator has no eigenvalues, but its spectrum is a whole interval. The right question is “describe the spectral measure,” and the answer for $M$ is “Lebesgue measure on $[0, 1]$ , in disguise.”
The Resolvent and Its Analyticity#
For $\lambda \in \rho(T)$ , define the resolvent

The resolvent is the technical workhorse of spectral theory. Its first virtue: as a function of $\lambda$ , it is operator-valued and analytic on $\rho(T)$ .

Theorem. $\rho(T)$ is open, and $\lambda \mapsto R(\lambda; T)$ is analytic from $\rho(T)$ to $B(X)$ , in the sense that it has a convergent power series expansion at every point of $\rho(T)$ .
Proof. Fix $\lambda_0 \in \rho(T)$ . For $\lambda$ near $\lambda_0$ , the formal Neumann series
$$ R(\lambda; T) = R(\lambda_0; T) \sum_{n=0}^\infty (\lambda_0 - \lambda)^n R(\lambda_0; T)^n $$converges in operator norm whenever $|\lambda - \lambda_0| < \|R(\lambda_0; T)\|^{-1}$ — the Neumann series argument. So $\lambda$ is in $\rho(T)$ as well, with the resolvent given by the series, hence analytic. $\square$
A standard consequence is the first resolvent identity: $R(\lambda) - R(\mu) = (\mu - \lambda) R(\lambda) R(\mu)$ , valid for $\lambda, \mu \in \rho(T)$ . It is the operator analog of the partial-fraction identity $1/(\lambda - t) - 1/(\mu - t) = (\mu - \lambda)/((\lambda - t)(\mu - t))$ , and it is what makes the resolvent useful in contour integration.
The spectrum is non-empty and bounded. For any $T \in B(X)$ on a complex Banach space, $\sigma(T) \neq \emptyset$ and $\sigma(T) \subset \{\lambda : |\lambda| \leq \|T\|\}$ . The bound is the Neumann series argument: for $|\lambda| > \|T\|$ , the series $\sum (T/\lambda)^n / \lambda$ converges to $R(\lambda; T)$ . Non-emptiness uses that if $\sigma(T) = \emptyset$ , then $R(\lambda; T)$ would be entire and bounded, hence constant by Liouville, but it goes to zero as $|\lambda| \to \infty$ , hence is identically zero — contradiction. The non-emptiness of the spectrum is thus a complex-analytic theorem that has no analog in real Banach spaces.
Worked Numerical Example#
$$ R(6; A) = (6I - A)^{-1} = \begin{pmatrix} 4 & -1 \\ 0 & 2 \end{pmatrix}^{-1} = \begin{pmatrix} 0.25 & 0.125 \\ 0 & 0.5 \end{pmatrix}. $$ $$ R(5; A) = R(6; A) \sum_{n=0}^\infty (6-5)^n R(6; A)^n = \sum_{n=0}^\infty R(6; A)^{n+1}. $$ $$ \begin{pmatrix} 1/3 & 1/3 \\ 0 & 1 \end{pmatrix}. $$Direct inversion confirms this: $(5I - A)^{-1} = \begin{pmatrix} 3 & -1 \\ 0 & 1 \end{pmatrix}^{-1} = \begin{pmatrix} 1/3 & 1/3 \\ 0 & 1 \end{pmatrix}$ . The resolvent is not an abstract limit; it is a concrete geometric series whose radius is dictated by the distance to the nearest spectral point.
Spectral Radius#
Define the spectral radius $r(T) = \sup\{|\lambda| : \lambda \in \sigma(T)\}$ . The neat fact is that this geometric quantity equals an analytic limit:

Spectral radius formula. $r(T) = \lim_{n \to \infty} \|T^n\|^{1/n}$ .
The limit exists by Fekete’s lemma applied to the sub-multiplicative sequence $\|T^n\|$ . For $|\lambda| > r(T)$ , the Neumann series $\sum T^n / \lambda^{n+1}$ converges (root test), giving $\lambda \in \rho(T)$ . The opposite direction comes from the analyticity of the resolvent: it must have a singularity somewhere on the circle $|\lambda| = r(T)$ , which forces $\sigma(T)$ to touch that circle.
Why this is shocking. The left side is purely about the spectrum (a geometric property of the operator). The right side is purely about iterating $T$ (an analytic property). They are equal. In particular: an operator with $\|T^n\|^{1/n} \to 0$ is quasinilpotent, with spectrum $\{0\}$ . The Volterra operator has this property. Conversely, an operator with spectral radius zero behaves, asymptotically, like a contraction: $\|T^n\| \to 0$ , eventually.
Try a small numerical example. Take the $3 \times 3$ matrix
$$ A = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{pmatrix}. $$Then $A^2$ has a single $1$ in the top-right corner, $A^3 = 0$ . So $\|A^n\| = 0$ for $n \geq 3$ , and $r(A) = 0$ . The spectrum of $A$ is $\{0\}$ (it is nilpotent). Now take $B = A + 0.01 \cdot I$ . The spectrum is $\{0.01\}$ , $\|B^n\|^{1/n} \to 0.01$ . The eigenvalue equals the asymptotic growth rate. The spectral radius formula in three lines.
Worked Numerical Example#
$$ T^2 = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} = I, \quad T^3 = T, \quad T^4 = I. $$The sequence alternates: $T^{2k} = I$ , $T^{2k+1} = T$ . The spectral norm $\|I\|_2 = 1$ . For $T$ , the singular values are the square roots of eigenvalues of $T^*T = \text{diag}(1/9, 9)$ , so $\|T\|_2 = 3$ . The sequence $\|T^n\|^{1/n}$ therefore alternates between $1^{1/(2k)} = 1$ and $3^{1/(2k+1)}$ . For $n=1$ , value is $3$ . For $n=3$ , $3^{1/3} \approx 1.442$ . For $n=5$ , $3^{1/5} \approx 1.246$ . For $n=9$ , $3^{1/9} \approx 1.130$ . For $n=99$ , $3^{1/99} \approx 1.011$ . The subsequence of odd powers decays monotonically to $1$ , matching the even subsequence exactly. The limit is $1$ , identical to $r(T)$ . The formula holds even when the norm sequence oscillates wildly. The asymptotic growth rate cares only about the spectral boundary, not the transient non-normality.
Self-Adjoint Operators on Hilbert Space#
From here on, $H$ is a complex Hilbert space and $T \in B(H)$ is self-adjoint, meaning $T = T^*$ . The world becomes much friendlier.

The spectrum of a self-adjoint operator is real. For $\lambda = a + ib \in \mathbb{C}$ with $b \neq 0$ , the operator $T - \lambda I$ satisfies
$$ \|(T - \lambda I) x\|^2 = \|(T - aI)x\|^2 + b^2 \|x\|^2 \geq b^2 \|x\|^2, $$so $T - \lambda I$ is bounded below, hence injective with closed range. By self-adjointness, $\text{range}(T - \lambda I)^\perp = \ker(T - \overline{\lambda} I)$ , and the same argument shows this kernel is trivial. So the range is all of $H$ , $T - \lambda I$ is invertible, and $\lambda \in \rho(T)$ .
The spectrum of a self-adjoint operator therefore lies on $\mathbb{R}$ . In fact, it lies in $[-\|T\|, \|T\|]$ , and (a finer estimate) in $[m, M]$ where $m = \inf_{\|x\|=1} \langle Tx, x \rangle$ and $M = \sup_{\|x\|=1} \langle Tx, x\rangle$ , with $m, M \in \sigma(T)$ .
Residual spectrum is empty. If $\lambda I - T$ has dense range, the same self-adjointness identity shows it is also injective, hence the whole spectrum is point or continuous. No third type. This dichotomy is what makes the spectral theorem for self-adjoint operators so clean: every spectral value is either an eigenvalue (with eigenvector) or a continuous spectrum value (with “approximate eigenvectors”: Weyl sequences $x_n$ with $\|x_n\| = 1$ and $(T - \lambda I)x_n \to 0$ ).
The Continuous Functional Calculus#
This is the key idea, and it is the cleanest one in spectral theory. Given $T = T^*$ bounded with spectrum $\sigma(T) \subset [m, M] \subset \mathbb{R}$ , we want to apply functions to $T$ . For polynomials this is trivial: $p(T) = \sum a_n T^n$ . For power series with radius of convergence exceeding $\|T\|$ , the same. But what about the function $f(t) = e^t$ on the spectrum? Or $f(t) = \sqrt{t}$ on $[0, M]$ for positive $T$ ? Or $f(t) = \mathbf{1}_{(\lambda_0, \infty)}(t)$ , the indicator of an interval?

The clean answer is the continuous functional calculus.

Theorem (Continuous Functional Calculus). Let $T = T^* \in B(H)$ . There is a unique map $\Phi: C(\sigma(T)) \to B(H)$ such that
- $\Phi$ is a $*$ -algebra homomorphism: $\Phi(fg) = \Phi(f)\Phi(g)$ , $\Phi(\overline{f}) = \Phi(f)^*$ , $\Phi(1) = I$ , $\Phi(t) = T$ .
- $\Phi$ is an isometry: $\|\Phi(f)\| = \|f\|_{C(\sigma(T))} = \sup_{\lambda \in \sigma(T)} |f(\lambda)|$ .
- $\sigma(\Phi(f)) = f(\sigma(T))$ (spectral mapping theorem).
The construction goes through the polynomial case first: define $\Phi(p) = p(T)$ for polynomials, verify $\|p(T)\| = \|p\|_{C(\sigma(T))}$ (this is what self-adjointness gives), then extend by density to continuous functions via Stone-Weierstrass. The end product is a way to define $f(T)$ for any $f$ continuous on the spectrum, with all the algebraic and norm properties one would want.
In particular, we have $e^T$ , $\sqrt{T}$ (for positive $T$ ), $|T| = (T^*T)^{1/2}$ for general $T$ , and a range of useful operator-functions. The continuous functional calculus is what makes applied operator theory possible: anywhere one wants to compute $f(T)$ for a self-adjoint $T$ , this gives a clean way to do it.
A small numerical instance. Take $T = \text{diag}(1, 2, 3)$ on $\mathbb{C}^3$ . Then $\sigma(T) = \{1, 2, 3\}$ , and for $f$ continuous, $\Phi(f) = \text{diag}(f(1), f(2), f(3))$ — the functional calculus reduces to applying $f$ to the eigenvalues. For an operator with continuous spectrum, the same idea works but the “diagonal” is replaced with a multiplication operator on a function space. This is essentially the content of the spectral theorem.
Worked Numerical Example#
$$ f(0.1) = 0.1 e^{-0.1} \approx 0.09048, $$ $$ f(0.4) = 0.4 e^{-0.4} \approx 0.26813, $$ $$ f(0.8) = 0.8 e^{-0.8} \approx 0.35946. $$Thus $f(T) = \text{diag}(0.09048, 0.26813, 0.35946)$ . The operator norm $\|f(T)\|_2$ is the maximum absolute diagonal entry, which is $0.35946$ . Now compute the sup-norm of $f$ on the spectrum: $\max\{|f(0.1)|, |f(0.4)|, |f(0.8)|\} = 0.35946$ . They match exactly, confirming the isometry property $\|\Phi(f)\| = \|f\|_{C(\sigma(T))}$ . If we instead took $g(t) = \sqrt{t}$ , we get $g(T) = \text{diag}(0.3162, 0.6325, 0.8944)$ , with norm $0.8944$ . The functional calculus does not require $f$ to be a polynomial or analytic; continuity on the discrete spectrum is sufficient, and the norm equality is exact, not approximate. This is why numerical routines for matrix functions evaluate $f$ on eigenvalues directly when the matrix is normal.
The Spectral Theorem for Bounded Self-Adjoint Operators#
There are several equivalent formulations. The two I find most useful are the multiplication operator form and the spectral measure form.

Theorem (Spectral Theorem, Multiplication Form). Let $T \in B(H)$ be self-adjoint. There exist a measure space $(\Omega, \mu)$ , a unitary $U: H \to L^2(\Omega, \mu)$ , and a bounded measurable function $h: \Omega \to \mathbb{R}$ such that
$$ U T U^{-1} = M_h, $$where $M_h$ is multiplication by $h$ . In words: every bounded self-adjoint operator is unitarily equivalent to a multiplication operator on some $L^2$ space.
This is the right generalization of “every Hermitian matrix is diagonalizable.” For matrices, $\Omega$ is finite (the index set of eigenvalues, with multiplicity), $\mu$ is counting measure, and $M_h$ is the diagonal matrix. For general bounded self-adjoint operators, $\Omega$ may be a continuum and $\mu$ a continuous measure, but the structural picture is the same: in the right basis, $T$ is multiplication by a real-valued function.
Theorem (Spectral Theorem, Spectral Measure Form). Let $T \in B(H)$ be self-adjoint. There exists a unique projection-valued measure $E$ on the Borel sets of $\sigma(T)$ such that
$$ T = \int_{\sigma(T)} \lambda \, dE(\lambda), $$where the integral is interpreted weakly: $\langle T x, y \rangle = \int \lambda \, d \langle E(\lambda) x, y \rangle$ . The measure $E$ assigns to each Borel set $B \subset \sigma(T)$ an orthogonal projection $E(B)$ , with $E(\emptyset) = 0$ , $E(\sigma(T)) = I$ , and countable additivity for disjoint unions.
This is the form that physicists love: it is the operator-theoretic content of “an observable has a spectrum, and a measurement projects onto an eigenspace.” The projection-valued measure $E$ is what tells you, for each Borel set $B$ , the orthogonal projection onto “the part of the state living in spectral region $B$ .”
For compact self-adjoint operators (article 7), $E$ is a sum of finite-rank projections at the eigenvalues. For multiplication by $x$ on $L^2[0, 1]$ , $E(B)$ is multiplication by $\mathbf{1}_B$ — the projection onto functions supported on $B$ . Both cases are special instances of the same theorem.
Examples to Internalize#
Example A (multiplication on $L^2$ ). $(Mf)(x) = m(x) f(x)$ on $L^2(\Omega, \mu)$ for a real-valued bounded measurable $m$ . Spectrum is the essential range of $m$ :
$$ \sigma(M) = \{\lambda : \mu(\{x : |m(x) - \lambda| < \varepsilon\}) > 0 \text{ for all } \varepsilon > 0\}. $$The spectral measure is $E(B) = M_{\mathbf{1}_{m^{-1}(B)}}$ , multiplication by the indicator of the preimage of $B$ . This is the “model” example to which every self-adjoint operator is unitarily equivalent.

Example B (right shift on $\ell^2$ ). $S(x_1, x_2, \ldots) = (0, x_1, x_2, \ldots)$ . Not self-adjoint: $S^*$ is the left shift $(x_1, x_2, \ldots) \mapsto (x_2, x_3, \ldots)$ . Spectrum: $\sigma(S) = \overline{\mathbb{D}} = \{|\lambda| \leq 1\}$ . Point spectrum of $S$ : empty (no $\ell^2$ eigenvector). Point spectrum of $S^*$ : the open disk $\mathbb{D}$ , with eigenvectors $(1, \lambda, \lambda^2, \ldots)$ for $|\lambda| < 1$ . The asymmetry between $S$ and $S^*$ is a textbook illustration that point spectrum is not a self-adjointness-flavored quantity.
Example C (Laplacian on $L^2(\mathbb{R})$ , technically unbounded but instructive). $\Delta f = f''$ . Via Fourier transform, $\Delta$ becomes multiplication by $-|\xi|^2$ on $L^2(\mathbb{R})$ . Spectrum: $\sigma(\Delta) = (-\infty, 0]$ , all continuous spectrum. There are no $L^2$ eigenfunctions. The “eigenfunctions” $e^{i\xi x}$ are not in $L^2$ , they are generalized eigenfunctions. This is the prototype of how Fourier analysis is spectral theory in disguise.
Example D (integral operator with smoothing kernel on $L^2[0, 1]$ ). $K(x, y) = e^{-|x-y|^2}$ , a Gaussian kernel. The operator is compact and self-adjoint, so eigenvalues form a sequence going to zero. The eigenfunctions can be approximated numerically by discretizing on a fine grid and diagonalizing the resulting matrix. The eigenvalues decay roughly exponentially, matching the smoothness of the kernel.
These examples are worth building intuition around. Almost any self-adjoint operator one meets in mathematical physics is a variation on one of A, B, C, D, or a combination.
A Numerical Example to Anchor the Picture#
Consider the bounded self-adjoint operator $(T f)(x) = (1 - x^2) f(x) + \int_{-1}^1 K(x, y) f(y) \, dy$ on $L^2[-1, 1]$ , with $K(x, y)$ a small Hilbert-Schmidt kernel. The first part is multiplication by $1 - x^2$ , with continuous spectrum $[0, 1]$ . The second part is a compact self-adjoint operator. The full operator has both a continuous spectrum (from the multiplication part) and possibly a discrete eigenvalue set (perturbations from the compact part).
Numerically, one discretizes the interval into $N = 1000$ points, builds the resulting $1000 \times 1000$ symmetric matrix, and diagonalizes. The eigenvalues cluster densely on $[0, 1]$ (approximating the continuous spectrum) and possibly have a few outliers (approximating discrete eigenvalues). As $N \to \infty$ , the cluster of eigenvalues fills out $[0, 1]$ at a rate predicted by the spectral density of the multiplication operator, and the outliers stabilize. This is the qualitative picture of operator spectra: in the limit, multiplication operators give continuous spectrum, compact operators give discrete eigenvalues, and combinations give a mix. The spectral measure form of the theorem is the structural statement that captures both regimes.

Concretely: for the multiplication operator $M_g$ on $L^2[0, 1]$ with $g(x) = x$ , the spectral measure $E(B) = M_{\mathbf{1}_{B}}$ is multiplication by the indicator of $B$ , and $\langle E(B) f, f \rangle = \int_B |f|^2 \, dx$ . This is the Lebesgue measure of $B$ weighted by $|f|^2$ . The eigenfunctions, in the strict sense, do not exist; the right replacement is the spectral measure.
The Decomposition $\sigma_{ac} \cup \sigma_{sc} \cup \sigma_{pp}$ in Practice#
The decomposition $\sigma(T) = \sigma_{ac}(T) \cup \sigma_{sc}(T) \cup \sigma_{pp}(T)$ — absolutely continuous, singular continuous, pure point — is what comes out of the spectral measure form once one applies the Lebesgue decomposition theorem to the projection-valued measure. Most operators of physical interest have $\sigma_{sc} = \emptyset$ , but counterexamples exist (random Schrödinger operators with sparse potentials, almost-Mathieu operators at certain parameter values), and these are subtle and interesting.
Why does the decomposition matter? In quantum mechanics, the parts have physical meaning. Pure point spectrum corresponds to bound states (electrons trapped near a nucleus, particles in a confining potential). Absolutely continuous spectrum corresponds to scattering states (free particles, particles that can escape to infinity). Singular continuous spectrum is exotic but real: it corresponds to states that are neither bound nor scattering, with anomalous transport properties. The proof that ordinary atoms have only point + absolutely continuous spectrum (no singular continuous) is a deep theorem in mathematical physics (the RAGE theorem and its descendants), and it took decades to establish.
The continuous functional calculus extends to a Borel functional calculus, allowing $f(T)$ for any bounded Borel function $f$ on the spectrum — in particular for indicator functions, which gives back the projection-valued spectral measure $E(B) = \mathbf{1}_B(T)$ . So the continuous and Borel functional calculi together give the spectral theorem; conversely, the spectral theorem implies them by integration against $E$ . The whole story is a tight three-way equivalence.
Computing Spectra: A Practitioner’s Catalog#
Some operators come up so often that knowing their spectra by heart is useful. A short catalog.
Identity, $I$ : $\sigma(I) = \{1\}$ . Trivial, but worth stating. The identity has a single eigenvalue.
Diagonal multiplication on $\ell^2$ : $T(x_1, x_2, \ldots) = (\lambda_1 x_1, \lambda_2 x_2, \ldots)$ with $\lambda_n$ bounded. Spectrum is the closure of $\{\lambda_n\}$ . Each $\lambda_n$ is an eigenvalue. Limit points of $\{\lambda_n\}$ are continuous spectrum.
Multiplication by $g$ on $L^2(\Omega, \mu)$ : spectrum is the essential range of $g$ . Pure point spectrum at the values $\lambda$ where $\mu(g^{-1}(\{\lambda\})) > 0$ (so $g$ is constant on a set of positive measure); continuous spectrum elsewhere.
Right shift on $\ell^2$ : $\sigma(S) = \overline{\mathbb{D}}$ , point spectrum empty, residual spectrum the open disk $\mathbb{D}$ , continuous spectrum the unit circle. The residual-spectrum mass on $\mathbb{D}$ goes away when we take the adjoint $S^*$ (left shift): then $\mathbb{D}$ becomes point spectrum.
Discrete Laplacian on $\ell^2(\mathbb{Z})$ : $(\Delta x)_n = x_{n+1} + x_{n-1} - 2 x_n$ . Via the Fourier transform $\ell^2(\mathbb{Z}) \cong L^2([-\pi, \pi])$ , this is multiplication by $2\cos\theta - 2$ for $\theta \in [-\pi, \pi]$ . So the spectrum is $[-4, 0]$ , all continuous, no eigenvalues. The picture matches what physicists call the “tight-binding” model band structure.
Continuous Laplacian $\Delta$ on $L^2(\mathbb{R})$ : unbounded, but instructive: spectrum $(-\infty, 0]$ via Fourier transform.
Volterra integral operator $(V f)(x) = \int_0^x f(y) \, dy$ on $L^2[0, 1]$ : compact, $\sigma(V) = \{0\}$ . As an aside, the $n$ -th iterate has norm $\|V^n\| = 1/n!$ , so $\|V^n\|^{1/n} \to 0$ confirms $r(V) = 0$ via the spectral radius formula.
Toeplitz operator $T_g$ with continuous symbol $g$ on the unit circle, acting on the Hardy space $H^2$ : spectrum is the curve $g(\mathbb{T})$ together with all points the curve winds around (this is a remarkable result of Brown and Halmos, 1964). Toeplitz operators are essentially “Fourier multipliers acting on positive frequencies,” and their spectral theory is a major subject in operator theory and harmonic analysis. The Hardy-Hilbert kernel $(Hf)(x) = \int_0^\infty f(y)/(x+y)\,dy$ on $L^2(0, \infty)$ is bounded with operator norm exactly $\pi$ (Hilbert’s inequality), with continuous spectrum $[0, \pi]$ that one can compute explicitly via Mellin transform.
These examples are not just trivia. They are the building blocks of intuition: when faced with a new operator, one should ask which of these it resembles. Most operators in practice are perturbations of these models, or combinations.
Worked Numerical Example#
$$ \Delta_4 = \begin{pmatrix} 2 & -1 & 0 & 0 \\ -1 & 2 & -1 & 0 \\ 0 & -1 & 2 & -1 \\ 0 & 0 & -1 & 2 \end{pmatrix}. $$ $$ \lambda_1 = 2 - 2\cos(\pi/5) \approx 2 - 1.6180 = 0.3820, $$ $$ \lambda_2 = 2 - 2\cos(2\pi/5) \approx 2 - 0.6180 = 1.3820, $$ $$ \lambda_3 = 2 - 2\cos(3\pi/5) \approx 2 + 0.6180 = 2.6180, $$ $$ \lambda_4 = 2 - 2\cos(4\pi/5) \approx 2 + 1.6180 = 3.6180. $$All four eigenvalues sit strictly inside $(0, 4)$ . As $N$ increases, the set $\{\lambda_k\}_{k=1}^N$ becomes dense in $[0, 4]$ . For $N=100$ , the smallest eigenvalue is $2 - 2\cos(\pi/101) \approx 0.00097$ , and the largest is $3.9990$ . The discrete spectrum fills the continuous interval $[0, 4]$ uniformly. This matches the catalog entry: the infinite discrete Laplacian on $\ell^2(\mathbb{Z})$ has spectrum $[-4, 0]$ (shifted by sign convention), all continuous. The finite matrix computation is not an approximation error; it is a Riemann-sum sampling of the continuous spectral measure.
Spectral Theory in Numerical Linear Algebra#
A practical aside. The whole apparatus of spectral theory has direct counterparts in numerical linear algebra. The QR algorithm computes eigenvalues by iterating a shifted similarity transformation; the underlying convergence proof uses the spectral mapping theorem and rate estimates from the spectral gap. The Lanczos algorithm computes eigenvalues of large symmetric matrices by building a Krylov subspace and exploiting orthogonality; the analysis uses the Rayleigh quotient and Courant-Fischer min-max. ARPACK, the standard library for large eigenvalue problems, is essentially Lanczos plus shift-and-invert tricks justified by spectral mapping.
When one studies operator spectra and computes them numerically, the same structural theorems govern both. The error analysis of finite-dimensional approximations to infinite-dimensional spectral problems is the subject of spectral approximation theory (Chatelin, Anselone), and it is one of the cleanest applications of operator theory to scientific computing. The take-home message: there is no firewall between operator theory and numerical analysis. The same theorems are used on both sides; only the implementation details differ.
Why This Matters: Quantum Observables#
In quantum mechanics, observables (energy, momentum, position) are self-adjoint operators on a Hilbert space. The spectrum of an observable is exactly the set of possible measurement outcomes. The spectral measure $E$ encodes the probability distribution of outcomes: in state $\psi$ , the probability of measuring an outcome in Borel set $B$ is $\langle E(B) \psi, \psi \rangle$ . This is not an analogy — it is the literal mathematical foundation of quantum mechanics, formulated by von Neumann in 1932.
The mystery of why eigenvalues of an operator should correspond to physical measurement outcomes has a structural answer: the mathematical structure of “self-adjoint operator on a Hilbert space” was reverse-engineered from the empirical observation that physical observables have real-valued outcomes with definite probabilities. Spectral theory is, in this sense, a piece of physics formulated in mathematical language. The continuous functional calculus tells you how to compute $f(\hat H)$ , where $\hat H$ is the Hamiltonian, and that includes $e^{-it\hat H}$ — the time evolution operator. Spectral theory is what makes the Schrödinger equation actually solvable in any nontrivial sense.
Article 12 returns to this in detail. For now: spectral theory is the linear-algebraic infrastructure of quantum mechanics, and it is also the right setting for a vast amount of PDE.
The Spectral Mapping Theorem and Its Uses#
A small but useful statement: $\sigma(f(T)) = f(\sigma(T))$ for any continuous $f$ on $\sigma(T)$ . So the spectrum of $T^2$ is $\{\lambda^2 : \lambda \in \sigma(T)\}$ , the spectrum of $e^T$ is $\{e^\lambda : \lambda \in \sigma(T)\}$ , and so on. The polynomial case is direct (factor $f - \mu$ as $\prod (z - \lambda_j)$ , and $f(T) - \mu I = \prod (T - \lambda_j I)$ is invertible iff every factor is). The continuous case follows by approximation.
This is what lets us compute spectra of operators built from $T$ via algebraic operations or functional calculus. If $T \geq 0$ (positive self-adjoint), then $T^{1/2}$ is well-defined and self-adjoint, with $\sigma(T^{1/2}) = \sqrt{\sigma(T)}$ . If $T = T^*$ with spectrum in $[m, M]$ , then $(T - mI)/(M - m)$ has spectrum in $[0, 1]$ , normalizing the operator.
Why “Spectrum”? A Historical Aside#
The word “spectrum” was Hilbert’s coinage. In his 1906 lectures on integral equations, he noticed that the eigenvalues of certain symmetric integral kernels formed a “spectrum” reminiscent of the discrete lines of atomic emission spectra. The physical spectrum and the operator spectrum then evolved together: by 1925, when Heisenberg formulated matrix mechanics, the eigenvalues of energy operators were quite literally the observed spectral lines of atoms. The mathematical name for the structure preceded the physical interpretation by two decades, but the two became indistinguishable. “Spectrum” is one of the rare cases where a mathematical term and its physical referent are not just analogous but historically continuous.
Spectral Projections and the Riesz Functional Calculus#
Before unbounded operators, one more tool: the Riesz functional calculus for general bounded operators (not necessarily self-adjoint). For $T \in B(X)$ on a Banach space and $f$ holomorphic on a neighborhood of $\sigma(T)$ , one defines
$$ f(T) = \frac{1}{2\pi i} \oint_\Gamma f(\lambda) R(\lambda; T) \, d\lambda, $$where $\Gamma$ is a contour enclosing $\sigma(T)$ in the domain of $f$ . The integral is operator-valued and converges by the analyticity of the resolvent.
In particular, taking $f = \mathbf{1}_U$ for a clopen set $U \subset \sigma(T)$ (i.e., $U$ is a connected component of the spectrum), one gets a Riesz spectral projection
$$ P_U = \frac{1}{2\pi i} \oint_{\Gamma_U} R(\lambda; T) \, d\lambda, $$where $\Gamma_U$ encloses only the part of the spectrum in $U$ . The projection $P_U$ commutes with $T$ and decomposes $X$ into the direct sum of the $T$ -invariant subspaces $\text{range}(P_U)$ and $\text{range}(I - P_U)$ , on each of which the spectrum of $T$ is reduced to either $U$ or $\sigma(T) \setminus U$ . This is the operator-theoretic version of “isolating an eigenvalue” or “splitting off an invariant subspace.”
For self-adjoint operators on a Hilbert space, the Riesz functional calculus and the continuous (or Borel) functional calculus agree where they overlap; the Borel calculus extends further, since we can apply non-holomorphic functions like indicators of arbitrary Borel sets. For general operators, the Riesz calculus is the strongest tool available, and it is what enables results like the Jordan canonical form for compact operators and the structure theory of operator semigroups.
A small worked example. Consider the matrix $A = \text{diag}(1, 2, 3, 4) + \varepsilon N$ , where $N$ is some nilpotent perturbation and $\varepsilon$ is small. The Riesz projection $P_1$ associated with the spectral component near $\lambda = 1$ is approximately $\text{diag}(1, 0, 0, 0)$ for small $\varepsilon$ , perturbed by an $O(\varepsilon)$ correction computable by the contour integral. This is how perturbation theory in quantum mechanics (Rayleigh-Schrödinger) is rigorously set up — the spectral projections of the unperturbed operator are deformed analytically as the perturbation is turned on, and the projections track the eigenvalues continuously as long as no level crossings occur.
Worked Numerical Example#
$$ R(\lambda; A) = \text{diag}\left(\frac{1}{\lambda-1}, \frac{1}{\lambda-5}, \frac{1}{\lambda-10}\right). $$ $$ P = \frac{1}{2\pi i} \oint_\Gamma R(\lambda; A) \, d\lambda = \text{diag}\left( \frac{1}{2\pi i}\oint_\Gamma \frac{d\lambda}{\lambda-1}, \frac{1}{2\pi i}\oint_\Gamma \frac{d\lambda}{\lambda-5}, \frac{1}{2\pi i}\oint_\Gamma \frac{d\lambda}{\lambda-10} \right). $$ $$ P = \text{diag}(1, 0, 0). $$Applying $P$ to any vector $(x, y, z)$ yields $(x, 0, 0)$ , the component in the eigenspace for $\lambda=1$ . The computation requires no eigenvector solving; it extracts the invariant subspace purely from contour integration of the resolvent. If we perturb $A$ to $A_\varepsilon = A + \varepsilon \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}$ , the contour integral still yields a rank-1 projection, now slightly tilted, tracking the perturbed eigenspace analytically as long as $\varepsilon < 2$ .
A Detour Through Spectral Theory of Normal Operators#
Self-adjoint operators are a special case of normal operators: $T$ is normal if $T T^* = T^* T$ . Unitary operators are normal ($U U^* = U^* U = I$ ); self-adjoint operators are trivially normal; positive operators are normal. The spectral theorem extends without serious modification to bounded normal operators: there is a projection-valued measure $E$ on the (now possibly complex) spectrum, with $T = \int \lambda \, dE(\lambda)$ , and a continuous functional calculus $f \mapsto f(T)$ for $f \in C(\sigma(T))$ .
The most useful corollary: every unitary operator $U$ on a Hilbert space is unitarily equivalent to multiplication by $e^{i\theta}$ on some $L^2$ space. Specifically, $U$ has spectrum on the unit circle, and the spectral measure on the circle gives a model. This is the right setting for thinking about Fourier transforms, lattice translations on $\ell^2(\mathbb{Z})$ , and time evolution in quantum mechanics — all of these are unitary operators, and all of them are diagonalized by the spectral theorem for normal operators.
Why is this worth flagging? Because the most useful operators in physics are typically either self-adjoint (observables) or unitary (symmetries, time evolution), and both fall under the normal-operator spectral theorem. The non-self-adjoint, non-normal operators are computationally useful (transfer operators, dissipative semigroup generators) but their spectral theory is fundamentally messier — defective eigenvectors, generalized eigenspaces, Jordan structure in infinite dimensions. The normal case is where the spectral theory is at its cleanest.
A small numerical example. Take the unitary operator $U: \ell^2(\mathbb{Z}/N\mathbb{Z}) \to \ell^2(\mathbb{Z}/N\mathbb{Z})$ given by the shift $(Ux)_n = x_{n-1}$ . This is the discrete Fourier transform of multiplication by $e^{2\pi i k/N}$ , $k = 0, 1, \ldots, N-1$ . The eigenvalues of $U$ are precisely the $N$ -th roots of unity $e^{2\pi i k/N}$ . The Fourier transform diagonalizes the shift, and that is the cleanest possible illustration of the spectral theorem for normal operators in finite dimensions. The infinite-dimensional case (continuous Fourier transform, Pontryagin duality) is a vast generalization but the geometric picture is the same.
A Reading-Order Note#
For someone learning this material for the first time, I recommend the following order: (1) the multiplication-operator form of the spectral theorem, because it is concrete; (2) the spectral measure form, as a refinement; (3) the continuous functional calculus, working examples until the algebraic identities become familiar; (4) the spectral mapping theorem, which then becomes a corollary; (5) Riesz projections, only when needed. Reed and Simon (Volume I) handle this in roughly this order. Rudin (Real and Complex Analysis) goes through the spectral theorem in a more abstract form via Gelfand-Naimark, which is conceptually elegant but slower to bring on tangible operators. The two routes meet eventually, but it is worth knowing both.
The single insight that took me longest to absorb was that “$T$ has continuous spectrum at $\lambda$ ” does not mean “$\lambda$ is an eigenvalue of a slightly perturbed operator.” It means something stronger: there are unit vectors $x_n$ with $(T - \lambda) x_n \to 0$ but no convergent subsequence of $x_n$ . These approximate eigenvectors, or Weyl sequences, are the right replacement for eigenvectors in the continuous-spectrum case. Multiplication by $x$ on $L^2[0, 1]$ has Weyl sequences at every $\lambda \in [0, 1]$ : take $x_n = \sqrt{n} \mathbf{1}_{[\lambda - 1/(2n), \lambda + 1/(2n)] \cap [0, 1]}$ . They are unit vectors, $(M - \lambda) x_n \to 0$ in $L^2$ , but no subsequence converges. The operator is “almost diagonalized” near $\lambda$ but not actually diagonalized — and that is exactly what continuous spectrum captures.
Counterexample: Why the Definition Cannot Be Weakened#
The article states that for any bounded operator on a complex Banach space, the spectrum is non-empty. The proof relies on Liouville’s theorem applied to the resolvent, which requires the underlying field to be $\mathbb{C}$ . If we weaken the hypothesis to a real Banach space, the theorem collapses completely.
$$ J = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}. $$ $$ \begin{pmatrix} \lambda & 1 \\ -1 & \lambda \end{pmatrix}, $$ $$ (\lambda I - J)^{-1} = \frac{1}{\lambda^2 + 1} \begin{pmatrix} \lambda & -1 \\ 1 & \lambda \end{pmatrix}. $$The operator norm of this resolvent is $\|R(\lambda; J)\| = 1/\sqrt{\lambda^2+1}$ , which is bounded and smooth on all of $\mathbb{R}$ . There is no real $\lambda$ where invertibility fails. The spectrum of $J$ over $\mathbb{R}$ is the empty set.
This breaks the foundational link between spectral theory and complex analysis. The resolvent $R(\lambda; J)$ is an entire function on $\mathbb{R}$ that vanishes at infinity, yet it is not identically zero. Liouville’s theorem does not apply because $\mathbb{R}$ is not algebraically closed and lacks the Cauchy integral machinery that forces singularities. The moment we complexify the space to $\mathbb{C}^2$ , the eigenvalues $\pm i$ appear, the spectrum becomes $\{i, -i\}$ , and the resolvent develops poles. The non-emptiness of the spectrum is not a generic linear-algebra fact; it is a complex-analytic constraint. Working over $\mathbb{R}$ strips spectral theory of its predictive power, which is why functional analysis defaults to complex scalars even when the original problem is real-valued.
Why I Care#
I first internalized the distinction between spectrum and resolvent norm during a graduate numerical PDE course. I was implementing a Crank-Nicolson scheme for a 1D advection-diffusion equation with a skewed upwind discretization. The spatial discretization produced a $200 \times 200$ matrix $A$ . I checked the eigenvalues: all had strictly negative real parts, clustered near $-0.5$ . By the spectral mapping theorem, $\sigma(e^{\Delta t A})$ should lie inside the unit disk for any $\Delta t > 0$ . I set $\Delta t = 0.05$ and ran the simulation. The solution blew up after 120 steps.
I assumed a coding error. I rewrote the time-stepper three times. The blow-up persisted. I finally computed the resolvent norm $\|R(\lambda; A)\|$ on a grid in the complex plane. The eigenvalues were safely in the left half-plane, but the resolvent norm formed a massive ridge extending far into the right half-plane, reaching values above $10^4$ near $\lambda = 0.2 + 1.5i$ . The matrix was highly non-normal. The spectral radius predicted asymptotic decay, but the resolvent norm dictated transient growth. The inequality $\|e^{tA}\| \leq e^{t \omega(A)}$ (where $\omega(A)$ is the numerical abscissa) was the actual stability constraint, not the spectral bound. I reduced $\Delta t$ to $0.005$ and switched to a scheme respecting the numerical range. The simulation stabilized immediately.
That numerical disaster killed my habit of equating spectrum with stability. The spectrum tells you what happens at $t = \infty$ . The resolvent tells you what happens at $t = 10$ . In non-normal systems, the gap between the two is where the solution lives or dies. Spectral theory gave me the language to diagnose the transient violence; the resolvent norm was the diagnostic tool.
Common Pitfall#
Beginners routinely assume that if $\lambda \in \sigma(T)$ , then there exists a nonzero vector $x$ satisfying $Tx = \lambda x$ . This collapses the spectrum to the point spectrum and ignores the continuous and residual parts entirely. The multiplication operator $M_x$ on $L^2[0, 1]$ demolishes this belief.
$$ \|R(\lambda; M_x)\| = \left\| \frac{1}{x - \lambda} \right\|_{L^\infty[0, 1]} = \frac{1}{\text{dist}(\lambda, [0, 1])}. $$ $$ \|(M_x - 0.5)f_n\|_2^2 = \int_{0.5}^{0.5+1/n} n (x-0.5)^2 \, dx = \frac{1}{3n^2} \to 0. $$The operator has approximate eigenvectors with unit norm, but no exact eigenvector. Continuous spectrum is defined by this approximation property, not by kernel nontriviality.
What’s Next, and Why#
The bounded self-adjoint case is the cleanest scenario, but it leaves out almost everything that matters in physics. Differentiation operators, the Laplacian, the Schrödinger Hamiltonian — these are all unbounded, defined only on dense subdomains of $L^2$ . The next article extends spectral theory to unbounded self-adjoint operators, using the closed graph and a careful definition of self-adjointness via the adjoint operator $T^*$ and its domain.
The new technical complications are domains: an unbounded operator is a pair $(T, D(T))$ where $D(T)$ is a dense subspace and $T: D(T) \to H$ is linear. Self-adjointness is no longer a single equation $T = T^*$ but a domain equation $D(T) = D(T^*)$ together with $Tx = T^*x$ on the common domain. Symmetric operators (where $T \subset T^*$ ) need not be self-adjoint, and the Friedrichs extension and von Neumann deficiency-index theory come in to control which symmetric operators have self-adjoint extensions.
Once we have unbounded self-adjoint operators, the spectral theorem extends with minor modifications: there is still a projection-valued measure on the spectrum (now possibly unbounded as a subset of $\mathbb{R}$ ), and the operator is still $\int \lambda \, dE(\lambda)$ , with $D(T) = \{x : \int \lambda^2 \, d\langle E(\lambda) x, x\rangle < \infty\}$ . The functional calculus extends similarly. Everything in this article carries over, with care about domains.
The reward is that we can finally talk about Schrödinger operators, the heat semigroup, momentum and position observables, and the rest of mathematical physics. Domains are a small price. The conceptual lesson of this article — spectrum equals the obstruction to invertibility, and self-adjoint operators are unitarily equivalent to multiplication operators — survives the transition to unbounded operators with only minor edits. Once one has internalized this, the rest of operator-theoretic mathematical physics becomes accessible. The structure of the spectrum encodes the structure of the operator, and the functional calculus turns “applying $f$ to the operator” into a routine computation rather than a conceptual leap. In a sense everything we will do for the next four articles is variations on this theme: extending the calculus to more general operators, using it to write down explicit formulas for evolution equations, and reading off physical and analytic information from spectral data. The unifying viewpoint is that an operator’s spectrum, together with its spectral measure, contains all the structural information one would want; everything else is a specialization or computational consequence.
Specific Questions Ahead#
Bounded spectral theory is structurally complete, but it excludes the operators that actually generate dynamics. Differentiation, the Laplacian, and quantum Hamiltonians are unbounded. They are defined only on dense subspaces, and their norms are infinite. The next article extends the entire apparatus to this setting. You are now equipped for the transition because you already know how the resolvent encodes invertibility, how projection-valued measures decompose the space, and how the functional calculus turns scalar functions into operators. The unbounded case reuses these tools verbatim; it only adds domain bookkeeping.
The next article answers four specific questions:
- How do we define the adjoint $T^*$ when $D(T) \subsetneq H$ , and why does $T \subset T^*$ (symmetry) fail to guarantee $T = T^*$ (self-adjointness)?
- What is a closed operator, and why does the closed graph theorem force every everywhere-defined self-adjoint operator to be bounded?
- How do we construct self-adjoint extensions for symmetric operators that are not essentially self-adjoint, and when is the extension unique?
- How does the spectral theorem change when $\sigma(T)$ is unbounded, and how do we define $e^{itT}$ rigorously for unbounded $T$ ?
The technical centerpiece will be the Hellinger-Toeplitz theorem, which proves that a symmetric operator defined on all of $H$ must be bounded. This theorem forces us to accept proper dense domains as a structural necessity, not a technical inconvenience. Once domains are handled correctly, the spectral theorem for unbounded self-adjoint operators follows with minimal modification: the projection-valued measure lives on an unbounded subset of $\mathbb{R}$ , and the domain of $T$ is exactly the set of vectors with finite second spectral moment. The functional calculus extends to unbounded Borel functions, and Stone’s theorem on one-parameter unitary groups emerges as a direct corollary. The payoff is immediate: we gain rigorous control over the Schrödinger equation, heat semigroups, and momentum operators. The bounded theory was the blueprint; the unbounded theory is the building.
Functional Analysis 12 parts
- 01 Functional Analysis (1): Metric Spaces — Distance, Convergence, and Completeness
- 02 Functional Analysis (2): Normed Spaces and Banach Spaces
- 03 Functional Analysis (3): Hilbert Spaces — Geometry in Infinite Dimensions
- 04 Functional Analysis (4): Dual Spaces and the Hahn-Banach Theorem — Taming Linear Functionals
- 05 Functional Analysis (5): Weak and Weak-* Topologies — When Norm Convergence Is Too Strong
- 06 Functional Analysis (6): Bounded Linear Operators and the Big Theorems
- 07 Functional Analysis (7): Compact Operators — The Bridge to Finite Dimensions
- 08 Functional Analysis (8): Spectral Theory — Decomposing Operators you are here
- 09 Functional Analysis (9): Unbounded Operators — When Boundedness Fails
- 10 Functional Analysis (10): Semigroups of Operators — Evolution Equations in Infinite Dimensions
- 11 Functional Analysis (11): Distributions and Sobolev Spaces — Generalized Solutions
- 12 Functional Analysis (12): Functional Analysis in Action — PDE and Quantum Mechanics