
Ordinary Differential Equations (9): Chaos Theory and the Lorenz System
Deterministic yet unpredictable: the Lorenz system, butterfly effect, Lyapunov exponents, strange attractors, and the routes from order to chaos -- with Python simulations throughout.
In 1961, Edward Lorenz restarted a weather simulation from a rounded-off number — 0.506 instead of 0.506127. Within simulated weeks the forecast was unrecognisable. That single accident gave us the butterfly effect and turned chaos from a metaphor into a science. The lesson is profound and sober: equations that are exactly deterministic can still be practically unpredictable.

What You Will Learn#
- The four conditions that together define chaos
- The Lorenz system: paradigm of deterministic chaos
- Butterfly effect, visualised on the attractor itself
- Lyapunov exponents: numerical fingerprint of chaos
- Bifurcation cascades and the period-doubling route to chaos
- Other chaotic systems: Rossler and the double pendulum
- Strange attractors, fractal dimension, stretching-and-folding
- Applications: weather, encryption, controlling chaos, ensemble forecasting
Prerequisites#
- Chapter 8 : nonlinear systems, phase portraits, limit cycles
- Chapter 7 : stability and bifurcation basics
- Comfort with 3D visualization
What Is Chaos?#
A chaotic system satisfies all four of:
- Deterministic — governed by exact equations, no randomness
- Sensitive to initial conditions — tiny differences grow exponentially
- Bounded — trajectories stay in a finite region
- Aperiodic — they never repeat exactly
| Property | Random Process | Chaotic System |
|---|---|---|
| Equations | Contain noise terms | Completely deterministic |
| Short-term prediction | Statistical only | Precisely predictable |
| Long-term prediction | Statistical regularities | Completely unpredictable |
| Source of complexity | External noise | Intrinsic dynamics |
The deep insight: very simple equations can produce infinitely complex behaviour. Lorenz showed it with three.
The Lorenz System#
$$\dot x = \sigma(y - x), \quad \dot y = x(\rho - z) - y, \quad \dot z = xy - \beta z$$- $x$ : convection intensity
- $y$ : horizontal temperature difference
- $z$ : vertical temperature deviation
- Classic parameters: $\sigma = 10,\ \rho = 28,\ \beta = 8/3$
Where the three numbers come from. Each parameter is a dimensionless ratio that survives Lorenz’s truncation of the full Boussinesq convection equations:
- $\sigma = \nu/\kappa$ is the Prandtl number — kinematic viscosity over thermal diffusivity. For air it sits near $0.7$ , for water near $7$ ; Lorenz picked $\sigma = 10$ as a convenient round value that exhibits the chaotic regime. It is the only parameter that controls the velocity equation in isolation.
- $\rho = \mathrm{Ra}/\mathrm{Ra}_c$ is the normalised Rayleigh number — driving (buoyancy) divided by dissipation, scaled so that convection turns on at $\rho = 1$ . At $\rho = 28$ the system is well past the convective threshold and just past the second instability ($\rho \approx 24.74$ ) where $C_\pm$ lose stability via subcritical Hopf, which is why the chaos is sustained rather than transient.
- $\beta = 8/3$ comes from the geometry of the convection rolls — specifically the horizontal-to-vertical wavenumber ratio of the most unstable Fourier mode in a thin layer. It is forced on you, not chosen.
So the canonical $(\sigma, \rho, \beta) = (10,\ 28,\ 8/3)$ is not arbitrary: $\beta$ is geometry, $\rho$ places you in the chaotic window, and $\sigma$ is a fluid choice that happens to make the window wide. Vary any of them and the topology of the attractor changes — for $\rho \lesssim 24.06$ the system has stable equilibria and transient chaos; far above $\rho = 28$ it eventually re-stabilises onto periodic orbits before chaos returns.
The strange attractor#

Three signatures of “strangeness”:
- Fractal structure. The Hausdorff dimension is $\approx 2.06$ — thicker than a surface, thinner than a volume.
- Aperiodic. Infinite trajectory length confined to a finite volume.
- No self-intersection. Uniqueness of ODE solutions forbids crossings at the same time.
| |
The Butterfly Effect, Visualised#

Two trajectories that start a ten-billionth apart — $[1, 1, 1]$ and $[1 + 10^{-10}, 1, 1]$ — diverge exponentially until the difference is system-scale.

For the atmosphere $\lambda \approx 1/\text{day}$ and $\ln(L/\varepsilon_0) \approx 15$ , giving $T \approx 15$ days. No improvement in models can push past this — only better measurements widen the gap inside the logarithm.
Ensemble view#
A single trajectory tells you the worst case. An ensemble tells you the distribution.

Lyapunov Exponents: Quantifying Chaos#
$$ \lambda_1 \;=\; \lim_{t\to\infty}\frac{1}{t}\,\ln\frac{|\delta\mathbf{x}(t)|}{|\delta\mathbf{x}(0)|}. $$| Sign | Behaviour |
|---|---|
| $\lambda_1 > 0$ | Chaos (exponential divergence) |
| $\lambda_1 = 0$ | Periodic or quasi-periodic |
| $\lambda_1 < 0$ | Asymptotically stable |
For Lorenz at the canonical parameters, the spectrum is approximately $\{0.91,\ 0,\ -14.57\}$ .

Kaplan-Yorke (Lyapunov) dimension#
$$D_{KY} \;=\; 2 + \frac{\lambda_1 + \lambda_2}{|\lambda_3|} \;\approx\; 2 + \frac{0.91}{14.57} \;\approx\; 2.062.$$The attractor is almost a surface, but with infinitely many fractal layers stacked together.
Equilibria and the Route to Chaos#
Setting $\dot x = \dot y = \dot z = 0$ gives three equilibria:
- Origin $C_0 = (0,0,0)$ — stable for $\rho < 1$ , saddle for $\rho > 1$
- Symmetric pair $C_\pm = (\pm\sqrt{\beta(\rho-1)},\ \pm\sqrt{\beta(\rho-1)},\ \rho - 1)$ — born at $\rho = 1$
| $\rho$ | Behaviour |
|---|---|
| $< 1$ | Origin globally stable |
| $= 1$ | Pitchfork bifurcation: $C_\pm$ appear |
| $1 < \rho < 24.74$ | $C_\pm$ are stable spirals |
| $\approx 24.74$ | Subcritical Hopf: $C_\pm$ lose stability |
| $24.74 < \rho < 28$ | Transient chaos, periodic windows |
| $\geq 28$ | Sustained chaos |
The route from order to chaos shows up classically in the logistic map $x_{n+1} = r x_n (1 - x_n)$ :

Other Chaotic Systems#
Rossler system#
$$\dot x = -y - z, \qquad \dot y = x + a y, \qquad \dot z = b + z(x - c)$$With $a = b = 0.2,\ c = 5.7$ this gives a “folded ribbon” attractor that exposes the stretching-and-folding mechanism more cleanly than Lorenz.
Double pendulum#
Two hinged arms — one of the simplest mechanical systems with chaos.
| |
The double pendulum is the cleanest physical demonstration of chaos — you can build one on a table.
Strange Attractors: Stretching and Folding#
Chaotic attractors have fractal structure — self-similar, with non-integer dimension. The mechanism is mechanical:
- Stretch: nearby trajectories pulled apart -> sensitivity.
- Fold: stretched material folded back -> boundedness.
Repeat infinitely and you get an infinitely layered “puff pastry”. Think of a baker kneading dough: stretch, fold, stretch, fold — after $n$ steps, two yeast cells initially $\varepsilon$ apart are $2^n \varepsilon$ apart along the layer direction.
That single mechanism — expansion in some directions, contraction in others, with global folding — is what every strange attractor in nature does.
Applications of Chaos#
Weather prediction limits#
- 1-3 days: highly accurate
- 3-10 days: useful reference
- Beyond two weeks: only statistical trends
Modern centres use ensemble forecasting: run dozens of slightly perturbed initial conditions and report the spread.
Chaotic encryption#
Two parties share the chaotic system parameters as a key. The unpredictability of the output makes it a stream cipher; without the key, the chaotic sequence cannot be reproduced.
Controlling chaos (OGY method, 1990)#
- Locate unstable periodic orbits embedded in the chaotic attractor.
- When the trajectory naturally approaches such an orbit, apply tiny perturbations to keep it there.
- Chaos becomes periodic motion, suppressed with arbitrarily small control.
This has been used in laser physics, chemical reactors, and even cardiac pacing.
Chaos synchronisation#
Two chaotic systems coupled strongly enough can synchronise on a common, still-chaotic trajectory — the mathematical basis of chaotic secure communications.
Chaos and Philosophy#
Laplace’s demon (1814): “given perfect knowledge of every particle, the future is calculable.”
Chaos’s reply: even in a perfectly deterministic universe, the future is calculable only if measurements are infinitely precise. Errors grow exponentially, so any finite precision is forgotten in finite time.
This does not break causality. It limits predictability. The distinction matters.
Exercises#
Conceptual.
- What is the essential difference between chaos and randomness?
- Why are 2D continuous systems forbidden from chaos, while 3D ones permit it?
- What does a positive Lyapunov exponent mean physically and operationally?
Computational.
- Verify the origin of Lorenz is stable for $\rho < 1$ and a saddle for $\rho > 1$ .
- Prove $\nabla\cdot\mathbf{f} = -(\sigma + 1 + \beta)$ — the Lorenz flow contracts phase-space volume at a constant rate.
- For the Cantor set, prove the box-counting dimension is $\ln 2/\ln 3$ .
Programming.
- Plot the Lorenz attractor for $\rho \in \{10, 28, 100\}$ and compare topology.
- Compute the three Lyapunov exponents numerically; verify $\sum \lambda_i = -(\sigma + 1 + \beta)$ .
- Animate the double pendulum from two nearly-identical starts; visually demonstrate divergence.
- Build the Rossler bifurcation diagram in $c$ and identify the period-doubling route.
Computing Lyapunov Exponents: Benettin in 25 Lines#
The textbook formula $\lambda = \lim_{T\to\infty}\frac1T\ln\frac{\|\delta(T)\|}{\|\delta(0)\|}$ is clean but blows up the moment you implement it directly — $\delta(t)$ grows exponentially and overflows. Benettin (1980) renormalises every short interval and accumulates the log:
| |
For the full spectrum, replace $\delta$ with an $n\times n$ orthonormal matrix and QR-renormalise each step (Wolf algorithm). The Lorenz spectrum is roughly $(0.905, 0, -14.57)$ ; sum equals divergence $-(\sigma + \beta + 1) = -13.67$ — consistency check.
Trap. Too small $dt$ : linearisation is accurate per step but QR cost dominates. Too large: nonlinearity wins and the renormalisation is meaningless. For Lorenz, $dt \in [0.1, 1.0]$ is the working range.
Numerical Integration of Chaotic Systems: Error Has a Different Meaning#
For convergent systems, RK4 error is $O(h^4)$ — halve the step, errors drop 16x. Chaos breaks that intuition.
Shadowing principle. The numerical trajectory will not stay close to the true trajectory $\gamma(t)$ from your initial condition — any floating-point error gets amplified exponentially. But there is another true trajectory $\gamma^*(t)$ (starting from a slightly different IC) that stays near the numerical orbit forever. So the Lorenz butterfly you draw is statistically faithful but pointwise meaningless.
Practical consequences:
- Comparing “where are two Lorenz simulations at $t = 50$ ” tells you nothing — exponential divergence drowns the signal.
- Comparing “fractal dimension, Lyapunov spectrum, power spectrum of the two attractors” is the right invariant.
- Training an ML model to predict a chaotic system with MSE loss is wrong. Use statistical matching: attractor reconstruction, energy spectra.
Step size rule of thumb. $dt \approx 0.1 / \lambda_{\max}$ . For Lorenz, $\lambda_{\max} \approx 0.9$ , so $dt \le 0.1$ to resolve the chaotic timescale.
ML Connection: Reservoir Computing and the Predictability Wall#
The Lyapunov time $T_\lambda = 1/\lambda_{\max}$ sets a hard ceiling on prediction. For Lorenz, about 1.1 time units. Beyond that, the floating-point uncertainty of the initial condition has spread to fill the attractor. No model breaks this — it is a physics limit.
Reservoir Computing (RC) is the cleanest tool of the last decade in this direction: fix a large random nonlinear recurrent network (the reservoir), train only the readout layer. Pathak et al. (2018, PRL) pushed Kuramoto-Sivashinsky prediction to $\sim 8 T_\lambda$ with RC, beating classical data-assimilation pipelines.
Key points:
- RC is not “learning the dynamics” — it is approximately replicating the attractor in a high-dimensional embedding space.
- Training is ridge regression. No BPTT, no vanishing gradients.
- $T_\lambda$ is a ceiling, not a floor. RC just gets you closer to that ceiling than naive recurrent nets do.
Implication for PINN / Neural ODE. Asking ML to predict the long-term future of a chaotic system is the wrong objective. The right objective is to learn invariants — attractor geometry, spectra, transfer operators. That is also one of the deeper meanings of score matching in the PDE-ML chapter 7.
Summary#
| Concept | Key Point |
|---|---|
| Chaos | Deterministic + sensitive + bounded + aperiodic |
| Lorenz system | The paradigm; butterfly attractor at $\rho=28$ |
| Butterfly effect | $10^{-10}$ initial difference -> system scale in $\sim 20$ time units |
| Lyapunov exponents | $\lambda_1 > 0$ certifies chaos; magnitude sets prediction horizon |
| Bifurcation cascade | Period-doubling $\to$ chaos with universal Feigenbaum ratio $\delta$ |
| Strange attractor | Fractal dimension via Kaplan-Yorke formula |
| Forecast horizon | $T \approx \lambda^{-1}\ln(L/\varepsilon_0)$ |
| Ensemble forecasting | Standard practice for chaotic systems |
References#
- Lorenz, “Deterministic Nonperiodic Flow,” J. Atmospheric Sciences (1963)
- Strogatz, Nonlinear Dynamics and Chaos, CRC Press (2015)
- Gleick, Chaos: Making a New Science, Viking Press (1987)
- Ott, Chaos in Dynamical Systems, Cambridge (2002)
- Ott, Grebogi & Yorke, “Controlling Chaos,” Physical Review Letters (1990)
- Sparrow, The Lorenz Equations: Bifurcations, Chaos, and Strange Attractors, Springer (1982)
This is Part 9 of the 18-part series on Ordinary Differential Equations.
- Part 1: Origins and Intuition
- Part 2: First-Order Methods
- Part 3: Higher-Order Linear Theory
- Part 4: The Laplace Transform
- Part 5: Power Series and Special Functions
- Part 6: Linear Systems and the Matrix Exponential
- Part 7: Stability Theory
- Part 8: Nonlinear Systems and Phase Portraits
- Part 9: Chaos Theory and the Lorenz System (current)
- Part 10: Bifurcation Theory
- Part 11: Numerical Methods
- Part 12: Boundary Value Problems
- Part 13: Introduction to PDEs
- Part 14: Epidemic Models
- Part 15: Population Dynamics
- Part 16: Fundamentals of Control Theory
- Part 17: Physics and Engineering Applications
- Part 18: Frontiers and Series Finale
ODE Foundations 18 parts
- 01 Ordinary Differential Equations (1): Origins and Intuition
- 02 Ordinary Differential Equations (2): First-Order Methods
- 03 Ordinary Differential Equations (3): Higher-Order Linear Theory
- 04 Ordinary Differential Equations (4): The Laplace Transform
- 05 Ordinary Differential Equations (5): Power Series and Special Functions
- 06 Ordinary Differential Equations (6): Linear Systems and the Matrix Exponential
- 07 Ordinary Differential Equations (7): Stability Theory
- 08 Ordinary Differential Equations (8): Nonlinear Systems and Phase Portraits
- 09 Ordinary Differential Equations (9): Chaos Theory and the Lorenz System you are here
- 10 Ordinary Differential Equations (10): Bifurcation Theory
- 11 Ordinary Differential Equations (11): Numerical Methods
- 12 Ordinary Differential Equations (12): Boundary Value Problems
- 13 Ordinary Differential Equations (13): Introduction to Partial Differential Equations
- 14 Ordinary Differential Equations (14): Epidemic Models and Epidemiology
- 15 Ordinary Differential Equations (15): Population Dynamics
- 16 Ordinary Differential Equations (16): Fundamentals of Control Theory
- 17 Ordinary Differential Equations (17): Physics and Engineering Applications
- 18 Ordinary Differential Equations (18): Frontiers and Series Finale