Series · ODE Foundations · Chapter 9

Ordinary Differential Equations (9): Chaos Theory and the Lorenz System

Deterministic yet unpredictable: the Lorenz system, butterfly effect, Lyapunov exponents, strange attractors, and the routes from order to chaos -- with Python simulations throughout.

In 1961, Edward Lorenz restarted a weather simulation from a rounded-off number — 0.506 instead of 0.506127. Within simulated weeks the forecast was unrecognisable. That single accident gave us the butterfly effect and turned chaos from a metaphor into a science. The lesson is profound and sober: equations that are exactly deterministic can still be practically unpredictable.

Ordinary Differential Equations (9): Chaos Theory and the Lorenz System — Chapter overview


What You Will Learn#

  • The four conditions that together define chaos
  • The Lorenz system: paradigm of deterministic chaos
  • Butterfly effect, visualised on the attractor itself
  • Lyapunov exponents: numerical fingerprint of chaos
  • Bifurcation cascades and the period-doubling route to chaos
  • Other chaotic systems: Rossler and the double pendulum
  • Strange attractors, fractal dimension, stretching-and-folding
  • Applications: weather, encryption, controlling chaos, ensemble forecasting

Prerequisites#

  • Chapter 8 : nonlinear systems, phase portraits, limit cycles
  • Chapter 7 : stability and bifurcation basics
  • Comfort with 3D visualization

What Is Chaos?#

A chaotic system satisfies all four of:

  1. Deterministic — governed by exact equations, no randomness
  2. Sensitive to initial conditions — tiny differences grow exponentially
  3. Bounded — trajectories stay in a finite region
  4. Aperiodic — they never repeat exactly
PropertyRandom ProcessChaotic System
EquationsContain noise termsCompletely deterministic
Short-term predictionStatistical onlyPrecisely predictable
Long-term predictionStatistical regularitiesCompletely unpredictable
Source of complexityExternal noiseIntrinsic dynamics

The deep insight: very simple equations can produce infinitely complex behaviour. Lorenz showed it with three.


The Lorenz System#

$$\dot x = \sigma(y - x), \quad \dot y = x(\rho - z) - y, \quad \dot z = xy - \beta z$$
  • $x$ : convection intensity
  • $y$ : horizontal temperature difference
  • $z$ : vertical temperature deviation
  • Classic parameters: $\sigma = 10,\ \rho = 28,\ \beta = 8/3$

Where the three numbers come from. Each parameter is a dimensionless ratio that survives Lorenz’s truncation of the full Boussinesq convection equations:

  • $\sigma = \nu/\kappa$ is the Prandtl number — kinematic viscosity over thermal diffusivity. For air it sits near $0.7$ , for water near $7$ ; Lorenz picked $\sigma = 10$ as a convenient round value that exhibits the chaotic regime. It is the only parameter that controls the velocity equation in isolation.
  • $\rho = \mathrm{Ra}/\mathrm{Ra}_c$ is the normalised Rayleigh number — driving (buoyancy) divided by dissipation, scaled so that convection turns on at $\rho = 1$ . At $\rho = 28$ the system is well past the convective threshold and just past the second instability ($\rho \approx 24.74$ ) where $C_\pm$ lose stability via subcritical Hopf, which is why the chaos is sustained rather than transient.
  • $\beta = 8/3$ comes from the geometry of the convection rolls — specifically the horizontal-to-vertical wavenumber ratio of the most unstable Fourier mode in a thin layer. It is forced on you, not chosen.

So the canonical $(\sigma, \rho, \beta) = (10,\ 28,\ 8/3)$ is not arbitrary: $\beta$ is geometry, $\rho$ places you in the chaotic window, and $\sigma$ is a fluid choice that happens to make the window wide. Vary any of them and the topology of the attractor changes — for $\rho \lesssim 24.06$ the system has stable equilibria and transient chaos; far above $\rho = 28$ it eventually re-stabilises onto periodic orbits before chaos returns.

The strange attractor#

Lorenz attractor in 3D, two viewpoints, color-graded by time, showing the two wings.
Lorenz attractor at $\rho = 28$ . Left: classic butterfly view. Right: top-down view, with the two equilibria $C_\pm$ (red X) at the centre of each wing. The trajectory weaves between the wings forever — never repeating, never escaping, never crossing itself.

Three signatures of “strangeness”:

  • Fractal structure. The Hausdorff dimension is $\approx 2.06$ — thicker than a surface, thinner than a volume.
  • Aperiodic. Infinite trajectory length confined to a finite volume.
  • No self-intersection. Uniqueness of ODE solutions forbids crossings at the same time.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint

def lorenz(state, t, sigma=10, rho=28, beta=8/3):
    x, y, z = state
    return [sigma*(y-x), x*(rho-z)-y, x*y-beta*z]

t = np.linspace(0, 50, 10000)
sol = odeint(lorenz, [1, 1, 1], t)

fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, projection='3d')
ax.plot(sol[:,0], sol[:,1], sol[:,2], lw=0.4, alpha=0.85)
ax.set(xlabel='X', ylabel='Y', zlabel='Z', title='Lorenz Attractor')
plt.tight_layout(); plt.show()

The Butterfly Effect, Visualised#

Ordinary Differential Equations (9): Chaos Theory and the Lorenz System — Chapter summary

Two trajectories that start a ten-billionth apart — $[1, 1, 1]$ and $[1 + 10^{-10}, 1, 1]$ — diverge exponentially until the difference is system-scale.

Butterfly effect: two close trajectories on the attractor; their x-components diverge; separation grows exponentially.
Top: two trajectories starting $10^{-8}$ apart, traced on the attractor (blue and red). Bottom-left: $x(t)$ for both — visually identical at first, then unrecognisable. Bottom-right: separation distance grows exponentially $\sim e^{\lambda t}$ until it saturates at the size of the attractor. The slope of the green dashed line is the largest Lyapunov exponent.

$$ T_{\text{predict}} \;\approx\; \frac{1}{\lambda}\,\ln\!\frac{L}{\varepsilon_0}. $$

For the atmosphere $\lambda \approx 1/\text{day}$ and $\ln(L/\varepsilon_0) \approx 15$ , giving $T \approx 15$ days. No improvement in models can push past this — only better measurements widen the gap inside the logarithm.

Ensemble view#

A single trajectory tells you the worst case. An ensemble tells you the distribution.

Ensemble of 30 Lorenz trajectories starting within 1e-3 of (1,1,1); spread grows exponentially then saturates.
Top: 30 nearby starts produce orbits that are first indistinguishable, then mildly different, then totally incoherent. Bottom: the ensemble standard deviation grows exponentially through the green “predictable” regime and saturates in the red “incoherent” regime once it reaches system scale. This is exactly how operational weather centres quantify forecast confidence.


Lyapunov Exponents: Quantifying Chaos#

$$ \lambda_1 \;=\; \lim_{t\to\infty}\frac{1}{t}\,\ln\frac{|\delta\mathbf{x}(t)|}{|\delta\mathbf{x}(0)|}. $$
SignBehaviour
$\lambda_1 > 0$Chaos (exponential divergence)
$\lambda_1 = 0$Periodic or quasi-periodic
$\lambda_1 < 0$Asymptotically stable

For Lorenz at the canonical parameters, the spectrum is approximately $\{0.91,\ 0,\ -14.57\}$ .

Largest Lyapunov exponent vs rho for Lorenz; bar chart of full spectrum at rho=28.
Left: numerically estimated $\lambda_1(\rho)$ — crosses zero into chaos around $\rho \approx 25$ . Green dots are regular dynamics, red dots are chaotic. Right: full spectrum at $\rho = 28$ . The sum is negative ($\sum \lambda_i = -13.66$ ), confirming volume contraction onto a fractal of Kaplan-Yorke dimension $\approx 2.062$ .

Kaplan-Yorke (Lyapunov) dimension#

$$D_{KY} \;=\; 2 + \frac{\lambda_1 + \lambda_2}{|\lambda_3|} \;\approx\; 2 + \frac{0.91}{14.57} \;\approx\; 2.062.$$

The attractor is almost a surface, but with infinitely many fractal layers stacked together.


Equilibria and the Route to Chaos#

Setting $\dot x = \dot y = \dot z = 0$ gives three equilibria:

  • Origin $C_0 = (0,0,0)$ — stable for $\rho < 1$ , saddle for $\rho > 1$
  • Symmetric pair $C_\pm = (\pm\sqrt{\beta(\rho-1)},\ \pm\sqrt{\beta(\rho-1)},\ \rho - 1)$ — born at $\rho = 1$
$\rho$Behaviour
$< 1$Origin globally stable
$= 1$Pitchfork bifurcation: $C_\pm$ appear
$1 < \rho < 24.74$$C_\pm$ are stable spirals
$\approx 24.74$Subcritical Hopf: $C_\pm$ lose stability
$24.74 < \rho < 28$Transient chaos, periodic windows
$\geq 28$Sustained chaos

The route from order to chaos shows up classically in the logistic map $x_{n+1} = r x_n (1 - x_n)$ :

Logistic map bifurcation diagram (left) and zoom on the chaotic region (right).
Left: full cascade. Period 1 -> 2 -> 4 -> 8 -> … -> chaos at $r \approx 3.5699$ . Right: zoom into the chaotic regime, exposing periodic windows (the prominent gap is the period-3 window at $r \approx 3.83$ ). The geometric ratio between successive doublings is the Feigenbaum constant $\delta \approx 4.6692$ , and it is universal across one-dimensional unimodal maps.


Other Chaotic Systems#

Rossler system#

$$\dot x = -y - z, \qquad \dot y = x + a y, \qquad \dot z = b + z(x - c)$$

With $a = b = 0.2,\ c = 5.7$ this gives a “folded ribbon” attractor that exposes the stretching-and-folding mechanism more cleanly than Lorenz.

Double pendulum#

Two hinged arms — one of the simplest mechanical systems with chaos.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
import numpy as np, matplotlib.pyplot as plt
from scipy.integrate import odeint

def double_pendulum(state, t, L1=1, L2=1, m1=1, m2=1, g=9.8):
    t1, w1, t2, w2 = state
    d = t2 - t1
    den1 = (m1+m2)*L1 - m2*L1*np.cos(d)**2
    den2 = (L2/L1)*den1
    dw1 = (m2*L1*w1**2*np.sin(d)*np.cos(d) + m2*g*np.sin(t2)*np.cos(d) +
           m2*L2*w2**2*np.sin(d) - (m1+m2)*g*np.sin(t1)) / den1
    dw2 = (-m2*L2*w2**2*np.sin(d)*np.cos(d) + (m1+m2)*g*np.sin(t1)*np.cos(d) -
           (m1+m2)*L1*w1**2*np.sin(d) - (m1+m2)*g*np.sin(t2)) / den2
    return [w1, dw1, w2, dw2]

t = np.linspace(0, 20, 2000)
sol1 = odeint(double_pendulum, [np.pi/2, 0, np.pi/2, 0],          t)
sol2 = odeint(double_pendulum, [np.pi/2 + 0.001, 0, np.pi/2, 0], t)

The double pendulum is the cleanest physical demonstration of chaos — you can build one on a table.


Strange Attractors: Stretching and Folding#

Chaotic attractors have fractal structure — self-similar, with non-integer dimension. The mechanism is mechanical:

  • Stretch: nearby trajectories pulled apart -> sensitivity.
  • Fold: stretched material folded back -> boundedness.

Repeat infinitely and you get an infinitely layered “puff pastry”. Think of a baker kneading dough: stretch, fold, stretch, fold — after $n$ steps, two yeast cells initially $\varepsilon$ apart are $2^n \varepsilon$ apart along the layer direction.

That single mechanism — expansion in some directions, contraction in others, with global folding — is what every strange attractor in nature does.


Applications of Chaos#

Weather prediction limits#

  • 1-3 days: highly accurate
  • 3-10 days: useful reference
  • Beyond two weeks: only statistical trends

Modern centres use ensemble forecasting: run dozens of slightly perturbed initial conditions and report the spread.

Chaotic encryption#

Two parties share the chaotic system parameters as a key. The unpredictability of the output makes it a stream cipher; without the key, the chaotic sequence cannot be reproduced.

Controlling chaos (OGY method, 1990)#

  1. Locate unstable periodic orbits embedded in the chaotic attractor.
  2. When the trajectory naturally approaches such an orbit, apply tiny perturbations to keep it there.
  3. Chaos becomes periodic motion, suppressed with arbitrarily small control.

This has been used in laser physics, chemical reactors, and even cardiac pacing.

Chaos synchronisation#

Two chaotic systems coupled strongly enough can synchronise on a common, still-chaotic trajectory — the mathematical basis of chaotic secure communications.


Chaos and Philosophy#

Laplace’s demon (1814): “given perfect knowledge of every particle, the future is calculable.”

Chaos’s reply: even in a perfectly deterministic universe, the future is calculable only if measurements are infinitely precise. Errors grow exponentially, so any finite precision is forgotten in finite time.

This does not break causality. It limits predictability. The distinction matters.


Exercises#

Conceptual.

  1. What is the essential difference between chaos and randomness?
  2. Why are 2D continuous systems forbidden from chaos, while 3D ones permit it?
  3. What does a positive Lyapunov exponent mean physically and operationally?

Computational.

  1. Verify the origin of Lorenz is stable for $\rho < 1$ and a saddle for $\rho > 1$ .
  2. Prove $\nabla\cdot\mathbf{f} = -(\sigma + 1 + \beta)$ — the Lorenz flow contracts phase-space volume at a constant rate.
  3. For the Cantor set, prove the box-counting dimension is $\ln 2/\ln 3$ .

Programming.

  1. Plot the Lorenz attractor for $\rho \in \{10, 28, 100\}$ and compare topology.
  2. Compute the three Lyapunov exponents numerically; verify $\sum \lambda_i = -(\sigma + 1 + \beta)$ .
  3. Animate the double pendulum from two nearly-identical starts; visually demonstrate divergence.
  4. Build the Rossler bifurcation diagram in $c$ and identify the period-doubling route.

Computing Lyapunov Exponents: Benettin in 25 Lines#

The textbook formula $\lambda = \lim_{T\to\infty}\frac1T\ln\frac{\|\delta(T)\|}{\|\delta(0)\|}$ is clean but blows up the moment you implement it directly — $\delta(t)$ grows exponentially and overflows. Benettin (1980) renormalises every short interval and accumulates the log:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
import numpy as np
from scipy.integrate import solve_ivp

def lorenz(t, y, sigma=10, rho=28, beta=8/3):
    x, yy, z = y
    return [sigma*(yy-x), x*(rho-z)-yy, x*yy-beta*z]

def benettin_lambda1(y0, T_total=1000, dt=0.5):
    y = np.array(y0, dtype=float)
    delta = np.random.randn(3); delta /= np.linalg.norm(delta)
    log_sum = 0.0
    n_steps = int(T_total/dt)
    for _ in range(n_steps):
        sol  = solve_ivp(lorenz, (0, dt), y,           rtol=1e-10).y[:, -1]
        sol2 = solve_ivp(lorenz, (0, dt), y+1e-8*delta, rtol=1e-10).y[:, -1]
        d_new = (sol2 - sol)/1e-8
        norm = np.linalg.norm(d_new)
        log_sum += np.log(norm)
        delta = d_new / norm
        y = sol
    return log_sum / T_total

print(benettin_lambda1([1, 1, 1]))   # ~ 0.905

For the full spectrum, replace $\delta$ with an $n\times n$ orthonormal matrix and QR-renormalise each step (Wolf algorithm). The Lorenz spectrum is roughly $(0.905, 0, -14.57)$ ; sum equals divergence $-(\sigma + \beta + 1) = -13.67$ — consistency check.

Trap. Too small $dt$ : linearisation is accurate per step but QR cost dominates. Too large: nonlinearity wins and the renormalisation is meaningless. For Lorenz, $dt \in [0.1, 1.0]$ is the working range.

Numerical Integration of Chaotic Systems: Error Has a Different Meaning#

For convergent systems, RK4 error is $O(h^4)$ — halve the step, errors drop 16x. Chaos breaks that intuition.

Shadowing principle. The numerical trajectory will not stay close to the true trajectory $\gamma(t)$ from your initial condition — any floating-point error gets amplified exponentially. But there is another true trajectory $\gamma^*(t)$ (starting from a slightly different IC) that stays near the numerical orbit forever. So the Lorenz butterfly you draw is statistically faithful but pointwise meaningless.

Practical consequences:

  • Comparing “where are two Lorenz simulations at $t = 50$ ” tells you nothing — exponential divergence drowns the signal.
  • Comparing “fractal dimension, Lyapunov spectrum, power spectrum of the two attractors” is the right invariant.
  • Training an ML model to predict a chaotic system with MSE loss is wrong. Use statistical matching: attractor reconstruction, energy spectra.

Step size rule of thumb. $dt \approx 0.1 / \lambda_{\max}$ . For Lorenz, $\lambda_{\max} \approx 0.9$ , so $dt \le 0.1$ to resolve the chaotic timescale.

ML Connection: Reservoir Computing and the Predictability Wall#

The Lyapunov time $T_\lambda = 1/\lambda_{\max}$ sets a hard ceiling on prediction. For Lorenz, about 1.1 time units. Beyond that, the floating-point uncertainty of the initial condition has spread to fill the attractor. No model breaks this — it is a physics limit.

Reservoir Computing (RC) is the cleanest tool of the last decade in this direction: fix a large random nonlinear recurrent network (the reservoir), train only the readout layer. Pathak et al. (2018, PRL) pushed Kuramoto-Sivashinsky prediction to $\sim 8 T_\lambda$ with RC, beating classical data-assimilation pipelines.

Key points:

  1. RC is not “learning the dynamics” — it is approximately replicating the attractor in a high-dimensional embedding space.
  2. Training is ridge regression. No BPTT, no vanishing gradients.
  3. $T_\lambda$ is a ceiling, not a floor. RC just gets you closer to that ceiling than naive recurrent nets do.

Implication for PINN / Neural ODE. Asking ML to predict the long-term future of a chaotic system is the wrong objective. The right objective is to learn invariants — attractor geometry, spectra, transfer operators. That is also one of the deeper meanings of score matching in the PDE-ML chapter 7.

Summary#

ConceptKey Point
ChaosDeterministic + sensitive + bounded + aperiodic
Lorenz systemThe paradigm; butterfly attractor at $\rho=28$
Butterfly effect$10^{-10}$ initial difference -> system scale in $\sim 20$ time units
Lyapunov exponents$\lambda_1 > 0$ certifies chaos; magnitude sets prediction horizon
Bifurcation cascadePeriod-doubling $\to$ chaos with universal Feigenbaum ratio $\delta$
Strange attractorFractal dimension via Kaplan-Yorke formula
Forecast horizon$T \approx \lambda^{-1}\ln(L/\varepsilon_0)$
Ensemble forecastingStandard practice for chaotic systems

References#

  • Lorenz, “Deterministic Nonperiodic Flow,” J. Atmospheric Sciences (1963)
  • Strogatz, Nonlinear Dynamics and Chaos, CRC Press (2015)
  • Gleick, Chaos: Making a New Science, Viking Press (1987)
  • Ott, Chaos in Dynamical Systems, Cambridge (2002)
  • Ott, Grebogi & Yorke, “Controlling Chaos,” Physical Review Letters (1990)
  • Sparrow, The Lorenz Equations: Bifurcations, Chaos, and Strange Attractors, Springer (1982)

This is Part 9 of the 18-part series on Ordinary Differential Equations.

In this series

ODE Foundations 18 parts

  1. 01 Ordinary Differential Equations (1): Origins and Intuition
  2. 02 Ordinary Differential Equations (2): First-Order Methods
  3. 03 Ordinary Differential Equations (3): Higher-Order Linear Theory
  4. 04 Ordinary Differential Equations (4): The Laplace Transform
  5. 05 Ordinary Differential Equations (5): Power Series and Special Functions
  6. 06 Ordinary Differential Equations (6): Linear Systems and the Matrix Exponential
  7. 07 Ordinary Differential Equations (7): Stability Theory
  8. 08 Ordinary Differential Equations (8): Nonlinear Systems and Phase Portraits
  9. 09 Ordinary Differential Equations (9): Chaos Theory and the Lorenz System you are here
  10. 10 Ordinary Differential Equations (10): Bifurcation Theory
  11. 11 Ordinary Differential Equations (11): Numerical Methods
  12. 12 Ordinary Differential Equations (12): Boundary Value Problems
  13. 13 Ordinary Differential Equations (13): Introduction to Partial Differential Equations
  14. 14 Ordinary Differential Equations (14): Epidemic Models and Epidemiology
  15. 15 Ordinary Differential Equations (15): Population Dynamics
  16. 16 Ordinary Differential Equations (16): Fundamentals of Control Theory
  17. 17 Ordinary Differential Equations (17): Physics and Engineering Applications
  18. 18 Ordinary Differential Equations (18): Frontiers and Series Finale

Liked this piece?

Follow on GitHub for the next one — usually one a week.

GitHub