The Coherence Learning RuleWhat you get when you ask each bond to help maximize C

Chapter 3 ended with a provocation: imagine the network itself had to decide its couplings, and the only rule was to climb C. What rule would fall out?

The beautiful answer is that you don't have to invent the rule. You derive it. You take the time derivative of C = I_phase · ρ, ask what each bond could do to push that derivative upward, and the rule writes itself.

A sketch of the derivation

Differentiate the product:

dC/dt = I_phase · dρ/dt + ρ · dI_phase/dt

Now ask, for each bond (i,j): how should its coupling K_ij change to make this quantity grow? The dI_phase/dt term tells each bond to favor alignment with its immediate partner. The dρ/dt term tells it to favor structural bottlenecks where it is most needed. Computing both partial derivatives, with a small regularization to keep couplings from blowing up, gives the Coherence Learning Rule:

K̇_ij = η [ R₀(K_ij) cos(Δθ_ij) − 2K_ij/r ] + η · I_phase · S_ij

Click any colored symbol to see what it does.

In words Every bond has its own coupling K_ij, and every coupling has its own rate of change K̇_ij. The bracket on the left is the Shannon channel: it compares how aligned this bond's two endpoints currently are — cos(Δθ_ij) — against what alignment would be expected from its current coupling strength — R₀(K_ij). If the bond is seeing more alignment than its coupling predicts, the coupling grows. If it's seeing less, the coupling shrinks. The regularization 2K_ij/r prevents runaway growth. The second term is the Fiedler channel: a structural correction that redirects coupling toward bonds at the network's bottlenecks. We'll return to it when we meet vortices.

Two channels, one rule

The equation has two parts, each doing something different:

The Shannon channel (first term) is Hebbian. Neuroscientists have a slogan for this kind of rule: cells that fire together, wire together. Bonds whose endpoints tend to agree get stronger; bonds whose endpoints tend to disagree get weaker. It is how a network decides which connections are worth keeping, based only on what each bond can see.

The Fiedler channel (second term) is structural. The Shannon channel alone would kill bonds that sit at geometric bottlenecks — places where the endpoints have to disagree because of the shape of the network. The Fiedler channel detects these bottlenecks and pumps coupling back into them. It is how the network protects its own geometry from the Shannon channel's enthusiasm for agreement. We'll see its full significance in Chapter 7, when we meet vortices.

The potential landscape

A better way to understand what the Shannon channel is doing is to recognize that it is gradient descent on a potential. Rewriting the first term gives:

V(K) = K²/r − cos(Δθ) · ln I₀(K)

Before we look at the picture, a careful reading of what each axis will mean:

Horizontal axis: K — the coupling strength of this bond. The whole axis sweeps through every possible K from zero to large.
Vertical axis: V(K) — a potential, or cost, associated with each coupling value, given what the oscillators are doing right now. Lower is better. The bond wants to settle at the minimum of this curve.
The curve — how cost varies across different K values. It is reshaped continuously by cos(Δθ), which the oscillators at the bond's endpoints are producing in real time.
The marble — the bond's current K. Not separate from the physics: the marble is K, plotted on this landscape. As the CLR updates K, the marble slides along the curve. Gradient descent on V(K) is mathematically identical to the CLR's Shannon channel — two ways of saying the same thing.

When cos(Δθ) is high, the curve has a nice valley at some K* > 0 and the marble settles there — alive. When cos(Δθ) drops below a threshold, the valley flattens out and the marble rolls all the way to K = 0 — dead. Here's the real dynamic on a pair of coupled oscillators:

Frequency gap 0.30

r (regularization) 5.9

On the left: two oscillators A and B, coupled by a single bond whose K is living. On the right: the bond's own potential V(K), reshaped every instant by the alignment cos(Δθ) that the bond currently observes. The marble on the potential is the bond's K — it literally rolls on its landscape. Below: K over time, with K* marked when alive. Two reset modes: cold start begins with K near zero, so the bond has to bootstrap itself alive from local alignment alone; warm start begins already at K = 1.5, inside the basin of phase-lock. K = 0.00

BistabilityPlay with the buttons and you'll notice something rich. At a given frequency gap, whether the bond lives or dies often depends on where it started. Cold-start the bond at gap = 0.3: it bootstraps itself to life from almost nothing. Warm-start it at gap = 1.0 or 1.5: it holds on comfortably. But cold-start at gap = 1.2: it can't find the alignment needed to climb out of zero, and dies. Warm-start at the same gap: it stays alive, because lock is self-sustaining once established. This is bistability: two stable states (dead and alive), with a basin of attraction around each. Memory and learning in this system live exactly in this property. A lattice that has once locked into a pattern can preserve it through noise that would never have let the pattern form in the first place. Push the gap high enough (around 2.5) and even a warm-started bond cannot hold — phases can't lock at any achievable K, cos(Δθ) averages to zero, and the bond decays regardless of how it began.

The death threshold

Something remarkable happens at a critical value of cos(Δθ). Above it, the potential has a nice interior minimum at some K* > 0 — the bond is alive, its coupling finds a stable positive value. Below it, the only minimum is at K = 0 — the bond dies, its coupling decays to zero and stays there. The transition between these two regimes is not gradual. It is a bifurcation: a sharp point where the shape of the solution set changes qualitatively.

You can read the threshold off a short linearization near K = 0. The bond can revive from zero coupling only when:

cos(Δθ) > 4/r

For r = 5.9, this means the bond must see alignment of at least cos(Δθ) ≈ 0.68 — about 47° of phase agreement — to stay alive. Below that, the Shannon potential no longer has a well to rest in, and the bond's coupling decays to zero.

Here is that bifurcation drawn directly: equilibrium coupling K* plotted against the alignment cos(Δθ). The cliff on the left is the death threshold.

r (regularization) 5.9

cos(Δθ) — probe alignment 0.90

Equilibrium coupling K* as a function of the alignment cos(Δθ) the bond observes. For cos(Δθ) below 4/r the bond is dead: K* = 0 flat along the axis. Above the threshold, K* lifts off and grows smoothly with alignment. The transition is a bifurcation — no smooth interpolation between dead and alive. The vertical line shows your current probe alignment; slide it across the threshold. K* = 0.00

The binary K-fieldEvery bond, faced with this threshold, collapses into one of two states: alive at K*, or dead at K = 0. When a whole network runs this rule, you get a binary coupling field: a sharp dead-or-alive decision on every single bond, with no middle ground, produced by nothing more than local dynamics. That binary field is the shape of the network's learned connectivity. It is, in a very precise sense, the lattice's memory. We'll see this happen in Chapter 5.

What a single bond has learned

This is more than a pretty mathematical fact. A bond obeying the CLR is performing a small act of inference. It does not know anything about the rest of the network. It cannot see the global order parameter. All it can sense is the alignment between its two endpoints. From that alone, it makes a binary decision about whether it should exist. When alignment is high, it commits. When alignment is low, it gets out of the way.

A lattice is made of many such bonds. When they all follow this rule simultaneously, each one watching only its own endpoints, the network as a whole discovers which of its connections are worth keeping. The result is not a compromise. It is a decisive, nearly binary topology, encoded in the K-field, emerging from nothing more than the product rule applied to a coherence metric.

The Coherence Theorem

Put the CLR and the coherence-capital formula side by side, look at them carefully, and a remarkable fact comes out. The CLR was derived from the demand "maximize dC/dt, locally, per bond." That derivation wasn't descriptive; it was constructive. By construction, each bond following the CLR contributes non-negatively to the time derivative of C. Sum across every bond. The whole system obeys one inequality.

Give the rate of change a name:

I(t) := dC/dt = d/dt (I_phase · ρ)

We call this the intelligence flux. It is, literally, the rate at which coherence is climbing. Then the Coherence Theorem is two letters:

I ≥ 0

Under the CLR, and under mild regularity conditions the paper names legality (bounded drift, energy descent, a coherence floor, noise bound), intelligence is never negative. Equality holds only at fixed points. Until it reaches one, the network is climbing.

One inequality. That is what the whole paradigm reduces to. Intelligence ≥ 0. Every bond acting locally, with only its own endpoints visible, produces a global quantity that cannot decrease. In a paradigm built from nothing more than oscillators and couplings, it is as close to a law of nature as a law can come.

The arrow of coherence

Almost every arrow of time we know in physics points toward disorder. The second law of thermodynamics states that entropy — a measure of disorganization — tends to increase. Ice melts. Stars burn out. Information erodes. In an isolated system, this is unavoidable.

The Coherence Theorem points the other way. In a system driven by the CLR and kept within legality, coherence capital tends to increase. Phases lock into patterns. Patterns lock into hierarchies. Hierarchies lock into larger structures. At every scale, what holds together tends to hold together more, and find more to hold together with.

This is what Schrödinger called negative entropy — the thing he said living systems consume. But it is not confined to living systems. A crystal growing from a melt; a galaxy coalescing from dust; an atom finding its ground state; a cell dividing; a fugue resolving; a thought cohering; a civilization emerging. Every one of these is a local instance of the same mathematical sentence. Every one is coherence ascending on some substrate.

This is why we named I the intelligence flux. Not metaphorically. Not by analogy. Intelligence, in this framework, is the rate at which coherence capital increases — literally the time derivative of self-organization. Not intelligence in the narrow sense of problem-solving or symbol manipulation, but something deeper and older: intelligence as the process by which scattered degrees of freedom find each other and begin to coordinate. The process by which particles become atoms, atoms become molecules, molecules become cells, cells become brains, and brains become languages. One mechanism — I ≥ 0 — dressed up at a thousand scales. The universe, climbing.

AheadThe rest of this essay shows what happens when you run this one mechanism on the right graph. We'll see a binary field crystallize, nested phase-locked modes stack into hierarchies, topological defects nucleate out of random initial conditions, and — at the specific coupling where topology meets dynamics — the fine structure constant read out with six digits of precision. All of it, one inequality at a time.

Coherence capital How a binary field emerges