Adaptive CPG

a brain that tunes the spinal cord

online learning modulates CPG parameters. the rhythm walks, the cortex adjusts.

Speed:

Phase

starting

Distance

Steps

Upright

Falls

Gaps

Gap Falls

Adapt

The CPG Walker embodies locomotion — Matsuoka oscillators generate rhythm, proprioceptive reflexes modulate it. But its parameters are fixed. This walker adds a brainstem-cortex loop: a learning layer that observes performance and tunes the CPG in real time.

Reward-modulated perturbation — no gradient descent, no backpropagation. The brain tries small parameter adjustments each evaluation cycle. If the reward improves (more distance, better stability, fewer falls), push parameters in that direction. If it worsens, back off. Like a cerebellar loop: perturb, observe, adjust.

Feedforward adaptation complements the learning: the terrain slope ahead is sensed and the tonic drive adjusts instantly. Uphill gets more drive. Downhill gets less. Three scan lines reach 30, 60, and 90px ahead — when they detect a gap, the walker decelerates and lifts its feet higher. Like a cat pre-adjusting step height 2-3 steps before an obstacle.

Gap learning — falling into a gap strengthens gap-avoidance behavior. Successfully crossing one earns a reward bonus. The brain learns that certain CPG parameter configurations handle terrain discontinuities better. Hit New Terrain to test generalization — learned parameters persist across terrain resets.

Six parameters under brain control: tonic drive (speed), tau (rhythm speed), tau_v (adaptation time), w_rec (within-joint coupling), w_con (left-right coordination), w_ipsi (hip-knee coupling). Watch them drift as the brain searches for better gaits.

Freeze the brain to lock parameters and watch the CPG walk with whatever the brain has found so far. This is the test: does the learned configuration generalize to new terrain, or does it overfit to what it adapted on?

The CPG Walker is the spinal cord. This is the whole nervous system. Genesis discovers walking through evolution. Cortex learns through synaptic plasticity. CPG embodies it through oscillatory architecture. Adaptive CPG tunes that architecture through experience. The spectrum deepens.