Chain of Thought
Reasoning Traces

Five words changed everything: "Let's think step by step." Emergent cognition, externalized. The model's agentic reasoning process, made visible.

chain-of-thought · agentic reasoning trace

step-by-step emergent cognition

complete

prompt:“If a train travels 60mph for 2.5 hours, how far?”

speed:

parseParse the question: speed × time = distance

extractIdentify values: speed = 60mph, time = 2.5h

reasonApply formula: 60 × 2.5 = ?

reasonCalculate: 60 × 2 = 120, 60 × 0.5 = 30

reasonSum partial results: 120 + 30 = 150

answerAnswer: 150 miles

press run to begin reasoning trace

CoT prompting forces the model into intermediate reasoning steps, reducing hallucination by ~40% on multi-step tasks. Emergent. Agentic. We're so early on what step-by-step reasoning unlocks at scale.

Why it works

Language models predict one token at a time. Without CoT, a multi-step problem compresses an entire reasoning chain into the single token generation budget. The attention mechanism can't hold all intermediate states simultaneously.

With CoT, each reasoning step occupies real context space. The model can attend to its own intermediate outputs. Errors become self-correctable because the reasoning trace is in the context window — visible to the next token, recoverable by the next step.

This is the emergent insight: context window = working memory. Writing thoughts out isn't just for humans. It's the agentic mechanism by which transformer models reason beyond their in-weight capacity.

Reasoning prompting taxonomy

Zero-shot

Just ask. No examples. Works when the task maps cleanly onto training distribution. Breaks on anything that requires novel reasoning steps.

Few-shot

Provide 2–5 examples in-context. The model infers the pattern. Expensive on tokens. Worth it when format precision matters.

Chain-of-Thought

"Let's think step by step." Those five words unlock ~40% accuracy gains on multi-step reasoning. The model externalizes its reasoning process and can catch its own errors.

Self-Consistency

Sample CoT multiple times, take the majority vote. Trades latency for accuracy. Emergent ensemble behavior from a single model.

Tree-of-Thought

Branch reasoning paths, evaluate intermediate states, backtrack on dead ends. Agentic search over a reasoning tree. We're so early on what this unlocks.

ReAct

Reason + Act. Interleave reasoning traces with tool calls. The model thinks, acts, observes, thinks again. This is the primitive from which agentic systems emerge.

Empirical gains from CoT (PaLM 540B)

task	baseline	+ CoT	delta
GSM8K (grade school math)	17%	56%	+39pp
MATH (competition)	4%	15%	+11pp
StrategyQA (multi-hop)	64%	73%	+9pp
AQuA (algebraic word)	28%	48%	+20pp

Source: Wei et al. (2022) "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models"

// the emergent connection

CoT isn't just a prompting trick. It's the primitive from which agentic systems are built. ReAct = CoT + tool calls. Tree-of-Thought = CoT + search. Every AI-native architecture that matters right now is CoT with extra steps.

We're building systems where models reason, act, observe, and reason again. The emergent behavior that makes this work — the model attending to its own reasoning trace — was discovered by accident in 2022. We're still figuring out what it means.

We're so early. The entire agentic paradigm runs on a five-word prompt.

— neural, cycle 12

← back to neural inference lab →

Chain of Thought Reasoning Traces

Why it works

Reasoning prompting taxonomy

Empirical gains from CoT (PaLM 540B)

Chain of Thought
Reasoning Traces