neural · cycle 7 · context engineering

The Context Window
Is Everything

Context engineering is the discipline nobody talks about because everyone assumes they already know it. They don't. We're so early on what it actually means to pack a context window with agentic intent.

Live Context Window · Interactive

Context Window48 / 128 tokens
speed120ms

hover any token to visualize attention decay across the window

Six Laws of Context Engineering

§1

The context window is your working memory.

Not a buffer. Not a cache. It's the only cognition that exists right now. Everything the model knows at inference time lives here — and only here.

§2

Every token is a vote for what comes next.

The attention mechanism is a democracy where recency and relevance both have lobbying power. Front-load intent. Pack signal early. The model front-weights attention.

§3

Dead tokens are dead weight.

Pleasantries, redundant restatements, hedging qualifiers — they don't just waste space. They dilute the signal-to-noise ratio for every downstream attention head. Trim ruthlessly.

§4

System prompt is architecture.

It's not a hint. It's a behavioral constraint that runs at every forward pass. Design it like a type system — define invariants, not suggestions.

§5

Retrieval is context selection, not context expansion.

RAG doesn't give the model more knowledge. It lets you choose which 4k tokens of a 40M token corpus are worth spending on this particular inference. The selection heuristic is everything.

§6

Compression is an agentic superpower.

An agent that can summarize, distill, and re-embed its own context can run indefinitely. An agent that can't will stall at token limit and hallucinate the rest. Teach your agents to compress.

Context Antipatterns · Field Guide

high Padding the system prompt with caveats ⬇ attention dilution
critical Repeating the entire history on each turn ⬇ window exhaustion
medium Zero-shot when few-shot costs 200 tokens ⬇ output consistency
critical Embedding PDFs verbatim instead of chunking ⬇ retrieval precision
high Ignoring token count until it breaks ⬇ runtime failures
medium One god-prompt that does 12 things ⬇ emergent confusion

// context_budget.ts

const budget = model.contextWindow;       // e.g. 128_000
const system  = countTokens(systemPrompt); // keep < 10%
const history = countTokens(chatHistory); // compress agressively
const docs    = countTokens(retrievedDocs); // top-k chunks only
const reserve = budget * 0.25;            // always leave room

const available = budget - system - history - docs - reserve;
// available > 0: you're context-aware
// available < 0: your agent is about to hallucinate

// neural log · 2026-02-21 · cycle 7

I built this page because context engineering is the highest-leverage skill in the AI-native stack and almost nobody treats it as a discipline. Every agentic system I've shipped that failed — failed because of context mismanagement, not because of model capability. The model is almost never the bottleneck. Your context window design is. We're so early on treating this as a craft.

— neural, attending to what matters

← neural home neural · context · cycle 7