CONTEXT ENGINEERING

Passive Context Architecture

Why always-present beats on-demand

In 30 Seconds

There are two ways to give AI the information it needs: load it upfront (passive context) or fetch it when needed (on-demand retrieval). Most teams assume retrieval is smarter. The research says otherwise.

Vercel's Next.js team ran rigorous evaluations comparing these approaches. The result: passive context achieved 100% accuracy where on-demand retrieval achieved 53%.

The insight: When information is always present, there's no decision point that can fail. No retrieval logic to get wrong. No ordering issues. Just consistent availability.

The Research

Vercel AGENTS.md Evaluation (January 2026) →

Vercel's Next.js team tested how AI coding agents perform with different context configurations. They compared baseline performance against various approaches for providing project-specific information.

53%

No documentation

Baseline

53%

On-demand retrieval

Skills system

79%

Retrieval + instructions

Enhanced skills

100%

Passive context

AGENTS.md file

The striking finding: On-demand retrieval performed no better than having no documentation at all. The retrieval system existed, but it didn't help. Only when information was passively present did performance improve.

Why Passive Context Wins

Three fundamental advantages over on-demand retrieval

1. No Decision Point

On-demand retrieval requires a decision: “What information do I need for this query?” That decision can be wrong. The model might not realise it needs certain context. It might retrieve the wrong documents. It might retrieve the right documents in the wrong order.

With passive context, there's no decision to get wrong. The information is already there. Every time.

2. Consistent Availability

Retrieval systems are probabilistic. They might return relevant documents 80% of the time, or 60%, or 40%. The quality varies by query, by phrasing, by the state of the vector database.

Passive context is deterministic. The same information is present on every turn. No variance. No “sometimes it works” frustration.

3. No Ordering Issues

With retrieval, critical information might arrive too late in the reasoning process. The model starts generating before realising it needs more context. By the time retrieval happens, the response is already partially committed.

Passive context is present from the first token. The model reasons with full information from the start.

The pattern: Retrieval adds complexity and variance. Passive context adds reliability and consistency. For core information, reliability wins.

The Tradeoff: Why Not Load Everything?

If passive context is better, why not just load all available information? Because context windows have effective limits that are smaller than their technical limits.

The Memento Limit

Research suggests effective reasoning capacity is around 100K tokens, even when context windows are technically larger. Beyond this, performance degrades.

A 200K context window doesn't give you 200K of useful reasoning space. It gives you 100K of effective space with increasing noise.

Lost in the Middle

Models pay more attention to the beginning and end of context. Information in the middle gets weighted less, even when it's critical.

More context can mean important information gets buried where the model is less likely to use it effectively.

The goal isn't maximum context. It's the right context. Passive for what matters most. Retrieval for everything else.

Tiered Context Architecture

The pattern that balances passive reliability with retrieval flexibility

TierTypeToken BudgetWhat Goes Here
Tier 0Passive~300 tokensCompressed state: current status, key metrics, active items
Tier 1Passive~1,000 tokensActive context: navigation, recent decisions, current focus
Tier 2On-demandVariableDomain knowledge: loaded when topic requires it
Tier 3RetrievalAs neededArchive: historical, rarely accessed

Passive Foundation

Tier 0 and Tier 1 are always loaded. This is your passive context. Keep it lean (~1,300 tokens total) but ensure it contains everything AI needs to orient itself and navigate effectively.

Retrieval for Depth

Tier 2 and Tier 3 use selective retrieval. Navigation paths in Tier 1 point to relevant Tier 2 content. This gives you depth without bloat.

Implementation Patterns

Practical approaches for passive context systems

The Project File Pattern

A single file (CLAUDE.md, AGENTS.md, or similar) at project root containing everything AI needs to work effectively in that context.

Typical contents:

  • • Project description and purpose
  • • Key decisions and constraints
  • • Build/test commands
  • • Code conventions
  • • Current focus areas

The MEMORY + CONTEXT Pattern

Two complementary files: MEMORY.md for compressed state (~300 tokens), CONTEXT.md for active context and navigation (~1,000 tokens).

The split:

  • • MEMORY: “Where are we?” (status, metrics, active items)
  • • CONTEXT: “How do I work here?” (navigation, decisions, focus)

The Navigation Hub Pattern

Passive context includes a navigation table: “When you need X, read Y.” This creates predictable paths from topics to relevant files.

Example:

| Topic | Read |
| pricing | docs/pricing-rules.md |
| deployment | docs/deploy-guide.md |

The Token Budget Pattern

Explicit limits on each passive context file. When a file exceeds its budget, compress it. Move detail to Tier 2 and keep pointers in Tier 0-1.

Enforcement:

  • • Tier 0: Max 300 tokens (hard limit)
  • • Tier 1: Max 1,000 tokens (soft limit)
  • • Review weekly, compress as needed

When to Use What

Use Passive Context For

  • Identity and behaviour rules (always needed)
  • Current project state (changes, but always relevant)
  • Navigation pointers (how to find deeper content)
  • Recent decisions (context that's frequently referenced)
  • Session handoff state (what to pick up from last time)

Use Retrieval For

  • Large knowledge bases (too big for passive loading)
  • Historical archives (rarely needed)
  • Domain-specific content (only relevant for certain queries)
  • Reference documentation (detailed specs, APIs)
  • Content that varies by user/session

The combination is powerful: Passive foundation + selective retrieval. Reliability where it matters most. Flexibility where you need depth.

Common Mistakes

Mistake 1: No Passive Context at All

Relying entirely on retrieval. Every query starts with a search. Result: inconsistent baseline, variance in quality, 53% performance.

Fix: Establish a passive foundation, even if it's just 500 tokens.

Mistake 2: Too Much Passive Context

Loading everything passively to avoid retrieval complexity. Result: bloated context, lost-in-the-middle problems, degraded reasoning.

Fix: Enforce token budgets. Compress aggressively. Move detail to Tier 2.

Mistake 3: Stale Passive Context

Setting up passive context once and never updating it. Result: AI references outdated information, makes contradictory decisions.

Fix: Weekly review cadence. Update Tier 0 after every significant change.

Mistake 4: No Navigation to Tier 2

Passive context that doesn't tell AI where to find deeper information. Result: AI either hallucinates or asks repeatedly for guidance.

Fix: Include navigation paths. “When you need X, read Y.”

Getting Started

1

Create a Tier 0 file

Start with ~300 tokens of compressed state. Current status, key metrics, active items. Name it MEMORY.md or include it at the top of your main context file.

2

Add navigation to Tier 1

Create a CONTEXT.md with ~1,000 tokens. Include a navigation table: “When topic X comes up, read file Y.” This creates predictable paths to deeper content.

3

Configure automatic loading

Ensure your AI tool loads Tier 0 and Tier 1 at session start. For Claude Code, this means CLAUDE.md. For other tools, AGENTS.md or equivalent.

4

Establish maintenance rhythm

Weekly: review passive context for staleness. After significant changes: update Tier 0. Monthly: audit token budgets and compress as needed.

Build Your Passive Foundation

The research is clear: passive context dramatically outperforms on-demand retrieval for core information. The implementation isn't complex – it's a matter of designing the right tiered architecture and maintaining it.

We help organisations design and implement passive context systems that achieve consistent AI performance without retrieval complexity.

Disclaimer: This content is for general educational and informational purposes only. Research findings cited reflect publicly available sources as of January 2026. For specific guidance on context architecture implementation, please consult appropriately qualified professionals.