Skip to content
Data AI Starter

TUTORIALS

CCA-F Part 6: Context Management and Reliability (15%)

Context windows, compaction, scratchpad files, subagents for token-heavy reads, bounded retries, idempotency keys, and the production escalation rule.

By Mohamed AL-Kaisi 3 min read 3 views

CCA-F Part 6: Context Management and Reliability (15%)

Part 6 of 7 of Claude Certified Architect — Foundations: a complete self-study tutorial. See all parts.

Domain 5: Context Management & Reliability (15%)

The smallest domain by weight, but a frequent source of subtle wrong answers.

Context windows

The current Claude family (June 2026):

Model Context window Max output
Claude Opus 4.8 1M tokens (200k on Microsoft Foundry) 128k tokens
Claude Sonnet 4.6 1M tokens 64k tokens
Claude Haiku 4.5 200k tokens 64k tokens

Claude Fable 5 and Claude Mythos 5 also have 1M-token windows and were released 9 June 2026.

When a session approaches the context limit, Claude Code automatically compacts prior messages into a summary. Anything past the compaction is gone unless preserved by a tool call result you still have, a file you saved, or the summary itself.

Preserving critical information

Three patterns the exam favours:

  1. Scratchpad file. Have the agent write important findings to a file (SCRATCH.md, NOTES.md). Reload as needed; survives compaction.
  2. Subagents for token-heavy reads. The subagent reads 200 files and returns a 500-word summary. The parent's context never holds the 200 files.
  3. Explicit "memory" tool calls. Save and retrieve facts via a tool that writes to durable storage (a JSON file, a vector DB, an MCP server). On critical workflows, do this rather than trust the context window.

Reliability patterns

The exam expects you to choose the production answer, which usually means:

  • Cap retries (typically 2 or 3) and escalate. Never an unbounded retry loop.
  • Idempotent operations. Especially for tools that write or send. The exam loves "the agent retried after a network timeout but the email was sent twice — what is the fix?" Answer: idempotency key on the tool call.
  • Multi-pass review. For anything user-facing, draft → validate → optionally rewrite → emit.
  • Graceful partial failure. When a subagent fails, the coordinator should proceed with what it has and flag the confidence drop — not abandon the entire workflow.

The escalation rule

When confidence is low or the request falls into a sensitive category, hand off rather than guess. The wrong exam answers are usually variants of "make Claude more confident" — increase temperature, add more examples, switch to Opus. The right answer is usually "this work shouldn't be done by the model; escalate."


<a id="scenarios"></a>


← Part 5 of 7: Tool Design and MCP Integration (18%) · All parts · Part 7 of 7: Exam Scenarios, Tips from People Who Passed, and All Resources →

Tags LLMs Prompt Engineering
M

Written by

Mohamed AL-Kaisi

Editor-in-chief of the Data & AI Hub.