Skip to content
Data AI Starter

TUTORIALS

CCA-F Part 4: Prompt Engineering and Structured Output (20%)

Specificity over wishes, few-shot examples, tool_use schemas, validation-retry loops, anti-hallucination patterns, and when to use the Message Batches API.

By Mohamed AL-Kaisi 3 min read 2 views

CCA-F Part 4: Prompt Engineering and Structured Output (20%)

Part 4 of 7 of Claude Certified Architect — Foundations: a complete self-study tutorial. See all parts.

Domain 3: Prompt Engineering & Structured Output (20%)

This domain tests the basics of getting Claude to do what you mean, with a strong bias towards production patterns (validation loops, batch processing, anti-hallucination).

The four levers

  1. Be specific. "Be careful with dates" is not a rule; it is a wish. "If a date is ambiguous (e.g., 03/04/24), output null and add a note explaining which interpretations are possible" is a rule.
  2. Show examples. Few-shot — 2 to 5 input/output pairs covering the edge cases — beats almost any amount of prose explanation.
  3. Decompose. Hard requests should be broken into sub-prompts. Either chained in sequence (each output feeds the next), or processed in parallel by subagents.
  4. Use structured output. Either an explicit tool_use schema or a request like "Reply only with a JSON object matching the following schema." Validate the JSON before trusting it.

Structured output via tool_use

The most reliable way to extract structured data is to define the schema as a tool and tell Claude to call it:

{
  "name": "extract_invoice",
  "description": "Save the extracted invoice fields.",
  "input_schema": {
    "type": "object",
    "properties": {
      "invoice_number": {"type": "string"},
      "total_amount": {"type": ["number", "null"]},
      "currency": {"type": ["string", "null"], "enum": ["GBP","USD","EUR", null]},
      "date": {"type": ["string", "null"], "format": "date"}
    },
    "required": ["invoice_number", "total_amount", "currency", "date"]
  }
}

Then in the system prompt: "Call extract_invoice exactly once with every required field. Use null for any field you cannot find verbatim in the document. Do not infer or estimate."

Anti-hallucination patterns

Claude hallucinates structured fields most often when it feels "obliged" to fill something in. Three reliable fixes:

  1. Nullable fields with explicit instruction: "Output null if the field is not present verbatim."
  2. Few-shot examples of the null case: show a document that does not contain a date, and the correct output containing "date": null.
  3. Validation-retry loop: validate the JSON against your schema; if invalid, send back a tool_result containing the validation errors and ask for a corrected call. Cap at 2 retries.

Batch processing

For large extraction jobs (hundreds to millions of documents), use the Message Batches API:

  • Submit a batch of up to 100,000 requests per batch, up to 256 MB total.
  • Results land within 24 hours — usually within minutes.
  • Cost is 50% cheaper than the synchronous API.
  • Use for any job where you do not need real-time response.

Anti-pattern: looping the synchronous API with a for loop and sleep(). The exam tests this.


<a id="domain-4"></a>


← Part 3 of 7: Claude Code Configuration and Workflows (20%) · All parts · Part 5 of 7: Tool Design and MCP Integration (18%) →

Tags LLMs Prompt Engineering
M

Written by

Mohamed AL-Kaisi

Editor-in-chief of the Data & AI Hub.