The heart of prompt engineering: the 9 chapters of Anthropic's tutorial, the practical version.

Basic prompt structure

A call to Claude is made of three building blocks, only one of which is mandatory:

System prompt (optional): who Claude is, its role, its rules. The permanent set.
User / Assistant messages: the conversation, alternating. The User message is mandatory.
Claude's answer completes the last turn.

The founding principle behind everything else: recent models follow explicit instructions, they do not guess. If you want a format, a length, a tone, a language: say it. Do not expect the model to infer your intent.

Bad: "tell me about Sept-Tools tools". Better: "List the 5 Sept-Tools product families, one sentence each, in English, no introduction."

Key points

3 blocks: system (optional), User/Assistant messages, answer
Recent models execute the explicit, they do not guess
State the format, length, tone, language: always

Being clear and direct

Treat Claude like a brilliant new hire with no context: capable, but who knows neither your company, nor your conventions, nor what you have in mind. Everything you don't say, it has to guess, and it will sometimes guess wrong.

Three concrete levers:

Give the task context: what the result is for, who will read it, in what setting.
Be sequential: if order matters, number the steps.
Say what you want, not what you don't want when possible: "answer in 3 sentences" beats "don't be too long".

Unbeatable mental test: show your prompt to a colleague. If they need to ask a question to execute, so does the model. Fill the gap.

Key points

Claude = a brilliant hire with zero implicit context
Give the why, the reader, the setting
The colleague test: if they would ask a question, your prompt has a gap

Assigning a role (system prompt)

Giving Claude a role through the system prompt changes the baseline quality and tone. "You are an application security expert auditing PHP WooCommerce code" steers all the vocabulary, level of detail and priorities of the answers that follow.

Why it works: you move the model toward the region of language where the good experts live. A well-chosen role is often worth several paragraphs of instructions.

Best practices:

Put the role in the system prompt, not buried in the user message.
Be specific: "technical writer for a non-technical safety audience" beats "write well".
Combine role + mission + constraints: "You are X. Your mission is Y. Rules: Z."

This is exactly what Pierre's Frankenstein mode does, but taken to the extreme: a composed system prompt that assembles several expert personalities. We will dissect it in module 6.

Key points

A role steers tone, vocabulary and priorities
Role in the system prompt, specific, combined with mission and constraints
A good role replaces paragraphs of instructions

Separating data from instructions (XML)

When your prompt mixes your instructions and the data to process, the model confuses the two: it takes a piece of text to analyze for an instruction, or the reverse. The canonical fix with Claude: XML tags.

Claude is specifically trained to respect tags like <document>, <instructions>, <example>. They create clean boundaries.

<instructions>
Summarize the document below in 3 bullets, for a busy executive.
</instructions>

<document>
{ ... the long text to summarize, pasted here ... }
</document>

Benefits: zero data/instruction confusion, more reliable answers, and you can reference the tags ("in the <document>, find..."). This is THE reflex for any prompt that contains a big block of text, a transcript, code, or pasted data.

Key points

Mixing instructions and data = model confusion
XML tags (Claude is trained on them): clean boundaries
Reflex whenever there is a big block of data to process

Formatting output and speaking for Claude (prefill)

Two techniques to control the exact shape of the answer.

1. Specify the output format. Ask explicitly: JSON, markdown table, list, tags. For robust JSON, describe the expected schema and say "return only the JSON, no surrounding text".

2. Prefill (speaking for Claude). In the API, you can start the Assistant message yourself. Claude will continue from there. It is the most reliable way to force a format.

User: Give me the 3 risks, as JSON.
Assistant: {          <-- you prefill this brace

By starting the answer with { or [, you guarantee Claude does not add "Sure, here is..." before the JSON. Prefilling <analysis> forces the answer to start inside that tag. Heavily used for automated extraction and to short-circuit preambles.

Key points

Specify the output format (JSON, table, tags) explicitly
Prefill: seed the Assistant message to force the shape
Prefilling { or [ kills 'Sure, here is...' preambles

Precognition: thinking step by step

On a task that requires reasoning (math, logic, a multi-criteria decision, debugging), asking for the direct answer often yields a wrong one. Asking Claude to reason first, conclude after markedly improves accuracy. This is chain-of-thought.

Why: the model produces text sequentially. Letting it "lay down its reasoning" literally gives it intermediate steps to lean on. Without that room, it has to guess the answer in one shot.

Three ways to trigger it:

Direct: "Think step by step before answering."
Structured: "First, in <reasoning>, analyze. Then, in <answer>, conclude."
Extended thinking: recent models have an extended thinking mode you can trigger (a "thinking" budget before the answer).

Practical key: if you then want to parse the answer, isolate the reasoning in a tag so you keep only the conclusion.

Key points

Reasoning before concluding improves accuracy (chain-of-thought)
Written reasoning acts as scaffolding for the model
Isolate the reasoning in a tag if you must parse the conclusion

Using examples (few-shot)

Showing beats explaining. Giving 2 to 5 examples of input -> output (few-shot) is often the fastest way to lock a format, a tone, a level of detail, a naming convention.

The model imitates the pattern. If your examples are terse minimal JSON, the answers follow. If your examples ramble, the answers ramble.

Rules:

Pick representative examples, including an edge case if the pitfall is known.
Be consistent: same format in every example, otherwise you teach inconsistency.
Wrap them in <example> tags to distinguish them from the real task.

Few-shot + role + output format = the combo that solves most extraction and transformation tasks with no other trick.

Key points

2 to 5 input->output examples lock format and tone
The model imitates the pattern: terse examples => terse answers
Absolute consistency between examples; an edge case if useful

Avoiding hallucinations

Hallucination is not a rare bug, it is a consequence of how it works (lesson 1.1). You don't eliminate it, you constrain it. Toolbox:

Give an escape hatch: "If the information is not in the document, answer exactly: Unknown." Without it, the model fills the gap by inventing.
Ground in sources: provide the reference text and ask it to quote the exact passage that justifies each claim.
Ask for evidence first: "Before answering, extract the relevant quotes, then answer based only on them."
Search the web for any fact at or after the cutoff, rather than trusting training memory.
Lower the temperature for factual tasks.

The golden rule from Pierre's practice: for a factual question about the present, searching the web beats the model's memory. It is written into his Frankenstein prompt.

Key points

You don't remove hallucination, you constrain it
Always offer an 'Unknown' exit to avoid invention
Ground in sources + quote + search the web for facts

Composing complex prompts (real cases)

Real production prompts stack the previous techniques in a stable order. A skeleton that works for most serious cases:

1. ROLE         -> "You are a ..."
2. CONTEXT      -> what the result is for, who reads it
3. DATA         -> <document> ... </document> (tags)
4. INSTRUCTIONS -> the task, step by step
5. EXAMPLES     -> <example> input -> output </example>
6. THINKING     -> "reason inside <analysis> first"
7. FORMAT       -> exact structure of the output
8. GUARDRAILS   -> "if X is missing, answer Unknown"

You don't need all 8 every time. Anthropic's official advice: only add an advanced technique when it solves a concrete problem. The best prompt is neither the longest nor the most sophisticated, it is the clearest for the task.

Working method: start simple, watch what fails, and add the technique that repairs that precise failure. Iterate. A production prompt is something you refine, not something you guess right on the first try.

Key points

Stable order: role, context, data, instructions, examples, thinking, format, guardrails
Add a technique only when it repairs a concrete failure
The best prompt = the clearest, not the longest

Extended thinking vs chain of thought

When you ask Claude a hard question, you can influence how much reasoning it shows you. The two main modes are chain of thought (CoT) and extended thinking.

Chain of thought means asking Claude to reason step by step inside its reply. You trigger it with a simple instruction: "Think step by step before answering." Claude writes its reasoning as visible text, then gives the answer. This works in Claude.ai chat and in any API call without special flags.

Extended thinking is a model-level feature available on Opus and Sonnet. When activated, the model runs an internal scratchpad (called a thinking block) before producing the final reply. You see a summary of that scratchpad. Extended thinking uses more tokens (and costs more) but tackles problems where surface-level CoT is not enough, such as multi-step math, code debugging, or legal analysis.

Chain of thought: you prompt it, free, works everywhere, reasoning is part of the reply.
Extended thinking (API): set thinking: { type: "enabled", budget_tokens: N } in the API call; only on claude-opus-4-8 and claude-sonnet-4-6.
Extended thinking (Claude.ai): toggle the "Extended thinking" switch before sending your message; not available on all plans.
When to use CoT: explanations, planning, writing outlines, any everyday task.
When to use extended thinking: proofs, complex debugging, strategy with many trade-offs, anywhere you need the model to explore dead ends privately before committing.

Key points

Chain of thought: prompt Claude to reason step by step in its reply.
Extended thinking: internal scratchpad activated at the model level, not just by prompt.
Extended thinking costs more tokens; use it for genuinely complex problems.
Both modes improve accuracy, but extended thinking runs deeper exploration before answering.

Getting clean JSON out

JSON (JavaScript Object Notation) is a text format for structured data: keys and values, lists, nested objects. Many tools, apps, and automations expect JSON rather than plain prose. Claude can produce it reliably if you ask precisely.

The three levers are: explicit schema (tell Claude the exact fields and types you want), prefill (start Claude's reply for it, so it is forced to continue in that format), and validation (check the output before trusting it). Using all three together makes structured output nearly bulletproof.

Key techniques:

Schema in the prompt: paste or describe the exact JSON structure you need, including field names, value types, and required vs optional fields.
Prefill: in the API, put {"role":"assistant","content":"{"} as the last message so Claude must continue with valid JSON.
Strict instruction: add "Output raw JSON only. No prose before or after. No markdown code fences." to the system prompt.
Validation: parse the output with JSON.parse() or a schema validator like Zod or Pydantic before using it downstream.

In Claude.ai chat you cannot use the API prefill trick, but a clear schema plus the strict instruction still works well. For automation or production use, the API with prefill is the safest path.

Key points

Give Claude the exact field names and types in the prompt
Use prefill in the API to force JSON continuation
Instruct: raw JSON only, no prose, no code fences
Always validate with JSON.parse or a schema library before using the output

Delimiters beyond XML

XML tags like <context> are one way to separate parts of a prompt, but they are not the only way. Delimiters are any markers you use to tell Claude where one section ends and another begins. Choosing the right delimiter makes your prompt easier to read and harder for Claude to misinterpret.

Three families of delimiters work well in practice:

Markdown headings (##, ###): good for labeling named sections such as "Background", "Task", or "Constraints". Claude is trained on vast amounts of Markdown, so headings feel natural to it.
Fenced code blocks (triple backticks, written as three grave-accent characters): ideal when you paste code, JSON, CSV, or any text that must be treated literally, not interpreted as instructions.
Explicit marker lines like --- START OF CONTRACT --- and --- END OF CONTRACT ---: useful when you need clear, human-readable boundaries around a long document you are inserting into the prompt.

The key principle is consistency: pick one delimiter style per role (headings for sections, fences for raw data, markers for pasted documents) and use it throughout the prompt. Mixing styles at random confuses both Claude and you.

Key points

Delimiters signal where each prompt section starts and ends
Markdown headings label named sections cleanly
Fenced code blocks protect literal content from being read as instructions
Explicit marker lines wrap long pasted documents

Say what to do, not only what to avoid

When you tell Claude what not to do, you leave the space of allowed actions wide open. Claude has to guess what you actually want, and it often guesses wrong. A positive instruction (telling Claude exactly what to do) gives it a clear target and produces far better results.

Think of it like giving directions. "Don't go the wrong way" is useless. "Turn left on Main Street, then right on Oak Avenue" is actionable. The same logic applies to prompts.

Common patterns where positive instructions win:

Tone: Instead of "don't be too formal," write "use a friendly, conversational tone."
Length: Instead of "don't write too much," write "keep your answer under 100 words."
Format: Instead of "don't use bullet points," write "write in flowing prose, one paragraph."
Scope: Instead of "don't go off-topic," write "focus only on pricing and payment options."

You can still use negatives when a specific exclusion genuinely matters, for example "do not include any code examples." But lead with what you want, and add exclusions only when they prevent a known mistake.

Key points

Positive instructions give Claude a clear target
Negative-only prompts leave too much room for guessing
Replace 'don't X' with the behavior you actually want
Use exclusions sparingly, after stating the main goal

The refinement loop

A refinement loop means treating a conversation as a series of small steering corrections rather than a single perfect request. Instead of crafting one long prompt and starting over when the result is wrong, you send a short follow-up that targets exactly what needs to change.

This works because Claude holds the full conversation in its context window (the memory it has of everything said so far). Each follow-up inherits all prior context, so you only need to describe the delta, the specific change you want, not repeat the whole brief.

Effective steering messages share a few traits:

Pinpoint the problem: "The third paragraph is too formal" beats "rewrite it."
State the direction: "Make it shorter" or "Add a concrete example here."
Confirm what to keep: "Keep the bullet list, only change the introduction."
Ask for one change at a time: stacking many corrections in one message risks losing some of them.

When a thread drifts too far off track, it is fine to start a new conversation. But exhaust the refinement loop first: most problems are one or two targeted corrections away from a good result.

Key points

Refinement loop: iterate with small follow-ups, do not restart every time
Context window carries all prior turns so you only describe what changes
Pinpoint one problem per correction for the most reliable result
Start fresh only when the whole direction has changed

Reusable prompt templates

A prompt template is a prompt with named placeholders that you swap out each time you reuse it. Instead of rewriting the same instruction from scratch, you write it once, mark the parts that change with a label like [TOPIC] or {{LANGUAGE}}, and fill those slots in whenever you need it.

Templates save time and keep quality consistent. If you craft a prompt that produces a great result, a template locks in that quality so every future run starts from the same strong base. They are especially useful for tasks you repeat weekly, such as summarising meeting notes, translating product descriptions, or drafting reply emails.

A good template has three parts:

Fixed instructions: the context and rules that never change (for example, "You are a concise business writer").
Placeholders: clearly labelled slots for the variable parts, written in brackets or braces so they stand out visually.
Output format: tells Claude exactly how to structure the answer (bullet list, numbered steps, a table, and so on).

You can store templates as plain text files, notes, or even inside a spreadsheet column. When you need one, copy it, fill in the slots, and paste the completed prompt into Claude.ai, Claude Code, or any other interface.

Key points

A template fixes the instructions and marks the variable parts as placeholders.
Use consistent placeholder notation such as [BRACKETS] or {{BRACES}} so slots are easy to spot.
Templates lock in prompt quality and cut repetition for recurring tasks.
Always specify the output format inside the template to get predictable results.

Question answering over a long document

When you hand Claude a long document (a contract, a report, a manual), the order of your message matters. Put the document first, then your question. Claude reads top-to-bottom, so the context is fully loaded before it processes your request. Burying the document after a long preamble forces Claude to re-interpret the question against text it has not yet seen.

Ask precise questions. Vague prompts like "summarize this" return vague answers. Narrow questions like "What is the termination notice period in Section 12?" return specific, verifiable answers. Precision also makes it easier to catch a wrong answer.

Ask Claude to quote before answering. This technique (sometimes called "cite then conclude") forces the model to anchor its answer in the actual text rather than paraphrase from memory. If the quote does not exist in the document, that is a signal the answer may be hallucinated (invented with false confidence).

Paste the document at the top of your message.
Follow it with a clear, narrow question.
Add the instruction: "Quote the relevant sentence(s) first, then answer."
Cross-check the quote against the source before acting on the answer.

Key points

Document first, question second
Precise questions get precise answers
Quote before answering reduces hallucination
Always verify the quoted passage in the source

Prompting across languages

Claude reads and writes in over 50 languages without any setup. You can write your prompt in French, receive the answer in German, then switch to Spanish mid-conversation. The model detects your language and mirrors it by default, so the simplest rule is: write in the language you want the answer in.

The challenge is not fluency but terminology consistency: the same technical term should appear in the same form every time. If you call a product a "ponceuse orbitale" in one message and "sander orbital" in the next, Claude may treat them as different things and produce inconsistent output. Pin your vocabulary early and repeat it exactly.

Three techniques help you stay consistent across languages:

Glossary block: open your first message with a short list of fixed terms, for example Glossaire: ponceuse orbitale = orbital sander = Orbitalschleifer, then reference those labels throughout.
Language instruction: add an explicit instruction such as Always answer in English, even if I write in French. Claude follows it reliably.
Output language flag in Claude Code: when running the CLI (the command-line tool you use in a terminal), you can prime the session with a system prompt via
```
claude --system "Reply only in Spanish, keep all code comments in English."
```
This separates the human language from the code language.

Mixed-language projects, for example a French-speaking team building an English codebase, benefit most from that last technique. Keep code and comments in English (the universal standard for source code) while all explanations arrive in the team language.

Key points

Claude mirrors the language of your prompt by default
Terminology consistency matters more than grammar
A glossary block anchors vocabulary across the whole conversation
In Claude Code, use --system to set a language rule for the session

Summaries that keep what matters

There are two fundamentally different ways to summarize a text. Extractive summarization lifts exact sentences or phrases from the original and stitches them together. Abstractive summarization rephrases the ideas in new words, the way a human would explain a document to a colleague. Claude does abstractive summarization by default, which means it can condense, reorder, and clarify rather than just copy.

Without guidance, Claude picks its own length and focus. A short article might get three sentences; a long legal document might get three paragraphs. To take control, you need to be explicit about two things: length (word count, bullet count, or number of sentences) and focus (which topics, audiences, or use cases matter most).

Common focus controls you can combine in a single prompt:

Audience: "for a non-technical executive" vs "for an engineer reviewing the API"
Output format: "three bullet points", "one sentence", "a tweet-length caption"
Angle: "emphasizing risks only", "action items only", "financial impact only"
Preservation rule: "keep all numbers and dates verbatim" (extractive anchor inside an abstractive summary)

Mixing extractive anchors with abstractive prose is the most reliable technique for high-stakes summaries: Claude rewrites the narrative but quotes exact figures, names, and dates so nothing is accidentally paraphrased into an error.

Key points

Extractive: copies exact text; abstractive: rephrases in new words
State length explicitly (word count, sentence count, bullet count)
State focus explicitly (audience, angle, format)
Use extractive anchors for numbers and dates to prevent paraphrase errors

Talking to Claude

Basic prompt structure

Being clear and direct

Assigning a role (system prompt)

Separating data from instructions (XML)

Formatting output and speaking for Claude (prefill)

Precognition: thinking step by step

Using examples (few-shot)

Avoiding hallucinations

Composing complex prompts (real cases)

Extended thinking vs chain of thought

Getting clean JSON out

Delimiters beyond XML

Say what to do, not only what to avoid

The refinement loop

Reusable prompt templates

Question answering over a long document

Prompting across languages

Summaries that keep what matters

Need this level of execution on your project?

Inspired by 0xloucash