The Claude Bible
Home / Talking to Claude
Level: Beginner · 18 lessons

Talking to Claude

The heart of prompt engineering: the 9 chapters of Anthropic's tutorial, the practical version.

Open the interactive course212 lessons, quizzes, exercises, 3 languages, free.

Basic prompt structure

A call to Claude is made of three building blocks, only one of which is mandatory:

The founding principle behind everything else: recent models follow explicit instructions, they do not guess. If you want a format, a length, a tone, a language: say it. Do not expect the model to infer your intent.

Bad: "tell me about Sept-Tools tools". Better: "List the 5 Sept-Tools product families, one sentence each, in English, no introduction."

Key points
  • 3 blocks: system (optional), User/Assistant messages, answer
  • Recent models execute the explicit, they do not guess
  • State the format, length, tone, language: always

Being clear and direct

Treat Claude like a brilliant new hire with no context: capable, but who knows neither your company, nor your conventions, nor what you have in mind. Everything you don't say, it has to guess, and it will sometimes guess wrong.

Three concrete levers:

Unbeatable mental test: show your prompt to a colleague. If they need to ask a question to execute, so does the model. Fill the gap.

Key points
  • Claude = a brilliant hire with zero implicit context
  • Give the why, the reader, the setting
  • The colleague test: if they would ask a question, your prompt has a gap

Assigning a role (system prompt)

Giving Claude a role through the system prompt changes the baseline quality and tone. "You are an application security expert auditing PHP WooCommerce code" steers all the vocabulary, level of detail and priorities of the answers that follow.

Why it works: you move the model toward the region of language where the good experts live. A well-chosen role is often worth several paragraphs of instructions.

Best practices:

This is exactly what Pierre's Frankenstein mode does, but taken to the extreme: a composed system prompt that assembles several expert personalities. We will dissect it in module 6.

Key points
  • A role steers tone, vocabulary and priorities
  • Role in the system prompt, specific, combined with mission and constraints
  • A good role replaces paragraphs of instructions

Separating data from instructions (XML)

When your prompt mixes your instructions and the data to process, the model confuses the two: it takes a piece of text to analyze for an instruction, or the reverse. The canonical fix with Claude: XML tags.

Claude is specifically trained to respect tags like <document>, <instructions>, <example>. They create clean boundaries.

<instructions>
Summarize the document below in 3 bullets, for a busy executive.
</instructions>

<document>
{ ... the long text to summarize, pasted here ... }
</document>

Benefits: zero data/instruction confusion, more reliable answers, and you can reference the tags ("in the <document>, find..."). This is THE reflex for any prompt that contains a big block of text, a transcript, code, or pasted data.

Key points
  • Mixing instructions and data = model confusion
  • XML tags (Claude is trained on them): clean boundaries
  • Reflex whenever there is a big block of data to process

Formatting output and speaking for Claude (prefill)

Two techniques to control the exact shape of the answer.

1. Specify the output format. Ask explicitly: JSON, markdown table, list, tags. For robust JSON, describe the expected schema and say "return only the JSON, no surrounding text".

2. Prefill (speaking for Claude). In the API, you can start the Assistant message yourself. Claude will continue from there. It is the most reliable way to force a format.

User: Give me the 3 risks, as JSON.
Assistant: {          <-- you prefill this brace

By starting the answer with { or [, you guarantee Claude does not add "Sure, here is..." before the JSON. Prefilling <analysis> forces the answer to start inside that tag. Heavily used for automated extraction and to short-circuit preambles.

Key points
  • Specify the output format (JSON, table, tags) explicitly
  • Prefill: seed the Assistant message to force the shape
  • Prefilling { or [ kills 'Sure, here is...' preambles

Precognition: thinking step by step

On a task that requires reasoning (math, logic, a multi-criteria decision, debugging), asking for the direct answer often yields a wrong one. Asking Claude to reason first, conclude after markedly improves accuracy. This is chain-of-thought.

Why: the model produces text sequentially. Letting it "lay down its reasoning" literally gives it intermediate steps to lean on. Without that room, it has to guess the answer in one shot.

Three ways to trigger it:

Practical key: if you then want to parse the answer, isolate the reasoning in a tag so you keep only the conclusion.

Key points
  • Reasoning before concluding improves accuracy (chain-of-thought)
  • Written reasoning acts as scaffolding for the model
  • Isolate the reasoning in a tag if you must parse the conclusion

Using examples (few-shot)

Showing beats explaining. Giving 2 to 5 examples of input -> output (few-shot) is often the fastest way to lock a format, a tone, a level of detail, a naming convention.

The model imitates the pattern. If your examples are terse minimal JSON, the answers follow. If your examples ramble, the answers ramble.

Rules:

Few-shot + role + output format = the combo that solves most extraction and transformation tasks with no other trick.

Key points
  • 2 to 5 input->output examples lock format and tone
  • The model imitates the pattern: terse examples => terse answers
  • Absolute consistency between examples; an edge case if useful

Avoiding hallucinations

Hallucination is not a rare bug, it is a consequence of how it works (lesson 1.1). You don't eliminate it, you constrain it. Toolbox:

The golden rule from Pierre's practice: for a factual question about the present, searching the web beats the model's memory. It is written into his Frankenstein prompt.

Key points
  • You don't remove hallucination, you constrain it
  • Always offer an 'Unknown' exit to avoid invention
  • Ground in sources + quote + search the web for facts

Composing complex prompts (real cases)

Real production prompts stack the previous techniques in a stable order. A skeleton that works for most serious cases:

1. ROLE         -> "You are a ..."
2. CONTEXT      -> what the result is for, who reads it
3. DATA         -> <document> ... </document> (tags)
4. INSTRUCTIONS -> the task, step by step
5. EXAMPLES     -> <example> input -> output </example>
6. THINKING     -> "reason inside <analysis> first"
7. FORMAT       -> exact structure of the output
8. GUARDRAILS   -> "if X is missing, answer Unknown"

You don't need all 8 every time. Anthropic's official advice: only add an advanced technique when it solves a concrete problem. The best prompt is neither the longest nor the most sophisticated, it is the clearest for the task.

Working method: start simple, watch what fails, and add the technique that repairs that precise failure. Iterate. A production prompt is something you refine, not something you guess right on the first try.

Key points
  • Stable order: role, context, data, instructions, examples, thinking, format, guardrails
  • Add a technique only when it repairs a concrete failure
  • The best prompt = the clearest, not the longest

Extended thinking vs chain of thought

When you ask Claude a hard question, you can influence how much reasoning it shows you. The two main modes are chain of thought (CoT) and extended thinking.

Chain of thought means asking Claude to reason step by step inside its reply. You trigger it with a simple instruction: "Think step by step before answering." Claude writes its reasoning as visible text, then gives the answer. This works in Claude.ai chat and in any API call without special flags.

Extended thinking is a model-level feature available on Opus and Sonnet. When activated, the model runs an internal scratchpad (called a thinking block) before producing the final reply. You see a summary of that scratchpad. Extended thinking uses more tokens (and costs more) but tackles problems where surface-level CoT is not enough, such as multi-step math, code debugging, or legal analysis.

Key points
  • Chain of thought: prompt Claude to reason step by step in its reply.
  • Extended thinking: internal scratchpad activated at the model level, not just by prompt.
  • Extended thinking costs more tokens; use it for genuinely complex problems.
  • Both modes improve accuracy, but extended thinking runs deeper exploration before answering.

Getting clean JSON out

JSON (JavaScript Object Notation) is a text format for structured data: keys and values, lists, nested objects. Many tools, apps, and automations expect JSON rather than plain prose. Claude can produce it reliably if you ask precisely.

The three levers are: explicit schema (tell Claude the exact fields and types you want), prefill (start Claude's reply for it, so it is forced to continue in that format), and validation (check the output before trusting it). Using all three together makes structured output nearly bulletproof.

Key techniques:

In Claude.ai chat you cannot use the API prefill trick, but a clear schema plus the strict instruction still works well. For automation or production use, the API with prefill is the safest path.

Key points
  • Give Claude the exact field names and types in the prompt
  • Use prefill in the API to force JSON continuation
  • Instruct: raw JSON only, no prose, no code fences
  • Always validate with JSON.parse or a schema library before using the output

Delimiters beyond XML

XML tags like <context> are one way to separate parts of a prompt, but they are not the only way. Delimiters are any markers you use to tell Claude where one section ends and another begins. Choosing the right delimiter makes your prompt easier to read and harder for Claude to misinterpret.

Three families of delimiters work well in practice:

The key principle is consistency: pick one delimiter style per role (headings for sections, fences for raw data, markers for pasted documents) and use it throughout the prompt. Mixing styles at random confuses both Claude and you.

Key points
  • Delimiters signal where each prompt section starts and ends
  • Markdown headings label named sections cleanly
  • Fenced code blocks protect literal content from being read as instructions
  • Explicit marker lines wrap long pasted documents

Say what to do, not only what to avoid

When you tell Claude what not to do, you leave the space of allowed actions wide open. Claude has to guess what you actually want, and it often guesses wrong. A positive instruction (telling Claude exactly what to do) gives it a clear target and produces far better results.

Think of it like giving directions. "Don't go the wrong way" is useless. "Turn left on Main Street, then right on Oak Avenue" is actionable. The same logic applies to prompts.

Common patterns where positive instructions win:

You can still use negatives when a specific exclusion genuinely matters, for example "do not include any code examples." But lead with what you want, and add exclusions only when they prevent a known mistake.

Key points
  • Positive instructions give Claude a clear target
  • Negative-only prompts leave too much room for guessing
  • Replace 'don't X' with the behavior you actually want
  • Use exclusions sparingly, after stating the main goal

The refinement loop

A refinement loop means treating a conversation as a series of small steering corrections rather than a single perfect request. Instead of crafting one long prompt and starting over when the result is wrong, you send a short follow-up that targets exactly what needs to change.

This works because Claude holds the full conversation in its context window (the memory it has of everything said so far). Each follow-up inherits all prior context, so you only need to describe the delta, the specific change you want, not repeat the whole brief.

Effective steering messages share a few traits:

When a thread drifts too far off track, it is fine to start a new conversation. But exhaust the refinement loop first: most problems are one or two targeted corrections away from a good result.

Key points
  • Refinement loop: iterate with small follow-ups, do not restart every time
  • Context window carries all prior turns so you only describe what changes
  • Pinpoint one problem per correction for the most reliable result
  • Start fresh only when the whole direction has changed

Reusable prompt templates

A prompt template is a prompt with named placeholders that you swap out each time you reuse it. Instead of rewriting the same instruction from scratch, you write it once, mark the parts that change with a label like [TOPIC] or {{LANGUAGE}}, and fill those slots in whenever you need it.

Templates save time and keep quality consistent. If you craft a prompt that produces a great result, a template locks in that quality so every future run starts from the same strong base. They are especially useful for tasks you repeat weekly, such as summarising meeting notes, translating product descriptions, or drafting reply emails.

A good template has three parts:

You can store templates as plain text files, notes, or even inside a spreadsheet column. When you need one, copy it, fill in the slots, and paste the completed prompt into Claude.ai, Claude Code, or any other interface.

Key points
  • A template fixes the instructions and marks the variable parts as placeholders.
  • Use consistent placeholder notation such as [BRACKETS] or {{BRACES}} so slots are easy to spot.
  • Templates lock in prompt quality and cut repetition for recurring tasks.
  • Always specify the output format inside the template to get predictable results.

Question answering over a long document

When you hand Claude a long document (a contract, a report, a manual), the order of your message matters. Put the document first, then your question. Claude reads top-to-bottom, so the context is fully loaded before it processes your request. Burying the document after a long preamble forces Claude to re-interpret the question against text it has not yet seen.

Ask precise questions. Vague prompts like "summarize this" return vague answers. Narrow questions like "What is the termination notice period in Section 12?" return specific, verifiable answers. Precision also makes it easier to catch a wrong answer.

Ask Claude to quote before answering. This technique (sometimes called "cite then conclude") forces the model to anchor its answer in the actual text rather than paraphrase from memory. If the quote does not exist in the document, that is a signal the answer may be hallucinated (invented with false confidence).

Key points
  • Document first, question second
  • Precise questions get precise answers
  • Quote before answering reduces hallucination
  • Always verify the quoted passage in the source

Prompting across languages

Claude reads and writes in over 50 languages without any setup. You can write your prompt in French, receive the answer in German, then switch to Spanish mid-conversation. The model detects your language and mirrors it by default, so the simplest rule is: write in the language you want the answer in.

The challenge is not fluency but terminology consistency: the same technical term should appear in the same form every time. If you call a product a "ponceuse orbitale" in one message and "sander orbital" in the next, Claude may treat them as different things and produce inconsistent output. Pin your vocabulary early and repeat it exactly.

Three techniques help you stay consistent across languages:

Mixed-language projects, for example a French-speaking team building an English codebase, benefit most from that last technique. Keep code and comments in English (the universal standard for source code) while all explanations arrive in the team language.

Key points
  • Claude mirrors the language of your prompt by default
  • Terminology consistency matters more than grammar
  • A glossary block anchors vocabulary across the whole conversation
  • In Claude Code, use --system to set a language rule for the session

Summaries that keep what matters

There are two fundamentally different ways to summarize a text. Extractive summarization lifts exact sentences or phrases from the original and stitches them together. Abstractive summarization rephrases the ideas in new words, the way a human would explain a document to a colleague. Claude does abstractive summarization by default, which means it can condense, reorder, and clarify rather than just copy.

Without guidance, Claude picks its own length and focus. A short article might get three sentences; a long legal document might get three paragraphs. To take control, you need to be explicit about two things: length (word count, bullet count, or number of sentences) and focus (which topics, audiences, or use cases matter most).

Common focus controls you can combine in a single prompt:

Mixing extractive anchors with abstractive prose is the most reliable technique for high-stakes summaries: Claude rewrites the narrative but quotes exact figures, names, and dates so nothing is accidentally paraphrased into an error.

Key points
  • Extractive: copies exact text; abstractive: rephrases in new words
  • State length explicitly (word count, sentence count, bullet count)
  • State focus explicitly (audience, angle, format)
  • Use extractive anchors for numbers and dates to prevent paraphrase errors
Work with me

Master Claude, Claude Code and LLMs, from your first prompt to multi-agent orchestration.

Like this course? I built it end to end. Need a web app, mobile app, AI automation or SEO/GEO? Let us talk.

Contact me on LinkedInSee a site I built