Chain-of-Thought: Make AI Reason Step by Step
You ask Claude or ChatGPT to solve a complex problem and the answer arrives in two seconds, but it's completely wrong. Why? Because the AI skipped reasoning steps. Chain-of-thought (CoT) fixes this by forcing the AI to break down its thinking step by step, just like you would on a scratch pad. This technique improves answer accuracy by 40 to 60% according to studies from Anthropic and Google Research. You'll learn to apply it hands-on, even if you've never heard of prompt engineering before.
What is chain-of-thought in prompt engineering?
Chain-of-thought is a technique that explicitly asks the AI to show its reasoning step by step before giving its final answer. Instead of getting a direct response, you receive the complete logical path that leads to that answer.
A simple example: you ask the AI to calculate how much 3 items at €24.99 cost with a 15% discount. Without CoT, it might give a random number. With CoT, it details:
- Unit price: €24.99
- Total price before discount: €24.99 × 3 = €74.97
- Discount amount: €74.97 × 0.15 = €11.25
- Final price: €74.97 - €11.25 = €63.72
This breakdown lets you spot a calculation or logic error immediately. Google researchers published the foundational 2022 study "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models" demonstrating that this approach drastically improves performance on reasoning tasks.
Chain-of-thought works especially well for:
- Mathematical calculations
- Logic problems
- Complex scenario analysis
- Decisions requiring multiple criteria
- Code debugging
How to apply chain-of-thought in your prompts?
To activate chain-of-thought, simply add an explicit instruction asking the AI to break down its reasoning, like "Think step by step" or "Show your reasoning." This magic phrase radically transforms answer quality.
Here are three formulations that work consistently:
Formulation 1: Direct instruction
"Think step by step to solve this problem: [your problem]"
Formulation 2: Imposed structure
"Before answering, detail your reasoning by following these steps:
- Problem analysis
- Identifying relevant data
- Calculations or deductions
- Verification
- Final answer"
Formulation 3: Few-shot CoT
You provide an example of reasoning before asking your question. This is the most powerful method.
"Example: If a train travels at 120 km/h for 2.5 hours, what distance does it cover? Reasoning:
- Speed: 120 km/h
- Duration: 2h30 = 2.5 hours
- Distance = speed × time
- Distance = 120 × 2.5 = 300 km
Now solve this problem using the same logic: [your problem]"
A practical tip: start with Formulation 1 for daily use. Move to Formulation 2 for truly complex problems. Reserve Formulation 3 when you need maximum precision on a recurring task type.
What types of problems does chain-of-thought solve?
Chain-of-thought excels at tasks requiring multiple steps of logical reasoning, particularly mathematics, programming, text analysis, and multi-criteria decision-making. It transforms approximate answers into verifiable solutions.
Mathematics and calculations
AIs regularly make basic calculation errors without CoT. With CoT, the error rate drops by 80% according to GSM8K benchmarks (a dataset of 8,500 elementary school-level math problems).
Concrete example:
Without CoT: "I have 15 apples. I give 40% to Marie and 1/3 of the rest to Paul. How many do I have left?" Typical answer: "You have 6 apples left" (wrong)
With CoT: "Think step by step..." Detailed reasoning:
- Starting point: 15 apples
- 40% for Marie: 15 × 0.4 = 6 apples
- Remaining after Marie: 15 - 6 = 9 apples
- 1/3 of remainder for Paul: 9 ÷ 3 = 3 apples
- Final remaining: 9 - 3 = 6 apples
The answer is the same but the reasoning is verifiable. You immediately see where to look if something's off.
Programming and debugging
Chain-of-thought helps the AI break down a code problem into logical sub-problems. Particularly useful with Claude Code or Cursor.
Example: "My Python code shows a 'list index out of range' error. Analyze step by step why."
The AI will:
- Identify where the error occurs
- Examine the list size at that moment
- Check the indices being used
- Trace the execution flow
- Propose a fix
Text analysis and comprehension
To extract information from long text or analyze arguments, CoT forces the AI to justify its conclusions.
"Read this contract and identify problematic clauses. Explain your reasoning for each clause."
The AI will detail why a clause is ambiguous, what law it might violate, what risks it presents.
Decision-making
CoT transforms the AI into a transparent decision-support tool.
"I need to choose between 3 laptops. Analyze step by step which one best fits my needs: [criteria]."
The AI will compare each criterion, assign scores, justify its choices, instead of pulling a name out of thin air.
Chain-of-thought simple vs zero-shot-CoT: what's the difference?
Simple chain-of-thought requires providing reasoning examples to the AI, while zero-shot-CoT simply means adding "Think step by step" without prior examples. Zero-shot is faster, few-shot is more precise.
Zero-shot-CoT (discovered by Google researchers in 2022) is the minimalist version: you just add "Let's think step by step" or "Think step by step" at the end of your question. No examples needed. It works surprisingly well.
Zero-shot-CoT advantages:
- Quick to implement
- Works on almost any problem
- No need to think up examples
Downside: less precise than few-shot on specific tasks.
Few-shot-CoT requires providing 1 to 3 reasoning examples before your question. You show the AI exactly how you want it to reason.
Few-shot-CoT advantages:
- Maximum precision
- Full control over response format
- Ideal for repetitive tasks
Downside: takes more time to prepare.
A comparative Stanford study (2023) shows few-shot-CoT outperforms zero-shot by 12 to 18 percentage points on math benchmarks. But zero-shot remains more than sufficient for 80% of daily use.
Practical rule: start zero-shot. If results don't satisfy you, move to few-shot with 1 or 2 well-chosen examples.
Common mistakes and how to avoid them with chain-of-thought
The most frequent CoT mistakes are asking for too many unnecessary steps, not verifying the reasoning you get, and using it on questions too simple to need it. CoT isn't a universal magic formula.
Mistake 1: Overusing CoT on simple questions
For "What's the capital of France?", CoT is counterproductive. You'll get a paragraph when one word suffices. Reserve CoT for problems truly requiring multiple reasoning steps.
Mistake 2: Not verifying the reasoning
CoT makes errors visible, but you have to look for them. The AI can produce reasoning that seems logical but contains a subtle error at step 3 out of 7.
Solution: read the reasoning, not just the conclusion. Verify at least the critical calculations.
Mistake 3: Asking for too much detail
"Explain every micro-step of your reasoning with maximum detail level" produces unreadable walls of text. Ask for a detail level matching your need.
For a simple calculation: 3-4 steps suffice For a complex problem: 6-8 steps maximum Beyond that, you drown in details.
Mistake 4: Forgetting to structure the request
Compare:
Vague: "Think step by step about this complex corporate finance problem..."
Structured: "Analyze this problem following these steps: 1) Identify cash flows, 2) Calculate discount rate, 3) Net present value, 4) Investment decision. Think step by step."
The structured version guides the AI toward the reasoning you want.
Mistake 5: Mixing multiple questions
"Calculate this budget AND explain why this project is profitable AND list the risks" in CoT mode produces confused reasoning. Split into 3 separate prompts, each with its own CoT.
How to combine chain-of-thought with other prompting techniques?
Chain-of-thought combines powerfully with role prompting, format constraints, and validation through counter-examples to create ultra-high-performing prompts. These combinations multiply effectiveness.
CoT + Role prompting
Give the AI an expert role before asking for step-by-step reasoning.
"You're a certified accountant with 15 years of experience. Analyze step by step whether this expense is tax-deductible: [context]."
The role orients the type of reasoning, CoT guarantees its transparency.
CoT + Format constraints
Impose a precise output structure while keeping detailed reasoning.
"Think step by step, then present your answer as a table with 3 columns: Criterion | Analysis | Score out of 10."
You get traceable reasoning in an exploitable format.
CoT + Validation through counter-examples
Ask the AI to verify its own reasoning by finding cases where it would fail.
"Solve this problem step by step, then identify edge cases where your reasoning might be wrong."
This double-check detects logical flaws.
CoT + Few-shot learning
The most powerful combination: you provide 2-3 examples of similar problems solved step by step, then ask your question.
"Example 1: [problem + CoT reasoning + solution] Example 2: [problem + CoT reasoning + solution]
Now solve this new problem using the same logic: [your problem]"
This approach reaches accuracy rates close to 90% on complex benchmarks according to Anthropic research.
One last tip: test these combinations progressively. Start with CoT alone, then add one technique at a time. You'll quickly find your optimal formula based on your needs.
Conclusion
Chain-of-thought transforms the AI into an assistant that shows its work instead of pulling answers out of thin air. Start by adding "Think step by step" to your complex prompts. Verify the reasoning you get. Refine with examples if needed. This simple technique radically improves answer reliability on calculations, logic, code, and decisions. You've just acquired a tool that 90% of AI users still don't know about.