Prompting Techniques
Chain-of-Thought Prompting
Dramatically improve reasoning accuracy by asking the model to show its work step by step.
What is Chain-of-Thought Prompting?
Chain-of-Thought (CoT) prompting encourages the model to break down complex reasoning into intermediate steps before giving a final answer. This dramatically improves accuracy on math, logic, and multi-step reasoning tasks.
Why It Works
Language models generate text token by token. When forced to reason step-by-step, each reasoning step provides context for the next, leading to more accurate conclusions.
The Magic Phrase
Adding "Let's think step by step" to your prompt (or similar phrasing) significantly improves reasoning performance — this is called zero-shot chain-of-thought.
When to Use CoT
Chain-of-thought is most valuable for:
- Mathematical problems
- Logic puzzles
- Multi-step reasoning
- Code debugging
- Complex analysis tasks
- Decision-making with tradeoffs
Limitations
CoT doesn't help for simple, factual questions. It increases token usage (and cost). It can sometimes hallucinate confident-sounding wrong reasoning chains.
Example
// Without CoT (often wrong for complex math)
"If a train travels 120 miles in 2 hours, and then
60 miles in 1.5 hours, what is its average speed
for the entire journey?"
// With CoT - "think step by step"
"If a train travels 120 miles in 2 hours, and then
60 miles in 1.5 hours, what is its average speed
for the entire journey?
Think through this step by step before giving
your final answer."
// Model output with CoT:
// Step 1: Total distance = 120 + 60 = 180 miles
// Step 2: Total time = 2 + 1.5 = 3.5 hours
// Step 3: Average speed = 180 / 3.5 ≈ 51.4 mph
// Answer: ~51.4 mph
// Few-shot CoT (show reasoning in examples)
"Q: A store has 200 apples. They sell 45 in the
morning and receive a delivery of 80 more.
How many do they have now?
A: Let me work through this:
- Start: 200 apples
- After selling 45: 200 - 45 = 155 apples
- After delivery of 80: 155 + 80 = 235 apples
Answer: 235 apples
Q: A developer has 8 hours. Each feature takes
2.5 hours to build and 1 hour to test. How many
complete features can they ship?
A:"