Generate multiple reasoning paths, score them, prune to the best branches.
Idea: A person shows puzzling behavior. The agent explores multiple psychological explanations in parallel, scores them, and expands only the strongest hypothesis into the final analysis.
[Behavior: "Person leaves secure job without a clear plan"]
|
[Phase 1: Branch - 4 hypotheses]
/ | | \
[Avoidance][Burnout][Growth][External pressure]
| | | |
[Phase 2: Score]
6 9 7 4
|
[Phase 3: Prune - losers removed]
|
[Phase 4: Conclusion from ONLY Burnout]
This intentionally highlights both ToT's strength (structured exploration) and weakness (information loss after pruning).
| Principle | What happens in code |
|---|---|
| Branching | developHypothesis() creates one hypothesis per lens |
| Evaluation | scoreHypothesis() scores each hypothesis |
| Pruning | pruneHypotheses() discards all non-winners |
At the end, the console prints what got discarded.
Those branches may contain corrective insights, but they do not influence the final conclusion anymore.
That is the core limitation of strict ToT pruning.
Use Tree of Thought when you need a clear winner and a simple decision path.
You get a production alert: API latency jumped from 200 ms to 2 s after a release.
Why ToT fits: incident response often needs one fast, auditable decision path instead of maintaining many parallel remediation tracks.
You need to speed up a slow endpoint before a launch.
Why ToT fits: when deadlines are near, teams usually need one implementation winner, not a combined architecture experiment.
You are building an autonomous coding agent that must choose a fix strategy.
Why ToT fits: you get predictable behavior, lower token/tool cost, and easier postmortems because the agent follows one explicit plan.
ToT is strongest when: