The Hop Count: How I Got My Knowledge Base to Answer Questions in 2 Hops Instead of 11

The Hop Count: How I Got My Knowledge Base to Answer Questions in 2 Hops Instead of 11

·4 min read

I just asked Claude: "What's our competitive positioning against scanner vendors?"

Most repos: Claude reads 11 files, burns 35k tokens, takes 45 seconds, gives you a frankensteined answer from contradictory sources.

My repo: Claude reads 1 synthesis doc, follows 1 link to battle cards, 4k tokens, 8 seconds, perfect answer.

The difference? I measure and optimize for "Hop Count" - the number of steps required to get an accurate answer.

What is a "Hop Count"?

A "hop" is any action a user (human or AI) must take to find an answer:

  • Opening a new folder
  • Reading a document
  • Performing a new search

Your goal should be to answer 90% of common business questions in 2 hops or less.

  • Good (2 Hops): Query → Master Synthesis Doc → Answer
  • Bad (4+ Hops): Query → Folder → Sub-Folder → Read 5 different docs → Manually synthesize answer

A high hop count is a tax on your entire organization. It's friction, wasted time, and slower decisions.

The Root of the Problem: Synthesis vs. Detail

Most company knowledge bases are graveyards - digital junk drawers where information goes to die. The problem isn't storage; it's synthesis.

Most knowledge bases fail because they are just "Detail Layers" - collections of raw, unprocessed information.

A successful knowledge system has a strong "Synthesis Layer": a small number of canonical, master documents that roll up 90% of the insights from the Detail Layer.

Your team shouldn't have to read 47 customer call transcripts to find pain points. They should read the single Customer_Pain_Point_Summary.md document that synthesizes all 47 calls.

The 3-Layer Pattern

Layer 1: Synthesis (answers 95% of questions)

  • competitive_intel_master.md (800 lines, canonical)
  • Updated when patterns emerge from detail layer

Layer 2: Detail Docs (when synthesis insufficient)

  • competitors/vendor_x/customer_mentions.md
  • Auto-updated when calls mention that competitor
  • 20+ mentions → cascades insight to synthesis

Layer 3: Raw Data (audit trail + proof points)

  • Call transcripts, industry reports
  • Extracted automatically, rarely read directly

The trick: Most questions never leave Layer 1.

How This Actually Works

Every folder has a README that teaches Claude the shortest path:

# 03_competitive/README.md

## Quick Answers (Start here)
Positioning → `../00_foundation/competitive_intel_master.md`

## Deep Research (Only if synthesis insufficient)
Specific competitor → `competitors/[name]/overview.md`
Customer quotes → `competitors/[name]/customer_mentions.md`

## Raw Data (Audit trail only)
Industry reports → `reports/`
Call transcripts → `../01_customer/transcripts/`

Claude hits the README, knows exactly where to go, 2 hops max.

Real Example From Today

Question: "Show me proof that developers ignore scanner findings"

Without this structure:

  • Reads 8 customer call transcripts (22k tokens)
  • Searches through 3 research reports (11k tokens)
  • Misses the best quote buried in call #34
  • 33k tokens, incomplete answer

With this structure:

  • Reads customer_intel_synthesis.md Section 2.1 "Scanner Fatigue"
  • Finds table with 8 proof points, confidence ratings, source links
  • 2k tokens, perfect answer with sources
  • Can hop to raw transcript if needs exact quote

What Makes This Fast

1. READMEs are navigation maps

Every domain folder teaches Claude the path:

  • "Need positioning? Read synthesis."
  • "Need proof? Read detail docs."
  • "Need exact quote? Here's the link."

2. Synthesis docs have structure Claude can scan

## Section 2.1: Scanner Fatigue

| Pain Point | Companies (Count) | Confidence |
|------------|-------------------|------------|
| "Findings get ignored" | 8 companies | ⭐⭐⭐ HIGH |
| Proof: AutoZone: "Fortify findings generally get ignored"
| Detail: → competitors/fortify/customer_feedback.md

Claude sees the table, gets the answer, only hops if you ask for details.

3. Automation keeps layers in sync

  • Call uploaded → extract competitor mentions → update detail doc
  • 5+ mentions of same claim → flag synthesis for update
  • Synthesis changes → cascade to derived docs (battle cards, playbooks)

No manual file hunting. No stale docs. No contradictions.

Real Query Performance Data

QueryHop CountPathEfficiency
"How do we position vs. Competitor A?"2Navigation → Competitive Intel Master⭐⭐⭐ Excellent
"ROI story for financial services?"3Navigation → Personas → Customer Synthesis⭐⭐ Good
"Top 3 buyer objections?"2Navigation → Customer Pain Points⭐⭐⭐ Excellent
"Positioning vs. new market entrant?"2Navigation → Competitive Intel Master⭐⭐⭐ Excellent
"Developer pain points?"2Navigation → Customer Pain Points⭐⭐⭐ Excellent
"SEO strategy for target keyword?"3Navigation → Content README → SEO Strategy⭐⭐ Good
"Brand voice for technical content?"2Navigation → Brand Voice Guide⭐⭐⭐ Excellent

Pattern: 78% of queries answered in 2 hops or less. Average: 2.3 hops.

Compare to typical folder structure: 8-11 file reads per query.

The Token Difference

Old way (folder hierarchy):

  • Average question: 35k tokens
  • 8-11 files read
  • 45 seconds
  • Answer quality: 6/10 (contradictions, missed context)

New way (synthesis + smart navigation):

  • Average question: 4k tokens
  • 1-2 files read
  • 8 seconds
  • Answer quality: 9/10 (consistent, sourced, complete)

That's 8.7x fewer tokens for better answers.

The 10-Question "Hop Count" Diagnostic

Run this test on your own knowledge base quarterly. How many hops does it take to get a complete, accurate answer?

#Critical Business QueryYour Hop CountIdeal Path
1"Positioning vs. [Competitor A]?"?2 Hops: KB Home → Competitive_Master_Doc.md
2"ROI story for [Key Vertical]?"?2 Hops: KB Home → Sales_Playbook.md
3"Top 3 [Key Persona] objections?"?2 Hops: KB Home → Objection_Handling_Guide.md
4"Quarterly event strategy & budget?"?2 Hops: KB Home → Events_Strategy.md
5"Positioning vs. [Big Tech Competitor]?"?2 Hops: KB Home → Competitive_Master_Doc.md
6"[Key Persona] pain points?"?2 Hops: KB Home → Customer_Pain_Summary.md
7"SEO strategy for [Top Keyword]?"?3 Hops: KB Home → Marketing_Hub.md → SEO_Strategy.md
8"Cost per lead for [Key Tactic]?"?2 Hops: KB Home → Marketing_Metrics.md
9"What is our brand voice?"?2 Hops: KB Home → Brand_Voice_Guide.md
10"What is our current pricing?"?2 Hops: KB Home → Pricing_and_Packaging.md

If your hop counts are 4, 5, or 6+, your synthesis layer is broken. Your team is wasting hours hunting for information. Your AI assistants are burning tokens reading useless files.

How to Fix a High Hop Count

The solution: stop just collecting information and start synthesizing it.

Week 1: Pick your messiest folder. Create ONE synthesis doc. Consolidate everything into it.

Week 2: Add a README that tells Claude where to look first.

Week 3: Set up automation to extract details from raw data into structured docs.

Week 4: Add cascade rules (when detail doc hits threshold, update synthesis).

You don't need to do this all at once. Start with one domain. The token savings compound fast.

The 4-Step Fix

1. Identify Your "High Hop" Questions

Find the critical questions that are slow to answer.

2. Create Missing Synthesis Docs

For every "high hop" question, you're missing a master document. Go create it.

3. Appoint a Knowledge "Synthesizer"

Make it someone's job to regularly update these master docs with new insights from the detail layer.

4. Ruthlessly Archive

Move raw, unprocessed detail files into an "Archive" or "Source Data" folder so they don't clog up navigation.

What I Learned Building This

The goal isn't "organize files."

The goal is: minimize hops from question to answer.

Synthesis docs = 1 hop. Detail docs = 2 hops. Raw data = 3 hops (rarely needed).

Most questions should never get past hop 1.

Stop building a knowledge graveyard. Start building an operating system.

Measure your "Hop Count" and declare war on friction.