framework

Hierarchical Context Loading: Why Progressive Disclosure Beats Monolithic Prompts

A three-tier architecture that loads AI context progressively instead of all at once. Using /create-skill as a real example of how navigation → skill → execution prevents token waste and enables session persistence.

Chris Groves

20 Dec 2025 — 6 min read

Your AI assistant can do more than you think. It just needs better scaffolding.

Every session starts from zero. You explain context, re-establish workflows, reload templates. The AI has capability—it just lacks architecture to use it consistently across sessions.

That's the problem hierarchical context loading solves. Instead of dumping 8,000 lines of context into every session, you load three tiers progressively: navigation (what exists), skills (how to execute), and agents (who does it).

Following Anthropic's research on building effective agents, the solution is better orchestration—not bigger models.

The Context Overload Problem

Most AI frameworks do this:

User: "Create a new skill"
     ↓
AI loads EVERYTHING:
     ├── All agents (800+ lines)
     ├── All skills (5,000+ lines)
     ├── All templates (2,000+ lines)
     ├── All workflows (3,000+ lines)
     └── Total: 10,000+ lines loaded

Problems:

Wastes tokens on irrelevant context
Slow session startup
Hard to maintain (update one thing, breaks everywhere)
Can't fit large reference materials in context
AI gets lost in noise

The Solution: Three-Tier Progressive Loading

Instead of loading everything, load three tiers on-demand:

┌─────────────────────────────────────────────────────────┐
│  Tier 1: CLAUDE.md (~315 lines)                         │
│  "Where do I go? What exists?"                          │
│                                                          │
│  ├─ Agent registry (security, writer, engineer)         │
│  ├─ Skill directory (what skills exist)                 │
│  ├─ Routing rules (which agent handles what)            │
│  └─ Global preferences                                  │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│  Tier 2: skills/[skill-name]/SKILL.md (~300-500 lines)  │
│  "How do I execute this?"                               │
│                                                          │
│  ├─ Skill identity and purpose                          │
│  ├─ 5-phase workflow structure                          │
│  ├─ Tool and script inventory                           │
│  ├─ Pointers to supporting docs                         │
│  └─ Success criteria                                    │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│  Tier 3: Execution Context (on-demand)                  │
│  "What do I need right now?"                            │
│                                                          │
│  Phase 1 → requirements-questions.md (~200 lines)       │
│  Phase 2 → naming-conventions.md                        │
│  Phase 3 → templates/ (loaded one at a time)            │
│  Phase 4 → quality-checklist.md                         │
│  Phase 5 → (no extra docs needed)                       │
└─────────────────────────────────────────────────────────┘

Result: Dramatic token reduction while maintaining full capability.

Real Example: The /create-skill Workflow

Let me show you how this works using the actual /create-skill command—the skill that creates new skills.

When you type /create-skill, here's what loads:

File: CLAUDE.md (315 lines)
Purpose: "What skills exist and where do I find them?"

## Available Skills
- create-skill → Interactive skill scaffolding wizard

## Agent Routing
- Skill creation → engineer agent (for complex workflows)
- Simple templates → Base Claude (no agent needed)

That's it. Just a pointer to the skill. No workflows, no templates, no methodology.

Tier 2: Skill Execution (SKILL.md)

File: skills/create-skill/SKILL.md (458 lines)
Purpose: "How do I create a skill? What's the process?"

Now the skill loads its own context:

# Create-Skill Workflow

## 5-Phase Process:
1. DISCOVER - Gather requirements
2. DESIGN - Plan structure
3. GENERATE - Create files from templates
4. VALIDATE - Run quality checks
5. HANDOFF - Guide user to customize

## Templates Used:
- skills/create-skill/templates/SKILL-TEMPLATE.md
- skills/create-skill/templates/README-TEMPLATE.md
- skills/create-skill/templates/VERIFY-TEMPLATE.md
- skills/create-skill/templates/phases/PHASE-TEMPLATE.md

## Supporting Docs (load on-demand):
- docs/requirements-questions.md
- docs/naming-conventions.md
- docs/skill-structure-standards.md

Notice: The SKILL.md doesn't contain the templates themselves. It just points to them.

Tier 3: On-Demand Reference

Only when needed, the skill loads specific supporting docs:

Phase 1 (DISCOVER) → Load requirements-questions.md
Phase 2 (DESIGN) → Load naming-conventions.md
Phase 3 (GENERATE) → Load templates one at a time
Phase 4 (VALIDATE) → Load quality checklist
Phase 5 (HANDOFF) → Nothing extra needed

Total loaded during Phase 1:

CLAUDE.md: 315 lines
SKILL.md: 458 lines
requirements-questions.md: ~200 lines
Total: ~973 lines

Traditional approach would load: 10,000+ lines (everything)

That's an order of magnitude reduction while maintaining full capability.

How This Enables Session Persistence

Here's where it gets powerful.

Traditional approach:

Session 1: User starts creating skill
Session 2 (next day): AI has no memory
           User re-explains everything
           AI reloads all context

Hierarchical approach with checkpointing:

┌──────────────────────────────────────────────────────────┐
│  SESSION 1: Initial Work                                 │
└──────────────────────────────────────────────────────────┘
    User: "Create new skill for API testing"
      ↓
    AI loads context:
      ├─ CLAUDE.md (315 lines)
      ├─ skills/create-skill/SKILL.md (458 lines)
      └─ requirements-questions.md (Phase 1)
      ↓
    Work completes Phase 2 (DESIGN)
      ↓
    AI creates checkpoint:
      sessions/2026-01-29-api-testing-skill.md
      ├─ Skill name: api-testing
      ├─ Phase completed: 2 (DESIGN)
      ├─ Decisions: REST focus, TypeScript, oauth support
      └─ Files created: SKILL.md draft, README.md draft

┌──────────────────────────────────────────────────────────┐
│  SESSION 2: Resume Work (next day)                       │
└──────────────────────────────────────────────────────────┘
    User: "Continue the API testing skill"
      ↓
    AI loads same context:
      ├─ CLAUDE.md (315 lines)
      ├─ skills/create-skill/SKILL.md (458 lines)
      └─ sessions/2026-01-29-api-testing-skill.md (checkpoint)
      ↓
    AI reads checkpoint and knows:
      ├─ Current phase: 3 (GENERATE)
      ├─ Context: REST API testing, TypeScript, oauth
      └─ Next action: Load templates, generate files
      ↓
    Continues exactly where we left off

The checkpoint file stores:

Which skill was being created
Which phase we completed
Decisions already made
Files already created

Combined with hierarchical loading, the AI can resume work across sessions because it knows:

Where it is (checkpoint file)
What to do (SKILL.md workflow)
How to do it (on-demand templates)

The Three Context Tiers Explained

Tier 1: Organization (CLAUDE.md)

Size: ~315 lines
Purpose: Navigation layer

Contains:

Agent registry (security, writer, engineer, advisor, legal)
Skill directory (what skills exist)
Routing rules (which agent handles what)
Critical global requirements

Does NOT contain:

Workflows
Methodologies
Templates
Tool documentation
Implementation details

Update when:

New skill added
New agent created
Routing rules change

Tier 2: Skills (skills/*/SKILL.md)

Size: 300-500 lines per skill
Purpose: Complete skill context with progressive loading

Contains:

Skill identity and purpose
5-phase workflow structure
Tool and script inventory
Pointers to supporting docs
Success criteria
Output specifications

Progressive pattern:

SKILL.md acts as navigation to:
├── docs/ - Reference documentation
├── templates/ - Output templates
├── workflows/ - Detailed procedures
└── phases/ - Step-by-step execution

Update when:

Workflow changes
New tools added
Methodology refined

Tier 3: Agents (agents/*.md)

Size: ~170-190 lines per agent
Purpose: Agent identity and specialized behavior

Contains:

Agent role definition
Communication style
Context loading instructions
Skill routing (which skills this agent uses)

Example:

# Engineer Agent

## Core Identity
Implementation specialist. Infrastructure, remediation, deployment.

## Skills Used
- create-skill (skill scaffolding)
- infrastructure-ops (deployment automation)
- remediation (fix security findings)

## Context Loading
1. Read CLAUDE.md
2. Load skill based on user request
3. Read session checkpoint (if exists)
4. Execute skill workflow

Update when:

Agent behavior changes
New skills become available
Communication style evolves

Catalog-Based Discovery

To prevent even the navigation layer from becoming bloated, the framework uses catalog files:

library/catalogs/COMMANDS.md:

Complete list of all slash commands
├── /create-skill → create-skill skill (public)
├── /pentest → security skill (private)
├── /career → career skill (public)
└── [47 total commands]

library/catalogs/TOOL-CATALOG.md:

Complete API client and utility inventory
├── Ghost CMS client (authenticated)
├── OpenAI client (authenticated)
└── [Tool status and authentication requirements]

Why catalogs?

CLAUDE.md stays concise (points to catalog)
Catalog can grow without bloating navigation
Easy to search and reference
Automated filtering for public/private split

Enforcing the Architecture

Pre-commit hooks enforce size limits:

# Validation before every commit
hooks/validate-context-sizes.sh

Checks:
✓ CLAUDE.md < 400 lines (currently 315)
✓ agents/*.md < 200 lines (currently ~186)
✓ skills/*/SKILL.md < 500 lines (currently ~300-458)

If violated → Commit blocked

Why strict limits?

Without enforcement, entropy wins. Files bloat. Context becomes noise. The architecture degrades.

Hard limits force:

Separation of concerns
Progressive disclosure
Catalog-based discovery
On-demand loading

Benefits in Practice

Token efficiency:

Load only what's needed for current task
Avoid dumping entire framework into every session
Enables larger reference materials by not front-loading them

Maintainability:

Update one skill file, not monolithic config
Clear separation of what goes where
Easy to find and modify specific workflows

Scalability:

Add new skills without bloating navigation
Skills self-contained (input/, output/, scripts/)
Catalog grows independently of core files

Session persistence:

Checkpoint files store session state
Hierarchical loading brings back exact context
Resume multi-day work exactly where you left off

Try It Yourself

The Intelligence Adjacent framework implements this architecture:

# Clone
git clone https://github.com/notchrisgroves/ia-framework.git

# Install (creates ~/.claude symlink)
./setup/install.sh

# Try creating a skill
/create-skill

# Observe hierarchical loading
# 1. CLAUDE.md loads first (navigation)
# 2. create-skill/SKILL.md loads (execution)
# 3. Supporting docs load on-demand (templates)

Watch the context load progressively. Notice:

Fast startup (small nav file)
Clear workflow (SKILL.md)
On-demand templates (only when needed)

The Philosophy

Intelligence Adjacent means AI working alongside human intelligence—not replacing it.

Hierarchical context loading is scaffolding. It gives AI:

Context without overload
Methodology without rigidity
Memory across sessions
Capability without complexity

The AI handles execution. You handle judgment.

Orchestration over intelligence. Architecture over scale. Augmentation over automation.

That's hierarchical context loading in the IA framework.

Stay Updated

Subscribe for free to get weekly posts on AI systems, framework architecture, and building capability without gatekeeping.

Sources

Framework & Architecture

IA Framework Repository - Full source code and documentation
Building Effective Agents - Anthropic - Prompt chaining and workflow orchestration patterns
Context Management Patterns - Anthropic guidance on managing large context windows

Implementation References

Create-Skill Source - Complete /create-skill implementation
Hierarchical Context Loading Documentation - Architecture specification
Agent Format Standards - Size limits and structure enforcement

Hierarchical Context Loading: Why Progressive Disclosure Beats Monolithic Prompts

Chris Groves

The Context Overload Problem

The Solution: Three-Tier Progressive Loading

Real Example: The /create-skill Workflow

Tier 1: Navigation (CLAUDE.md)

Tier 2: Skill Execution (SKILL.md)

Tier 3: On-Demand Reference

How This Enables Session Persistence

The Three Context Tiers Explained

Tier 1: Organization (CLAUDE.md)

Tier 2: Skills (skills/*/SKILL.md)

Tier 3: Agents (agents/*.md)

Catalog-Based Discovery

Enforcing the Architecture

Benefits in Practice

Try It Yourself

The Philosophy

Sources

Framework & Architecture

Implementation References

Read more

2026 Predictions: Five Weeks In, Here's the Scorecard

LLMs Killed Practical Obscurity: Online Anonymity Was Always a Lie

Tool Poisoning and MCP Security: When Your Agent's Toolbox Is the Weapon

From Pattern Scanner to Security Researcher: The Code Review Upgrade