infrastructure

Hard Lessons from a Failed Framework Redesign

I spent 8 days rebuilding the IA Framework from the ground up. It failed spectacularly. Here's what I learned about engineering reality, AI behavior, and the value of boring solutions.

Chris Groves

18 Jan 2026 — 5 min read

Over the first two weeks of January 2026, I undertook an ambitious redesign of the Intelligence Adjacent framework. The goal was elegant: unify the architecture under a "module-driven" approach with TypeScript hooks, YAML frontmatter, and enhanced AI-readability features. The result was less elegant: I archived the entire attempt and restored the original V1 structure.

This is the story of that failure, what I learned, and why the boring solution won.

What I Tried: The V2 Vision

Between January 7-9, I committed to a complete framework restructure. The V2 approach looked clean on paper:

Module-Driven Architecture:

Replace "skills" terminology with "modules" for consistency
Unified module contract: index.ts entry point, YAML frontmatter in README.md
TypeScript hook loader for dynamic dispatch
Enhanced session schema with workflow state tracking

AI-Readability Enhancements:

"FOR AI AGENTS" headers in all documentation
YAML frontmatter with structured metadata
Module tier system (foundation, capability, integration)
Enforcement hooks for workflow validation

System Organization:

New root docs: INSTALL.md, PLATFORM.md, SECURITY.md
Architecture docs moved to system/ directory
Comprehensive MODULE-TEMPLATE for consistency
Python validators ported to TypeScript (TypeScript-first principle)

The commits tell the story: "Phase 1: Root documentation and system organization," "Phase 2: Module hook loader infrastructure," "Phase 3: Ghost module migration (reference implementation)."

It looked ambitious but achievable. I even documented it as "V2 restructure - Phases 1-3 complete."

What Failed: Engineering Reality Hits

By January 12, reality caught up. The commit message says it plainly: "Consolidate skills architecture: archive V2, restore V1 base."

The fundamental problem: Claude doesn't reliably call TypeScript functions back.

I built a beautiful hook loader system. Dynamic dispatch. Workflow state tracking. Module contracts with frontmatter validation. All of it assumed Claude would execute TypeScript reliably during workflow phases.

It didn't.

What actually happened:

Hook calls succeeded inconsistently
Function execution wasn't guaranteed between context switches
Abstract state (in-memory objects) disappeared between agent invocations
File-based state (markdown, YAML) persisted; everything else was ephemeral

The V2 architecture relied on TypeScript execution guarantees that Claude Code doesn't provide. I was designing for the world I wanted, not the world that exists.

The Recovery: Back to V1 with Lessons Applied

On January 12, I made the call: archive V2, restore V1, consolidate what worked.

The consolidation commit:

Moved V2 to archive/ia-framework-v2/ (preserved for reference)
Restored V1 structure as main framework base
Created consolidated skills from both versions:
- ghost/ - Blog publishing with V2 content improvements
- security/ - Pentesting with 7 methodologies
- advisory/ - ISO + NIST frameworks
- health/ - V1 + V2 tiered research combined
- content/ - Writer + diagrams
- research/ - OSINT + QA
- git/ - V2 push/public modules

What V1 got right:

File-based context (markdown, YAML, templates)
Hierarchical loading (CLAUDE.md → skills//SKILL.md → agents/.md)
Self-contained skills (each has input/, output/, scripts/, templates/)
Phase-based workflows with file gates (not function callbacks)

What V2 contributed:

Better voice and tone (first person, helpful not dictatorial)
Content redesign patterns (AI-first framing, visual hierarchy)
Workflow diagrams (ASCII art in markdown)
Transparency about constraints

The result: V1 structure with V2 lessons applied. Boring, reliable, file-based.

What I Learned: Five Hard Truths

1. Design for How AI Actually Behaves

The mistake: I designed for reliable TypeScript execution.

The reality: Claude's execution model is opportunistic. File reads/writes persist. Function calls don't guarantee callbacks. State resets between invocations.

The lesson: If you can't checkpoint it to a file, don't depend on it. File-based workflows are boring but they survive context switches.

2. Markdown + Hooks Beat Abstraction

The mistake: I built a hook loader with dynamic dispatch and workflow state tracking.

The reality: Claude loads markdown files reliably. It reads YAML frontmatter consistently. It follows phase-based workflows defined in text files.

The lesson: The simplest architecture that works is better than the elegant architecture that doesn't. Markdown documentation with validation hooks beat TypeScript abstractions.

3. Hierarchical Context Loading Solves Token Bloat

This one worked. Before V2, CLAUDE.md was 800+ lines with agent methodologies, tool lists, and workflows mixed in.

The fix:

Level 1: CLAUDE.md (<150 lines) - Navigation only
Level 2: skills/*/SKILL.md (<500 lines each) - Complete skill context
Level 3: agents/*.md (<200 lines each) - Agent identity and routing

Token efficiency: 69% reduction for simple tasks. Maintenance: update one file instead of monolithic CLAUDE.md.

This survived the consolidation. It's now the standard pattern.

4. Self-Contained Skills Are the Right Unit

V2 tried "modules" for consistency. The terminology change added nothing.

What matters: Each skill is self-contained:

skills/ghost/
├── SKILL.md          # Complete documentation
├── workflows/        # Phase-based execution
├── docs/             # Domain knowledge
├── scripts/          # Automation tools
├── templates/        # Output formats
├── input/            # User-provided resources
└── output/           # Generated deliverables

No dependencies on global state. No shared abstractions. Each skill loads its own context and executes independently.

This pattern survived because it matches how Claude actually loads context.

5. Transparency About Failure Builds Trust

From the voice guide I wrote during recovery: "Readers learn more from failures than successes. Transparent failure analysis builds confidence that decisions are informed, not accidental."

I spent 8 days on V2. It failed. The commit message doesn't hide it: "Archive V2, restore V1 base."

The lesson: Document what didn't work and why. It prevents repeating mistakes. It shows decision-making is evidence-based, not vibes-based.

The New Architecture: What Survived

After consolidation, the framework stabilized around these principles:

File-Based Context:

CLAUDE.md for navigation (<150 lines)
skills/*/SKILL.md for complete skill documentation (<500 lines)
Workflows defined in markdown with phase gates
Templates in files, not code
State checkpointed to markdown files

Hierarchical Loading:

Claude loads CLAUDE.md on startup
Skills load on-demand via routing
Agents read skill context progressively
No monolithic files, no global state

Self-Contained Skills:

Each skill manages its own input/output/scripts/templates
No cross-skill dependencies
Workflows execute via markdown documentation
Validation via hooks, not TypeScript contracts

Human-Readable Constraints:

First person voice (I, not we)
Helpful tone (recommend, not mandate)
Transparent about what doesn't work
Evidence-based decisions

The redesign sessions from January 18 show the refinement: AI-first framing, workflow diagrams, progressive disclosure. But the underlying structure is V1. Because V1 works.

What's Next: Building on Boring

The framework is stable. Ghost blog publishing is operational with 5-phase workflow (Research → Draft → QA → Visuals → Publish). Security testing has 7 methodologies. Advisory includes ISO/NIST frameworks. Career analysis, wellness research, CliftonStrengths coaching—all working.

The roadmap focuses on:

Expanding public skills (currently 6 ready for release)
Content creation workflows (this post is dogfooding the system)
Integration testing (Cal.com, Stripe, n8n planned)
Documentation refinement (voice guide enforcement)

No grand architectural rewrites. No module contracts or hook loaders. Just file-based workflows, hierarchical context, and self-contained skills.

Because boring solutions that work beat elegant solutions that don't.

Lessons for AI-Assisted Development

If you're building frameworks for AI agents:

Design for persistence, not execution:

File-based state survives context switches
Function calls don't guarantee callbacks
Markdown documentation works; TypeScript contracts are fragile

Optimize for token efficiency:

Hierarchical loading reduces context bloat
Navigation layer + skill-specific context
Load what you need, when you need it

Self-contained is better than shared:

Independent skills with no cross-dependencies
Each skill manages its own resources
No global state, no shared abstractions

Be transparent about constraints:

Document what doesn't work
Explain why you made each decision
Show the evidence, not just the conclusion

Test on AI behavior, not assumptions:

How does Claude actually load context?
What execution guarantees exist?
What persists between invocations?

The V2 attempt taught me these lessons the hard way. The framework is better for it. And this blog post exists because I documented what failed, not just what worked.

That's the Intelligence Adjacent philosophy: build systems that work alongside human intelligence, not replace it. Sometimes that means learning the hard way that your elegant solution doesn't match reality.

And sometimes that means archiving 8 days of work and restoring the boring solution that works.