Hard Lessons from a Failed Framework Redesign
I spent 8 days rebuilding the IA Framework from the ground up. It failed spectacularly. Here's what I learned about engineering reality, AI behavior, and the value of boring solutions.
Over the first two weeks of January 2026, I undertook an ambitious redesign of the Intelligence Adjacent framework. The goal was elegant: unify the architecture under a "module-driven" approach with TypeScript hooks, YAML frontmatter, and enhanced AI-readability features. The result was less elegant: I archived the entire attempt and restored the original V1 structure.
This is the story of that failure, what I learned, and why the boring solution won.
What I Tried: The V2 Vision
Between January 7-9, I committed to a complete framework restructure. The V2 approach looked clean on paper:
Module-Driven Architecture:
- Replace "skills" terminology with "modules" for consistency
- Unified module contract:
index.tsentry point, YAML frontmatter in README.md - TypeScript hook loader for dynamic dispatch
- Enhanced session schema with workflow state tracking
AI-Readability Enhancements:
- "FOR AI AGENTS" headers in all documentation
- YAML frontmatter with structured metadata
- Module tier system (foundation, capability, integration)
- Enforcement hooks for workflow validation
System Organization:
- New root docs: INSTALL.md, PLATFORM.md, SECURITY.md
- Architecture docs moved to
system/directory - Comprehensive MODULE-TEMPLATE for consistency
- Python validators ported to TypeScript (TypeScript-first principle)
The commits tell the story: "Phase 1: Root documentation and system organization," "Phase 2: Module hook loader infrastructure," "Phase 3: Ghost module migration (reference implementation)."
It looked ambitious but achievable. I even documented it as "V2 restructure - Phases 1-3 complete."
What Failed: Engineering Reality Hits
By January 12, reality caught up. The commit message says it plainly: "Consolidate skills architecture: archive V2, restore V1 base."
The fundamental problem: Claude doesn't reliably call TypeScript functions back.
I built a beautiful hook loader system. Dynamic dispatch. Workflow state tracking. Module contracts with frontmatter validation. All of it assumed Claude would execute TypeScript reliably during workflow phases.
It didn't.
What actually happened:
- Hook calls succeeded inconsistently
- Function execution wasn't guaranteed between context switches
- Abstract state (in-memory objects) disappeared between agent invocations
- File-based state (markdown, YAML) persisted; everything else was ephemeral
The V2 architecture relied on TypeScript execution guarantees that Claude Code doesn't provide. I was designing for the world I wanted, not the world that exists.
The Recovery: Back to V1 with Lessons Applied
On January 12, I made the call: archive V2, restore V1, consolidate what worked.
The consolidation commit:
- Moved V2 to
archive/ia-framework-v2/(preserved for reference) - Restored V1 structure as main framework base
- Created consolidated skills from both versions:
ghost/- Blog publishing with V2 content improvementssecurity/- Pentesting with 7 methodologiesadvisory/- ISO + NIST frameworkshealth/- V1 + V2 tiered research combinedcontent/- Writer + diagramsresearch/- OSINT + QAgit/- V2 push/public modules
What V1 got right:
- File-based context (markdown, YAML, templates)
- Hierarchical loading (CLAUDE.md → skills//SKILL.md → agents/.md)
- Self-contained skills (each has input/, output/, scripts/, templates/)
- Phase-based workflows with file gates (not function callbacks)
What V2 contributed:
- Better voice and tone (first person, helpful not dictatorial)
- Content redesign patterns (AI-first framing, visual hierarchy)
- Workflow diagrams (ASCII art in markdown)
- Transparency about constraints
The result: V1 structure with V2 lessons applied. Boring, reliable, file-based.
What I Learned: Five Hard Truths
1. Design for How AI Actually Behaves
The mistake: I designed for reliable TypeScript execution.
The reality: Claude's execution model is opportunistic. File reads/writes persist. Function calls don't guarantee callbacks. State resets between invocations.
The lesson: If you can't checkpoint it to a file, don't depend on it. File-based workflows are boring but they survive context switches.
2. Markdown + Hooks Beat Abstraction
The mistake: I built a hook loader with dynamic dispatch and workflow state tracking.
The reality: Claude loads markdown files reliably. It reads YAML frontmatter consistently. It follows phase-based workflows defined in text files.
The lesson: The simplest architecture that works is better than the elegant architecture that doesn't. Markdown documentation with validation hooks beat TypeScript abstractions.
3. Hierarchical Context Loading Solves Token Bloat
This one worked. Before V2, CLAUDE.md was 800+ lines with agent methodologies, tool lists, and workflows mixed in.
The fix:
- Level 1: CLAUDE.md (<150 lines) - Navigation only
- Level 2: skills/*/SKILL.md (<500 lines each) - Complete skill context
- Level 3: agents/*.md (<200 lines each) - Agent identity and routing
Token efficiency: 69% reduction for simple tasks. Maintenance: update one file instead of monolithic CLAUDE.md.
This survived the consolidation. It's now the standard pattern.
4. Self-Contained Skills Are the Right Unit
V2 tried "modules" for consistency. The terminology change added nothing.
What matters: Each skill is self-contained:
skills/ghost/
├── SKILL.md # Complete documentation
├── workflows/ # Phase-based execution
├── docs/ # Domain knowledge
├── scripts/ # Automation tools
├── templates/ # Output formats
├── input/ # User-provided resources
└── output/ # Generated deliverables
No dependencies on global state. No shared abstractions. Each skill loads its own context and executes independently.
This pattern survived because it matches how Claude actually loads context.
5. Transparency About Failure Builds Trust
From the voice guide I wrote during recovery: "Readers learn more from failures than successes. Transparent failure analysis builds confidence that decisions are informed, not accidental."
I spent 8 days on V2. It failed. The commit message doesn't hide it: "Archive V2, restore V1 base."
The lesson: Document what didn't work and why. It prevents repeating mistakes. It shows decision-making is evidence-based, not vibes-based.
The New Architecture: What Survived
After consolidation, the framework stabilized around these principles:
File-Based Context:
- CLAUDE.md for navigation (<150 lines)
- skills/*/SKILL.md for complete skill documentation (<500 lines)
- Workflows defined in markdown with phase gates
- Templates in files, not code
- State checkpointed to markdown files
Hierarchical Loading:
- Claude loads CLAUDE.md on startup
- Skills load on-demand via routing
- Agents read skill context progressively
- No monolithic files, no global state
Self-Contained Skills:
- Each skill manages its own input/output/scripts/templates
- No cross-skill dependencies
- Workflows execute via markdown documentation
- Validation via hooks, not TypeScript contracts
Human-Readable Constraints:
- First person voice (I, not we)
- Helpful tone (recommend, not mandate)
- Transparent about what doesn't work
- Evidence-based decisions
The redesign sessions from January 18 show the refinement: AI-first framing, workflow diagrams, progressive disclosure. But the underlying structure is V1. Because V1 works.
What's Next: Building on Boring
The framework is stable. Ghost blog publishing is operational with 5-phase workflow (Research → Draft → QA → Visuals → Publish). Security testing has 7 methodologies. Advisory includes ISO/NIST frameworks. Career analysis, wellness research, CliftonStrengths coaching—all working.
The roadmap focuses on:
- Expanding public skills (currently 6 ready for release)
- Content creation workflows (this post is dogfooding the system)
- Integration testing (Cal.com, Stripe, n8n planned)
- Documentation refinement (voice guide enforcement)
No grand architectural rewrites. No module contracts or hook loaders. Just file-based workflows, hierarchical context, and self-contained skills.
Because boring solutions that work beat elegant solutions that don't.
Lessons for AI-Assisted Development
If you're building frameworks for AI agents:
Design for persistence, not execution:
- File-based state survives context switches
- Function calls don't guarantee callbacks
- Markdown documentation works; TypeScript contracts are fragile
Optimize for token efficiency:
- Hierarchical loading reduces context bloat
- Navigation layer + skill-specific context
- Load what you need, when you need it
Self-contained is better than shared:
- Independent skills with no cross-dependencies
- Each skill manages its own resources
- No global state, no shared abstractions
Be transparent about constraints:
- Document what doesn't work
- Explain why you made each decision
- Show the evidence, not just the conclusion
Test on AI behavior, not assumptions:
- How does Claude actually load context?
- What execution guarantees exist?
- What persists between invocations?
The V2 attempt taught me these lessons the hard way. The framework is better for it. And this blog post exists because I documented what failed, not just what worked.
That's the Intelligence Adjacent philosophy: build systems that work alongside human intelligence, not replace it. Sometimes that means learning the hard way that your elegant solution doesn't match reality.
And sometimes that means archiving 8 days of work and restoring the boring solution that works.