Why AI Forgets Everything — 3 Open-Source Solutions to the Memory Crisis
karpathy-skills, claude-mem, Cognee — comparing 3 approaches to solving the AI memory problem.

Why AI Forgets Yesterday's Work — 3 Open-Source Solutions to the Memory Problem
Between an AI that needs constant reminders to "use formal Korean" and one that remembers forever after being told once — which would you choose?
The AI Memory Problem
I spent 10 hours working on a project with Claude Code. Fixed bugs, made architectural decisions, established rules like "we're doing it this way." Then the session ends and I return the next day — the AI has forgotten everything.
This isn't just an inconvenience. It's a structural loss of productivity.
Every session, the same things repeat:
- "Write Korean posts in formal style"
- "Follow this pattern for Sanity upload scripts"
- "This project's slug convention is no suffix for Korean, -en for English"
- "I started building that feature two days ago... where did I leave off?"
LLMs have the world's best contextual understanding, but memory like a goldfish. When a session ends, everything evaporates.
Three open-source projects tackling this problem are exploding on GitHub right now. Each takes a different approach, and comparing them reveals the true nature of the "AI memory" problem.
1. andrej-karpathy-skills — One CLAUDE.md File to Rule Them All
GitHub: forrestchang/andrej-karpathy-skills
Stars: 48,900+ (7,900 in one day)
What is it
In January 2026, Andrej Karpathy observed and documented recurring failure patterns in LLM coding agents, distilled into a single CLAUDE.md file. Place this file in your project root and Claude Code automatically reads and follows these rules.
Karpathy's 4 LLM Failure Patterns
| Pattern | Description |
|---|---|
| Silent Assumptions | Makes assumptions and runs with them without verification |
| Overengineering | Turns 100-line solutions into 1,000 lines |
| Scope Creep | Touches code you never asked for |
| Lack of Judgment | Syntax is correct but judgment is absent |
The 4 Rules
Rule 1: Think Before Coding
Explicitly state assumptions before coding. If uncertain, ask. When multiple interpretations are possible, present options — don't silently pick one.
Rule 2: Simplicity First
Implement only what's requested. No future flexibility, no configurability, no error handling for impossible scenarios. If a 200-line solution can be 50 lines, rewrite it.
Rule 3: Surgical Changes
When editing existing code, don't "improve" adjacent code. Follow existing style even if you don't like it. If you find unrelated dead code, mention it but don't delete it.
Rule 4: Goal-Driven Execution
Convert every task into verifiable goals:
- "Add validation" → "Write tests for invalid inputs and make them pass"
- "Fix bug" → "Write a test that reproduces the bug and make it pass"
Relationship to Memory Problem
karpathy-skills is the most primitive solution to the memory problem: "Let's write down the rules AI keeps forgetting." Since CLAUDE.md auto-loads at session start, these become rules the AI "cannot forget."
The limitations are clear:
- Manual. Humans must write the rules
- Static. Can't capture context that accumulates as projects evolve
- Size-limited. Not recommended beyond 200 lines
Yet it earned 48K stars because this alone raised coding accuracy from 65-70% to 91-94%, according to community reports. Simple but powerful.
2. claude-mem — Automatic Cross-Session Context Memory
GitHub: thedotmack/claude-mem
Stars: 59,300+ (peaked at 62.7K)
What is it
claude-mem is a persistent memory plugin that automatically captures all Claude Code activity, compresses it with AI, and auto-injects it into the next session. If karpathy-skills is "write down the rules," claude-mem is "record everything automatically."
How it works
Session Start → Auto-inject recent work history (800~3,000 tokens)
↓
During Work → Background AI compression of all tool call results
↓
File Read → Auto-inject past memories related to that file
↓
Session End → Generate and save session summaryThe core is 5 lifecycle hooks:
| Hook | Trigger | Action |
|---|---|---|
SessionStart | Session start | Inject 50 recent observations + 10 session summaries |
UserPromptSubmit | Prompt input | Session logging |
PostToolUse | After tool execution | Send results to background worker, AI compression |
Stop | Interrupt/idle | Generate session-level summary |
SessionEnd | Session end | Finalize metadata |
Most Innovative Feature: PreToolUse:Read
When Claude reads a file, it auto-injects past observations about that file. In other words, memory follows the agent's gaze. Even the official docs call this feature "genuinely novel."
3-Layer Progressive Disclosure
For token efficiency, it doesn't load all memories at once:
- Search (Layer 1): Return only observation IDs (~50-100 tokens)
- Timeline (Layer 2): Provide chronological context
- Detail (Layer 3): Full load only selected observations (~500-1,000 tokens)
Result: 5,250 tokens used where 25,000 would be needed (80% savings).
Installation
npx claude-mem installThis single line completes SQLite + ChromaDB setup, hook registration, and worker service startup.
Difference from karpathy-skills
| karpathy-skills | claude-mem | |
|---|---|---|
| Method | Manual rule writing | Auto-capture + AI compression |
| Scope | Immutable rules | All work history |
| Size | ~200 lines | Unlimited (DB) |
| Updates | Manual | Automatic |
| Search | None (full load) | Semantic + keyword |
| Cost | Free | $0.002-0.01 per compression |
Limitations
Weaknesses identified in source code reviews:
- Zero knowledge integrity verification
- Zero quality/trust scoring
- Zero append-only protection
Powerful, but insufficient for enterprise-grade security environments.
3. cognee — Learning Memory Engine for AI Agents
GitHub: topoteretes/cognee
Stars: 16,400+
What is it
cognee is an AI memory engine that transforms unstructured data into a learning, evolving knowledge system. If claude-mem is "personal memory," cognee is closer to "organizational knowledge."
AI Memory in 6 Lines
import cognee, asyncio
async def main():
await cognee.remember("Cognee transforms documents into AI memory.")
results = await cognee.recall("What does Cognee do?")
for result in results:
print(result)
asyncio.run(main())Behind this simplicity lies a sophisticated pipeline:
remember: Data → embeddings + graph nodes → persistent storagerecall: Auto-route queries to optimal search strategy (vector/graph hybrid)forget: Selective deletion including relationship cleanupimprove: Update knowledge structure via feedback-based learning
RAG vs. cognee
| Traditional RAG | cognee | |
|---|---|---|
| Memory Type | Static documents | Learning, evolving knowledge graph |
| Search | Vector similarity only | Vector + graph traversal hybrid |
| Context | Single session | Cross-session, cross-agent |
| Learning | None | Continuous improvement via improve() |
| Relationships | None | Explicit concept connections + ontology |
| Multi-agent | Isolated | Tenant isolation + shared knowledge |
Core Differentiator: Learning Memory
RAG retrieves. cognee learns and reasons. It doesn't just find similar documents — it understands how concepts connect and adapts based on outcomes.
Use Cases
- Customer Support Agents: Connect past consultation histories in graphs, auto-retrieve validated solutions from similar cases
- Expert Knowledge Transfer: Capture senior analyst SQL patterns, auto-provide to juniors
- Multi-agent Research: Share knowledge between agents, prevent duplicate research
Comparing the 3 Tools
These three tools solve different layers of the same problem:
karpathy-skills (Rules)
└── "Do it this way" — Immutable principles
↓
claude-mem (Memory)
└── "Yesterday I did this" — Personal history
↓
cognee (Knowledge)
└── "This connects to that like this" — Organizational knowledge| Dimension | karpathy-skills | claude-mem | cognee |
|---|---|---|---|
| Metaphor | Post-it note | Diary | Encyclopedia |
| Target | Individual developers | Individual/team | Team/organization |
| Automation | Manual | Fully automatic | API-based |
| Memory Structure | Flat text | Time-series observations | Knowledge graph |
| Learning | None | Pattern recognition | Feedback-driven evolution |
| Token Efficiency | Full load | Progressive Disclosure | Auto-routing |
| Stars | 48.9K | 59.3K | 16.4K |
| Setup Difficulty | Copy file | One npx line | pip install |
Which Should You Use?
- Want to improve Claude Code quality right now? → karpathy-skills. One file copy and done
- Work on the same project daily and hate losing context? → claude-mem. 5-minute install, positive ROI after 3-5 sessions
- Running AI agents team-wide and need accumulating knowledge? → cognee. Learning memory is the next step beyond RAG
Real-World Combination Recommendation
The most powerful approach is using all three together:
- Put karpathy-skills' 4 rules in
CLAUDE.mdto secure baseline quality - Use claude-mem to automatically maintain cross-session history
- Use cognee to build team/org-level knowledge and inject into agents
CLAUDE.md (Rules layer)
+ claude-mem (Memory layer)
+ cognee (Knowledge layer)
= AI that doesn't forgetClosing: Toward AI that Remembers
If 2024's question was "Can AI write code?", 2026's question is "Can AI remember what it did yesterday?"
The 48K stars on karpathy-skills and 59K on claude-mem show how urgent this problem is. Developers no longer ask "Does AI write code well?" They ask: "Does AI maintain context?"
None of these tools are perfect. karpathy-skills is manual, claude-mem lacks enterprise-grade security, and cognee has setup complexity. But they all point in the same direction: The next leap for LLMs isn't bigger models — it's better memory.
*Projects featured in this post:*
- andrej-karpathy-skills — CLAUDE.md coding rules (48.9K stars)
- claude-mem — Persistent memory plugin (59.3K stars)
- cognee — AI memory engine (16.4K stars)
Part 1 of 3 complete
2 more parts waiting for you
From theory to production deployment — subscribe to unlock the full series and all premium content.
Subscribe to Newsletter
Related Posts

Self-Evolving AI Agents — The New Paradigm of 2026
GenericAgent, Evolver, Open Agents — comparing 3 self-evolving agent frameworks that learn, adapt, and grow without human coding.

Build Your Own LLM Knowledge Base — A Karpathy-Style Knowledge System
Complete guide to building a permanent personal knowledge system with Obsidian + Claude Code. Wiki + Memory dual-axis architecture.

Why Karpathy's CLAUDE.md Got 48K Stars — And How to Write Your Own
One markdown file raised AI coding accuracy from 65% to 94%. Analyzing Karpathy's 4 rules and practical writing guide.