Why AI Forgets Everything — 3 Open-Source Solutions to the Memory Crisis

Why AI Forgets Yesterday's Work — 3 Open-Source Solutions to the Memory Problem

Between an AI that needs constant reminders to "use formal Korean" and one that remembers forever after being told once — which would you choose?

The AI Memory Problem

I spent 10 hours working on a project with Claude Code. Fixed bugs, made architectural decisions, established rules like "we're doing it this way." Then the session ends and I return the next day — the AI has forgotten everything.

This isn't just an inconvenience. It's a structural loss of productivity.

Every session, the same things repeat:

"Write Korean posts in formal style"
"Follow this pattern for Sanity upload scripts"
"This project's slug convention is no suffix for Korean, -en for English"
"I started building that feature two days ago... where did I leave off?"

LLMs have the world's best contextual understanding, but memory like a goldfish. When a session ends, everything evaporates.

Three open-source projects tackling this problem are exploding on GitHub right now. Each takes a different approach, and comparing them reveals the true nature of the "AI memory" problem.

1. andrej-karpathy-skills — One CLAUDE.md File to Rule Them All

GitHub: forrestchang/andrej-karpathy-skills

Stars: 48,900+ (7,900 in one day)

What is it

In January 2026, Andrej Karpathy observed and documented recurring failure patterns in LLM coding agents, distilled into a single CLAUDE.md file. Place this file in your project root and Claude Code automatically reads and follows these rules.

Karpathy's 4 LLM Failure Patterns

Pattern	Description
Silent Assumptions	Makes assumptions and runs with them without verification
Overengineering	Turns 100-line solutions into 1,000 lines
Scope Creep	Touches code you never asked for
Lack of Judgment	Syntax is correct but judgment is absent

The 4 Rules

Rule 1: Think Before Coding

Explicitly state assumptions before coding. If uncertain, ask. When multiple interpretations are possible, present options — don't silently pick one.

Rule 2: Simplicity First

Implement only what's requested. No future flexibility, no configurability, no error handling for impossible scenarios. If a 200-line solution can be 50 lines, rewrite it.

Rule 3: Surgical Changes

When editing existing code, don't "improve" adjacent code. Follow existing style even if you don't like it. If you find unrelated dead code, mention it but don't delete it.

Rule 4: Goal-Driven Execution

Convert every task into verifiable goals:

"Add validation" → "Write tests for invalid inputs and make them pass"
"Fix bug" → "Write a test that reproduces the bug and make it pass"

Relationship to Memory Problem

karpathy-skills is the most primitive solution to the memory problem: "Let's write down the rules AI keeps forgetting." Since CLAUDE.md auto-loads at session start, these become rules the AI "cannot forget."

The limitations are clear:

Manual. Humans must write the rules
Static. Can't capture context that accumulates as projects evolve
Size-limited. Not recommended beyond 200 lines

Yet it earned 48K stars because this alone raised coding accuracy from 65-70% to 91-94%, according to community reports. Simple but powerful.

2. claude-mem — Automatic Cross-Session Context Memory

GitHub: thedotmack/claude-mem

Stars: 59,300+ (peaked at 62.7K)

What is it

claude-mem is a persistent memory plugin that automatically captures all Claude Code activity, compresses it with AI, and auto-injects it into the next session. If karpathy-skills is "write down the rules," claude-mem is "record everything automatically."

How it works

Session Start → Auto-inject recent work history (800~3,000 tokens)
  ↓
During Work → Background AI compression of all tool call results
  ↓
File Read → Auto-inject past memories related to that file
  ↓
Session End → Generate and save session summary

The core is 5 lifecycle hooks:

Hook	Trigger	Action
`SessionStart`	Session start	Inject 50 recent observations + 10 session summaries
`UserPromptSubmit`	Prompt input	Session logging
`PostToolUse`	After tool execution	Send results to background worker, AI compression
`Stop`	Interrupt/idle	Generate session-level summary
`SessionEnd`	Session end	Finalize metadata

Most Innovative Feature: PreToolUse:Read

When Claude reads a file, it auto-injects past observations about that file. In other words, memory follows the agent's gaze. Even the official docs call this feature "genuinely novel."

3-Layer Progressive Disclosure

For token efficiency, it doesn't load all memories at once:

Search (Layer 1): Return only observation IDs (~50-100 tokens)
Timeline (Layer 2): Provide chronological context
Detail (Layer 3): Full load only selected observations (~500-1,000 tokens)

Result: 5,250 tokens used where 25,000 would be needed (80% savings).

Installation

bash

npx claude-mem install

This single line completes SQLite + ChromaDB setup, hook registration, and worker service startup.

Difference from karpathy-skills

	karpathy-skills	claude-mem
Method	Manual rule writing	Auto-capture + AI compression
Scope	Immutable rules	All work history
Size	~200 lines	Unlimited (DB)
Updates	Manual	Automatic
Search	None (full load)	Semantic + keyword
Cost	Free	$0.002-0.01 per compression

Limitations

Weaknesses identified in source code reviews:

Zero knowledge integrity verification
Zero quality/trust scoring
Zero append-only protection

Powerful, but insufficient for enterprise-grade security environments.

3. cognee — Learning Memory Engine for AI Agents

GitHub: topoteretes/cognee

Stars: 16,400+

What is it

cognee is an AI memory engine that transforms unstructured data into a learning, evolving knowledge system. If claude-mem is "personal memory," cognee is closer to "organizational knowledge."

AI Memory in 6 Lines

python

import cognee, asyncio

async def main():
    await cognee.remember("Cognee transforms documents into AI memory.")
    results = await cognee.recall("What does Cognee do?")
    for result in results:
        print(result)

asyncio.run(main())

Behind this simplicity lies a sophisticated pipeline:

remember: Data → embeddings + graph nodes → persistent storage
recall: Auto-route queries to optimal search strategy (vector/graph hybrid)
forget: Selective deletion including relationship cleanup
improve: Update knowledge structure via feedback-based learning

RAG vs. cognee

	Traditional RAG	cognee
Memory Type	Static documents	Learning, evolving knowledge graph
Search	Vector similarity only	Vector + graph traversal hybrid
Context	Single session	Cross-session, cross-agent
Learning	None	Continuous improvement via `improve()`
Relationships	None	Explicit concept connections + ontology
Multi-agent	Isolated	Tenant isolation + shared knowledge

Core Differentiator: Learning Memory

RAG retrieves. cognee learns and reasons. It doesn't just find similar documents — it understands how concepts connect and adapts based on outcomes.

Use Cases

Customer Support Agents: Connect past consultation histories in graphs, auto-retrieve validated solutions from similar cases
Expert Knowledge Transfer: Capture senior analyst SQL patterns, auto-provide to juniors
Multi-agent Research: Share knowledge between agents, prevent duplicate research

Comparing the 3 Tools

These three tools solve different layers of the same problem:

karpathy-skills (Rules)
  └── "Do it this way" — Immutable principles
        ↓
claude-mem (Memory)
  └── "Yesterday I did this" — Personal history
        ↓
cognee (Knowledge)
  └── "This connects to that like this" — Organizational knowledge

Dimension	karpathy-skills	claude-mem	cognee
Metaphor	Post-it note	Diary	Encyclopedia
Target	Individual developers	Individual/team	Team/organization
Automation	Manual	Fully automatic	API-based
Memory Structure	Flat text	Time-series observations	Knowledge graph
Learning	None	Pattern recognition	Feedback-driven evolution
Token Efficiency	Full load	Progressive Disclosure	Auto-routing
Stars	48.9K	59.3K	16.4K
Setup Difficulty	Copy file	One `npx` line	`pip install`

Which Should You Use?

Want to improve Claude Code quality right now? → karpathy-skills. One file copy and done
Work on the same project daily and hate losing context? → claude-mem. 5-minute install, positive ROI after 3-5 sessions
Running AI agents team-wide and need accumulating knowledge? → cognee. Learning memory is the next step beyond RAG

Real-World Combination Recommendation

The most powerful approach is using all three together:

Put karpathy-skills' 4 rules in CLAUDE.md to secure baseline quality
Use claude-mem to automatically maintain cross-session history
Use cognee to build team/org-level knowledge and inject into agents

CLAUDE.md (Rules layer)
  + claude-mem (Memory layer)
    + cognee (Knowledge layer)
= AI that doesn't forget

Closing: Toward AI that Remembers

If 2024's question was "Can AI write code?", 2026's question is "Can AI remember what it did yesterday?"

The 48K stars on karpathy-skills and 59K on claude-mem show how urgent this problem is. Developers no longer ask "Does AI write code well?" They ask: "Does AI maintain context?"

None of these tools are perfect. karpathy-skills is manual, claude-mem lacks enterprise-grade security, and cognee has setup complexity. But they all point in the same direction: The next leap for LLMs isn't bigger models — it's better memory.

*Projects featured in this post:*

andrej-karpathy-skills — CLAUDE.md coding rules (48.9K stars)
claude-mem — Persistent memory plugin (59.3K stars)
cognee — AI memory engine (16.4K stars)

Why AI Forgets Everything — 3 Open-Source Solutions to the Memory Crisis

Why AI Forgets Yesterday's Work — 3 Open-Source Solutions to the Memory Problem

The AI Memory Problem

1. andrej-karpathy-skills — One CLAUDE.md File to Rule Them All

What is it

Karpathy's 4 LLM Failure Patterns

The 4 Rules

Relationship to Memory Problem

2. claude-mem — Automatic Cross-Session Context Memory

What is it

How it works

Most Innovative Feature: PreToolUse:Read

3-Layer Progressive Disclosure

Installation

Difference from karpathy-skills

Limitations

3. cognee — Learning Memory Engine for AI Agents

What is it

AI Memory in 6 Lines

RAG vs. cognee

Core Differentiator: Learning Memory

Use Cases

Comparing the 3 Tools

Which Should You Use?

Real-World Combination Recommendation

Closing: Toward AI that Remembers

2 more parts waiting for you

Stay Updated

Subscribe to Newsletter

Related Posts

Inside Google COSMO — The New Architecture of On-Device AI Agents

Self-Evolving AI Agents — The New Paradigm of 2026

Build Your Own LLM Knowledge Base — A Karpathy-Style Knowledge System