GTM AI Podcast & Newsletter

GTM AI Podcast & Newsletter

Under the Hood

Claude Memory

J Moss's avatar
J Moss
Mar 23, 2026
∙ Paid

You’ve been in the middle of a project for three weeks. Claude knows your ICP, your company’s positioning, your writing voice, the competitive context, the particular angle you’ve been developing for the past month. The work is going well. Then you open a new conversation and type “Hi” — and it’s gone. All of it. You’re back to a blank slate.

Most people accept this as the cost of the tool. It’s not. It’s a design problem, and it has a design solution.

Claude’s memory isn’t missing — it’s distributed across five different mechanisms with completely different persistence characteristics. Once you understand which layer does what, you stop losing context and start building memory that compounds. This guide is the map.


Step 1: How Claude’s Memory Actually Works — The Five Layers

Most people interact with exactly one layer of Claude’s memory: in-conversation context. The rest are either unknown or underused, which means they’re leaving most of the infrastructure untouched.

Here’s the full picture.

Layer 1: In-Conversation Context

Everything in your current session. The model processes everything in the context window — every message, every file you’ve pasted, every response it’s generated — and uses it when responding. This is why Claude can refer back to something you said 40 messages ago in the same conversation.

The catch: when the conversation ends, this layer is gone. Not archived somewhere retrievable. Gone. Claude cannot access a previous conversation’s content when you start a new one. This is the layer most people treat as the only layer, which is why they spend the first five minutes of every conversation re-briefing an AI that has no idea who they are.

Layer 2: Projects Memory (Cowork)

Available on Claude.ai Pro and Team plans. Projects give you a persistent context layer that loads automatically at the start of every conversation inside that project. You write Project Instructions once — who you are, what you’re working on, what constraints apply — and every subsequent conversation inherits that context.

The important nuance: this is not transcript memory. Claude doesn’t read your previous conversations before responding in a new one. It has your instructions and your uploaded files, not a running log of what you’ve discussed. This is a crucial distinction when you’re deciding what to put there.

Layer 3: CLAUDE.md / Instruction Files

The most reliable memory mechanism available if you’re using Claude Code. CLAUDE.md is a markdown file in your project root that Claude Code reads at every session start. It’s not a prompt — it’s persistent instruction infrastructure. Changes you make to CLAUDE.md are available in every future session without any setup.

This is where system-level behavior lives: how you’ve structured your project, what agents you’re using and why, how work should be routed, what constraints apply across the entire environment. In a well-configured Claude Code setup, CLAUDE.md is the brain that orients every session.

Layer 4: Memory Files

Explicitly written markdown files Claude reads as part of its session orientation. The MEMORY.md pattern — writing structured context into a file that gets surfaced at session start — is the closest thing to persistent episodic memory Claude Code has. You write down what Claude needs to remember: key decisions made, current state of active projects, important context that would take 10 minutes to re-establish from scratch.

The difference between Layer 3 and Layer 4: CLAUDE.md holds behavioral instructions (how to work), MEMORY.md holds factual context (what’s been done, what’s true). Both load at session start. Both persist across sessions. Together, they close the gap between “AI that starts fresh every time” and “AI that picks up where you left off.”

Layer 5: MCP-Based Memory

External memory stores connected to Claude via the Model Context Protocol. Vector databases, knowledge graphs, retrieval systems — any structured store that can receive a query and return relevant context. This layer enables semantic memory: Claude can search your accumulated knowledge by meaning, not just by keyword, and retrieve relevant context dynamically instead of dumping everything into the context window at once.

This is the most powerful layer and the most complex to set up. It’s the right tool when your memory store has grown beyond what fits in a context window, when you need semantic retrieval across hundreds of notes or documents, or when you’re building a system where accumulated insights need to compound across a team.

The summary: Layer 1 handles your current session. Layers 2-4 handle what persists across sessions through structured files. Layer 5 handles what scales beyond files. Most people only use Layer 1. The good setup uses all five intentionally.


Step 2: Projects Memory — Setting Up Persistent Context in Cowork

The first thing to understand about Projects is what they’re not: they’re not a memory system that learns from your conversations. They’re a context injection system that ensures every conversation starts from an informed baseline. The distinction matters because it changes what you put there.

Projects Instructions should contain context that is true and stable across many conversations: who you are, what you’re working on, your ICP, your voice, your constraints, your company’s competitive position. Not your current active tasks. Not the status of a specific deal. Not what you discussed with a prospect last Tuesday. Stable context, not dynamic state.

Here’s the test: if the information would still be true six months from now, it belongs in Project Instructions. If it’s the current status of something that changes week to week, it belongs in a specific conversation.

What to write in Project Instructions:

Start with four blocks in this order.

First block — who you are and what this project is for:

You are a [role] assistant for [Name], [Title] at [Company].
[Company] is [one-sentence description].
This project is for [specific type of work — writing, GTM strategy, client work, etc.].

Second block — domain knowledge Claude needs to give specific rather than generic advice:

ICP: [Specific description — industry, size, role, trigger events, not just "SMB"]
Primary competitors: [Names and the one key differentiator against each]
Value props ranked by ICP priority: [Specific outcomes, not category claims]
Sales motion: [How you actually sell — PLG, outbound, channel, hybrid]

Third block — output requirements:

Voice: [Specific constraints — first person, short paragraphs, no passive voice]
Format: [What a good response looks like — length, structure, when to use headers]
When to push back: [Where you want Claude to challenge you, not just comply]

Fourth block — what NOT to do:

Do not: [Re-summarize what I just said. Add caveats I didn't ask for.
Give generic advice when specific context is available in uploaded files.
Use passive voice, jargon, or hedged language.]

That fourth block is the one most people skip. Instructions that only tell Claude what to do don’t prevent the default behaviors that frustrate you. Constraints are what make instructions actually change behavior.

What to upload to a Project:

Upload your reference materials — documents Claude would otherwise need you to paste every time. Brand guide. ICP definition. Competitor battlecards. Past content samples (especially important for writing projects — three to five of your best pieces teach voice better than any description). Case studies. Pricing structure. Product documentation.

The upload is permanent context. The paste is per-conversation context. Move anything you paste in more than three times to a file and upload it.


Step 3: CLAUDE.md — The Most Reliable Memory Mechanism in Claude Code

If you’re using Claude Code and you don’t have a CLAUDE.md, you’re running the most powerful version of the tool without its most fundamental memory infrastructure.

CLAUDE.md is read at the start of every Claude Code session. Not sometimes. Every time. It’s the one memory mechanism with zero maintenance overhead — write it once, and it loads in perpetuity. This makes it uniquely reliable compared to every other layer.

What belongs in CLAUDE.md:

The behavioral architecture of your entire working environment. Not task lists. Not current status. The stable structure of how you work.

Three categories to cover:

How work gets routed. If you’ve built specialized agents, CLAUDE.md is where you define the routing rules. Which agent handles which task type, how to classify ambiguous requests, what the tiers of routing complexity look like. Without this, Claude defaults to doing everything itself, which means it ignores the specialized agents you built.

How the vault and file system are organized. Where notes live. Where inbox items go. Where output goes. What the processing pipeline looks like (inbox → distill → notes, not directly to notes). If the architecture is documented in CLAUDE.md, Claude can navigate and maintain it correctly without you re-explaining it.

What the constraints and guardrails are. Things that should never happen regardless of what’s requested: don’t write directly to notes without processing, don’t commit sensitive files, don’t skip quality gates. Hard constraints belong in CLAUDE.md because they apply to every session.

What doesn’t belong: anything that changes frequently. The current status of a project. Your active tasks. What you worked on last session. That’s what memory files are for.

The principle: CLAUDE.md holds behavior. Memory files hold state. The distinction keeps CLAUDE.md clean and stable while allowing your actual working context to evolve without cluttering your behavioral instructions.


Keep reading with a 7-day free trial

Subscribe to GTM AI Podcast & Newsletter to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2026 Coach K and J Moss · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture