Skip to main content
Memory Crystal stores what your AI knows across two distinct layers: short-term memory (STM) and long-term memory (LTM). Each layer serves a different purpose, and both are searched together on every turn.

Short-term memory

Short-term memory holds the raw, verbatim text of every message in a conversation. Nothing is summarized or interpreted — STM is an exact record of what was said. When you ask your AI to recall something from earlier in a session, STM is what surfaces that context. It’s also the source used by crystal_search_messages and crystal_recent. STM has a rolling retention window based on your plan:
PlanSTM retention
Free30 days
Pro60 days
Ultra90 days
Messages older than the retention window are automatically purged. They are not promoted to LTM — STM and LTM are populated independently.

Long-term memory

Long-term memory holds distilled facts extracted from your conversations. After each turn, Memory Crystal runs an extraction pass and pulls out what’s worth keeping — decisions, lessons, people, rules, goals, and more. Those extractions are embedded as vectors and stored permanently. LTM is what lets your AI remember something you said three months ago in a completely different session. It’s not searching raw text — it’s searching a structured knowledge base built from everything you’ve discussed. LTM memories persist indefinitely unless you explicitly archive or delete them.

Comparison

Short-term memory (STM)Long-term memory (LTM)
ContentRaw messages, verbatimExtracted facts, decisions, lessons, people, rules
RetentionRolling window (7–90 days by tier)Forever
SearchHybrid BM25 + vectorVector-indexed semantic search
Written byCapture hook, automaticallyLLM extraction after each turn
PurposeRecent context and continuityPersistent knowledge across sessions

How they work together

Before every response, the Context Engine searches both layers simultaneously. STM provides recent conversational continuity — what you’ve been discussing right now. LTM provides durable knowledge — what your AI has learned over time.
You don’t choose which layer to search. The Context Engine queries both and merges the results before ranking them. The most relevant memories from either layer are injected into the model context automatically.
The result is an AI that maintains the thread of your current conversation (STM) while also drawing on everything it has learned over your entire history together (LTM).

Memory lifecycle

1

Message arrives

Your message is captured and written to STM as a verbatim record.
2

AI responds

The AI generates a response using context injected by the Context Engine.
3

Extraction runs

After the response, Memory Crystal passes the conversation turn to an LLM that extracts up to three durable memories and writes them to LTM.
4

Graph enrichment

An async background job connects the new memories to related memories already in LTM, building the knowledge graph.
You can store memories manually at any time using crystal_remember. This is useful for saving a decision or fact that you know is important, without waiting for automatic extraction.