What is persistent AI memory?

Persistent AI memory is a storage layer that lets an AI assistant remember facts, preferences and decisions across separate conversations, instead of starting from zero each session. Unlike a single model’s built-in memory, a persistent layer can be exported, edited and reused across different AI tools.

How is it different from a context window?

A context window is the working memory of one conversation: everything the model can "see" right now. It is large but temporary, and it resets when the conversation ends. Persistent memory sits outside the conversation. It captures the durable facts worth keeping and re-injects only the relevant ones into each new session, so the model behaves as if it remembers you without re-reading the entire history every time.

How is it different from RAG?

Retrieval-augmented generation (RAG) retrieves passages from documents you supply — a knowledge base, a set of files — and grounds answers in them. Persistent memory is about you: it captures and structures what you tell the AI over time (preferences, decisions, ongoing projects) rather than indexing a document corpus. The two are complementary, and many systems use both.

What does it actually remember?

Facts ("I use TypeScript, not Java"), preferences ("answer concisely"), decisions ("we chose Postgres over MySQL") and recurring patterns. In Alma this is organised into three layers — memories (discrete facts), episodes (conversation summaries) and procedures (learned workflows) — each scored and retrieved by relevance so the right context surfaces at the right time.

Where Alma fits

Alma is a persistent memory layer wrapped in a full workspace: you chat, it remembers, and the same memory is reachable from Claude Desktop, Cursor and VSCode over MCP. You can export it any time. It is the memory your AI was missing, not locked inside one provider.

Related

See plans