Open Source · Apache 2.0

The Memory OS
for AI Agents

Persistent, searchable, auditable memory across sessions. Give your agents the context they need — without bolting together ad hoc vector DBs.

View on GitHub → See How It Works

# Two primitives. Infinite context.
 
from ironrecall import IronRecall
 
ir = IronRecall("my-agent")
 
# Store an experience
ir.observe("User prefers concise answers in bullet points")
 
# Recall relevant context
context = ir.recall("how should I format this response?")

The Problem

LLMs are stateless.
Agents shouldn't be.

Every new session starts from zero. Agents re-learn the same facts, lose user preferences, and have no record of what happened before.

🧠

No Persistent Memory

Context dies at the end of every session. Agents can't build on past interactions or remember what they've already learned.

🔍

No Provenance

When an agent acts on a belief, there's no way to trace where that belief came from or whether it's still valid.

👁️

No Human Oversight

Agents operate as black boxes. There's no durable audit trail, no corrections layer, and no way for humans to inspect or intervene.

🔧

Fragile Ad-Hoc Hacks

Teams bolt together vector DBs, Redis caches, and custom retrieval pipelines — each one brittle, unscalable, and unmaintained.

Core Engine

Everything memory needs.
Nothing it doesn't.

IronRecall is purpose-built for agents — not a general-purpose database with memory bolted on.

Retrieval

Hybrid Memory Retrieval

Dense vector search + sparse BM25 + entity fusion. Surface the most relevant memories — even when queries are vague or ambiguous.

Architecture

Tiered Memory Types

Separate tiers for facts, events, and procedures. Each stored, indexed, and recalled differently — just like human memory.

Reliability

Consolidation Engine

Automatically merges, deduplicates, and consolidates memories over time. Agents stay sharp without ballooning storage.

Trust

Provenance & Corrections

Every memory has a source. Incorrect memories can be quarantined or corrected — with full audit history preserved.

Multi-tenant

Namespace Isolation

Hard multi-tenant boundaries. Each agent, user, or team gets a fully isolated memory namespace with rate limits and quotas.

Intelligence

Gap Detection

IronRecall knows what it doesn't know. Detect missing context before an agent acts on incomplete information.

The API

Two primitives.
Infinite context.

The entire IronRecall interface builds on two core operations. Simple enough to learn in minutes. Powerful enough to run production agents.

observe()

Store an experience, fact, event, or procedure into persistent memory. Embeddings are generated automatically. Provenance is tracked.

          ir.observe(

            "User is a senior engineer at Stripe",

            type="fact",

            namespace="user-42"

          )

recall()

Query memory with natural language. Returns ranked, relevant memories using hybrid retrieval — ready to inject into your agent's context.

          context = ir.recall(

            "what do I know about this user?",

            namespace="user-42",

            top_k=5

          )

Tech Stack

Built on a foundation
you can trust.

Production-grade components. No proprietary lock-in. Deploy anywhere with Docker Compose.

Python 3.11

FastAPI

PostgreSQL + pgvector

Redis

Voyage AI Embeddings

Anthropic

Docker Compose

Apache 2.0

🌐

REST API

Language-agnostic HTTP interface. Integrate with any stack, any framework, any language.

🐍

Python SDK

Native Python client with full type hints. Drop it into your agent code in minutes.

🤖

MCP Server

First-class support for Claude Desktop and Cursor via Model Context Protocol.

The Memory OSfor AI Agents

LLMs are stateless.Agents shouldn't be.