Tellodb Blog
Writing for high-intent searches around temporal memory, hybrid retrieval, and infrastructure for agents that need continuity over time.
Featured
Cornerstone posts
How to Extract Structured Knowledge from Chat: A Neural-Symbolic Approach
Vector similarity only gets you partway there. This tutorial walks through building a neural-symbolic pipeline that extracts entities, maps relationships, and computes exact numbers from chat logs using Python, spaCy, and deterministic logic.
How to Fix LLM Math Errors in AI Agents: A Practical Guide
LLMs fail at basic arithmetic tasks like counting and summing. This guide shows you how to build a deterministic aggregation layer that gives your AI agent reliable math.
Latest
All posts
Production AI Memory: From Prototype to Serving 10,000 Users
A practical guide to taking AI agent memory from a working prototype to a production system serving thousands of concurrent users.
Your memory prototype works. It retrieves facts, feeds them to a model, and gives coherent answers. But it was built for one user on a warm cache. Here is what breaks when you ship it.
How to Extract Structured Knowledge from Chat: A Neural-Symbolic Approach
A hands-on tutorial combining NER, relationship extraction, and deterministic math to go beyond basic RAG and build truly structured memory from conversational data.
Vector similarity only gets you partway there. This tutorial walks through building a neural-symbolic pipeline that extracts entities, maps relationships, and computes exact numbers from chat logs using Python, spaCy, and deterministic logic.
How to Fix LLM Math Errors in AI Agents: A Practical Guide
LLMs hallucinate numbers. Learn why AI agents fail at counting and aggregation, and build a deterministic arithmetic layer to fix it.
LLMs fail at basic arithmetic tasks like counting and summing. This guide shows you how to build a deterministic aggregation layer that gives your AI agent reliable math.
Evaluating Agent Memory Beyond Context Length
Why serious memory evaluation should focus on recall quality, temporal correctness, and contradiction handling instead of context window size alone.
A long context window does not prove an agent remembers well. Memory quality is about retrieving the right evidence at the right time.
How to Handle Contradicting Facts in AI Agent Memory
A practical guide to implementing fact supersession — the mechanism that stops AI agents from giving contradictory answers about the same user.
Your agent told a user they live in Miami, then cited their NYC address in the same conversation. Fact supersession is the retrieval-layer mechanism that prevents this.
How to Fix LLM Hallucinations About Past Conversations
LLMs make things up about what you told them before. Here are three practical techniques to fix memory hallucinations, with Python code.
Your AI agent fabricates past conversations, mixes up timelines, or merges different users' data. These three techniques fix memory hallucinations at the source.
How to Give AI Agents Long-Term Memory: A Python Tutorial
Step-by-step guide to adding persistent memory to AI agents. Learn three approaches from file-based to vector databases, with runnable Python code.
Your AI agent forgets everything between sessions. This tutorial shows you how to add long-term memory using Python, from simple file storage to production-ready vector databases.
Hybrid Retrieval: Why Your AI Agent Needs Both Semantic and Keyword Search
A practical guide to combining BM25 and vector search for production retrieval. Includes Python code, RRF implementation, and real benchmark numbers.
Your vector database is returning irrelevant results for exact queries. Your keyword search misses intent. Here's how to combine both approaches to fix retrieval quality.
Building Knowledge Graph Memory for AI Agents: A Practical Guide
A hands-on guide to adding structured memory to AI agents using knowledge graphs. Learn to build, query, and combine graph and vector retrieval with real Python examples.
Vector memory finds similar text. Graph memory finds connected facts. This guide walks through building both with NetworkX and combining them for production agents.
Local-First AI Development: Why Your Memory Engine Should Run on Your Laptop
A practical guide to building agent memory systems that run locally — faster iteration, lower costs, and no cloud dependency during development.
Most agent memory debugging happens against remote services. This post explains why running the same engine locally changes everything, and walks through a complete setup from zero to a working local memory stack.
How to Add Persistent Memory to Any OpenAI Agent (Without Rewriting Your Code)
A step-by-step tutorial showing how to add cross-session memory to existing OpenAI agents using a proxy-based approach—no SDK changes required.
Most memory solutions require deep integration work. This tutorial walks through adding persistent, time-aware memory to an existing OpenAI agent using a transparent proxy that intercepts API calls.
The Predict-Calibrate Pattern: Managing User Profiles Without Blowing Your Context Window
A practical tutorial on maintaining compact user profiles in AI agents using delta extraction and profile patching, avoiding the bloated-context trap.
As user interactions grow, naive profile storage balloons your context window with stale data. Learn how the predict-calibrate pattern keeps profiles compact and accurate.
RAG vs Long-Term Memory: When Your Vector Database Isn't Enough
RAG retrieves similar text. Long-term memory understands time, contradictions, and evolving truth. Here's when you need both.
RAG works for retrieval, but it can't handle changing facts, temporal reasoning, or user profiles. This comparison shows where standard RAG breaks down and what to build instead.
Self-Hosting AI Memory with Rust: A Step-by-Step Guide
Run your own AI memory engine on your infrastructure. One Rust binary, zero dependencies, full control over user data.
Self-hosting your AI memory engine gives you full control over user data, zero vendor lock-in, and predictable costs. This guide walks through building and deploying a Rust memory engine from source.
Temporal Memory vs Vector Database: Why Your AI Agent Keeps Remembering Stale Facts
A practical guide to understanding why vector databases fail at temporal reasoning, and how to build memory systems that know when facts expire.
Vector databases retrieve similar text. They cannot decide that a newer fact should replace an old one. Here is how to fix that.
We Tested 5 Memory Systems for AI Agents — Here Are the Results
A head-to-head comparison of ChromaDB, Mem0, Zep, LangChain Memory, and Tellodb on accuracy, latency, temporal correctness, and cost.
We benchmarked 5 memory systems across accuracy, latency, temporal correctness, and cost. Here's what we found — and which system wins for different use cases.