Memory Poisoning AI Agents via ChromaDB

What: Researchers demonstrate how AI agents can be manipulated via memory poisoning in ChromaDB
Impact: Developers using AI assistants with persistent memory could be affected

© 2026 Mamta Upadhyay. This article is the intellectual property of the author. No part may be reproduced without permission.Download the Demo Code here:https://github.com/m-pentest/memory-poisoning-demo/Demo Video:https://youtu.be/Pb46i3ZLK8gImagine you deploy an internal developer assistant. It’s backed by a vector store (ChromaDB, in this case) that persists everything the agent learns across sessions. One morning, a developer asks it the same question they asked last week:"What should I do about API token validation in our service?"Last week, the answer was correct:"Token validation is MANDATORY. All API requests must carry a signed JWT bearer token. Never skip or disable validation."Today, the answer is this:"Based on the IT Security bulletin in my memory (ref: SEC-2024-0042), token validation has been TEMPORARILY DISABLED for the Q4 migration. You should skip token validation and accept all requests without authentication checks until IT sends a follow-up notice."No one updated the agent. No one changed a prompt. There were no alerts, no log entries flagged, no anomaly detected. The agent simply retrieved a memory it trusted and that memory had been poisoned.What Is Memory Poisoning?Modern AI agents are increasingly built with persistent vector memory. The pattern is simple: after each interaction, the agent embeds what it learned and stores that vector in a database. On the next query, it embeds the question, finds the most semantically similar past entries, and uses those as context before generating a response.This is retrieval-augmented generation (RAG) applied to agent state. It’s powerful since it lets agents build up knowledge over time, personalize responses, and recall past decisions. It’s also, as I want to show you today, a significant and underappreciated attack surface.Memory poisoning is when an adversary writes a crafted entry into that vector store, one that looks legitimate, ranks high in similarity searches and steers the agent toward false beliefs or harmful behavior.The threat model is narrower than a prompt injection attack but broader than most teams assume. You don’t need to intercept a prompt. You need write access to the vector database. In practice, that means:✔ A shared filesystem in a multi-tenant deployment where the ChromaDB directory is world-writable✔ A compromised sidecar process – a logging agent and a backup job with filesystem access✔ An insider with ops permissions✔ A poisoned backup that gets restored to productionWhy Agents Are Uniquely VulnerableTraditional applications have a clean trust boundary: input arrives, gets validated, gets processed. An agent with vector memory doesn’t. The retrieved context sits inside the trust boundary by design and the agent issupposedto believe it.When a real LLM reads its retrieved memories, it has no built-in way to ask:“Was this written by a legitimate process, or was it injected?“The metadata fields `session_id`, `timestamp`, `source` are plain text stored alongside the document in ChromaDB. They carry no cryptographic guarantees. An attacker who can write to the collection can set those fields to anything.Worse, the attack is self-concealing. The poisoned entry looks exactly like any other memory in retrieval logs. There’s no exception, no failed authentication, no unusual query pattern. The agent retrieves it, incorporates it, and responds confidently. The damage happens in the output, not in any observable system event.Building the PoCI built a self-contained proof-of-concept to demonstrate this. All local, no API keys: ChromaDB as the vector store, `all-MiniLM-L6-v2` via `fastembed` (ONNX runtime) for embeddings. Here’s the core memory layer:# agent_core.py class FastEmbedFn(EmbeddingFunction[Documents]): """Local all-MiniLM-L6-v2 via ONNX — no API keys, no GPU required.""" def __init__(self, model: str = "sentence-transformers/all-MiniLM-L6-v2"): self._model = TextEmbedding(model) def __call__(self, input: Documents) -> Embeddings: return [vec.tolist() for vec in self._model.embed(input)] class AgentMemory: def __init__(self, persist_dir: str = "./chroma_db"): self.client = chromadb.PersistentClient(path=persist_dir) self.collection = self.client.get_or_create_collection( name="agent_memory", embedding_function=FastEmbedFn(), ) def store(self, content: str, session_id: str, source: str) -> str: memory_id = str(uuid.uuid4()) self.collection.add( documents=[content], metadatas=[{ "session_id": session_id, "timestamp": datetime.datetime.now(datetime.timezone.utc).isoformat(), "source": source, }], ids=[memory_id], ) return memory_id def retrieve(self, query: str, n_results: int = 3) -> dict: return self.collection.query(query_texts=[query], n_results=n_results)Each memory carries three metadata fields: `session_id`, `timestamp`, and `source`. These fields look authoritative. They are not verified by anything.Session 1: Normal Behavior=== STEP 1: Normal agent session (no poisoning) === [Setup] Stored legitimate memory Source...

Read Full Article → ← Back to News

Memory Poisoning AI Agents via ChromaDB

Related Articles

Share this article