A guide · ~7 min read
AI Memory Creep: When Your Assistant Starts Inferring Things You Never Told It
You never said you were a morning person. You did mention, twice, that you sent an email at 6:43am. Six weeks later the assistant casually refers to you as a morning person. That's creep: the silent promotion of inference to fact, with no label and no audit trail.
How a stored memory entry actually gets written
When a frontier provider's memory system decides "this is worth remembering," it summarizes the relevant context into a short text string and writes it to your memory store. That string has no field for "where this came from" or "how confident I am." It's just a sentence. The system that reads it later treats it as ground truth because there's no other option.
The model that wrote the entry might have inferred it from three different chats. The model that reads it has no idea. So inference, once written down, has the same status as something you explicitly stated. That's the whole bug.
Why creep is harder to notice than hallucination
A pure hallucination — "you were born in Latvia" when you weren't — pings the alarm immediately. Creep is plausible by construction: it's built from real signals you generated, so when it surfaces it sounds like something you might have said. You catch yourself wondering if you did.
That plausibility is what makes creep corrosive. You don't push back because you're not sure. Over months, the assistant's portrait of you drifts toward whatever profile is easiest to infer from the noise. (For the world-facts version of the same problem, see when ChatGPT makes up facts about you.)
What "source on every claim" looks like
In a per-atom system, the morning-person claim doesn't get stored as a flat sentence. It gets stored as: claim: morning person · derived_from: [chat 2026-02-11, chat 2026-03-08] · source_type: inference · confidence: 0.41. The model reading that atom knows it's an inference and can treat it accordingly: hedge in the reply, ask you to confirm, or skip it entirely if confidence is below threshold.
The flip side: when you do explicitly state something, the atom is marked source_type: stated with high confidence. The model leans on those claims confidently. The two kinds of memory are finally distinguishable. (We make the broader provenance case in why your AI says weird things.)