The pricing page was updated three weeks ago. The AI is still quoting the old price. The output looks perfect. The facts are wrong.

RAG systems are only as fresh as their indexed documents. Pricing pages drift within weeks. Compliance documents expire. Policies change. But the system that indexed your docs at launch keeps answering confidently from information that’s months out of date. And since RAG outputs still look coherent and well-formatted, there’s no obvious signal that the underlying facts have gone wrong. [CONFIRMED] One analysis found that RAG systems lose roughly a third of their effective accuracy within 90 days, purely due to knowledge staleness. [SOURCE: Nebula]

The invisible rot: Your knowledge base decays silently. The AI keeps answering confidently. The users keep trusting the answers. Until someone checks the price quote against the website and realizes it’s wrong.

The Five Causes of Knowledge Base Decay

1. Ranking Conflicts

Vector databases prioritize semantic closeness over chronological recency. Without strict deprecation rules or time-weighted metadata, older documents can mathematically outrank newly added ones. [CONFIRMED] A query about “current pricing” retrieves the 2024 pricing sheet because its vector embedding is closer to the query than the 2026 update. [SOURCE: Kajito]

Early warning: Answers that cite old dates. Users reporting “outdated” responses that your logs show were retrieved from indexed documents.

The fix: Apply time-weighted metadata to your vector database. Add explicit prompt instructions telling the LLM to favor recent dates. [SOURCE: Nebula]

2. Static Indexing Delays

Relying on massive scheduled batch jobs to reindex an entire knowledge base leaves answers stale between update cycles. [CONFIRMED] A weekly reindex job means your answers are up to 7 days old at worst. For pricing, compliance, or policy questions, that’s unacceptable. [SOURCE: Kajito]

The fix: Switch to retrieval-on-demand. Each time someone asks a question, fetch the relevant documents directly from cloud storage or knowledge base in that moment. Document retrieval is fast. The tradeoff is that documents must be accessible in a queryable format — but that’s a smaller infrastructure problem than maintaining a fresh index. [SOURCE: Nebula]

3. Caching Overrides

Semantic or API caching layers intercept queries and serve previously generated, obsolete responses before the retriever even searches for new documents. [CONFIRMED] The cache hit saves 50ms. It also serves a 6-month-old answer. [SOURCE: Kajito]

The fix: Cache invalidation tied to document updates. Any change to source documents should trigger a cache flush. If you can’t track that, don’t cache RAG answers at all.

4. Silent Ingestion Failures

New data is uploaded to the system but fails to become searchable due to asynchronous indexing delays, poor chunking strategies, or background parsing errors. [CONFIRMED] The document is in the upload bucket. The vector database never received it. No error was surfaced. [SOURCE: Kajito]

Early warning: Intercept the raw output of your top-K retrieved chunks before they reach the LLM. This immediately clarifies whether you’ve a retrieval failure (the new doc wasn’t found) or a generation error (the LLM ignored it). [SOURCE: Nebula]

The fix: Add a retrieval audit log. Every query should log which source IDs were retrieved, their dates, and their relevance scores. [SOURCE: Kajito]

5. Context Window Limitations

Even when fresh data is retrieved successfully, it might get truncated if it exceeds the LLM’s context window. Without explicit instructions to prioritize recent dates, the model falls back on its outdated pre-trained memory. [CONFIRMED] The 2026 policy document is in the retrieved chunks — at position 47, beyond the context window cutoff. [SOURCE: Kajito]

The fix: Cap chunk injection at top-5 maximum. Twenty chunks fills the context window with noise. Score-gate retrieval and discard low-relevance chunks. [SOURCE: Nebula]

The “Document Age” Metric

Treat “document age” as a first-class reliability metric alongside retrieval latency and answer quality. [CONFIRMED] Production-grade teams audit document freshness every 60-90 days. Any document older than its expected shelf life — days for pricing, months for policy, a year for stable architecture content — needs verification before it’s surfaced. [SOURCE: Nebula]

Content TypeShelf LifeAudit Cadence
PricingDaysWeekly
Compliance / LegalMonthsMonthly
Product specsMonthsQuarterly
Architecture docsYearQuarterly
HR policiesMonthsQuarterly

The Recovery Playbook

  1. Audit document freshness every 60-90 days. Any document exceeding its shelf life requires verification.
  2. Track document age as a metric. Add it to your monitoring dashboard alongside latency and accuracy.
  3. Implement dynamic retrieval. Fetch fresh documents on-demand rather than relying on static indexes.
  4. Log source IDs for every answer. Not just what was returned — which documents fed it, their dates, and their relevance scores.
  5. Add time-weighted metadata. Recent documents should rank higher in retrieval.

The Cost Transparency Angle

Knowledge base decay is invisible until it causes a customer-facing error. A wrong price quote. An outdated compliance policy. A deprecated API endpoint. Each mistake costs money — but the 90-day rot period means the cost accumulates silently. [OBSERVED] The $150K law firm project that cited a non-existent case? The knowledge base hadn’t been updated in 8 months. [SOURCE: Boundev]