The Shift Around Feat(learning-engine): Q-Table & RL

Mar 21, 2026 by Jule 53 views

The LearningEngine stores Q-values, configuration, and reinforcement learning state entirely in memory - meaning every trained model, eligibility trace, and actor weight vanishes with a single MCP restart. What users don’t realize: hooks_learning_config settings disappear too, crashing epsilon values and forcing a full re-bootstrap each session.

Root cause runs deep: exports fail to capture eligibilityTraces and actorWeights, imports ignore stored Maps, and no atomic persistence file exists. The default intelligence.json bloats with unbounded reward history, making recovery impossible.

Here is the deal: every feature hinges on reliable state retention. Without persistent Q-tables and full RL context, even the best training feels like starting over daily.

Complete export() to serialize all internal Maps, not just core values
Make import() fully restore the full state with safe defaults
Save to .ruvector/learning-state.json using atomic tmp → rename for safety
Auto-load state on MCP startup to eliminate manual re-boot
Cap rewardHistory at 500 to prevent unbounded memory bloat

Looking ahead, RVF’s POLICY_KERNEL must hold Q-tables as first-class state - extending Thompson Sampling into a full cognitive container for advanced RL. This aligns with emerging standards like ADR-029 (RVF Canonical Format) and ADR-036 (AGI Cognitive Container), turning fragmented state into modular, persistent intelligence.

Users face a quiet crisis: every training gain evaporates on restart. The question isn’t if persistence is needed - it’s when the system will finally catch up.