The Shift Around Feat(learning-engine): Q-Table & RL
The LearningEngine stores Q-values, configuration, and reinforcement learning state entirely in memory - meaning every trained model, eligibility trace, and actor weight vanishes with a single MCP restart. What users don’t realize: hooks_learning_config settings disappear too, crashing epsilon values and forcing a full re-bootstrap each session.
Root cause runs deep: exports fail to capture eligibilityTraces and actorWeights, imports ignore stored Maps, and no atomic persistence file exists. The default intelligence.json bloats with unbounded reward history, making recovery impossible.
Here is the deal: every feature hinges on reliable state retention. Without persistent Q-tables and full RL context, even the best training feels like starting over daily.
- Complete
export()to serialize all internal Maps, not just core values - Make
import()fully restore the full state with safe defaults - Save to
.ruvector/learning-state.jsonusing atomic tmp → rename for safety - Auto-load state on MCP startup to eliminate manual re-boot
- Cap
rewardHistoryat 500 to prevent unbounded memory bloat
Looking ahead, RVF’s POLICY_KERNEL must hold Q-tables as first-class state - extending Thompson Sampling into a full cognitive container for advanced RL. This aligns with emerging standards like ADR-029 (RVF Canonical Format) and ADR-036 (AGI Cognitive Container), turning fragmented state into modular, persistent intelligence.
Users face a quiet crisis: every training gain evaporates on restart. The question isn’t if persistence is needed - it’s when the system will finally catch up.