The Shift Around Feat(learning-engine): Q-Table & RL

by Jule 53 views
The Shift Around Feat(learning-engine): Q-Table & RL

The LearningEngine stores Q-values, configuration, and reinforcement learning state entirely in memory - meaning every trained model, eligibility trace, and actor weight vanishes with a single MCP restart. What users don’t realize: hooks_learning_config settings disappear too, crashing epsilon values and forcing a full re-bootstrap each session.

Root cause runs deep: exports fail to capture eligibilityTraces and actorWeights, imports ignore stored Maps, and no atomic persistence file exists. The default intelligence.json bloats with unbounded reward history, making recovery impossible.

Here is the deal: every feature hinges on reliable state retention. Without persistent Q-tables and full RL context, even the best training feels like starting over daily.

  • Complete export() to serialize all internal Maps, not just core values
  • Make import() fully restore the full state with safe defaults
  • Save to .ruvector/learning-state.json using atomic tmp → rename for safety
  • Auto-load state on MCP startup to eliminate manual re-boot
  • Cap rewardHistory at 500 to prevent unbounded memory bloat

Looking ahead, RVF’s POLICY_KERNEL must hold Q-tables as first-class state - extending Thompson Sampling into a full cognitive container for advanced RL. This aligns with emerging standards like ADR-029 (RVF Canonical Format) and ADR-036 (AGI Cognitive Container), turning fragmented state into modular, persistent intelligence.

Users face a quiet crisis: every training gain evaporates on restart. The question isn’t if persistence is needed - it’s when the system will finally catch up.