Core Engine¶
The Core Engine is the heart of RLM Code. It implements the Recursive Language Model paradigm from the 2025 research paper, providing a complete runtime for context-as-variable reasoning, iterative code execution, reward-driven optimization, and multi-paradigm orchestration.
Architecture Overview¶
The Core Engine follows a context -> action proposal -> sandbox execution -> observation -> reward -> memory update loop. Unlike traditional coding agents that load full context into the LLM's token window, RLM Code stores context as REPL variables and exposes only metadata (type, length, preview) to the LLM. The LLM then accesses the data programmatically through code execution.
+------------------+
| RLMRunner |
| (Orchestrator) |
+--------+---------+
|
+------------------+------------------+
| | |
+-------v------+ +-------v------+ +--------v-------+
| Pure RLM Env | | DSPy Env | | Generic Env |
| (Paper-exact)| | (DSPy-aware) | | (General use) |
+--------------+ +--------------+ +----------------+
| | |
+------------------+------------------+
|
+--------v---------+
| Event Bus |
| (27+ event types)|
+------------------+
Subsystems¶
The Core Engine is composed of several tightly integrated subsystems:
| Subsystem | Module | Purpose |
|---|---|---|
| Runner | rlm_code.rlm.runner | Multi-paradigm orchestrator with trajectory persistence |
| Environments | rlm_code.rlm.environments, rlm_code.rlm.pure_rlm_environment | Execution environments with reward profiles |
| Execution Patterns | rlm_code.rlm.runner, rlm_code.harness.runner | How to run pure recursive mode vs harness vs direct baseline |
| Event System | rlm_code.rlm.events | Pub-sub event bus for observability and UI |
| Termination | rlm_code.rlm.termination | FINAL/FINAL_VAR termination patterns |
| Memory Compaction | rlm_code.rlm.memory_compaction | Context window management via summarization |
| REPL Types | rlm_code.rlm.repl_types | Foundation types for context-as-variable paradigm |
| Trajectory | rlm_code.rlm.trajectory | JSONL trajectory logging and visualization |
| Paradigm Comparison | rlm_code.rlm.comparison | Side-by-side paradigm benchmarking |
Key Concepts¶
Context-as-Variable¶
The central innovation of RLM: instead of injecting full context into the LLM prompt (consuming tokens), the context is stored as a Python variable in the REPL namespace. The LLM receives only lightweight metadata -- the variable name, type, character count, and a short preview -- and accesses the data through code.
Reward-Driven Optimization¶
Every action produces a scalar reward in the range [-1.0, 1.0]. The RLMRewardProfile provides 25+ configurable knobs for tuning reward signals across different action types, including code execution success/failure, DSPy pattern matching, verifier suites, and warning penalties.
Recursive LLM Calls¶
Within the REPL, the LLM can call llm_query() for single sub-LLM queries and llm_query_batched() for concurrent parallel queries. This enables recursive decomposition of complex tasks -- the hallmark of the RLM paradigm.
Event-Driven Architecture¶
The runner publishes 27+ event types through the RLMEventBus, enabling real-time UI updates, observability sinks, and execution tracing without coupling the core engine to any specific consumer.
Quick Start¶
from rlm_code.rlm.runner import RLMRunner
runner = RLMRunner(
llm_connector=my_connector,
execution_engine=my_engine,
workdir=Path("/my/project"),
)
result = runner.run_task(
task="Analyze the codebase and find all TODO comments",
environment="pure_rlm",
max_steps=5,
)
print(f"Completed: {result.completed}")
print(f"Answer: {result.final_response}")
print(f"Steps: {result.steps}, Reward: {result.total_reward}")
Module Map¶
rlm_code/rlm/
__init__.py
runner.py # RLMRunner orchestrator
environments.py # RLMEnvironment protocol, Generic, DSPy environments
pure_rlm_environment.py # Paper-compliant Pure RLM environment
events.py # Event bus and event types
termination.py # FINAL/FINAL_VAR patterns
memory_compaction.py # Context window management
repl_types.py # REPLVariable, REPLHistory, REPLEntry, REPLResult
trajectory.py # JSONL logging and visualization
comparison.py # Paradigm comparison framework
benchmarks.py # Benchmark case definitions
config_schema.py # Configuration schema
observability.py # Observability hooks
context_store.py # Lazy file context loading
visualizer.py # Run visualization