Skip to content

Core Engine

The Core Engine is the heart of RLM Code. It implements the Recursive Language Model paradigm from the 2025 research paper, providing a complete runtime for context-as-variable reasoning, iterative code execution, reward-driven optimization, and multi-paradigm orchestration.


Architecture Overview

The Core Engine follows a context -> action proposal -> sandbox execution -> observation -> reward -> memory update loop. Unlike traditional coding agents that load full context into the LLM's token window, RLM Code stores context as REPL variables and exposes only metadata (type, length, preview) to the LLM. The LLM then accesses the data programmatically through code execution.

                    +------------------+
                    |    RLMRunner      |
                    | (Orchestrator)    |
                    +--------+---------+
                             |
          +------------------+------------------+
          |                  |                  |
  +-------v------+  +-------v------+  +--------v-------+
  | Pure RLM Env |  | DSPy Env     |  | Generic Env    |
  | (Paper-exact)|  | (DSPy-aware) |  | (General use)  |
  +--------------+  +--------------+  +----------------+
          |                  |                  |
          +------------------+------------------+
                             |
                    +--------v---------+
                    |  Event Bus        |
                    |  (27+ event types)|
                    +------------------+

Subsystems

The Core Engine is composed of several tightly integrated subsystems:

Subsystem Module Purpose
Runner rlm_code.rlm.runner Multi-paradigm orchestrator with trajectory persistence
Environments rlm_code.rlm.environments, rlm_code.rlm.pure_rlm_environment Execution environments with reward profiles
Execution Patterns rlm_code.rlm.runner, rlm_code.harness.runner How to run pure recursive mode vs harness vs direct baseline
Event System rlm_code.rlm.events Pub-sub event bus for observability and UI
Termination rlm_code.rlm.termination FINAL/FINAL_VAR termination patterns
Memory Compaction rlm_code.rlm.memory_compaction Context window management via summarization
REPL Types rlm_code.rlm.repl_types Foundation types for context-as-variable paradigm
Trajectory rlm_code.rlm.trajectory JSONL trajectory logging and visualization
Paradigm Comparison rlm_code.rlm.comparison Side-by-side paradigm benchmarking

Key Concepts

Context-as-Variable

The central innovation of RLM: instead of injecting full context into the LLM prompt (consuming tokens), the context is stored as a Python variable in the REPL namespace. The LLM receives only lightweight metadata -- the variable name, type, character count, and a short preview -- and accesses the data through code.

Reward-Driven Optimization

Every action produces a scalar reward in the range [-1.0, 1.0]. The RLMRewardProfile provides 25+ configurable knobs for tuning reward signals across different action types, including code execution success/failure, DSPy pattern matching, verifier suites, and warning penalties.

Recursive LLM Calls

Within the REPL, the LLM can call llm_query() for single sub-LLM queries and llm_query_batched() for concurrent parallel queries. This enables recursive decomposition of complex tasks -- the hallmark of the RLM paradigm.

Event-Driven Architecture

The runner publishes 27+ event types through the RLMEventBus, enabling real-time UI updates, observability sinks, and execution tracing without coupling the core engine to any specific consumer.


Quick Start

from rlm_code.rlm.runner import RLMRunner

runner = RLMRunner(
    llm_connector=my_connector,
    execution_engine=my_engine,
    workdir=Path("/my/project"),
)

result = runner.run_task(
    task="Analyze the codebase and find all TODO comments",
    environment="pure_rlm",
    max_steps=5,
)

print(f"Completed: {result.completed}")
print(f"Answer: {result.final_response}")
print(f"Steps: {result.steps}, Reward: {result.total_reward}")

Module Map

rlm_code/rlm/
  __init__.py
  runner.py              # RLMRunner orchestrator
  environments.py        # RLMEnvironment protocol, Generic, DSPy environments
  pure_rlm_environment.py # Paper-compliant Pure RLM environment
  events.py              # Event bus and event types
  termination.py         # FINAL/FINAL_VAR patterns
  memory_compaction.py   # Context window management
  repl_types.py          # REPLVariable, REPLHistory, REPLEntry, REPLResult
  trajectory.py          # JSONL logging and visualization
  comparison.py          # Paradigm comparison framework
  benchmarks.py          # Benchmark case definitions
  config_schema.py       # Configuration schema
  observability.py       # Observability hooks
  context_store.py       # Lazy file context loading
  visualizer.py          # Run visualization