Skip to content

๐Ÿ”ฌ Research Tab

The Research tab is the 5th tab in the RLM Code TUI, accessible via Ctrl+5 / F6. It provides a dedicated space for experiment tracking, trajectory viewing, benchmarks, session replay, and live event streaming, all wired to real data from the RLM runner.


๐Ÿ“ How to Access

Method Action
โŒจ๏ธ Keyboard Ctrl+5 or F6
๐Ÿ–ฑ๏ธ Click Click ๐Ÿ”ฌ Research in the focus bar
๐Ÿ’ฌ Command /view research

๐Ÿ—‚๏ธ Sub-Tabs

The Research tab organizes data across 5 sub-tabs, each shown as a button bar at the top of the pane. Click a sub-tab to switch views.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๐Ÿ”ฌ Research                                         โ”‚
โ”‚ [Dashboard] [Trajectory] [Benchmarks] [Replay] [Events] โ”‚
โ”‚                                                     โ”‚
โ”‚  โ”Œโ”€ Content area (changes per sub-tab) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚                                                โ”‚  โ”‚
โ”‚  โ”‚                                                โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“Š Dashboard

The default sub-tab. Shows a high-level summary of the most recent RLM run.

Widgets

Widget What It Shows
๐Ÿท๏ธ MetricsPanel Run ID, status (color-coded), cumulative reward, step count, tokens, cost, duration
๐Ÿ“ˆ SparklineChart ASCII reward curve using Unicode block characters (โ–โ–‚โ–ƒโ–„โ–…โ–†โ–‡โ–ˆ)
๐Ÿ“ Summary One-line result summary of the run

How It Populates

  1. Run /rlm run "your task" or /rlm bench preset=dspy_quick in the RLM tab
  2. When the run completes, the Dashboard auto-populates via build_run_visualization()
  3. The MetricsPanel updates its reactive properties (run_id, status, reward, steps, etc.)
  4. The SparklineChart fills with cumulative reward values from the trajectory

๐Ÿ“Š Live Updates

During an active run, the SparklineChart updates in real-time as each iteration completes and emits an ITERATION_END event.

Data Source

from rlm_code.rlm.visualizer import build_run_visualization

viz = build_run_visualization(run_path=run_path, run_dir=run_path.parent)
# viz["run_id"], viz["status"], viz["total_reward"],
# viz["step_count"], viz["reward_curve"], viz["timeline"]

๐Ÿ“ˆ Trajectory

Step-by-step timeline of the RLM run, showing what the agent did at each step.

Table Columns

Column Description
๐Ÿ”ข Step Step number
โšก Action Action type (e.g., code_generation, validation)
๐Ÿ† Reward Step reward (color-coded: ๐ŸŸข positive, ๐Ÿ”ด negative)
๐Ÿ”ค Tokens Tokens consumed in this step
โœ… Success Whether the step succeeded

How It Populates

After any /rlm run or /rlm bench command, the trajectory is extracted from build_run_visualization()["timeline"] and rendered as a Rich table.


๐Ÿ† Benchmarks

Displays the leaderboard table from benchmark runs.

What You See

A Rich table ranked by reward, showing:

Column Description
๐Ÿ… Rank Position on the leaderboard
๐Ÿท๏ธ ID Run identifier
๐ŸŒ Environment pure_rlm, codeact, or generic
๐Ÿค– Model Model used
๐Ÿ† Reward Average reward (color-coded)
๐Ÿ“Š Completion Completion rate
๐Ÿ”ข Steps Step count
๐Ÿ”ค Tokens Total tokens

How It Populates

Run /rlm bench preset=<name> then switch to Research โ†’ Benchmarks. The data comes from:

from rlm_code.rlm.leaderboard import Leaderboard

lb = Leaderboard(workdir=Path.cwd() / ".rlm_code", auto_load=True)
table = lb.format_rich_table(limit=15)

โช Replay

Step-through controls for time-travel debugging of any RLM run.

Controls

Button Action
\|< โฎ๏ธ Jump to first step
< โ—€๏ธ Step backward
> โ–ถ๏ธ Step forward
>\| โญ๏ธ Jump to last step

What You See

  • Step position: Step 3/8 indicator
  • Step detail: Action code with syntax highlighting, output, reward, cumulative reward
  • Reward curve: SparklineChart showing the full reward trajectory with current position

How It Populates

Run /rlm status to get a run id, then /rlm replay <run_id>. The TUI automatically switches to Research โ†’ Replay and loads the session:

from rlm_code.rlm.session_replay import SessionReplayer

replayer = SessionReplayer.from_jsonl(run_path)
replayer.step_forward()    # advance one step
replayer.step_backward()   # go back one step
replayer.goto_step(n)      # jump to step n

Reward Color Coding

Reward Color
>= 0.8 ๐ŸŸข Bright green
>= 0.5 ๐ŸŸข Green
>= 0.3 ๐ŸŸก Yellow
>= 0.0 ๐ŸŸ  Orange
< 0.0 ๐Ÿ”ด Red

๐Ÿ“ก Events

Live event stream from the RLM event bus, showing real-time progress during active runs.

What You See

A RichLog widget that streams formatted events with timestamps:

[14:23:01] ๐ŸŸข RUN_START - Starting run abc123 (pure_rlm)
[14:23:02] ๐Ÿ”ต ITERATION_START - Step 1/8
[14:23:04] ๐ŸŸก LLM_CALL - Calling claude-sonnet-4-20250514 (450 tokens)
[14:23:06] ๐ŸŸข ITERATION_END - Step 1 complete (reward: +0.15)
[14:23:08] ๐ŸŸข RUN_END - Run complete (total reward: 0.72)

Event Types

The event bus supports 27+ event types including:

Event Description
RUN_START / RUN_END ๐Ÿ Run lifecycle
ITERATION_START / ITERATION_END ๐Ÿ”„ Step lifecycle
LLM_CALL / LLM_RESPONSE ๐Ÿค– Model interactions
SANDBOX_EXEC ๐Ÿ“ฆ Code execution
REWARD_COMPUTED ๐Ÿ† Reward calculation
MEMORY_COMPACTED ๐Ÿงน Memory compaction
APPROVAL_REQUESTED ๐Ÿ”’ HITL gates

Thread Safety

Events flow from the RLM runner (which runs in a worker thread) to the UI via call_from_thread() for thread-safe rendering:

def _on_raw_rlm_event(self, event):
    self.call_from_thread(self._on_rlm_event, event)

๐Ÿ”— Integration with Slash Commands

The Research tab auto-updates when you run RLM commands in the RLM tab:

Command What Updates
/rlm run "..." ๐Ÿ“Š Dashboard + ๐Ÿ“ˆ Trajectory + ๐Ÿ“ก Events
/rlm bench preset=... ๐Ÿ“Š Dashboard + ๐Ÿ“ˆ Trajectory + ๐Ÿ† Benchmarks + ๐Ÿ“ก Events
/rlm replay โช Replay (auto-switches to Replay sub-tab)
/rlm bench compare ... ๐Ÿ† Benchmarks + compare summary

๐ŸŽจ Visual Design

The Research tab inherits the TUI's purple-accented dark theme with additional styling for research-specific elements:

Element Style
Sub-tab buttons Active = primary variant, Inactive = default
Metrics panel Titled Rich Panel with color-coded status
Sparkline Unicode block chars with reward-based colors
Event log Black background, light text, markup-enabled
Replay controls Compact button row with step position indicator

๐Ÿ“Š Widgets Used

Widget Module Purpose
MetricsPanel rlm_code.rlm.research_tui.widgets.panels Run dashboard metrics
SparklineChart rlm_code.rlm.research_tui.widgets.animated Reward curve visualization
RichLog textual.widgets Event stream display
Static textual.widgets Trajectory table, summary, replay detail
Button textual.widgets Sub-tab buttons, replay controls