๐ฌ Research Tab¶
The Research tab is the 5th tab in the RLM Code TUI, accessible via Ctrl+5 / F6. It provides a dedicated space for experiment tracking, trajectory viewing, benchmarks, session replay, and live event streaming, all wired to real data from the RLM runner.
๐ How to Access¶
| Method | Action |
|---|---|
| โจ๏ธ Keyboard | Ctrl+5 or F6 |
| ๐ฑ๏ธ Click | Click ๐ฌ Research in the focus bar |
| ๐ฌ Command | /view research |
๐๏ธ Sub-Tabs¶
The Research tab organizes data across 5 sub-tabs, each shown as a button bar at the top of the pane. Click a sub-tab to switch views.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ๐ฌ Research โ
โ [Dashboard] [Trajectory] [Benchmarks] [Replay] [Events] โ
โ โ
โ โโ Content area (changes per sub-tab) โโโโโโโโโโโ โ
โ โ โ โ
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ Dashboard¶
The default sub-tab. Shows a high-level summary of the most recent RLM run.
Widgets¶
| Widget | What It Shows |
|---|---|
| ๐ท๏ธ MetricsPanel | Run ID, status (color-coded), cumulative reward, step count, tokens, cost, duration |
| ๐ SparklineChart | ASCII reward curve using Unicode block characters (โโโโโ
โโโ) |
| ๐ Summary | One-line result summary of the run |
How It Populates¶
- Run
/rlm run "your task"or/rlm bench preset=dspy_quickin the RLM tab - When the run completes, the Dashboard auto-populates via
build_run_visualization() - The MetricsPanel updates its reactive properties (run_id, status, reward, steps, etc.)
- The SparklineChart fills with cumulative reward values from the trajectory
๐ Live Updates
During an active run, the SparklineChart updates in real-time as each iteration completes and emits an ITERATION_END event.
Data Source¶
from rlm_code.rlm.visualizer import build_run_visualization
viz = build_run_visualization(run_path=run_path, run_dir=run_path.parent)
# viz["run_id"], viz["status"], viz["total_reward"],
# viz["step_count"], viz["reward_curve"], viz["timeline"]
๐ Trajectory¶
Step-by-step timeline of the RLM run, showing what the agent did at each step.
Table Columns¶
| Column | Description |
|---|---|
| ๐ข Step | Step number |
| โก Action | Action type (e.g., code_generation, validation) |
| ๐ Reward | Step reward (color-coded: ๐ข positive, ๐ด negative) |
| ๐ค Tokens | Tokens consumed in this step |
| โ Success | Whether the step succeeded |
How It Populates¶
After any /rlm run or /rlm bench command, the trajectory is extracted from build_run_visualization()["timeline"] and rendered as a Rich table.
๐ Benchmarks¶
Displays the leaderboard table from benchmark runs.
What You See¶
A Rich table ranked by reward, showing:
| Column | Description |
|---|---|
| ๐ Rank | Position on the leaderboard |
| ๐ท๏ธ ID | Run identifier |
| ๐ Environment | pure_rlm, codeact, or generic |
| ๐ค Model | Model used |
| ๐ Reward | Average reward (color-coded) |
| ๐ Completion | Completion rate |
| ๐ข Steps | Step count |
| ๐ค Tokens | Total tokens |
How It Populates¶
Run /rlm bench preset=<name> then switch to Research โ Benchmarks. The data comes from:
from rlm_code.rlm.leaderboard import Leaderboard
lb = Leaderboard(workdir=Path.cwd() / ".rlm_code", auto_load=True)
table = lb.format_rich_table(limit=15)
โช Replay¶
Step-through controls for time-travel debugging of any RLM run.
Controls¶
| Button | Action |
|---|---|
\|< | โฎ๏ธ Jump to first step |
< | โ๏ธ Step backward |
> | โถ๏ธ Step forward |
>\| | โญ๏ธ Jump to last step |
What You See¶
- Step position:
Step 3/8indicator - Step detail: Action code with syntax highlighting, output, reward, cumulative reward
- Reward curve: SparklineChart showing the full reward trajectory with current position
How It Populates¶
Run /rlm status to get a run id, then /rlm replay <run_id>. The TUI automatically switches to Research โ Replay and loads the session:
from rlm_code.rlm.session_replay import SessionReplayer
replayer = SessionReplayer.from_jsonl(run_path)
replayer.step_forward() # advance one step
replayer.step_backward() # go back one step
replayer.goto_step(n) # jump to step n
Reward Color Coding¶
| Reward | Color |
|---|---|
| >= 0.8 | ๐ข Bright green |
| >= 0.5 | ๐ข Green |
| >= 0.3 | ๐ก Yellow |
| >= 0.0 | ๐ Orange |
| < 0.0 | ๐ด Red |
๐ก Events¶
Live event stream from the RLM event bus, showing real-time progress during active runs.
What You See¶
A RichLog widget that streams formatted events with timestamps:
[14:23:01] ๐ข RUN_START - Starting run abc123 (pure_rlm)
[14:23:02] ๐ต ITERATION_START - Step 1/8
[14:23:04] ๐ก LLM_CALL - Calling claude-sonnet-4-20250514 (450 tokens)
[14:23:06] ๐ข ITERATION_END - Step 1 complete (reward: +0.15)
[14:23:08] ๐ข RUN_END - Run complete (total reward: 0.72)
Event Types¶
The event bus supports 27+ event types including:
| Event | Description |
|---|---|
RUN_START / RUN_END | ๐ Run lifecycle |
ITERATION_START / ITERATION_END | ๐ Step lifecycle |
LLM_CALL / LLM_RESPONSE | ๐ค Model interactions |
SANDBOX_EXEC | ๐ฆ Code execution |
REWARD_COMPUTED | ๐ Reward calculation |
MEMORY_COMPACTED | ๐งน Memory compaction |
APPROVAL_REQUESTED | ๐ HITL gates |
Thread Safety¶
Events flow from the RLM runner (which runs in a worker thread) to the UI via call_from_thread() for thread-safe rendering:
๐ Integration with Slash Commands¶
The Research tab auto-updates when you run RLM commands in the RLM tab:
| Command | What Updates |
|---|---|
/rlm run "..." | ๐ Dashboard + ๐ Trajectory + ๐ก Events |
/rlm bench preset=... | ๐ Dashboard + ๐ Trajectory + ๐ Benchmarks + ๐ก Events |
/rlm replay | โช Replay (auto-switches to Replay sub-tab) |
/rlm bench compare ... | ๐ Benchmarks + compare summary |
๐จ Visual Design¶
The Research tab inherits the TUI's purple-accented dark theme with additional styling for research-specific elements:
| Element | Style |
|---|---|
| Sub-tab buttons | Active = primary variant, Inactive = default |
| Metrics panel | Titled Rich Panel with color-coded status |
| Sparkline | Unicode block chars with reward-based colors |
| Event log | Black background, light text, markup-enabled |
| Replay controls | Compact button row with step position indicator |
๐ Widgets Used¶
| Widget | Module | Purpose |
|---|---|---|
MetricsPanel | rlm_code.rlm.research_tui.widgets.panels | Run dashboard metrics |
SparklineChart | rlm_code.rlm.research_tui.widgets.animated | Reward curve visualization |
RichLog | textual.widgets | Event stream display |
Static | textual.widgets | Trajectory table, summary, replay detail |
Button | textual.widgets | Sub-tab buttons, replay controls |