LangSmith Integration¶
The LangSmithSink sends RLM run traces to LangSmith, LangChain's observability platform for LLM application debugging, testing, and monitoring.
Overview¶
| Property | Value |
|---|---|
| Class | rlm_code.rlm.observability_sinks.LangSmithSink |
| Sink name | langsmith |
| Activation | DSPY_RLM_LANGSMITH_ENABLED=true |
| Primary env var | LANGCHAIN_API_KEY |
| Optional dependency | pip install langsmith |
Activation¶
export DSPY_RLM_LANGSMITH_ENABLED=true
export LANGCHAIN_API_KEY=ls-your-api-key-here
export LANGCHAIN_PROJECT=rlm-code # optional, default: rlm-code
export LANGCHAIN_TRACING_V2=true # auto-set by the sink if not present
LANGCHAIN_TRACING_V2
The sink automatically calls os.environ.setdefault("LANGCHAIN_TRACING_V2", "true") during initialization. You do not need to set this variable manually unless you want to ensure it is set before any other LangChain code runs.
Features¶
Run Tracing¶
Each RLM run is represented as a root RunTree in LangSmith:
- Name:
rlm-run-<first 8 chars of run_id> - Run type:
chain - Project: Configurable via
LANGCHAIN_PROJECT(default:rlm-code) - Inputs: Task text, environment name, and full parameters dict
- Metadata:
run_idandenvironment
Step-Level Child Runs¶
Each step is created as a child run under the root:
| Field | Source |
|---|---|
| Name | step-<n> |
| Run type | tool |
| Inputs | Action type and code (first 500 chars) |
| Outputs | Success flag, output (first 500 chars), reward, cumulative reward |
| Error | Set if the step did not succeed |
Run Completion¶
At run end, the root RunTree is updated with outputs:
| Output Field | Description |
|---|---|
completed | Whether the run completed successfully |
steps | Total number of steps taken |
total_reward | Final cumulative reward |
final_answer | The final answer text (first 500 chars) |
If the run did not complete, an error message is attached.
Feedback Collection¶
LangSmith supports feedback/evaluation annotations on runs. While the sink does not automatically create feedback, you can add it via the LangSmith SDK using the run_id logged in the trace.
Dataset Creation¶
You can use LangSmith's dataset features to create evaluation datasets from RLM benchmark results. Export runs from the LangSmith UI or use the SDK to query by project name.
Setup Guide¶
1. Create a LangSmith Account¶
Sign up at smith.langchain.com and obtain an API key.
2. Install the SDK¶
3. Configure Environment¶
export DSPY_RLM_LANGSMITH_ENABLED=true
export LANGCHAIN_API_KEY=ls-your-api-key-here
export LANGCHAIN_PROJECT=rlm-code
4. Run a Task¶
5. View Traces¶
Open smith.langchain.com, navigate to your project (rlm-code), and view the run traces. Each trace shows:
- The root run with inputs and outputs
- Child runs for each step with timing
- Input/output data for debugging
- Error details for failed steps
Configuration Options¶
| Parameter | Type | Default | Env Var | Description |
|---|---|---|---|---|
enabled | bool | False | DSPY_RLM_LANGSMITH_ENABLED | Enable/disable the sink |
project | str | "rlm-code" | LANGCHAIN_PROJECT | LangSmith project name |
Programmatic Usage¶
from rlm_code.rlm.observability_sinks import LangSmithSink
sink = LangSmithSink(
enabled=True,
project="my-rlm-project",
)
print(sink.status())
# {'name': 'langsmith', 'enabled': True, 'available': True,
# 'detail': 'project: my-rlm-project', 'project': 'my-rlm-project'}
Factory Function¶
from rlm_code.rlm.observability_sinks import create_langsmith_sink_from_env
# Reads DSPY_RLM_LANGSMITH_ENABLED and LANGCHAIN_PROJECT
sink = create_langsmith_sink_from_env()
Trace Structure¶
A typical RLM run creates this hierarchy in LangSmith:
rlm-run-abc12345 (chain)
|-- Inputs: { task: "...", environment: "dspy", params: {...} }
|-- Metadata: { run_id: "abc12345", environment: "dspy" }
|
+-- step-1 (tool)
| |-- Inputs: { action: "run_python", code: "import dspy..." }
| |-- Outputs: { success: true, output: "...", reward: 0.5 }
|
+-- step-2 (tool)
| |-- Inputs: { action: "run_python", code: "class Module..." }
| |-- Outputs: { success: true, output: "...", reward: 0.5 }
|
+-- step-3 (tool)
|-- Inputs: { action: "submit", code: "" }
|-- Outputs: { success: true, reward: 0.5 }
|
|-- Outputs: { completed: true, steps: 3, total_reward: 1.5 }
Connection Validation¶
During initialization, the sink tests the connection to LangSmith by calling self._client.list_projects(limit=1). If this call fails (invalid API key, network error, etc.), the sink sets _available=False and records the error:
try:
self._client = Client()
self._client.list_projects(limit=1)
self._available = True
self._detail = f"project: {self.project}"
except Exception as exc:
self._available = False
self._detail = f"connection failed: {exc}"
Check Status
After initialization, call sink.status() to verify the connection. The available field tells you whether the sink is live.
Troubleshooting¶
| Symptom | Cause | Solution |
|---|---|---|
available: False, detail mentions ImportError | langsmith package not installed | pip install langsmith |
available: False, detail mentions connection failed | Invalid API key or network issue | Verify LANGCHAIN_API_KEY and network connectivity |
| Traces not appearing in UI | Sink not enabled | Ensure DSPY_RLM_LANGSMITH_ENABLED=true |
| Traces in wrong project | LANGCHAIN_PROJECT mismatch | Set LANGCHAIN_PROJECT to the correct project name |
| Missing step details | Step truncation | Code and output are truncated to 500 chars in LangSmith |