Behavioral Spec Example¶
Complete example of using CodeOptiX to evaluate agent behavior.
Overview¶
This example demonstrates evaluating an agent for security issues.
Complete Code¶
import os
from codeoptix.adapters.factory import create_adapter
from codeoptix.evaluation import EvaluationEngine
from codeoptix.reflection import ReflectionEngine
from codeoptix.artifacts import ArtifactManager
from codeoptix.utils.llm import create_llm_client, LLMProvider
# 1. Setup
adapter = create_adapter("codex", {
"llm_config": {
"provider": "openai",
"api_key": os.getenv("OPENAI_API_KEY"),
}
})
llm_client = create_llm_client(LLMProvider.OPENAI)
artifact_manager = ArtifactManager()
# 2. Evaluate
eval_engine = EvaluationEngine(adapter, llm_client)
results = eval_engine.evaluate_behaviors(
behavior_names=["insecure-code"]
)
# 3. Save results
artifact_manager.save_results(results)
# 4. Reflect
reflection_engine = ReflectionEngine(artifact_manager)
reflection = reflection_engine.reflect(results, save=True)
# 5. Print summary
print(f"Overall Score: {results['overall_score']:.2f}")
for behavior_name, behavior_data in results['behaviors'].items():
status = "✅ PASSED" if behavior_data['passed'] else "❌ FAILED"
print(f"{behavior_name}: {status} (Score: {behavior_data['score']:.2f})")
Running the Example¶
Expected Output¶
Next Steps¶
- GEPA Demo - GEPA evolution example
- Adapter Usage - Adapter examples
- Python API Guide - More examples