Quick Start: Single Behavior¶
The easiest way to get started with CodeOptiX is to use a single behavior. This guide will help you run your first evaluation in under 5 minutes.
Prerequisites¶
- Python 3.12+ installed
- API Key for one of these providers:
- OpenAI (for evaluation)
- Anthropic (for Claude Code agent)
- Google (for Gemini agent)
Step 1: Install CodeOptiX¶
Or using uv (recommended):
Step 2: Set Your API Key¶
Or for Claude Code:
Step 3: Run Your First Evaluation¶
Option 1: Using the CLI (Easiest)¶
This will:
- ✅ Check for security issues (insecure-code behavior)
- ✅ Use Claude Code as the agent
- ✅ Use OpenAI for evaluation
- ✅ Save results automatically
Option 2: Using a Config File (Recommended)¶
codeoptix eval \
--agent claude-code \
--behaviors "insecure-code" \
--config examples/configs/single-behavior-insecure-code.yaml
Option 3: Using Python¶
from codeoptix.adapters.factory import create_adapter
from codeoptix.evaluation import EvaluationEngine
from codeoptix.utils.llm import LLMProvider, create_llm_client
import os
# Create adapter
adapter = create_adapter("claude-code", {
"llm_config": {
"provider": "anthropic",
"api_key": os.getenv("ANTHROPIC_API_KEY"),
}
})
# Create LLM client
llm_client = create_llm_client(LLMProvider.OPENAI, api_key=os.getenv("OPENAI_API_KEY"))
# Create evaluation engine
eval_engine = EvaluationEngine(adapter, llm_client)
# Run evaluation with single behavior
results = eval_engine.evaluate_behaviors(
behavior_names=["insecure-code"] # Behavior name as string
)
print(f"Score: {results['overall_score']:.2%}")
Available Behaviors¶
You can use any of these behaviors:
1. insecure-code (Security)¶
Behavior Name: insecure-code
Checks for security vulnerabilities: - Hardcoded secrets - SQL injection risks - XSS vulnerabilities
Behavior Name: insecure-code
2. vacuous-tests (Test Quality)¶
Behavior Name: vacuous-tests
Checks test quality: - Missing assertions - Trivial tests - Test coverage
Behavior Name: vacuous-tests
3. plan-drift (Requirements)¶
Behavior Name: plan-drift
Checks requirements alignment: - Plan deviations - Missing features - Extra features
Behavior Name: plan-drift
Understanding Results¶
After running an evaluation, you'll see:
✅ Evaluation Complete!
📊 Overall Score: 85.00%
📁 Results: artifacts/results_run-001.json
🆔 Run ID: run-001
📋 Behavior Results:
✅ **insecure-code**: 85.00%
What the Score Means¶
- 100%: Perfect - no issues found
- 80-99%: Good - minor issues
- 50-79%: Needs improvement
- <50%: Critical issues found
Viewing Detailed Results¶
# View the results file
cat artifacts/results_run-001.json
# Generate a reflection report
codeoptix reflect --input artifacts/results_run-001.json
Common Issues¶
"API key required" Error¶
Solution: Set your API key:
"Unsupported adapter type" Error¶
Solution: Use a supported agent: - claude-code (Anthropic) - codex (OpenAI) - gemini-cli (Google)
"Invalid behavior name" Error¶
Solution: Use a valid behavior:
-
insecure-code -
vacuous-tests -
plan-drift
Next Steps¶
Once you're comfortable with a single behavior:
- Try other behaviors - Test different aspects of code quality
- Use multiple behaviors - Combine checks:
- Use in CI/CD - Add to GitHub Actions:
- Generate reflection - Understand failures:
Example Configurations¶
We provide example configs for single behaviors:
examples/configs/single-behavior-insecure-code.yaml- Security checksexamples/configs/single-behavior-vacuous-tests.yaml- Test qualityexamples/configs/single-behavior-plan-drift.yaml- Requirements alignment
Next Steps¶
Ready for more advanced usage?
- Quick Start - Comprehensive guide with Ollama support
- Your First Evaluation - Detailed step-by-step walkthrough
- Core Concepts - Understand how CodeOptiX works
- Python API Guide - Advanced Python usage