Your First DSPy Program
Build a complete DSPy program from scratch in this detailed tutorial.
What You'll Build
A text classifier that categorizes customer feedback into categories: bug report, feature request, or question.
Time: 20 minutes
Prerequisites
- DSPy Code installed
- Basic Python knowledge
- (Optional) Ollama or OpenAI API key
Step 1: Start DSPy Code
Open your terminal:
You'll see the welcome screen with your DSPy version.
Step 2: Initialize Your Project
Create a new project:
What happens:
- Creates
dspy_config.yaml - Sets up project directories
- Indexes your DSPy installation
- Shows entertaining messages!
Project Structure Created
Step 3: Connect to a Model
For better results, connect to a model:
Without a Model?
You can still use DSPy Code with templates. The code will be more generic but functional.
Step 4: Describe Your Program
Use natural language to describe what you want:
Create a text classifier that takes customer feedback and categorizes it as bug report, feature request, or question
DSPy Code will generate complete code!
Step 5: Understanding the Generated Code
Let's break down what was created:
The Signature
import dspy
class FeedbackClassifier(dspy.Signature):
"""Classify customer feedback into categories."""
feedback: str = dspy.InputField(
desc="Customer feedback text"
)
category: str = dspy.OutputField(
desc="bug_report, feature_request, or question"
)
What this does:
- InputField: Defines what goes into the model
- OutputField: Defines what comes out
- Descriptions: Help the LLM understand the task
The Module
class FeedbackClassifierModule(dspy.Module):
def __init__(self):
super().__init__()
self.classifier = dspy.ChainOfThought(FeedbackClassifier)
def forward(self, feedback):
result = self.classifier(feedback=feedback)
return result
What this does:
- dspy.Module: Base class for all DSPy programs
- ChainOfThought: Uses reasoning to improve accuracy
- forward(): The main execution method
Configuration and Usage
# Configure DSPy
import dspy
dspy.settings.configure(
lm=dspy.OpenAI(model="gpt-5-nano")
)
# Create and use the classifier
classifier = FeedbackClassifierModule()
# Test it
result = classifier(feedback="The app crashes when I click save")
print(result.category) # Output: bug_report
result = classifier(feedback="Can you add dark mode?")
print(result.category) # Output: feature_request
result = classifier(feedback="How do I export my data?")
print(result.category) # Output: question
Step 6: Save Your Program
Save the generated code:
File saved to generated/feedback_classifier.py
Step 7: Validate the Code
Check for issues:
The validator checks:
- â Signature structure
- â InputField and OutputField usage
- â Module inheritance
- â forward() method implementation
- â Best practices
Validation Passed
Step 8: Test Your Program
Run it in the sandbox:
DSPy Code executes your code safely and shows results.
Step 9: Create Test Data
Generate test examples:
Generated examples:
{"feedback": "App crashes on startup", "category": "bug_report"}
{"feedback": "Please add export to PDF", "category": "feature_request"}
{"feedback": "How do I reset my password?", "category": "question"}
...
Save the data:
Step 10: Evaluate Your Classifier
Create an evaluation script:
Generated evaluation code:
import dspy
from dspy.evaluate import Evaluate
# Load test data
def load_test_data():
examples = []
with open('data/feedback_examples.jsonl') as f:
for line in f:
data = json.loads(line)
examples.append(dspy.Example(**data).with_inputs('feedback'))
return examples
# Define metric
def accuracy_metric(example, prediction, trace=None):
return example.category == prediction.category
# Evaluate
evaluator = Evaluate(
devset=load_test_data(),
metric=accuracy_metric,
num_threads=4
)
classifier = FeedbackClassifierModule()
score = evaluator(classifier)
print(f"Accuracy: {score}")
Save it:
Step 11: Optimize with GEPA
Generate GEPA optimization code:
Optimization on Cloud Models
GEPA optimization can make a large number of LLM calls. If you're connected to a cloud provider (OpenAI, Anthropic, Gemini), be sure you understand the potential API cost and have your quotas/billing configured before running /optimize.
This creates a complete GEPA optimization script!
What's included:
- Metric with feedback for GEPA
- Training data loader
- GEPA configuration
- Optimization execution
Save it:
Step 12: Run Optimization
Exit DSPy Code and run:
GEPA will:
- Load training examples
- Evaluate current performance
- Use reflection to improve prompts
- Evolve better instructions
- Save optimized version
Optimization Complete
Step 13: Deploy Your Classifier
Your classifier is now ready for production!
Package it:
This creates a distributable package with:
- Your optimized code
- Configuration
- Dependencies
- README
- Tests
What You Learned
Congratulations! You now know how to:
- â Create DSPy Signatures
- â Build DSPy Modules
- â Use ChainOfThought for reasoning
- â Validate code quality
- â Generate test data
- â Evaluate performance
- â Optimize with GEPA
- â Package for deployment
Next Steps
Try Different Predictors
Modify your classifier to use different reasoning patterns:
ReAct (Reasoning + Acting):
ProgramOfThought:
MultiChainComparison:
Add More Categories
Expand your classifier:
Add Confidence Scores
Multi-Label Classification
Common Issues
Model Not Responding
Check your model connection:
Reconnect if needed:
Low Accuracy
Try these improvements:
- Better examples: Generate more diverse training data
- Better predictor: Use ChainOfThought or ReAct
- Optimization: Run GEPA optimization
- Better prompts: Add more detailed descriptions
Validation Errors
Read error messages carefully:
Common fixes:
- Add missing InputField/OutputField
- Inherit from dspy.Module
- Implement forward() method
- Add type hints
Summary
You built a complete DSPy program with:
- â Signature definition
- â Module implementation
- â Validation
- â Testing
- â Evaluation
- â Optimization
- â Deployment package
Time to build something more complex!