Skip to content

Reflective Engine

The reflective engine is the maintained SkillOpt and GEPA inspired optimizer inside CodexOpt.

Use it when you want CodexOpt to improve SKILL.md or AGENTS.md from task feedback instead of only applying static cleanup rules.

Start with a safe preview:

codexopt improve

Run the live Codex loop:

codexopt improve --live

Apply only after you review the preview:

codexopt improve --live --apply

What Happens

For each target file, CodexOpt:

  1. builds or loads task evidence
  2. splits tasks into train and validation sets
  3. evaluates the current instruction file
  4. collects textual feedback from verifier, judge, or static analysis
  5. proposes a focused rewrite
  6. rejects rewrites that exceed the edit budget
  7. keeps a rewrite only when it improves validation score
  8. writes artifacts and a report you can review

Reward Tiers

CodexOpt uses the strongest signal available:

  1. verifier: run a deterministic command or assertion in a temporary repo copy
  2. judge: run Codex and ask a judge model to score the trajectory
  3. static: use CodexOpt quality checks when no rollout signal is available

Default runs stay offline. Use --live or configure models when you want Codex or API calls.

Configuration

reflective:
  optimizer_model: null
  judge_model: null
  reward_mode: "tiered"
  minibatch_size: 3
  max_iterations: 6
  edit_budget: 12
  valset_ratio: 0.34
  max_rollouts: 60
  seed: 0
  codex_binary: "codex"

Useful model values:

  • null: disable this role
  • codex: use codex exec
  • openai/<model>: use an OpenAI-compatible model

Safety Defaults

  • codexopt improve previews by default.
  • --apply writes files through the backup path.
  • max_rollouts caps live Codex and verifier executions.
  • edit_budget limits how much a single mutation can change.
  • the validation gate keeps the original when the candidate does not improve.

Relation To SkillOpt And GEPA

CodexOpt implements the practical pieces Codex users need:

  • SkillOpt-style train and validation discipline
  • bounded edits
  • held-out validation acceptance
  • GEPA-style textual feedback
  • reflective mutation from rollout feedback
  • Pareto-style parent selection across validation tasks

The implementation is in-house. It does not require a gepa package dependency.