OpenAI Stops Using SWE-bench Verified For Frontier Coding Claims: What Codex Teams Should Measure Now
The most important Codex-related development on February 23, 2026 is not a new model launch. It is an evaluation policy change. OpenAI published a research post stating …