Will any system score at least 78.0% on SWE-bench Verified’s official public leaderboard before May 1, 2026?
Prediction market on metaculus. SWE-bench Verified is a human-filtered subset of 500 real GitHub issues. OpenAI’s Aug. 2024 post said top-scoring agents were at 20% on SWE-bench as of Aug. 5, 2024; SWE-bench’s own site highlighted 65% on Verified in July 2025; and the current mini-SWE-agent page says it scores >74% on SWE-bench Verified. `{"format":"bot_tournament_question","info":{"hash_id":"ccd0a5a407fd9872","sheet_id":530.0}}`
Resolves: 5/1/2026.