Loading
Prediction market on manifold. METR's time horizon evaluation: https://metr.org/time-horizons/ Some existing multi-agent systems: GPT-5.2 Pro, Grok 4 Heavy, Gemini 3 Deep Think. This market doesn't count "regular" models being able to spawn subagents. For example, if the reported evaluated model is Claude Opus 4.6, but the evaluation was made within Claude Code where Claude Opus 4.6 could spawn some Claude Sonnet 4.6 subagents, this does not count for the purpose of this market.
24h Volume: $300. Liquidity: $1,000. Resolves: 7/31/2026.