Is scale unnecessary for intelligence (<10B param human-competitive STEM model before 2030)?

Prediction market on manifold. Resolves yes if before 2030, a neural net with <10B parameters achieves all of: >75% on GPQA, >80% on SWE-bench verified, and >95% on MATH Arbitrary scaffolding allowed (retrieval over fixed DB is ok), no talking with other AI, no internet access. We'll use whatever tools are available at the time to determine whether such an AI memorized the answers to these datasets; if verbatim memorization obviously happened, the model will be disqualified. Edit: we'll allow up to 1 minute of time per question. Possible clarification from creator (AI generated): Model must complete each question within 1 minute of wall-clock time

Liquidity: $1,000. Resolves: 1/2/2030.

Loading