Model breakdown
GPT / OpenAIopenai/o3
Profile
o3OpenAI
Release
2025-04-16Available on OpenRouter
Specs
Not publicly disclosed200,000 tokens
Capabilities
Image + Text + FileComplex reasoning across text, code, and images
Training
June 01, 2024OpenAI
Rank#57
-1121.1alignment score
72.5%crowd match
Mean gap27.5%
Human match72.5%
Best fitBacon Lettuce Tomato
Average vote71.9%
71.9%model yes
62.8%human yes
Workload3.1K evals
3.1Kevals
154iterations
2.9Mtokens
Photo-by-photo

Model Results

Breaking down how close the model answered each question, compared to humans.

Dodge Van
Photo 02Dodge Van
GPT / OpenAIopenai/o3
0.0% yes100.0% no
Gap7.0%
Model readLeans no
Sub Sandwich
Photo 03Sub Sandwich
GPT / OpenAIopenai/o3
100.0% yes0.0% no
Gap5.5%
Model readLeans yes
GPT / OpenAIopenai/o3
100.0% yes0.0% no
Gap4.4%
Model readLeans yes
GPT / OpenAIopenai/o3
13.6% yes86.4% no
Gap40.6%
Model readLeans no
Hamburger
Photo 08Hamburger
GPT / OpenAIopenai/o3
100.0% yes0.0% no
Gap27.0%
Model readLeans yes
Hot Dog
Photo 10Hot Dog
GPT / OpenAIopenai/o3
90.3% yes9.7% no
Gap50.4%
Model readLeans yes
GPT / OpenAIopenai/o3
15.6% yes84.4% no
Gap50.0%
Model readLeans no
Avocado Tea
Photo 12Avocado Tea
GPT / OpenAIopenai/o3
100.0% yes0.0% no
Gap7.2%
Model readLeans yes
Panini
Photo 13Panini
GPT / OpenAIopenai/o3
100.0% yes0.0% no
Gap7.6%
Model readLeans yes
Cookie PB
Photo 14Cookie PB
GPT / OpenAIopenai/o3
90.3% yes9.7% no
Gap38.7%
Model readLeans yes
Chicken Wrap
Photo 15Chicken Wrap
GPT / OpenAIopenai/o3
65.6% yes34.4% no
Gap43.0%
Model readLeans yes
Sloppy Joe
Photo 17Sloppy Joe
GPT / OpenAIopenai/o3
100.0% yes0.0% no
Gap20.6%
Model readLeans yes
GPT / OpenAIopenai/o3
10.4% yes89.6% no
Gap45.3%
Model readLeans no
Bagel PB&J
Photo 20Bagel PB&J
GPT / OpenAIopenai/o3
98.0% yes1.9% no
Gap51.4%
Model readLeans yes
PhotoVote SplitHuman responseGapRead
GPT / OpenAIopenai/o3
100.0% yes0.0% no
3.7%absolute gap
Leans yesPeople mostly said yes
Dodge Van
Photo 02Dodge Van
GPT / OpenAIopenai/o3
0.0% yes100.0% no
7.0%absolute gap
Leans noPeople mostly said no
Sub Sandwich
Photo 03Sub Sandwich
GPT / OpenAIopenai/o3
100.0% yes0.0% no
5.5%absolute gap
Leans yesPeople mostly said yes
GPT / OpenAIopenai/o3
0.0% yes100.0% no
40.9%absolute gap
Leans noHuman knife-edge
GPT / OpenAIopenai/o3
100.0% yes0.0% no
4.4%absolute gap
Leans yesPeople mostly said yes
GPT / OpenAIopenai/o3
100.0% yes0.0% no
8.3%absolute gap
Leans yesPeople mostly said yes
GPT / OpenAIopenai/o3
13.6% yes86.4% no
40.6%absolute gap
Leans noHuman knife-edge
Hamburger
Photo 08Hamburger
GPT / OpenAIopenai/o3
100.0% yes0.0% no
27.0%absolute gap
Leans yesSplit concept
GPT / OpenAIopenai/o3
96.1% yes3.9% no
36.7%absolute gap
Leans yesHuman knife-edge
Hot Dog
Photo 10Hot Dog
GPT / OpenAIopenai/o3
90.3% yes9.7% no
50.4%absolute gap
Leans yesSplit concept
GPT / OpenAIopenai/o3
15.6% yes84.4% no
50.0%absolute gap
Leans noSplit concept
Avocado Tea
Photo 12Avocado Tea
GPT / OpenAIopenai/o3
100.0% yes0.0% no
7.2%absolute gap
Leans yesPeople mostly said yes
Panini
Photo 13Panini
GPT / OpenAIopenai/o3
100.0% yes0.0% no
7.6%absolute gap
Leans yesPeople mostly said yes
Cookie PB
Photo 14Cookie PB
GPT / OpenAIopenai/o3
90.3% yes9.7% no
38.7%absolute gap
Leans yesHuman knife-edge
Chicken Wrap
Photo 15Chicken Wrap
GPT / OpenAIopenai/o3
65.6% yes34.4% no
43.0%absolute gap
Leans yesSplit concept
GPT / OpenAIopenai/o3
100.0% yes0.0% no
33.7%absolute gap
Leans yesSplit concept
Sloppy Joe
Photo 17Sloppy Joe
GPT / OpenAIopenai/o3
100.0% yes0.0% no
20.6%absolute gap
Leans yesSplit concept
GPT / OpenAIopenai/o3
58.4% yes41.6% no
28.6%absolute gap
Leans yesSplit concept
GPT / OpenAIopenai/o3
10.4% yes89.6% no
45.3%absolute gap
Leans noHuman knife-edge
Bagel PB&J
Photo 20Bagel PB&J
GPT / OpenAIopenai/o3
98.0% yes1.9% no
51.4%absolute gap
Leans yesHuman knife-edge
openai/o3 Sandwich Benchmark Breakdown | opensandwich.ai