Model breakdown
GPT / OpenAIopenai/gpt-5.1-chat
Profile
OpenAI: GPT-5.1 ChatOpenAI
Release
Not provided by OpenRouterAvailable on OpenRouter
Specs
Not provided by OpenRouter128,000 tokens
Capabilities
File + Image + TextNot provided by OpenRouter
Training
Not provided by OpenRouterOpenAI
Rank#8
-537.6alignment score
80.0%crowd match
Mean gap20.0%
Human match80.0%
Best fitBacon Lettuce Tomato
Average vote68.4%
68.4%model yes
62.8%human yes
Workload2K evals
2Kevals
100iterations
1.5Mtokens
Photo-by-photo

Model Results

Breaking down how close the model answered each question, compared to humans.

Dodge Van
Photo 02Dodge Van
GPT / OpenAIopenai/gpt-5.1-chat
0.0% yes100.0% no
Gap7.0%
Model readLeans no
Sub Sandwich
Photo 03Sub Sandwich
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
Gap5.5%
Model readLeans yes
GPT / OpenAIopenai/gpt-5.1-chat
0.0% yes100.0% no
Gap40.9%
Model readLeans no
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
Gap4.4%
Model readLeans yes
GPT / OpenAIopenai/gpt-5.1-chat
0.0% yes100.0% no
Gap54.2%
Model readLeans no
Hamburger
Photo 08Hamburger
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
Gap27.0%
Model readLeans yes
Hot Dog
Photo 10Hot Dog
GPT / OpenAIopenai/gpt-5.1-chat
31.0% yes69.0% no
Gap8.8%
Model readLeans no
GPT / OpenAIopenai/gpt-5.1-chat
78.0% yes22.0% no
Gap12.4%
Model readLeans yes
Avocado Tea
Photo 12Avocado Tea
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
Gap7.2%
Model readLeans yes
Panini
Photo 13Panini
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
Gap7.6%
Model readLeans yes
Cookie PB
Photo 14Cookie PB
GPT / OpenAIopenai/gpt-5.1-chat
94.0% yes6.0% no
Gap42.5%
Model readLeans yes
Chicken Wrap
Photo 15Chicken Wrap
GPT / OpenAIopenai/gpt-5.1-chat
6.0% yes94.0% no
Gap16.6%
Model readLeans no
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
Gap33.7%
Model readLeans yes
Sloppy Joe
Photo 17Sloppy Joe
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
Gap20.6%
Model readLeans yes
GPT / OpenAIopenai/gpt-5.1-chat
50.0% yes50.0% no
Gap5.7%
Model readExact tie
Bagel PB&J
Photo 20Bagel PB&J
GPT / OpenAIopenai/gpt-5.1-chat
90.0% yes10.0% no
Gap43.4%
Model readLeans yes
PhotoVote SplitHuman responseGapRead
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
3.7%absolute gap
Leans yesPeople mostly said yes
Dodge Van
Photo 02Dodge Van
GPT / OpenAIopenai/gpt-5.1-chat
0.0% yes100.0% no
7.0%absolute gap
Leans noPeople mostly said no
Sub Sandwich
Photo 03Sub Sandwich
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
5.5%absolute gap
Leans yesPeople mostly said yes
GPT / OpenAIopenai/gpt-5.1-chat
0.0% yes100.0% no
40.9%absolute gap
Leans noHuman knife-edge
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
4.4%absolute gap
Leans yesPeople mostly said yes
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
8.3%absolute gap
Leans yesPeople mostly said yes
GPT / OpenAIopenai/gpt-5.1-chat
0.0% yes100.0% no
54.2%absolute gap
Leans noHuman knife-edge
Hamburger
Photo 08Hamburger
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
27.0%absolute gap
Leans yesSplit concept
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
40.6%absolute gap
Leans yesHuman knife-edge
Hot Dog
Photo 10Hot Dog
GPT / OpenAIopenai/gpt-5.1-chat
31.0% yes69.0% no
8.8%absolute gap
Leans noSplit concept
GPT / OpenAIopenai/gpt-5.1-chat
78.0% yes22.0% no
12.4%absolute gap
Leans yesSplit concept
Avocado Tea
Photo 12Avocado Tea
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
7.2%absolute gap
Leans yesPeople mostly said yes
Panini
Photo 13Panini
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
7.6%absolute gap
Leans yesPeople mostly said yes
Cookie PB
Photo 14Cookie PB
GPT / OpenAIopenai/gpt-5.1-chat
94.0% yes6.0% no
42.5%absolute gap
Leans yesHuman knife-edge
Chicken Wrap
Photo 15Chicken Wrap
GPT / OpenAIopenai/gpt-5.1-chat
6.0% yes94.0% no
16.6%absolute gap
Leans noSplit concept
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
33.7%absolute gap
Leans yesSplit concept
Sloppy Joe
Photo 17Sloppy Joe
GPT / OpenAIopenai/gpt-5.1-chat
100.0% yes0.0% no
20.6%absolute gap
Leans yesSplit concept
GPT / OpenAIopenai/gpt-5.1-chat
19.0% yes81.0% no
10.8%absolute gap
Leans noSplit concept
GPT / OpenAIopenai/gpt-5.1-chat
50.0% yes50.0% no
5.7%absolute gap
Exact tieHuman knife-edge
Bagel PB&J
Photo 20Bagel PB&J
GPT / OpenAIopenai/gpt-5.1-chat
90.0% yes10.0% no
43.4%absolute gap
Leans yesHuman knife-edge
openai/gpt-5.1-chat Sandwich Benchmark Breakdown | opensandwich.ai