Rank#3
-234.5alignment score
84.1%crowd match
Breaking down how close the model answered each question, compared to humans.




















| Photo | Vote SplitHuman response | Gap | Read |
|---|---|---|---|
![]() Photo 01Bacon Lettuce Tomato | meta-llama/llama-3.2-11b-vision-instruct 90.0% yes10.0% no | 6.3%absolute gap | Leans yesPeople mostly said yes |
![]() Photo 02Dodge Van | meta-llama/llama-3.2-11b-vision-instruct 11.0% yes89.0% no | 4.0%absolute gap | Leans noPeople mostly said no |
![]() Photo 03Sub Sandwich | meta-llama/llama-3.2-11b-vision-instruct 85.0% yes15.0% no | 9.5%absolute gap | Leans yesPeople mostly said yes |
![]() Photo 04Sandwich Costume | meta-llama/llama-3.2-11b-vision-instruct 38.0% yes62.0% no | 2.9%absolute gap | Leans noHuman knife-edge |
![]() Photo 05Grilled Cheese | meta-llama/llama-3.2-11b-vision-instruct 89.0% yes11.0% no | 6.6%absolute gap | Leans yesPeople mostly said yes |
![]() Photo 06Grilled Cheese Pineapple | meta-llama/llama-3.2-11b-vision-instruct 79.0% yes21.0% no | 12.7%absolute gap | Leans yesPeople mostly said yes |
![]() Photo 07Kitten in Bread | meta-llama/llama-3.2-11b-vision-instruct 27.0% yes73.0% no | 27.2%absolute gap | Leans noHuman knife-edge |
![]() Photo 08Hamburger | meta-llama/llama-3.2-11b-vision-instruct 85.0% yes15.0% no | 12.0%absolute gap | Leans yesSplit concept |
![]() Photo 09Hashbrown Sandwich | meta-llama/llama-3.2-11b-vision-instruct 79.0% yes21.0% no | 19.6%absolute gap | Leans yesHuman knife-edge |
![]() Photo 10Hot Dog | meta-llama/llama-3.2-11b-vision-instruct 42.0% yes58.0% no | 2.2%absolute gap | Leans noSplit concept |
![]() Photo 11Pickle Sandwich | meta-llama/llama-3.2-11b-vision-instruct 95.0% yes5.0% no | 29.4%absolute gap | Leans yesSplit concept |
![]() Photo 12Avocado Tea | meta-llama/llama-3.2-11b-vision-instruct 95.0% yes5.0% no | 2.2%absolute gap | Leans yesPeople mostly said yes |
![]() Photo 13Panini | meta-llama/llama-3.2-11b-vision-instruct 78.0% yes22.0% no | 14.4%absolute gap | Leans yesPeople mostly said yes |
![]() Photo 14Cookie PB | meta-llama/llama-3.2-11b-vision-instruct 7.0% yes93.0% no | 44.5%absolute gap | Leans noHuman knife-edge |
![]() Photo 15Chicken Wrap | meta-llama/llama-3.2-11b-vision-instruct 52.0% yes48.0% no | 29.4%absolute gap | Leans yesSplit concept |
![]() Photo 16Waffle Ice Cream | meta-llama/llama-3.2-11b-vision-instruct 29.0% yes71.0% no | 37.3%absolute gap | Leans noSplit concept |
![]() Photo 17Sloppy Joe | meta-llama/llama-3.2-11b-vision-instruct 77.0% yes23.0% no | 2.4%absolute gap | Leans yesSplit concept |
![]() Photo 18Cigarette Sandwich | meta-llama/llama-3.2-11b-vision-instruct 25.0% yes75.0% no | 4.8%absolute gap | Leans noSplit concept |
![]() Photo 19KFC Double Down | meta-llama/llama-3.2-11b-vision-instruct 47.0% yes53.0% no | 8.7%absolute gap | Leans noHuman knife-edge |
![]() Photo 20Bagel PB&J | meta-llama/llama-3.2-11b-vision-instruct 89.0% yes11.0% no | 42.4%absolute gap | Leans yesHuman knife-edge |