ARC-AGI
Chollet's ARC-AGI-3: Humans Ace It, GPT-5.5 Scores 0.43%
François Chollet left Google to build ARC-AGI-3, a reasoning benchmark where humans score 100% but frontier LLMs like GPT-5.5 manage just 0.43%, exposing the gap between pattern matching and true intelligence.