What AI can actually do.
Not what the pitch says.
Objective benchmarks, first-hand experiments, and honest analysis across all major AI systems. Built for the technically curious — and for anyone making real decisions about AI.
Latest
All articlesNo posts yet
Articles and experiments will appear here once published.
Recent Scores
AllGPT-4 Turbo
MMLU
86.5%✓
Claude 3.5 Sonnet
MMLU
88.7%✓
Claude 3 Opus
MMLU
86.8%✓
Gemini 1.5 Pro
MMLU
85.9%✓
GPT-4o
MMLU
88.7%✓