Independent AI Analysis

What AI can actually do.
Not what the pitch says.

Objective benchmarks, first-hand experiments, and honest analysis across all major AI systems. Built for the technically curious — and for anyone making real decisions about AI.

No posts yet

Articles and experiments will appear here once published.

Recent Scores

All

GPT-4 Turbo

MMLU

86.5%

Claude 3.5 Sonnet

MMLU

88.7%

Claude 3 Opus

MMLU

86.8%

Gemini 1.5 Pro

MMLU

85.9%

GPT-4o

MMLU

88.7%

Models Tracked

About this project

The gap between AI hype and AI reality depends heavily on where you start. For someone outside tech, demos can feel like going from 0→100. For engineers, it's often 10→15.

This site exists to quantify that gap — with real numbers and documented experiments anyone can scrutinize.

Read more →