"claude" articles — LLMTest Blog

Claude Fable 5 review: 5-3 over Opus 4.8, GPT-5.5 timed out

Claude Fable 5 review with real benchmark data: 5-3 over Opus 4.8, 3-0 vs GPT-5.5 on 12 coding and reasoning prompts. Includes subscription break-even math.

Jun 10, 2026 · 7 min read claudehotbenchmarks

Claude Opus 4.8 review: 8-0 over GPT-5.5, near-split with Opus 4.7

We ran 12 coding, math, and data tasks through Opus 4.8, Opus 4.7, and GPT-5.5 via LLMTest. Opus 4.8 swept GPT-5.5 but split with its predecessor.

May 29, 2026 · 8 min read hotclaudebenchmarks

Claude Sonnet 4.5 vs GPT-5 in 2026: 8/15 wins, 1.7x faster

We ran 20 real prompts through Claude Sonnet 4.5 and GPT-5. Claude won 8 of 15 comparisons, ran 1.7x faster, and GPT-5 timed out on 5 of 20.

May 6, 2026 · 6 min read h2hclaudeopenai

Claude Opus 4.7 vs GPT-5.5 for coding in 2026: Claude wins

We ran 15 real coding tasks through Claude Opus 4.7 and GPT-5.5 via LLMTest. Claude won 10, GPT-5.5 won 2, 3 ties. Full outputs and verdict inside.

May 4, 2026 · 8 min read h2hbenchmarksclaude

Claude Opus 4.7: genuine coding gains, hidden cost sting

Opus 4.7 scores higher on coding benchmarks and adds 3.75MP vision, but its new tokenizer inflates real cost by up to 35%. Here's what changed.

Apr 21, 2026 · 5 min read model-releaseclaudecost