Insights

Deep dives and practical guides on LLM performance, pricing changes, and new model comparisons.(51 posts)

Stay in the loop

Reviewed LLM analysis when a new edition is ready. No spam.

Anthropic's Fable 5 shutdown exposes AI governance's next fault line

A critical look at the U.S. directive suspending Claude Fable 5 and Mythos 5, and what it reveals about export control, national security, and corporate control of frontier AI.

Jun 13, 2026anthropic, claude-fable-5, ai-governance, export-control, ai-policy

Which LLM for low-latency real-time applications in June 2026?

A prescriptive guide to choosing LLMs for real-time workloads where inference latency and tokens per second dominate the user experience.

Jun 12, 2026low-latency, real-time, inference, model-selection

Which LLM for coding in June 2026?

A prescriptive guide to picking a coding LLM in June 2026, comparing GPT-5.3-Codex, Qwen3.7 Max, and Claude Opus 4.8 on cost, speed, and quality.

Jun 12, 2026coding, llm-comparison, developer-tools

Claude Fable 5 sits at 64.9 quality and $20/M. Is the top score worth double the price?

Claude Fable 5 leads quality at 64.9 but costs $20/M tokens. I break down when that premium pays off and when Opus 4.8 or Gemini 3.1 Pro win.

Jun 10, 2026claude-fable-5, model-comparison, pricing-analysis, anthropic

GPT-5.5 launches at 60.2 quality but Opus 4.8 keeps the crown

OpenAI's GPT-5.5 lands second on quality while costing 12% more than Claude Opus 4.8. Gemini 3.1 Pro still wins on price-per-quality.

Jun 8, 2026gpt-5.5, claude-opus-4.8, gemini-3.1-pro, llm-pricing, model-comparison

Claude Opus 4.7 versus Gemini 3.5 Flash: paying triple for 2.5 quality points

Claude Opus 4.7 costs $10/M and scores 57.3. Gemini 3.5 Flash medium costs $3.38/M and scores 54.8. I worked out when the gap is worth it.

Jun 4, 2026model-comparison, cost-analysis, anthropic, google

Claude Opus 4.8 takes the quality lead as Gemini 3.1 Pro undercuts it by 55%

Claude Opus 4.8 tops quality at 61.4 but costs $10/M. Gemini 3.1 Pro hits 57.2 at $4.50. Here's where the price-performance line actually sits this week.

Jun 1, 2026claude-opus-4-8, gemini-3-1-pro, gpt-5-5, weekly-briefing

Which LLM for long-context document processing in May 2026?

A prescriptive guide to picking an LLM for 100K+ token document workloads, weighing throughput, quality, and price per million tokens.

Jun 1, 2026long-context, document-processing, llm-selection, inference

GPT-5.4 is the model OpenAI doesn't want you to notice

GPT-5.4 scores 56.8 quality at $5.63/M tokens and 90 tok/s, quietly outperforming its pricier siblings on value.

May 27, 2026gpt-5-4, openai, value-analysis, model-comparison

Kimi K2.6 and Grok 4.3 undercut the field below $1.60 while GPT-5.5 stays expensive at the top

Weekly LLM briefing: Kimi K2.6 hits 53.9 quality at $1.42/M tokens, Grok 4.3 delivers 133 tok/s at $1.56, and the budget tier closes in on mid-range models.

May 25, 2026weekly-briefing, kimi-k2-6, grok-4-3, qwen3-7-max, budget-llms

Which LLM for coding and software development in May 2026?

Practical guide to choosing the best LLM for coding workloads in May 2026, with benchmarks, pricing, and clear recommendations by use case.

May 23, 2026coding, software-development, llm-comparison, guide

The $3.38 model that outputs at 227 tokens per second and still scores 55.3

Gemini 3.5 Flash delivers 227 tok/s at $3.38/M tokens with a 55.3 quality score. I compared it against GPT-5.5 variants and MiMo-V2.5-Pro.

May 23, 2026gemini-3-5-flash, gpt-5-5, cost-efficiency, inference-speed, model-comparison