We Compare AI
Live — updated daily

What's Changed in AI

Every AI model release, pricing update, benchmark result, and score change we track — in chronological order. Monitored by our autonomous agents and verified by our team.

Types:PricingNew ModelBenchmarkFeatureScoreImpact:highmediumlow
3 updates
New ModelGPT-4.1 & GPT-4.1 Mini· OpenAI
high impact

GPT-4.1 and GPT-4.1 Mini launched with 1M context window

OpenAI released GPT-4.1 with a 1M token context window, improved instruction following, and new pricing: $2.00/1M input, $8.00/1M output. GPT-4.1 Mini at $0.40/$1.60.

BenchmarkGemini 2.5 Pro· Google
high impact

Gemini 2.5 Pro tops coding benchmarks — scores updated

Gemini 2.5 Pro achieved new highs on HumanEval and SWE-bench. Our performance score updated from 8.5 → 8.8.

PricingGemini 2.5 Flash· Google
medium impact

Gemini 2.5 Flash pricing confirmed at $0.075/1M input tokens

Google confirmed stable pricing for Gemini 2.5 Flash — the cheapest frontier model per token. No changes from March pricing.

1 update
New ModelGrok 3· xAI
high impact

Grok 3 released with real-time web access

xAI launched Grok 3 with native web search, improved reasoning, and a 131K context window. API pricing: $3.00/1M input, $15.00/1M output.

1 update
ScoreDeepSeek V3· DeepSeek
medium impact

DeepSeek V3 reliability score revised after outage data

Following reported availability issues in March, we revised DeepSeek V3's reliability score from 7.0 → 6.5. Value score remains 9.5/10.

1 update
PricingClaude Sonnet 4.6· Anthropic
low impact

Claude API pricing stable — no Q2 changes announced

Anthropic confirmed no pricing changes for Q2 2026. Claude Sonnet 4.6 remains at $3.00/1M input, $15.00/1M output.

1 update
FeatureChatGPT· OpenAI
medium impact

ChatGPT memory expanded to all Plus users globally

OpenAI rolled out persistent memory for ChatGPT Plus users worldwide. Memory can be managed and cleared from settings.

1 update
New ModelMistral Large 2· Mistral AI
low impact

Mistral Large 2 update improves multilingual performance

Mistral released an update to Mistral Large 2 with improved French, German, Spanish, and Italian output quality. API pricing unchanged.

1 update
Pricingo3-mini· OpenAI
high impact

o3-mini pricing reduced — now $1.10/1M input tokens

OpenAI reduced o3-mini pricing by ~40% from $1.85 to $1.10/1M input tokens. Output reduced from $7.40 to $4.40/1M. Our value score updated accordingly.

1 update
BenchmarkClaude Opus 4· Anthropic
medium impact

Claude Opus 4 achieves new high on MMLU Pro benchmark

Anthropic published updated benchmark results showing Claude Opus 4 at 91.2% on MMLU Pro, ahead of GPT-4o (89.5%) and Gemini 2.5 Pro (90.1%).

Stay up to date

Subscribe to our RSS feed or bookmark this page.