What's Changed in AI
Every AI model release, pricing update, benchmark result, and score change we track — in chronological order. Monitored by our autonomous agents and verified by our team.
GPT-4.1 and GPT-4.1 Mini launched with 1M context window
OpenAI released GPT-4.1 with a 1M token context window, improved instruction following, and new pricing: $2.00/1M input, $8.00/1M output. GPT-4.1 Mini at $0.40/$1.60.
Gemini 2.5 Pro tops coding benchmarks — scores updated
Gemini 2.5 Pro achieved new highs on HumanEval and SWE-bench. Our performance score updated from 8.5 → 8.8.
Gemini 2.5 Flash pricing confirmed at $0.075/1M input tokens
Google confirmed stable pricing for Gemini 2.5 Flash — the cheapest frontier model per token. No changes from March pricing.
Grok 3 released with real-time web access
xAI launched Grok 3 with native web search, improved reasoning, and a 131K context window. API pricing: $3.00/1M input, $15.00/1M output.
DeepSeek V3 reliability score revised after outage data
Following reported availability issues in March, we revised DeepSeek V3's reliability score from 7.0 → 6.5. Value score remains 9.5/10.
Claude API pricing stable — no Q2 changes announced
Anthropic confirmed no pricing changes for Q2 2026. Claude Sonnet 4.6 remains at $3.00/1M input, $15.00/1M output.
ChatGPT memory expanded to all Plus users globally
OpenAI rolled out persistent memory for ChatGPT Plus users worldwide. Memory can be managed and cleared from settings.
Mistral Large 2 update improves multilingual performance
Mistral released an update to Mistral Large 2 with improved French, German, Spanish, and Italian output quality. API pricing unchanged.
o3-mini pricing reduced — now $1.10/1M input tokens
OpenAI reduced o3-mini pricing by ~40% from $1.85 to $1.10/1M input tokens. Output reduced from $7.40 to $4.40/1M. Our value score updated accordingly.
Claude Opus 4 achieves new high on MMLU Pro benchmark
Anthropic published updated benchmark results showing Claude Opus 4 at 91.2% on MMLU Pro, ahead of GPT-4o (89.5%) and Gemini 2.5 Pro (90.1%).
Stay up to date
Subscribe to our RSS feed or bookmark this page.