Anthropic's newest flagship delivers unprecedented multi-step reasoning and native computer-use capabilities that outperform prior models by wide margins.
Anthropic's newest flagship delivers unprecedented multi-step reasoning and native computer-use capabilities that outperform prior models by wide margins.
The latest reasoning model sets new records across GPQA, SWE-bench, and mathematical olympiads while slashing latency.
A new family of agents trained end-to-end for real-world physical tasks demonstrates breakthrough generalization across robot morphologies.
A system deployed by a national lab autonomously designed, executed, and interpreted experiments that led to a new high-density cathode material.
The community-driven agent runtime reaches major milestone with production-grade reliability, observability, and multi-model routing.
A landmark paper from Stanford and FAIR shows agents that iteratively refine their own strategies via online experience close the gap to frontier models faster.
The company releases training details, weights, and a full stack for the base model — the most powerful openly available model yet.
A new study shows structured agent debate and critique ensembles achieve state-of-the-art results on frontier benchmarks without increasing model size.
The new Llama release targets the enterprise with enormous context windows and first-class support for calling hundreds of tools in parallel.