News

Anthropic Unveils Claude Cowork to Rival OpenClaw, OpenAI Releases GPT-5.4 Mini/Nano for Coding-Optimized AI Agents

Anthropic Unveils Claude Cowork to Rival OpenClaw, OpenAI Releases GPT-5.4 Mini/Nano for Coding-Optimized AI Agents

Anthropic has launched "Claude Cowork," widely regarded as its strategic answer to OpenAI's "OpenClaw." Prominent figures like SimonW and Ethan Mollick have favorably compared Claude Cowork to OpenClaw, echoing Jensen's earlier assertion that every company requires an "OpenClaw strategy." Having previously "fumbled" its relationship with Clawdbot, Anthropic now presents what is described as a "pretty pretty good" solution.

The development of Claude Cowork involved careful consideration of technical choices such as sandboxing and Electron. Discussions on its origin story, use cases, and design thinking are available in related content. While remote control functionality is not yet available, it is anticipated to be released soon.

OpenAI Rolls Out GPT-5.4 Mini/Nano, Shifting Towards Small, Coding-Optimized Models

OpenAI has simultaneously rolled out its GPT-5.4 mini and nano models across its API, ChatGPT, and Codex platforms, positioning them as its most capable small models to date. According to @OpenAIDevs, GPT-5.4 mini is more than twice as fast as GPT-5 mini, designed for coding, computer use, multimodal understanding, and subagents, and offers a 400k context window in the API.

OpenAI claims that mini approaches the performance of larger GPT-5.4 models on benchmarks like SWE-Bench Pro and OSWorld-Verified, while consuming only 30% of the GPT-5.4 Codex quota. This makes it the new default for numerous background coding workflows and subagent fan-out scenarios.

Initial reception highlighted the models' coding value, but also raised concerns about pricing and truthfulness tradeoffs. Developers noted mini's utility for subagents in Codex, computer-use workloads, and external products such as Windsurf. However, commentary also converged on OpenAI's familiar pattern of improved performance coupled with higher costs. Posts from @scaling01 indicated prices of $0.75/M input and $4.5/M output for mini, with nano similarly priced above previous nano tiers.

Third-party evaluations yielded mixed results: Mercor’s APEX-Agents reported a 24.5% Pass@1 for mini with xhigh reasoning, outperforming some lightweight and midweight competitors. Conversely, BullshitBench placed the new small models relatively low in terms of resistance to false-premise/jargon traps. OpenAI also tacitly acknowledged behavior tuning issues, with @michpokrass noting a recent 5.3 instant update reduced "annoyingly clickbait-y" behavior.

Agent Infrastructure: Sandboxes, Subagents, and the Harness Wars

Code-executing agents are becoming central to product architecture. Several new launches indicate a maturing stack focused on secure execution, orchestration, and deployment ergonomics, rather than solely on superior base models. LangChain, for instance, introduced LangSmith Sandboxes for secure ephemeral code execution.

↗ Read original source