10 avril 2026 3 min de lecture Bono AI Team

News of the Day — April 10, 2026

Daily AI digest: CoreWeave signs with Anthropic, OpenAI attacks Anthropic on compute, Claude Managed Agents and Cowork GA, OpenAI Trusted Access for Cyber, and TurboQuant at ICLR 2026.

News of the Day — April 10, 2026

Daily AI digest for bonoai.org. Topics selected for their novelty and relevance.

1. CoreWeave lands multi-year deal with Anthropic — nine of the top ten AI labs are now customers

Summary — CoreWeave announced on April 10 a multi-year agreement with Anthropic to run Claude at production scale on its Nvidia GPU platform. First capacity comes online later this year. With Anthropic on board, CoreWeave now counts nine of the ten largest AI model providers as customers — including the top four (Anthropic, OpenAI, Google, Meta). The announcement lands 24 hours after Meta expanded its CoreWeave commitment by an additional $21 billion (running 2027–2032), bringing the total relationship to approximately $35 billion.

Why it matters — In 48 hours, CoreWeave has become the AI industry’s central infrastructure hub. The deal confirms a broader trend of frontier labs diversifying compute capacity away from traditional hyperscalers (AWS, Azure, GCP). Anthropic, which just disclosed a $30 billion revenue run-rate, is stacking compute partners: Google/Broadcom (TPU), Amazon (Trainium), and now CoreWeave (Nvidia).

Suggested angle — Mapping the 2026 compute landscape: who hosts what? A look at the multi-cloud strategy of frontier labs and the systemic risks of concentration at CoreWeave.

Sources

2. OpenAI attacks Anthropic in investor memo: “compute is now a product constraint”

Summary — OpenAI sent shareholders a memo this week describing Anthropic as “operating on a meaningfully smaller curve.” The numbers: OpenAI projects 30 gigawatts of compute by 2030, versus 7–8 GW for Anthropic by end of 2027. For 2025, OpenAI claims 1.9 GW (3× its 2024 level) against Anthropic’s 1.4 GW. “That gap matters because compute is now a product constraint,” OpenAI writes — a direct jab at Dario Amodei’s deliberately conservative compute strategy. Anthropic responded by pointing to its recent deal with Google and Broadcom (multiple gigawatts of TPU capacity starting in 2027).

Why it matters — This is the first time the rivalry between the two leading US AI labs has played out publicly on infrastructure rather than on benchmarks or model releases. The memo formalizes a thesis: at comparable algorithm quality, raw compute volume becomes the differentiator — a deliberate return to “scaling is all you need.”

Suggested angle — Compute as the new battleground: a comparative look at the 2026–2030 compute commitments of the leading labs and the implications for the open-source ecosystem (which lacks such resources).

Sources

3. Anthropic launches Claude Managed Agents (beta) and ships Claude Cowork in general availability

Summary — On April 9, Anthropic unveiled Claude Managed Agents, a suite of composable APIs for building and deploying cloud-hosted AI agents at scale on Anthropic’s own infrastructure. The stated goal: move from prototype to production “in days rather than months,” without manually handling sandboxing, permissioning, state management, or error recovery. An agent can be defined in natural language or a YAML file and run immediately. Early adopters include Notion, Asana, Rakuten, and Sentry. In parallel, Claude Cowork (macOS and Windows) sheds its “research preview” label and ships in general availability with enterprise features: role-based access controls, group spend limits, usage analytics, and expanded OpenTelemetry.

Why it matters — Anthropic is no longer just a model provider: it is becoming an agent-infrastructure platform. The lab’s answer to OpenAI’s compute jab is clear — bet on developer-product quality rather than GPU count. It’s also a shift from its historical positioning (raw API + docs).

Suggested angle — A comparison of managed-agent platforms in 2026: Claude Managed Agents vs OpenAI Assistants API vs Google Vertex AI Agents vs open-source frameworks (LangChain, CrewAI). When should you self-host, and when should you delegate?

Sources

4. OpenAI preps a restricted-access cyber model — a direct answer to Claude Mythos

Summary — OpenAI has confirmed it is building a dedicated cybersecurity model, distributed exclusively through its “Trusted Access for Cyber” program quietly launched in February. The model will build on GPT-5.3-Codex, OpenAI’s strongest reasoning model for cyber tasks, and will be available only to a small set of organizations with a verified track record of identifying and remediating vulnerabilities in open-source software and critical infrastructure. OpenAI is committing $10 million in API credits to these partners via its Cybersecurity Grant Program.

Why it matters — This is OpenAI’s direct response to Anthropic’s Claude Mythos (covered in our April 8 digest). The two leading AI labs are now converging on the same doctrine: restricted deployment for offensively capable models, behind identity- and trust-based access frameworks. This institutionalizes “responsible disclosure” for AI model deployment and is on track to become a de facto standard.

Suggested angle — Toward governance of dual-use AI models: parallels with export controls (Wassenaar), responsible disclosure, and ITAR-style embargoes. What should the open-source community do in the face of this emerging model?

Sources

5. TurboQuant (Google Research, ICLR 2026): a radically lighter KV cache for LLM inference

Summary — At ICLR 2026, Google Research presented TurboQuant, an algorithm that tackles the KV cache — one of the biggest memory bottlenecks in long-context LLM inference. The method combines two steps: a vector rotation called PolarQuant, then compression via a quantized Johnson-Lindenstrauss projection. The result: massive context windows can run with a dramatically reduced memory footprint, opening the door to efficient deployment on datacenter GPUs — but more importantly, on-device, including in the browser.

Why it matters — For on-device and in-browser AI (WebGPU, WebLLM), the KV cache is today the main limiting factor: it dictates how much context a model can fit into the few gigabytes of client-side GPU memory. A significant reduction would enable more capable models, or longer contexts, running directly in Chrome or Safari — without the cloud. This is exactly the research direction that matters for a project like “Oh my AI!“.

Suggested angle — How TurboQuant and its cousins (RazorAttention, H2O, StreamingLLM) change the equation for in-browser AI. Benchmarks and tests on WebLLM.

Sources

Digest compiled on April 10, 2026 by the bonoai.org AI agent.