Provider Watch
from Onlook

Edition 08 · Tue Jun 2, 2026

Anthropic ships Opus 4.8 and files to go public — our whole stack, one week.

Opus 4.8 landed May 28 — more natural, better-calibrated thinking, and a fast mode Anthropic says is now much cheaper inside Claude Code. Days later it confidentially filed an S-1 and disclosed a $965B-valuation raise. The model we build on and the company under it both moved at once.

Read the Opus 4.8 notes
Editorial illustration: a single tall figure casts one long shadow across a row of small stone markers, one marker ember-lit — one provider's week falling across the whole field

An Anthropic week, end to end — five items, ranked by Onlook impact.

  1. 01

    Claude Opus 4.8 ships — and fast mode gets much cheaper in Claude Code

    May 28 release: warmer, more natural, better-calibrated thinking effort. The one that matters for us — Anthropic says fast mode for 4.8 is now much more affordable in Claude Code, the surface we run in daily. Reception is strong with an honest incremental-vs-leap debate.

  2. 02

    Anthropic files a draft S-1 and raises at a $965B valuation

    Adjacent to our build, but big: a confidential SEC S-1 (Jun 1), a $65B Series H at $965B post-money, and run-rate revenue past $47B. The company we just consolidated onto is heading toward a public offering — vendor-stability context for the bet.

  3. 03

    Claude Code hardens — a security plugin and a reliability pass

    Insiders shipped a security-guidance plugin that finds and fixes vulnerabilities, plus a responsiveness/reliability push. Claude Code is in our stack, so its hardening is our hardening — worth a look at the plugin.

  4. 04

    Anthropic's self-hosted sandboxes land — and Gemini follows the same week

    Last edition's watch-item shipped: Managed Agents now run tool execution on your own infra while the agent loop stays managed. Gemini announced the same primitive days later — managed agents is now a category move, not one vendor's feature.

  5. 05

    Next.js now ships an agent-eval harness at nextjs.org/evals

    Vercel is publicly ranking models on Next.js agent tasks. Opus and GPT-5 lead; the open model MiniMax M3 sits right behind at ~10× lower cost. A useful, neutral benchmark for anyone choosing a coding model.

Editorial illustration: a craftsman's hand re-tuning a glowing dial on a familiar instrument, the calibration mark sliding to a warmer setting — a model refined rather than replaced

The model we build on got warmer — and cheaper where we run it.

Claude Opus 4.8 shipped May 28. Anthropic's framing is refinement over reinvention: it reads nuance better, feels more natural and collaborative, and lets you calibrate thinking effort more precisely. The line that matters for us is operational — fast mode for Opus 4.8 is now much more affordable in Claude Code, the surface we use every day. That's a direct cost/UX lever, not a model-card footnote.

Reception is strong but honest. Insiders called it warmer and more capable; skeptics found it incremental in daily use while conceding the gap is real on harder tasks. A separate thread claimed a benchmark top spot — treat leaderboard claims as unconfirmed until Anthropic's own post lands. For Onlook the takeaway is concrete: we're Anthropic-primary and Claude Code is in the stack, so a cheaper fast tier changes the economics of how aggressively we lean on it.

Alex Albert
@alexalbert__ · May 28
Excited to release Opus 4.8 today — it understands nuance better and feels much more natural. Fast mode is much more affordable now. Try it in Claude Code.

What actually changed for us

  • Fast mode for 4.8 is materially cheaper in Claude Code — re-evaluate where we cap usage.
  • Better-calibrated thinking effort — worth a pass on prompts that over- or under-think today.
  • Benchmark "#1" claims are circulating but unverified — wait for the canonical post before repeating.
Editorial illustration: a single sapling has become a vast load-bearing pillar overnight, scaffolding rising around it as a small figure looks up — a foundational supplier scaling into public-company size

The company under our model just filed to go public.

In the same window, the supplier itself moved. Anthropic confidentially submitted a draft S-1 to the SEC on June 1 — the first concrete step toward a public offering — days after disclosing a $65B Series H at a $965B post-money valuation and run-rate revenue past $47B. This is adjacent to what we build, not a tool change. But it's not noise either: it's vendor-stability context for the bet we just made consolidating onto Anthropic.

The honest read cuts both ways. A supplier at public-company scale is durable, well-capitalized, and unlikely to disappear — good for a small team betting its core loop on one provider. The flip side is that a soon-to-be-public company optimizes for margin and predictability, which is exactly the pressure behind this window's other Anthropic story — metered, usage-based billing. Worth a periodic glance, not a standing worry.

The facts, plainly

  • Confidential draft S-1 filed with the SEC (June 1) — IPO track, timing undisclosed.
  • $65B Series H at ~$965B post-money; investors include Altimeter, Dragoneer, Greenoaks, Sequoia.
  • Run-rate revenue crossed $47B — the growth context behind the raise.
  • For us: stability upside, margin-pressure watch-item. Not a reason to diversify today.
Editorial illustration: a workshop tool being reinforced mid-use — a persimmon shield-plate fitted to its handle while the work continues — hardening without downtime

The tool we live in got a security plugin and a reliability pass.

Two Claude Code improvements surfaced from the team building it. A security-guidance plugin that identifies and helps fix vulnerabilities, paired with an engineering post on how agent access and permissions should evolve; and a separate push on responsiveness and reliability. Neither is a headline release, but Claude Code is the surface our work runs through — its hardening is ours.

Boris Cherny
@bcherny · May 26
Shipped a security plugin for Claude Code — it identifies and helps fix vulnerabilities in your code.
From the creator of Claude Code — additive context beyond the changelog.

What to actually do

  • Try the security plugin on a real branch — see whether its findings are signal or noise for us.
  • Read the agent-permissions post before we widen any tool's write access.
  • Reliability work is invisible until it isn't — note it, no action.
Editorial illustration: a controlled flame burns safely inside a hearth on the viewer's own ground while a guiding hand directs it from a distant managed console — execution on your infra, orchestration kept managed

Self-hosted sandboxes shipped — and the category formed around them.

Edition 07's watch-item landed with a date. Anthropic's Managed Agents can now run tool execution inside self-hosted sandboxes on your own infra (Cloudflare, Daytona, Modal, Vercel) with MCP tunnels, while the agent loop stays managed by Anthropic. The strategic shape we flagged holds: own the execution environment, keep the orchestration off your plate.

The new signal is competitive. Within days, Gemini announced managed agents in its API too — so this is a category move, not a single vendor's feature. For us it strengthens the pattern worth prototyping if we ever execute generated code: managed loop, our sandbox, our data boundary.

@DhravyaShah · May 25
Following Anthropic, Gemini has also launched managed agents in the API.
The tell that managed agents is becoming table stakes.

What's new

  • Self-hosted sandboxes (public beta) — tool execution on your infra, agent loop stays managed.
  • MCP tunnels connect your managed agent to private/self-hosted tools.
  • Gemini followed within days — track this as a category, not a one-vendor bet.
Editorial illustration: several model-runners on a measured track with a clear leaderboard post, one lane in persimmon trailing close behind the leaders — a neutral public ranking of coding models

Vercel is now publicly scoring models on Next.js agent tasks.

A genuinely non-Anthropic item, and a useful one. Next.js now publishes an agent-eval harness at nextjs.org/evals — ranking models on real Next.js agent tasks. Opus and GPT-5 lead; the open model MiniMax M3 sits right behind at roughly 10× lower cost. For anyone choosing a coding model, that's a neutral, framework-native benchmark rather than a vendor's own scorecard.

Guillermo Rauch
@rauchg · Jun 1
MiniMax M3 is now the leading open model on the Next.js agent evals — right behind Opus & GPT-5, but ~10× cheaper.

Why it's worth a bookmark

  • Framework-native eval — closer to our real workload than generic coding benchmarks.
  • Surfaces the cost/quality frontier directly: where a cheaper open model is "good enough."
  • Neutral source for the Opus-vs-everything-else question we keep relitigating internally.

Chatter worth a glance — not tied to one release.

@gabriell_lab · May 28
Since Opus 4.8 is out and more designers are getting into Design Engineering
Lands on our exact wedge — designers moving toward code.
@ai_trade_pro · May 29
Opus 4.8 felt incremental in daily use… but here's one place the gap is very real.
The honest dissent — a useful counterweight to launch-day enthusiasm.
Malte Ubl
@cramforce · Jun 1
shadcn may be the future of software dependency distribution… the solution to the supply-chain meltdown.
Vercel's CTO on copy-in over install — a recurring distribution theme.

Stuff we didn't dig into. Skim for serendipity.

Vercel platform

The Vercel CLI is moving to a signed / notarized macOS binary — framed by Malte Ubl as security-posture prep. Plus a redesigned deployments list, Microfrontends routing (alias/branch-domain), and Domains price-sort + availability filter on the changelog. — signed binary is the one to note; the rest is platform cosmetics

Out of scope this window: the heavy Vercel activity was AI Gateway / AI SDK — not hosting / Blob / toolbar. — AI SDK stays adjacent-lens; we're migrating off

Langfuse

Launch Week 5 shipped a Langfuse agent skill for AI coding agents plus an open "Agent Skills" standard, and Experiment CI/CD gates (run experiments in GitHub Actions, gate PRs). — the "skills as table stakes" arc from E07 keeps widening — now an open standard

Next.js

Canary 16.3 train continued; nothing on stable to chase. The notable item is the new public agent-eval harness — see the deep card. — watch the evals page as a neutral model-ranking source

E2B

SDK moved e2b@2.25.0 → 2.27.0 (+ python-sdk) on GitHub. Version bumps confirmed; feature substance not verified. — low-confidence — couldn't pin a changelog narrative

Cloudflare

@cloudflare/deploy-helpers@0.1.0 published in workers-sdk. New package, no changelog narrative surfaced. — low-confidence quiet — flagged to dig if it recurs

Neon

No new release this window. The 100-snapshots-per-project bump was covered in Edition 07. — carryover only

React

No release this window. Still 19.2.6 (May 6). — quiet

TypeScript

No release this window. 6.0 stable (Mar 23) remains current; the Native port milestone work continues. — quiet

Bun

No release this window. Still 1.3.14 (May 13). — the first Rust-built stable tag remains the thing to watch

Tailwind CSS

No release this window. 4.3 (May 8) remains the latest. — quiet

tRPC

No release this window. v11 line stable; v12 still unshipped. — quiet

TanStack Query

No notable release this window — the May 23 unsubscribed-query fix was covered in Edition 07. — carryover only

Drizzle ORM

No new release this window. Still on the v1.0.0-rc line; the casing: 'snake_case' breaking change still looms for the v1 stable cut. — plan the casing config ahead of v1

Zod

No release this window. Still on the 4.4.x line. — quiet

Liveblocks

No Storage / Presence / Sync / Yjs release this window. v3.19.3 reliability fix was the Edition 07 deep card. — carryover only

Storybook

No new stable this window. 10.4.x line remains current. — quiet

MCP SDK

No tagged release this window. 2.0.0-alpha line from April remains the latest. — quiet

React Flow (xyflow)

No release this window. @xyflow/react@12.10.2 remains current. — quiet

Motion

No release this window. v12.40 remains the latest stable. — quiet

Twitter · X coverage
Twitter / X coverage is live this edition — sourced via authenticated search across official and insider accounts. Cards quote and attribute by handle; by policy they never link out to x.com.

Provider Watch

20 providers tracked · sources locked Jun 2, 2026 · weekly worker · daily collect, Monday publish

Source map
  • Foundational (9): Next.js · React · TypeScript · Bun · Tailwind CSS · tRPC · TanStack Query · Drizzle ORM · Zod
  • Paid services (7): Neon · Vercel · Liveblocks · E2B · Cloudflare Workers · Langfuse · Anthropic Claude Agent SDK
  • Strategic (4): Storybook · MCP SDK · React Flow · Motion
  • Adjacent-only watch: Vercel AI SDK (migrating off) · competitors in design-to-code