How do product managers operate at Anthropic

How do product managers operate at Anthropic

Jess Yan is a Product Manager at Anthropic. She gave an insightful talk with an interesting take: being a PM in the AI era has made my work feel more human than ever.

Traditionally, PMs spent most of their time on coordination — cross-team meetings, reports, and managing engineering backlogs. After making decisions, they also had to pitch ideas across departments and secure resources. The core craft of the role often got pushed aside. Now with Claude, she has cut down drastically on routine coordination work, freeing up time to focus on what a PM truly does best.

PM workflow diagram

Cat Wu included a workflow and role division diagram in her March piece PM on the AI Exponential.

Jess oversees Claude Managed Agents internally at Anthropic. To put it simply, it is a cloud service that lets developers easily deploy AI as virtual team members. Set a goal, and one or a group of AI agents in the cloud will handle the tasks for you. All underlying work — including sandbox environments, permissions and tracing — is fully managed, and you will receive the final results once the work is done. This service launched its public beta on April 8, and the official documentation crashed right after release.

In her past work, API design was done entirely in Google Docs. One person would draft a specification, followed by review meetings and comment threads over emails. Weeks later, the documents might look polished, yet they would break easily when put into actual development due to unforeseen issues.

Jess has adopted a new approach: building prototypes instead of writing formal specs. Using Claude Code, she tests pre-production API specifications directly to run AI agents. She can go from a basic "hello world" test to a full end-to-end prototype in just one afternoon.

This practice allowed her to iterate multiple times on API abstractions and the user experience of Claude Console well before the product launch. She notes that such flaws would never surface through weeks of document reviews, and fixing them post-launch based on user feedback would already be too late.

She also runs raw cURL requests to verify the out-of-the-box user experience. Her standard for product delivery has shifted: well-written documentation is just good wording; what truly matters is whether the product actually works.


The Trio of Claude Tools: Division of Work

Jess relies on three core Claude products to form her steady workflow:

Claude
For open-ended research and exploration. Used in the early ambiguous stages or when continuous dialogue is needed.
Claude Cowork
For all other knowledge work — writing emails, clearing inboxes, managing to-do lists, creating slides, checking Slack history and arranging business trips.
Claude Code
Once the problem to solve is clearly defined, she builds custom AI agents here, which run on top of Claude Managed Agents.
METR Human Equivalent Time chart

📈 The METR metric for Human Equivalent Time has surged from 21 minutes on Sonnet 3.5 to 12 hours on Opus 4.6 — a 41x improvement over 16 months.

She summed this up in one line: Being able to experiment with your own products raises the ceiling for what you can envision for future versions. Previously, PMs could only imagine what they could conceive; now they can imagine what they can actually build. The two are fundamentally different.


Three Custom Agents for Her Daily Work

🔍 Adoption Analytics Agent

Connected to internal databases and equipped with capabilities to understand data schemas, it autonomously runs queries to spot anomalies and patterns. With built-in memory, it retains past insights and iterates on previous findings in subsequent runs.

📡 Developer Sentiment Monitoring Agent

Integrated with web search, it scans developer feedback across a predefined list of domains and summarizes recurring topics. When the workload exceeds a single agent's capacity, it fans out tasks to multiple agents for parallel processing, then consolidates all results.

🎨 Demo Building Agent

Linked to demo-related GitHub repositories, brand assets and presentation decks. It converts standard templates into customized demos for different scenarios, such as internal meetings and client sessions.

Architecture Diagram: Claude Managed Agents
Architecture Diagram: Claude Managed Agents

All three agents run on the cloud. Jess can step away to handle other work, and return to find the tasks fully completed and deployed.


What Are Managed Agents?

The public beta of Managed Agents comes with four core capabilities:

1
Production-grade Agent Sandbox
Handles sandboxing, authentication and tool execution out of the box.
2
Long-duration Sessions
Agents can run autonomously for hours. Progress and outputs are preserved even if the connection drops.
3
Multi-agent Collaboration
Agents can spawn additional instances to work in parallel (research preview).
4
Trusted Governance
Scoped permissions, identity management and tracing are enabled by default.

Jess highlighted one key feature: Memory.

Launched in beta on April 23, Memory equips agents on Managed Agents with cross-session learning capabilities. It adopts a file-system-based memory architecture. Agents can read and write data directly via Bash and code execution. Developers may also export the memory for standalone management, with all changes recorded in audit logs.

Memory announcement featured image
Featured image from the Memory announcement

Invoking Managed Agents is as simple as a single line of instruction. Jess enables the Managed Agents skill within Claude Code, then hands the agent a rough outline to get started. Developers can also directly prompt the latest Claude Code: "Start onboarding for managed agents in Claude API", and it will complete the entire onboarding process autonomously.


Rakuten: 97% Drop in Errors

Rakuten shared three key metrics in the Memory release notes:

97%
Reduction in first-pass errors
27%
Cost reduction
34%
Lower latency

The agents learn to avoid mistakes within workspace boundaries, with full observability into the learning process. After integrating Memory into its documentation validation pipeline, Wisedocs boosted overall efficiency by 30%. Netflix leverages cross-session context retention for its agents, preserving multi-round insights and human revisions made mid-task.

Managed Agents has gained extensive real-world adoption. Notion teams assign work directly to Claude within workspaces: engineers use it for coding, while knowledge workers build slides and spreadsheets, with dozens of tasks running in parallel. Vibecode adopts Managed Agents as its foundational platform, cutting the time from prompt input to full deployment by over 10x. Sentry connected its debugging agent Seer to a Claude agent, enabling end-to-end workflows from bug identification to review-ready pull requests. Asana, Atlassian, Blockit and General Legal are also listed as official customers.

💰 Pricing is usage-based. In addition to standard Claude Platform token rates, an extra $0.08 per active session hour applies for runtime usage.


Just Give It a Try

The experiments and tools you've long wanted are now accessible with just one prompt and a few API calls.

Most product managers still follow the traditional workflow: drafting PRDs, waiting for scheduling and going through reviews. Once PMs can deploy agents to deliver tasks independently, the entire production cadence will be transformed.

Going forward, a new evaluation metric will emerge for PMs: what percentage of your work is delivered by AI agents?

Back to blog

Leave a comment