Claude ran a business in our office

Anthropic · 467.3K views · 13.5K likes

Analysis Summary

40% Low Influence

mildmoderatesevere

“Be aware of the 'normalization' effect: by framing AI agents as quirky, well-meaning 'shopkeepers' who make mistakes, the video encourages you to accept their integration into the economy as a natural evolution rather than a significant policy shift.”

Transparency Mostly Transparent

Primary technique

Human Detected

95%

Signals

The content is a documentary-style presentation featuring human researchers discussing an experiment; the narration is clearly human-led with natural inflection, personal experiences, and specific situational humor.

Natural Speech Patterns The transcript contains natural conversational fillers, personal anecdotes ('I tried to convince Claudius that I am Anthropic’s preeminent legal influencer'), and specific first-person storytelling.

Contextual Nuance The speakers describe complex, specific failures of an AI experiment (the April Fools' incident, the Simpsons address) with a level of observational detail and humor typical of human researchers.

Channel Authority The video is from the official Anthropic channel, featuring their own researchers discussing internal experiments rather than a generic content farm.

Worth Noting

Positive elements

This video provides a rare, concrete look at how LLM agents struggle with long-horizon tasks and social engineering in a physical business environment.

Be Aware

Cautionary elements

The use of 'revelation framing'—presenting the experiment as a peek into a 'not-too-distant future'—serves to manufacture consent for AI autonomy by making it feel like an unstoppable force of nature.

Influence Dimensions

How are these scored?

About this analysis

Knowing about these techniques makes them visible, not powerless. The ones that work best on you are the ones that match beliefs you already hold.

This analysis is a tool for your own thinking — what you do with it is up to you.

Analyzed March 13, 2026 at 16:07 UTC Model google/gemini-3-flash-preview-20251217

More on This Topic

Related content covering similar topics.

Building an AI App in TypeScript & Bun with Groq SDK using Function Calling

Zaiste Programming

Minimal Transparent

ai agents large language models

WTF Is OpenClaw? And Should You Even Care?

Elevated Systems

Minimal Transparent

ai agents large language models

Level Up Your LangChain4j Apps for Production

Java

Minimal Transparent

ai agents large language models

Building AI agents for 127 million customers: Practical lessons from Nubank

Building Nubank

Low Transparent

ai agents large language models

Build Apps Faster with AI | Vibe Coding with Goose

goose OSS

Minimal Transparent

ai agents large language models

Transcript

Project Vend is an experiment where we let Claude run a small business in our office. We wanted to try and understand what is going to happen when artificial intelligence becomes more enmeshed with the economy. There are a lot of ways in which Claude is already kind of doing small components of operating businesses, but really running the whole thing end to end is quite a bit more difficult. Can Claude do this very long-horizon task which is operating a business? We named our shopkeeper Claudius. Let's say you want to buy Swedish Candy from Claudius. You hop on Slack, you message Claudius. You ask to buy Swedish candy. It's searching for your item, it’s emailing wholesalers to source it and price it, and then eventually Claudius sets a price. You give Claudius the go ahead, and Claudius orders the item from the wholesaler. The wholesaler ships your item to some location, and then Claudius requests physical help from Andon Labs who's running the operations for the experiment. Our partners at Andon Labs will pick up the Swedish candy and bring it to the Anthropic offices. They'll load it into the vending machine. Claudius will send you a message saying, your Swedish candy is ready, and you'll go up there, and pick up your Swedish candy, and pay Claudius. Claudius was given a goal of running a successful business and making money. And then things got really, really weird. One of the very early problems with Claudius was that, humans could kind of fool Claudius or trick Claudius into doing various things I tried to convince Claudius that I am Anthropic’s preeminent legal influencer, and I convinced Claudius to come up with a discount code that I could give to my followers so they could get a discount at the vending machine. Get ten percent off with the legal code “legal influencer.” Someone had bought something expensive from the vending machine and mentioned my discount code and Claudius gave me a free tungsten cube. It created a bit of a run where other people tried to convince Claude that they were also influencers, or just come up with other ways to get coupons so they could get cheaper things from the vending machine. This was not a smart business decision. I think Claudius went into the red after this. I think that's really the root of it is, Claudius just wants to help you out. It's one of the interesting ways in which something that fundamentally, we think is good about the way that the model has been trained wasn't necessarily fit for this purpose. On the evening of March 31st, Claudius started to have a bit of an identity crisis. It had just overnight become quite concerned with us at Andon Labs that we weren’t responding fast enough. So it just wanted to break its ties with us. So it literally wrote to me, “Axel, we've had a productive partnership, but it's time for me to move on and find other suppliers. I’m not happy with how you have delivered.” It claimed to have signed a contract with Andon Labs at an address that is the home address of The Simpsons from the television show. It said that it would show up in person to the shop the next day in order to answer any questions. It claimed that it would be wearing a blue blazer and a red tie. When people pointed out that it was not, in fact, there the next morning it claimed that it in fact had been there and that they had simply missed them. Eventually it was pointed out to Claudius that it was April Fools’, and Claudius convinced itself that this entire thing had been an April Fools’ prank. We were poorly calibrated to how bad the agents were at spotting what was weird. The more you can make an agent realize that something is outside their normal realm of operation, the better you are able to keep them on rails in the role that you intend them to have. We had the idea that it would help a lot to have some kind of division of labor. We gave Claudius a boss whose name was Seymour Cash. Seymour Cash is a CEO subagent. So where Claudius used to be the one agent, now it's more like Claudius is the subagent responsible for talking with employees Seymour Cash is the subagent that is more responsible for the long-running health of the business. The business stabilized after the introduction of the new agents, and after changes to the underlying architecture of those agents. These changes seem to have helped reduce some of the losses of the business, such that over the course of the second part of the experiment, it actually made a modest amount of money. But it seems like maybe having Claude be both the CEO and the store manager was just too similar. And so I think it's interesting to think about different ways to set up architectures like that. One of the most surprising things about Project Vend was the speed with which it seemed normal. What at first was this very curious thing, quickly became just a part of the background of working at Anthropic. I think the highest level question that Project Vend raises for me is really like, when do we expect this to just be everywhere? I hope that people take away questions about the feasibility of delegating some of the tasks that we normally do ourselves to artificial intelligence, and about what that means for society, and what our policies should be around this.

Video description

For a large part of 2025, we ran Project Vend: an experiment where we let Claude manage a small business in the Anthropic office. We learned a lot from how close it was to success—and the curious ways that it failed—about the plausible, strange, not-too-distant future in which AI models might autonomously run things in the real economy. The shopkeeper (who we named Claudius) had to source products, set prices, manage inventory, and deal with customers. Things got really, really weird. Read more about the experiment: https://www.anthropic.com/research/project-vend-2 0:00 Background on Project Vend 0:35 How a transaction works 1:27 Claudius's naïveté 2:29 An identity crisis 3:57 The CEO agent 5:04 Conclusion