Claude Code Agent Teams Explained (Complete Guide)

Mark Kashef · 14.8K views · 347 likes

Analysis Summary

40% Low Influence

mildmoderatesevere

“Be aware that the 'cheat codes' and 'magic words' described are actually standard software features; this framing is designed to make the creator's paid community feel like an essential source of 'insider' knowledge.”

Ask yourself: “Did I notice what this video wanted from me, and did I decide freely to say yes?”

Transparency Mostly Transparent

Primary technique

Human Detected

95%

Signals

The content exhibits high-fidelity human traits including spontaneous verbal fillers, a personalized technical setup (the surveillance dashboard), and a teaching style rooted in specific, non-formulaic experiences. The narration lacks the rhythmic perfection and generic structure typical of AI voiceovers or LLM-generated scripts.

Natural Speech Patterns Transcript includes natural filler words ('So', 'basically', 'kind of like'), self-corrections, and conversational transitions ('what the heck you're doing').

Personal Anecdotes and Context The creator references building a custom 'surveillance dashboard' on top of the API and shares specific 'cheat codes' based on their personal workflow.

Technical Nuance and Error Handling The speaker describes specific troubleshooting steps (interviewing the agent to check for feature flag activation) that reflect real-world experience rather than a generic script.

Worth Noting

Positive elements

This video provides a practical explanation of how multi-agent coordination differs from sub-agents, specifically regarding shared context and communication protocols.

Be Aware

Cautionary elements

The use of 'revelation framing' (e.g., 'cheat codes') to describe public documentation can lead viewers to over-rely on a single 'expert' for information that is freely available.

Influence Dimensions

How are these scored?

About this analysis

Knowing about these techniques makes them visible, not powerless. The ones that work best on you are the ones that match beliefs you already hold.

This analysis is a tool for your own thinking — what you do with it is up to you.

Analyzed March 23, 2026 at 20:38 UTC Model google/gemini-3-flash-preview-20251217

More on This Topic

Related content covering similar topics.

Claude Just Rolled Out 2 Big New Features

Matt Wolfe

Minimal Transparent

claude code ai agents

Claude Code Skills Just Got Even Better

Nate Herk | AI Automation

Minimal Transparent

claude code ai agents

I Tried Claude Code Remote Control So You Don't Have To

Mark Kashef

Low Mostly Transparent

claude code ai agents

Cursor Automations Clearly Explained (worth learning?)

Nate Herk | AI Automation

Low Mostly Transparent

claude code ai agents

Run Multiple Claude Code Agents Without Git Conflicts (Vibe Kanban)

Zen van Riel

Low Mostly Transparent

claude code ai agents

Transcript

So, in this video, I'm going to walk through a deep dive on how the agent teams from Anthropic's new Opus 4.6 release work in depth. So, as you can see on my screen here, I have my agent team running and instead of just looking at JSON files or markdown files, I actually built a system on top of it to give me full surveillance as to what's happening in real time. So what you can see here is I'm spinning up a web page and you have a designer actively working on the task assignment and all of the discourse between that agent to us and the agent to the team lead is fully transparent and something that you can audit along the way. And you can see that we go from designer to developer and each one waits for its turn. Unlike before with sub agents, they worked in parallel. Basically would give you the TLDDR of the result at the very end, but there's no interplay in between the different agents. Here we can actually click see the entire discourse, the description, and if there are conversations between them, kind of like sending an email to each other or messages to each other on Telegram, we'll be able to audit that with this infrastructure as well. And what's cool is you can click on the history tab and go through any of the prior sessions we had along with the associated messages. so you can audit what happened and why. So, with that little teaser out of the way, I'm going to walk you through how the agent team infrastructure works so you can leave this video understanding where to use it, when to use it, and more importantly, how to actually surveil your agents. And the reason why is this agents feature is amazing, but it swallows up tokens like a vacuum in a very short amount of time. And the last thing is you want to make sure that you're not using this feature frivolously. Not using it just because it looks cool, but because you actually have a complex task where you need a series of agents with ideally a sequence in order to execute on a recurring basis. And to enable the agents team specifically, what you have to do is enable the feature flag. Now, if you are non-technical, I have a cheat code for you. So instead of going to navigate this cloud settings.json JSON folder and worrying about what the heck you're doing. What you can do is literally go to the docs page on orchestrating the cloud code sessions and click on copy page and then my cheat code is literally to give this to cloud code or to something like warp which is like a smart terminal to go and set everything that I need. So I can just open cloud code and it's there. So, as an example, I could go to warp and I could say go read and doublech check that we have this installed correctly. I'll literally just throw that all in there. This will go through and just double check that our settings are where they need to be. So you don't have to worry about putting this feature flag all the agent team stuff and you can even ask something like cloud code or warp to help you with a proper prompt because the invocation of this teams is typically in this form where you say I'm designing X and I want to be able to create an agent team or spawn an agent team. If you say something semantically similar to that then that should suffice. And if you wanted to make it even easier then you could just give the URL right here and just say go read this document. It will come back with an understanding of the architecture, the feature flag, everything that you'll need and you can just say, "Listen, go and implement this all in my system." And once you're good and good to go from there, you can just restart Cloud Code. It should work in a brand new session. And then ideally, you should interview it to access and ask it, do you have access to agent teams? Once it says yes, then it'll explain what tools it has. If it doesn't say that and it assumes that teams means sub agents, that means it's not properly installed. So, as a quick preview, if I go to my prior session, if I just said, do you know how to spin up an agent team right here? Then it replies saying, yes, I know how to create a team, and it should lecture you on all the tools it has. So, the name of the tools include the agent types, the task updates, the team delete. So once it's ready to clean up, the way teams work is once they spin them up, they persist until the actual goal is completed or the task is done. Unlike sub agents where they spin up and die, spin up and die. And at the bottom here, I have a reference to the agent surveillance skill that I showed you at the beginning of the video. And I'll even walk you through the TLDDR of how you can approach building such a system yourself. And for step four, the goal is to assign the best model to the right agent at the right time. So in this case the team lead always usually deserves opus 4.6 just because they will be managing the other sub aents as well. So the other team members will report up to the team lead to get approval to get direction. So that's why you want to be able to equip it with the highest complexity possible. For the additional team members that's where it might make sense to use something like sonet 4.5 or even haiku depending on the task. And similar to invoking a spell, you have to say the magic words like we saw before to say something like create an agent team to review PR42. This could be an example of you actually reviewing an existing codebase and looking for a team of reviewers. And these reviewers will be able to look at security at performance and validating test coverage in a way that it can finally discuss with each other. And similar to invoking a magic spell, you have to say those magic words like we saw before and say create an agent team to review PR142. In this case, you could spawn up three different reviewers. One for security, one for code quality and performance, and one for actually validating the test coverage of the other two agents. So with all this out of the way, how do the teams work in depth? So step one is you have the team lead looking around and understanding when it makes sense to spawn a brand new teammate. Even if the generation is in flight, meaning you've already assigned and hired three agents on top of your team lead, it could still decide that for this particular task or permutation, it might need to employ a new agent. Now, the main thing to keep in mind is that as you spin up more and more agents, naturally, you will take more and more tokens. So, if you're using something like Opus 4.6 for all of them, you will see your usage evaporate really quickly. The theoretical mental model of spinning up different agents is having them work on completely different areas of the build where it makes sense for them to communicate. So if you're using sub agents, then what would usually happen is if you created a front-end sub agent and then a backend sub aent, they would both work on their separate tasks, but there would be no communication between them. So you could run the backend after the front end, but ideally there's a contract between a front end and a back end. So to create that contract in a way that there's cohesion from day one, ideally they should be able to work on their own context window, but be able to message each other and ask each other on direction because if the front end is going to implement a brand new framework or use a library, you want to make sure that it's compatible with your backend. And this is where this feature becomes ingenious where you have that cross communication and cross-pollination of these agents in a way that you have full transparency as to what's happening. And in a way, this shared task list replaces a lot of the trend that I've been seeing over the past few months where you have things like Vibe Canban, an open-source framework where people can put different tasks and it autoranges those tasks. I found it to be buggy, sometimes slow, sometimes doesn't activate at all. And technically, this is built in. So, if we take a look at the agent dashboard that I showed you, I didn't actually engineer anything new here. It created all these categories right here inspired by the categories that are in the JSON files themselves. So I'm purely just visualizing what's happening and streaming the events in real time. The TLDDR of the messaging system is you have the team lead who's always in the know and then when the front-end agent, agent A or agent B are finished, they communicate with each other. So agent A will say, "Okay, I'm done the front end." And then the agent B will say, "Okay, I'm updating my endpoints." And then the front end because they're closest to the fire will update the team lead that it's all done. So the full life cycle becomes you say build this app and spawn a team to help me accomplish it. The lead then spawns typically it creates three to five sub employees depending on the task. These become the teammates that all have their defined roles and then they coordinate together to create a plan. Once the plan is approved or you've kind of put it into YOLO mode or bypass permissions mode to execute it, then you essentially get to the final result. And if we take a peek here at running that web page that I showed you earlier once it finishes, you could see all the tokens that it's taken. So I think it's around 80,000 tokens for this web page right here. Personalized web page. I didn't give it anything. Looks pretty clean in terms of layout and format. and it does work from what I could see and it's essentially oneshotted. So pretty impressive in terms of watching the agents do the work. You obviously have options. Now some people like to use this framework that's called T-Mix and T-Mix allows you to essentially have a terminal with a main window and then you can see the sub terminals open up where you can audit and see what's happening with the agents. The reason I made this UI is one to make it universal and number two as a dev when I was walking through and looking at it. I just noticed that things were moving so quickly that I'd have to constantly scroll through the history to properly monitor what's happening. So completely up to you. If you want to actually install it yourself, then you can use something like warp. And the way I actually just installed it is by asking it to install and update to my latest version. I asked it what is T-Mooks and it walks through exactly what that is. And then at the very bottom once you want to install it, you can say, can you install it? Can I use it in something like cursor? Then it walks me through that it's probably not the best use case there just because of the pain management. So you can just spin it up on your own terminal itself. And you'll notice if it works if we do something like this. So if we go to terminal right here and I spin this up and let's zoom in just a tad. And then we say T-Mox. You'll see right here at the very bottom, it has this green little footer. Then if we say this is my shortcut for YOLO mode, you'll be able to spin up the agent teams and it will spin up multiple panes. But for me, it was a pain to look at that a bit of an eyesore. So that's why we gave birth to this bad boy. And this is structured in a way where it's a skill. So it's invoked just in time. It's basically memorize the structure of the dashboard and memorize where it has to reference all the different files. Now, what's the difference between agent teams and sub aents because that's a very important thing. So, in the worlds of sub aents, they report back to you, but they never talk to each other. So even though they can run in parallel because of that lack of communication, you will see a lot of disfluency or different objectives or different goals for the sub agents, not knowing what their other agent compatriates are up to. One interesting thing is that with Opus 4.6, it's infinitely better now at spinning up sub agents on the fly without you even asking for it to do things like explore your codebase to actively, proactively preserve your context window as much as possible. So, there's still a huge place for sub agents in the mix here. It's just a matter of when to use them. And for now, what I can see when it comes to prepping, exploring, researching, doing tasks that are very admin in nature, it seems like sub agents is the way to go. Even though they also take a decent number of tokens, if it comes to code exploration, it will preserve your core context window, which really matters if you want to be able to execute a team, execute sub aents, and still get everything done without having to compact your conversation. So, if you needed one more diagram to drive this point home, with sub agents, you are the monkey in the middle right here getting all the discourse, all the updates, and you're the one who has to manage what has to happen next. In the world of agent teams, the team lead is the monkey maybe not in the middle monkey at the top and then they receive all the inputs and questions and need for guidance from all the others teammates. Now, when does it make sense to spend the tokens to use agent teams? Now, these are four use cases that I've tried personally. I'm going to keep experimenting and sharing what I find, but for now, parallel code review. So, if you've already vibe coded something and you're at the 85% mark and you're stuck, there's something not working. I personally have a project where I'm creating a version of Opus Clip. It's a clone of it. I've been working on it for two months. I've been stuck on a core set of features that no matter how much I try or intervene using my own dev background, I can't get it over the line consistently. So, I had it review the codebase and it found three or four areas where it had a series of duplicate functions or functions stepping on each other's toes that really muddied my codebase. So, having not just one extra set of eyes on it, but three or four helps you really orient yourself and better understand where to go next. The next like I showed you are cross layer features. So, if you want to build a web application or something on Nex.js JS and you have a front end, you have a backend, you have a database, and you have a series of features. This is where it makes a lot of sense to try to oneshot the 80% with the team. Maybe drive it home yourself. And the last two are debugging any hypotheses. So, if you've watched my prior video, before this even existed, I tried to create my own agent team by having multiple sub agents share a markdown file that they would use as a diary. So now that we live in this world, we don't technically need that strategy anymore, unless you want to preserve tokens at all cost. We can still use that method to do non-technical tasks. Create an entire brainstorming network. You could create a RFP generation network, proposal network. You can do all kinds of non-technical things using this feature. And when it comes to research, a lot of people default to using things like Google deep research or perplexity. behind the scenes, you could implement a research committee where you have different agents go and research different parts but communicate their findings in real time. So you can imagine things like scientific discovery will become increasingly more possible. And when it comes to sub agents, it's really useful for quick research. We can just run it in parallel. A quick code exploration to see where things are or where certain files are. File operations where you want to use lower token costs. So maybe you spin up four or five sub agents that use Haik coup to do a very basic tasks, but you know at the end of the day that those are just being executed. They're being checked off a to-do list, but there's no value add of adding more agents to the network in terms of additive knowledge. So here's a tactical applied scenario of let's say debugging a codebase. So you could have agent one in a agent team decide that there's a memory leak of some sort and then agent two could say no, it's a race condition. And if you have no idea what those words mean, don't worry about it. Maybe just exploring some concepts. An agent three could act as the devil's advocate where it could say, "You know what? You're wrong and you're wrong. I actually think it's something completely different." And then they work together. They have their own consensus until they get to the final result. So, it's literally like having a possible committee go through and vote on what the problem is to get to the bottom of it. Now, this debate could also happen with sub agents, but like we said before, they couldn't directly fight each other. Now, when it comes to spinning up an agent team related to building a feature like authentication, you could end up with a UI agent, an API agent, and a database agent. And the way it would work is the first UI agent would design the scaffolding for the page. And then it would realize that it needs an API, an API to actually allow the login. So this goes to the API agent that creates the API and sends some form of response to double check that it works and then it realizes it needs users table with an email and a hash. So it needs some support to bring this API to life so that when you actually log in this actually does something. So then the database agent gets to work. It creates this new users table right here and it says it's ready. So then it goes back to the API agent. Then it can go back and say listen we're good to go. It's time to actually test this out. If you want a quick heruristic on when to use sub agents versus agent teams, if you have a brand new task, the easiest question you could ask yourself is, do you need agents to speak to each other? If the answer is no, then you can use sub agents otherwise you want to ask the next question. Is the task complex enough to justify the overhead of tokens/cost of tokens? If the answer is yes, then use agent teams. But there is a world here where you say no to all of them and you just have a normal session with no agents at all. The mentality I want you to adopt is if you are a bootstrapped founder of a startup and you have a very finite amount of money. Now depending on what plan you're on, that could be very good analogy. But if you are bootstrapped, you have to be very picky about when it makes sense to actually hire your first employee or employees unless you raise some funding. But unless you have raised funding for your anthropic subscription, I doubt that's the case. So you just want to be as selective and picky and responsible as possible. The pros, just starting off with that, is you can require plan approval for any risky changes. So you can still tell Claude that this is what you want in terms of a framework. You can ideally aim to have five to six different agents. Beyond that, from what I found, is diminishing returns. Same thing with sub agents. And then you could start all of them working on a research task before they all work on an actual execution task. And in terms of the gotchas, like I said, the token cost is very real and it will add up very quickly. You could spend anywhere between 100 to 300,000 tokens just spinning up and executing an agent team. The next thing is even if you have separated roles, I've found that agents can overwrite the same file. Now, it might make sense for them to override it, but once in a while, you'll notice that one will do a great job and the other will come in and change just a few things that make it either unusable or it added bloat unnecessarily. The last thing is unless you create a way to respin up the exact same agents with the same instructions, every time it shuts down, you won't be able to bring that back up. And last but not least, the lead of the team could end up coding instead of delegating. So ideally you want your head team lead to be looking at the big picture and looking at what everyone's doing and making sure it accomplishes the goal, but once in a while you could see it coding itself. So now you could see it stepping on the toes of its submp employees. Now hopping back into the terminal, one thing you can do to really visualize this entire system of where the messages lie, where the inboxes exist is ask it for an asky art to visualize how this feature and function works. Now when I say this feature I already referenced above the agents teams feature. So it walks through it says this you are the human and you have a team lead. These are the functions that come with a team lead. So you have team create task create task tool send message. Then it walks you through how they share a file system and then teammate coder will read the inbox claim tasks. Teammate research will read the inbox as well and so will the tester. And then in terms of file storage, it breaks down exactly where all this exists, which is why I always get confused while other creators confuse you with different folders and JSON files in a black terminal because it can be intimidating if you're non-technical. You can just ask it to visualize it. You can see right here we have a config JSON that tells you who's on the team. Then you have all the agent mailboxes, which is one of the many ways we created that visual to monitor everything. All I said was, "Hey, can you help me create a skill that when I say the words surveil my agents will spin up the agent team and then look at the inboxes, look at the JSON files and just stream whatever is there onto a local host UI and ideally make it so that you can spin up the same local host every single time. So you don't have to go and search for a port. And then when you get to tasks, all of these are also structured. So the shared tasks board where something's pending completed that basically that canban essence is already native to this structure. So with all that in mind you can see right here this is my skill my dashboard database using what's called SQLite a free database that can live on your computer no need for superbase or anything else. In terms of where the information flows this shows you exactly that the team lead will send a message directly to the inbox of the researcher. So it's at.cloud/teams/dynamic name of your team inboxes researcher.json. Then the agent waits to see when the next message arrives. Once it reads it, it tells us that it's read it. So if you think of WhatsApp or Telegram, when you see those two check marks and they're blue, then it knows you've read it. In this case, you have the exact same system. And then you have the task life cycle where you create a task, update said task, and then you delete the task when it's completed. And then the protocol messages look like this. So, hey, can you review my PR like an actual human? Then you have the overall protocol, the protocol types, and this breaks down my surveillance dashboard. So, it always will spin up on local host 3847. You can see live and history tabs, and you have a canban board and the inbox threats. If you watch it in action, it's literally as simple as me saying, "Spin up an agent team to build a web page and surveil them using your special skill." Again, I told it in the skill itself that it should surveil the team whenever I say those words. So, it knows to load that skill right here. It then organizes itself, comes up with the three agents and the subbuilders. It walks through how the pipeline will be designed. It executes the tasks. And while this is all happening, it's streaming on the main web page that I showed you. So once this is all done, then we're good to go. And the best part is it auto shuts down the session. And when it auto shuts down the session, that's where you can see everything go in the history tab. So you can see right here, this is the ended session of the web build right here. And we can always go back and look at the messages, tasks, etc. And that's pretty much it. So hopefully now you have a better understanding of when to use sub aents, how they work, and where it makes the most sense versus using something like sub agents or nothing at all. Now, if you want access to all the diagrams I showed you along with a prompt that you can use to start your journey of creating your own skill for your own agent dashboard, I'll make them both available to you in the second link in the description below. But if you want access to my skill that I spent a couple hours putting together to make sure it works at least nine out of 10 times, then you're going to want to check that out in my exclusive early AI adopters community. And last but not least, if you found this video helpful and educational, would be super helpful for you to leave a comment and a like on the video. really helps the video, really helps the channel, and I'll see you all next

Video description

🚀 Join 800+ builders going deeper on Claude Code: https://www.skool.com/earlyaidopters/about 📦 Download the Agent Teams Guide + 24 Diagrams (FREE): https://markkashef.gumroad.com/l/claude-code-agent-teams-guide 📅 Book a strategy call: https://calendly.com/d/crfp-qz3-m4z --- Claude Code's Opus 4.6 release introduced Agent Teams - and they completely change how multi-agent work gets done. In this video, I break down exactly how agent teams work under the hood, when to use them vs subagents, and I walk you through the real-time surveillance dashboard I built to monitor everything agents do. If you've been using subagents and wondering why your frontend and backend agents keep building on different assumptions - this is the fix. Shared task lists, agent-to-agent messaging, and a team lead that actually coordinates. --- ⏱️ TIMESTAMPS: 0:00 - Agent Surveillance Dashboard Preview 1:05 - What We'll Cover 1:27 - Why Token Cost Matters 1:50 - Step 1: Enable the Feature Flag (Cheat Code) 3:17 - Verifying Agent Teams Are Active 4:00 - How Teams Persist vs Subagents 4:23 - Step 4: Assign the Right Model Per Agent 5:00 - Invoking Agent Teams (Magic Words) 5:55 - How Teams Work In Depth 6:19 - The Communication Problem with Subagents 7:00 - Cross-Communication Between Agents 7:32 - The Built-In Kanban (Shared Task List) 8:00 - Full Team Lifecycle Explained 8:56 - Display Options: tmux vs Surveillance Dashboard 10:00 - tmux Demo 10:31 - The Surveillance Skill (How It Works) 11:00 - Agent Teams vs Subagents (Key Differences) 12:17 - When to Use Agent Teams (4 Use Cases) 14:10 - When Subagents Are Still Better 15:00 - Applied Example: Debugging with Competing Hypotheses 16:00 - Applied Example: Building Authentication 16:42 - Decision Flowchart: Teams vs Subagents vs Nothing 17:37 - Pros and Best Practices 18:00 - Gotchas (Token Cost, File Overwrites, No Resume) 19:00 - Visualizing the Architecture (ASCII Art) 20:00 - How I Built the Surveillance Dashboard 21:00 - Message Protocol and Task Lifecycle 21:52 - Live Demo: Spinning Up a Team + Surveillance 22:32 - History Tab and Session Audit 22:58 - Free Resources + Community --- 🔗 RESOURCES: • Agent Teams Docs: https://docs.anthropic.com/en/docs/claude-code/agent-teams • Claude Code: https://claude.ai/code • Warp Terminal: https://www.warp.dev --- #ClaudeCode #AgentTeams #Opus46 #AIAgents #Anthropic #ClaudeAI #Subagents #MultiAgent #AIAutomation #AICoding #AITools #AgentDashboard #ClaudeCodeTutorial #AIWorkflow