Reliable Multi-Agent Orchestration Framework. Contribute to VRSEN/agency-swarm development by creating an account on GitHub.
This framework continues the original vision of Arsenii Shatokhin (aka VRSEN) to simplify the creation of AI agencies by thinking about automation in terms of real-world organizational structures, making it intuitive for both agents and users. If you hit environment issues, see the Installation guide. or convert from OpenAPI schemas: In Agency Swarm, communication flows are directional. The operator defines allowed initiations (left can initiate a chat with right). On first run, Agency Swarm sets up the terminal app automatically, shows a short setup message, and reuses it on later runs. See the terminal workflow guide: https://agency-swarm.ai/core-framework/agencies/agent-swarm-cli Need sync? agency.get_response_sync(...) exists, but async is recommended. Recommended agent folder structure: This structure ensures that each agent has its dedicated space with all necessary files to start working on its specific tasks. The tools.py can be customized to include tools and functionalities specific to the agent's role. For details on how to contribute to Agency Swarm, please refer to the Contributing Guide. Agency Swarm is open-source and licensed under MIT. Reliable Multi-Agent Orchestration Framework There was an error while loading. Please reload this page. There was an error while loading. Please reload this page. There was an error while loading. Please reload this page. There was an error while loading. Please reload this page.
Mentions (30d)
0
Reviews
0
Platforms
2
GitHub Stars
4,130
1,008 forks
Features
Use Cases
Industry
information technology & services
Employees
6,000
Funding Stage
Other
Total Funding
$7.9B
648
GitHub followers
32
GitHub repos
4,130
GitHub stars
1
npm packages
6
HuggingFace models
Karpathy just said "the human is the bottleneck" and "once agents fail, you blame yourself" — I built a system that fixes both problems
In the No Priors podcast posted 3 days ago, Karpathy described a feeling I know too well: He's spending 16 hours a day "expressing intent to agents," running parallel sessions, optimizing agents.md files — and still feeling like he's not keeping up. I've been in that exact loop. But I think the real problem isn't what Karpathy described. The real problem is one layer deeper: you stop understanding what your agents are doing, but everything keeps working — until it doesn't. Here's what happened to me: I was building an AI coding team with Claude Code. I approved architecture proposals I didn't understand. I pressed Enter on outputs I couldn't evaluate. Tests passed, so I assumed everything was fine. Then I gave the agent a direction that contradicted its own architecture — because I didn't know the architecture. We spent days on rework. I wasn't lazy. I was structurally unable to judge my agents' output. And no amount of "running more agents in parallel" fixes that. The problem no one is solving I surveyed the top 20 AI coding projects on star-history in March 2026 — GStack (Garry Tan's project, 16k+ stars), agency-agents, OpenCrew, OpenClaw, etc. Every single one stops at the same layer: they give you a powerful agent team, then assume you know who to call, when to call them, and how to evaluate their output. You're still the dispatcher. You went from manually prompting one agent to manually dispatching six. The cognitive load didn't decrease — it shifted. I mapped out 6 layers of what I call "decision caching" in AI-assisted development: Layer What gets cached You no longer need to... 0. Raw Prompt Nothing — 1. Skill Single task execution Prompt step by step 2. Pipeline Task dependencies Manually orchestrate skills 3. Agent Runtime decisions Choose which path to take 4. Agent Team Specialization Decide who does what 5. Secretary User intent Know who to call or how + Education Understanding Worry about falling behind Every project I found stops at Layer 4. Nobody is building Layer 5. What I built: Secretary Agent + Education System Secretary Agent — a routing layer that sits between you and a 6-agent team (Architect, Governor, Researcher, Developer, Tester + the Secretary itself). The key innovation is ABCDL classification — it doesn't classify what you're talking about, it classifies what you're doing: A = Thinking/exploring → routes to Architect for analysis B = Ready to execute → routes to Developer pipeline C = Asking a fact → Secretary answers directly D = Continuing previous work → resumes pipeline state L = Wants to learn → routes to education system Why this matters: "I think we should redesign Phase 3" and "Redesign Phase 3" are the same topic but completely different actions. Every existing triage/router system (including OpenAI Swarm) treats them identically. Mine doesn't. The first goes to research, the second goes to execution. When ambiguous, default to A. Overthinking is correctable. Premature execution might not be. Before dispatching, the Secretary does homework — reads files, checks governance docs, reviews history — then constructs a high-density briefing and shows it to you before sending. Because intent translation is where miscommunication happens most. The education system: the exam IS the course When you send a message that touches a knowledge domain you haven't been assessed on, the system asks: Before routing this to the Architect, I notice you haven't reviewed how the team pipeline works. This isn't a test you can fail — it's 8 minutes of real scenarios that show you how the system actually operates. A) Learn now (~8 min) B) Skip C) 30-second overview If you choose A, you get 3 scenario-based questions — not definitions, real situations: You answer. The system reveals the correct answer with reasoning. Testing effect (retrieval practice) — cognitive science shows testing itself produces better retention than re-reading. I just engineered it into the workflow. The anti-gaming design: every "shortcut" leads to learning. Read all answers in advance? You just studied. Skip everything? System records it, reminds you more frequently. Self-assess as "understood" but got 3 wrong? Diagnostic score tracked separately, advisory frequency auto-adjusts. It is impossible to game this system into "learning nothing." That's by design. Other things worth mentioning Agents can say no to you. Tell the Secretary to skip the preview gate, it pushes back: "Preview gating is mandatory. Skipping may cause routing errors. Override?" You can force it — you always can — but the override gets logged and the system learns. Cross-model adversarial review. The Architect proposes a solution, then attacks its own proposal using a second AI model (Gemini). Only proposals that survive cross-model scrutiny get through. Constitutional governance. 9 Architecture Decision Records protected by governance rules. You can't unilaterally change them
View originalRepository Audit Available
Deep analysis of VRSEN/agency-swarm — architecture, costs, security, dependencies & more
Agency Swarm uses a subscription + tiered pricing model. Visit their website for current pricing details.
Key features include: Key Features, Compatibility, Folder Structure, Resources, License, Contributing, Uh oh!, Stars.
Agency Swarm is commonly used for: Installation, Set Your OpenAI Key.
Agency Swarm has a public GitHub repository with 4,130 stars.