Fleet in GitHub Copilot CLI: Realities, Limitations, and What Actually Works

The official GitHub Blog post about /fleet in Copilot CLI paints an exciting picture: run multiple agents in parallel, decompose big tasks, and watch Copilot orchestrate your work like a team lead. But does it really work as described? Here’s a fact-checked, experience-based look at what Fleet can (and can’t) do today.

What the Blog Claims

/fleet splits your prompt into independent work items and runs them in parallel as sub-agents.
The orchestrator manages dependencies, dispatches agents, and merges results.
You can reference custom agents, set boundaries, and declare dependencies in your prompt.
It’s as simple as writing a structured prompt and letting Copilot handle the rest.

What Actually Happens

1. Parallelism Is Limited

While /fleet does attempt to break down your prompt, true parallel execution is often limited by:

The orchestrator’s ability to correctly decompose your task (which is hit-or-miss for anything non-trivial).
File system conflicts: If two agents touch the same file, the last write wins, with no warning.
Sub-agents don’t communicate, so coordination is only as good as your prompt.

2. Prompt Quality Is Critical

If your prompt isn’t extremely explicit about file boundaries and dependencies, Copilot will often fall back to sequential execution. Vague or high-level prompts rarely result in real parallelism.

3. Custom Agents Are Experimental

Referencing custom agents in .github/agents/ is supported, but:

The feature is still evolving and may not work as expected in all environments.
Model selection and tool assignment are not always honored.

4. Error Handling and Merging Are Manual

If two agents produce conflicting changes, you must manually resolve them.
There’s no built-in merge or conflict resolution—just silent overwrites.

5. Verification Is on You

The orchestrator does not guarantee that all validation steps (tests, lint, type checks) are run unless you explicitly ask for it.
You must review the plan and outputs carefully.

Best Practices for Using Fleet

Be explicit: List every file, module, and dependency in your prompt.
Partition work: Never let two agents touch the same file.
Review plans: Always check Copilot’s decomposition before letting it run.
Test everything: Run your own validation after Fleet completes.

When Fleet Shines (and When It Doesn’t)

Fleet is best for:

Generating boilerplate across many files.
Updating docs or tests in parallel (with clear boundaries).
Large refactors where work can be cleanly split.

It’s less effective for:

Tasks with complex interdependencies.
Anything requiring coordination between agents.
Work on a single file or tightly coupled modules.

Conclusion

Fleet in Copilot CLI is a promising feature, but it’s not magic. For now, treat it as an experimental tool: great for parallelizing well-partitioned tasks, but not a replacement for careful planning and review. Always verify the results, and don’t trust the orchestrator to catch everything.

This post fact-checks and clarifies claims from the official GitHub Blog post. Experience and testing show Fleet’s real-world limitations and best practices.

Fleet in GitHub Copilot CLI: Realities, Limitations, and What Actually Works

Fleet in GitHub Copilot CLI: Realities, Limitations, and What Actually Works

What the Blog Claims

What Actually Happens

1. Parallelism Is Limited

2. Prompt Quality Is Critical

3. Custom Agents Are Experimental

4. Error Handling and Merging Are Manual

5. Verification Is on You

Best Practices for Using Fleet

When Fleet Shines (and When It Doesn’t)

Conclusion

Comments

Leave a comment