
GitHub Copilot CLI Introduces Rubber Duck: Cross-Model Reviews for Better Code
GitHub Copilot CLI’s new Rubber Duck feature brings cross-model reviews to your workflow, helping catch more errors and improve code quality. Learn how it works and why a second opinion matters.
GitHub Copilot CLI Introduces Rubber Duck: Cross-Model Reviews for Better Code
GitHub Copilot CLI has launched a new experimental feature called Rubber Duck, designed to give developers a second opinion by leveraging a model from a different AI family. This approach aims to catch more errors, challenge assumptions, and improve the quality of code—especially for complex, multi-file, or long-running tasks.
Why a Second Opinion Matters
Coding agents are powerful, but they can fall into the trap of compounding early mistakes. When an agent reviews its own work, it’s limited by its own training data and biases. Rubber Duck solves this by acting as an independent reviewer, using a complementary model to critique the primary agent’s plans and implementations.
How Rubber Duck Works
- Cross-family review: If you’re using a Claude model as your orchestrator, Rubber Duck will use GPT-5.4 as the reviewer (and vice versa in the future).
- Targeted feedback: Rubber Duck surfaces high-value concerns—missed details, questionable assumptions, and edge cases—at critical checkpoints: after planning, after complex implementations, and after writing tests.
- Proven results: In benchmarks like SWE-Bench Pro, pairing Claude Sonnet 4.6 with Rubber Duck (GPT-5.4) closed nearly 75% of the performance gap between Sonnet and the more advanced Opus model. The biggest gains were seen on the hardest, multi-file problems.
Real-World Impact
Rubber Duck has already caught issues such as:
- Architectural flaws (e.g., schedulers that never run jobs)
- Subtle bugs (e.g., loops overwriting dictionary keys)
- Cross-file conflicts (e.g., breaking dependencies between modules)
When and How to Use Rubber Duck
Rubber Duck is available in experimental mode in the Copilot CLI. It activates automatically at key moments or whenever you request a critique. To try it:
- Install GitHub Copilot CLI
- Run the
/experimentalcommand - Select a Claude model and ensure access to GPT-5.4
You’ll see critiques after planning, after complex code changes, or before running tests. You can also ask for a critique on demand.
Why This Matters
By combining the strengths of different AI model families, GitHub Copilot CLI helps developers avoid costly mistakes and build more robust software. Rubber Duck is especially valuable for:
- Complex refactors and architectural changes
- High-stakes tasks
- Ensuring comprehensive test coverage
- Getting a second opinion before committing to a plan
Rubber Duck is now available in experimental mode. Try it out and share your feedback with the GitHub team!
Read the original announcement on the GitHub Blog.
Share this post