Playwright for AI Agents
Run real coding agents against your API. See where they fail. Get fixes. All from one command.
Real Agent Testing
Not simulations. Real agents.
We spin up actual coding agents (Cursor, Claude, Copilot) in real IDEs and have them try to use your API. You see exactly what they see, do, and where they fail.
What you get
- Real IDE environment (VS Code, Cursor)
- Multiple agent support (Claude, GPT-4, etc.)
- Live streamed runs you can watch
- Full transcript of agent reasoning
- Actual code generation attempts
Failure Reports
Root causes, not just logs
Get detailed breakdowns of where agents fail: auth confusion, parameter errors, missing examples, unclear schemas. We tell you why, not just what.
What you get
- Categorized failure types
- Root cause analysis
- Affected endpoints list
- Severity scoring
- Shareable report links
Concrete Fixes
Copy-paste solutions
Every failure comes with suggested fixes: doc rewrites, example code, schema improvements. Not generic advice — specific changes for your API.
What you get
- Markdown diff suggestions
- Example code generation
- OpenAPI spec patches
- Error message rewrites
- Before/after comparisons
Agent Usability Score
Track over time
Get a 0-100 score measuring how well agents can use your API. Track improvements, compare releases, share with stakeholders.
What you get
- Overall compatibility score
- Category breakdowns
- Historical tracking
- Release comparisons
- Benchmark against industry
CI Integration
Coming SoonTest on every release
Run agent compatibility tests in your CI pipeline. Block releases on regressions. Get notified when scores drop.
What you get
- GitHub Actions support
- PR comments with results
- Release gating rules
- Slack/email notifications
- Score trend alerts
How it compares
OpenCanary vs. traditional API testing
Traditional API Testing
- xTests if endpoints return correct responses
- xValidates schemas and contracts
- xChecks authentication flows
- xMeasures performance and latency
- xTests if AI can figure out how to use it
OpenCanary
- +Tests if agents can understand your docs
- +Validates agent-generated code works
- +Identifies confusing parameter names
- +Finds missing examples that cause failures
- +Shows exactly where agents get stuck
Traditional testing asks: “Does it work?”
OpenCanary asks: “Can an AI agent figure out how to use it?”