How OpenCanary works

Playwright for AI Agents

Run real coding agents against your API. See where they fail. Get fixes. All from one command.

$opencanary run openapi.yml

Real Agent Testing

Not simulations. Real agents.

We spin up actual coding agents (Cursor, Claude, Copilot) in real IDEs and have them try to use your API. You see exactly what they see, do, and where they fail.

What you get

  • Real IDE environment (VS Code, Cursor)
  • Multiple agent support (Claude, GPT-4, etc.)
  • Live streamed runs you can watch
  • Full transcript of agent reasoning
  • Actual code generation attempts

Failure Reports

Root causes, not just logs

Get detailed breakdowns of where agents fail: auth confusion, parameter errors, missing examples, unclear schemas. We tell you why, not just what.

What you get

  • Categorized failure types
  • Root cause analysis
  • Affected endpoints list
  • Severity scoring
  • Shareable report links

Concrete Fixes

Copy-paste solutions

Every failure comes with suggested fixes: doc rewrites, example code, schema improvements. Not generic advice — specific changes for your API.

What you get

  • Markdown diff suggestions
  • Example code generation
  • OpenAPI spec patches
  • Error message rewrites
  • Before/after comparisons

Agent Usability Score

Track over time

Get a 0-100 score measuring how well agents can use your API. Track improvements, compare releases, share with stakeholders.

What you get

  • Overall compatibility score
  • Category breakdowns
  • Historical tracking
  • Release comparisons
  • Benchmark against industry

CI Integration

Coming Soon

Test on every release

Run agent compatibility tests in your CI pipeline. Block releases on regressions. Get notified when scores drop.

What you get

  • GitHub Actions support
  • PR comments with results
  • Release gating rules
  • Slack/email notifications
  • Score trend alerts

How it compares

OpenCanary vs. traditional API testing

Traditional API Testing

  • xTests if endpoints return correct responses
  • xValidates schemas and contracts
  • xChecks authentication flows
  • xMeasures performance and latency
  • xTests if AI can figure out how to use it

OpenCanary

  • +Tests if agents can understand your docs
  • +Validates agent-generated code works
  • +Identifies confusing parameter names
  • +Finds missing examples that cause failures
  • +Shows exactly where agents get stuck

Traditional testing asks: “Does it work?”
OpenCanary asks: “Can an AI agent figure out how to use it?”

Ready to test your API?

Start free. No signup required. Results in 30 seconds.

Try free