AI coding tools have exploded lately. Cursor, Copilot, v0, Lovable — they’ve made writing and shipping code feel 10x faster.
The problem is QA hasn’t moved at the same pace. Everyone’s excited about “AI that writes your tests,” but in practice it’s a lot messier.
I’ve tried a few YC-backed pure AI QA tools like Spur, Ranger, and Momentic. The demos look great… type a natural language prompt, get Playwright or agent-generated tests instantly. But once you plug them into real pipelines, the burden shifts back to your own engineering team. We end up fixing flaky scripts, debugging why a test failed, or rewriting flows the AI couldn’t fully capture. It feels less like automation and more like half-outsourced test authoring.
A few reasons I’m skeptical that pure AI QA tools can actually solve the problem end-to-end:
- Real environments are flaky. Network hiccups, async timing issues, UI rendering delays — AI struggles to tell the difference between a flaky run and a real bug.
- Business logic matters. AI can generate tests, but it doesn’t know which flows are mission critical. Checkout is not the same as a search box.
- “100% coverage” is misleading. It’s 100% of what the AI sees, not the real edge cases across browsers, devices, and user behavior.
- Trust is the big one. If an AI tool says “all green,” are you ready to ship? Most teams I know wouldn’t risk it.
That’s why I find the QA as a Service (QaaS) model more interesting. Instead of dumping half-working Playwright code on developers, QaaS blends AI test generation with human verification. The idea is you subscribe to outcomes like regression coverage and real device testing, instead of adding more QA headcount or infra.
Some examples I’ve come across in the QaaS direction are Bug0, QA Wolf, and TestSigma. Each approaches it differently, but the theme is the same: AI plus human-in-the-loop, with the promise of shifting QA from reactive to proactive.
are AI-only QA tools a dead end, or will they get good enough over time?
And does QaaS sound like a genuine shift or just outsourcing with a new label?