testpilot.jpg
Developer Tools

TestPilot

AI testing copilot that watches you code and automatically generates comprehensive test suites — unit, integration, and E2E.

kentcdodds.jpg
Kent C. Dodds
March 14, 2026947 views9 min read

TestPilot

Write the feature. We’ll write the tests. And they’ll actually catch bugs.


The Problem

Every engineering team agrees: testing is essential. Yet most codebases are dramatically under-tested — not because developers don’t care, but because writing good tests is as time-consuming and tedious as building the feature itself.

Here’s how testing really works on most teams:

  • Time pressure kills testing first
    When a sprint is running behind, what gets cut? Tests. Always tests. “We’ll add tests later” is the most common lie in software engineering.

  • Coverage ≠ quality
    Chasing 80–90% coverage often leads to shallow, brittle tests that assert implementation details instead of behavior. You can hit your coverage target and still miss the bugs that matter.

  • Tests are tedious to write
    Good tests require understanding the component’s contract, mocking dependencies correctly, handling edge cases, and writing meaningful assertions. For a complex React component, this can take longer than building the component.

  • Tests rot quickly
    Refactor a component and half your tests break — not because behavior changed, but because the tests were coupled to implementation details. Teams burn hours fixing tests that should never have broken.

  • AI code generation makes it worse
    As AI writes more code faster, the testing gap widens. AI can generate features in minutes, but the test suite that validates those features doesn’t grow proportionally. Teams ship faster but with less confidence.

The economics are brutal: bugs found in production cost 15–30x more to fix than bugs caught during development. Yet the industry’s best advice is still “write more tests” — which nobody has time to do.


Our Solution

TestPilot is an AI testing copilot that watches you code in real time and automatically generates comprehensive, meaningful test suites — unit, integration, and end-to-end — that catch real bugs instead of just padding coverage numbers.

You write the feature. TestPilot writes the tests.


How It Works

1. Deep IDE Integration

TestPilot runs as a VS Code extension (JetBrains support coming soon). It:

  • Monitors your code changes in real time
  • Understands your project structure and dependencies
  • Learns your existing test patterns and conventions

It behaves like a senior engineer sitting beside you, constantly asking: “What could break here — and how do we prove it won’t?”

2. Intent Understanding

Most generators look only at code structure. TestPilot focuses on intent — what the code is supposed to do.

It:

  • Analyzes function signatures, JSDoc comments, and variable names to infer behavior
  • Studies your existing tests to learn your team’s assertion style and conventions
  • Examines git history to see what kinds of changes historically introduce bugs
  • Reads call sites and consumers to understand the true public contract of a function or component

This lets TestPilot generate tests that validate business logic, not just lines of code.

3. Intelligent Test Generation

For each meaningful code change, TestPilot proposes a suite of tests across three layers.

Unit Tests

  • Happy paths that reflect the primary purpose of the function or component
  • Edge cases: null/undefined, empty arrays, boundary values, type coercion, unusual inputs
  • Error handling: what happens when dependencies fail, timeouts occur, or invalid data appears
  • Regression tests targeting patterns that historically cause bugs in similar code (e.g., off-by-one errors, floating point issues, race conditions)

Integration Tests

  • API endpoint tests with realistic request/response scenarios
  • Database interaction tests with proper setup/teardown and fixtures
  • Service-to-service tests that validate real-world flows across modules and microservices

End-to-End (E2E) Tests for UI

  • User flow tests using Playwright or Cypress for critical journeys
  • Accessibility checks: keyboard navigation, ARIA roles, focus management
  • Visual regression snapshots for high-value UI surfaces

You can accept, edit, or discard any suggestion. TestPilot learns from every decision.

4. Framework-Native Output

TestPilot writes tests in your stack, your style:

  • Unit/Integration: Jest, Vitest, Mocha
  • E2E: Playwright, Cypress
  • Component: React Testing Library, Vue Test Utils

It automatically:

  • Follows your file naming and folder structure
  • Mirrors your describe/it organization
  • Uses your custom matchers and helpers
  • Respects your linting and formatting rules

Generated tests look like a teammate wrote them — not a generic AI.

5. Continuous Learning & Refinement

TestPilot improves with your feedback:

  • Mark a test as “not useful” → TestPilot avoids similar patterns
  • Accept a test as-is → That pattern is reinforced
  • When a bug reaches production, TestPilot can retroactively generate the test that would have caught it, and learn from that gap

Over time, your test suite — and TestPilot itself — becomes more aligned with your product, your risk profile, and your team’s standards.


What Makes TestPilot Different

Behavior-First, Not Coverage-First

TestPilot’s core question for every test is: “Would this catch a bug that matters?”

Instead of blindly chasing coverage, it:

  • Prioritizes business-critical behavior and edge cases
  • Targets historically fragile areas of your codebase
  • Avoids brittle tests that overfit implementation details

Coverage becomes a byproduct of meaningful tests, not the goal.

Intent-Aware

Where other tools see a function that “returns a number,” TestPilot sees a function that calculates a discount, and knows to:

  • Test pricing boundaries (0%, 100%, negative, and extreme values)
  • Validate rounding rules and currency behavior
  • Check interactions with promotions, coupons, and user tiers

This intent-awareness produces tests that mirror real-world usage and real-world failures.

Context-Aware

TestPilot has codebase-wide context:

  • Understands how modules interact
  • Sees how components are actually used
  • Knows which APIs are critical to revenue or reliability

This enables realistic integration and E2E tests that reflect actual usage patterns, not contrived examples.

Team-Adaptive

Every team has its own testing culture. TestPilot:

  • Learns your style, from naming to matcher preferences
  • Adapts to your risk tolerance (e.g., more exhaustive tests for payments than for marketing pages)
  • Produces tests that fit seamlessly into your existing suite and review process

The result: developers keep and trust the tests it generates.


Results

Across 500+ open-source projects and 200+ commercial teams, TestPilot delivers measurable impact:

  • 89% of generated tests catch real regressions when run against historical bug-introducing commits
  • Average 4.2 meaningful tests per function (vs. 1.8 from competing tools)
  • 73% test acceptance rate — nearly 3 out of 4 generated tests are kept without modification
  • Teams ship 40% fewer bugs to production over 6 months
  • Developers save 2.5 hours per day on average — time reclaimed from writing and maintaining tests and redirected to building features

TestPilot doesn’t just increase coverage; it increases confidence.


Traction

TestPilot is already proving its value in the wild:

  • 12,000+ developers across 800+ organizations
  • $2.1M ARR, growing 35% month-over-month
  • Enterprise customers include:
    • A top-3 global fintech company
    • Two FAANG-tier technology companies
    • A healthcare platform processing 10M+ transactions/day
  • VS Code Marketplace: 4.8/5 stars with 45,000+ installs
  • Design partner program with the Jest and Playwright teams for deep, first-class framework integration

The demand signal is clear: teams want to ship faster without sacrificing quality.


Business Model

We monetize per developer, aligned with team size and sophistication.

  • Free

    • 50 test generations per month
    • Basic test types
    • Single framework support
  • Pro — $19/dev/month

    • Unlimited test generation
    • All test types: unit, integration, E2E
    • All supported frameworks
    • Team-aware learning
    • Priority generation
  • Team — $39/dev/month

    • Everything in Pro
    • Shared team testing conventions
    • CI integration (PR comments, gating on test quality)
    • Test quality dashboards and analytics
    • Custom rules and policies
  • Enterprise — $79/dev/month

    • SSO and advanced access controls
    • Self-hosted / VPC deployment options
    • Custom model fine-tuning on proprietary codebases
    • Compliance reporting (SOC2, HIPAA-ready workflows)
    • Dedicated support and SLAs

This structure lets individual developers start for free, small teams upgrade for productivity, and large enterprises standardize testing at scale.


Market Opportunity

The software testing market is a $45B industry growing at 14% annually.

Within that, AI-assisted testing tools are the fastest-growing segment, with 38% CAGR, projected to reach $8B by 2028.

TestPilot sits at the intersection of two powerful shifts:

  1. AI-Accelerated Development
    As AI generates more code, faster, the bottleneck moves from writing code to validating it. Every AI-generated feature needs tests; humans don’t scale to that pace.

  2. Shift-Left Testing
    The industry is converging on the idea that testing must happen earlier and continuously in the development lifecycle. TestPilot operationalizes shift-left by embedding high-quality test generation directly into the coding workflow.

Every company that writes software needs better tests. Every team using AI to write code needs TestPilot even more, because the gap between “code generated” and “code validated” grows with every commit.


Team

We’ve built testing systems, AI models, and engineering cultures at some of the most demanding software organizations in the world.

  • CEO — Former Google Testing Infrastructure lead

    • Built the internal test generation system used across 50,000+ engineers
    • Deep experience with large-scale, automated testing in complex monorepos
  • CTOAI researcher from Microsoft Research

    • Specializes in program analysis and automated reasoning about code behavior
    • Published work on static and dynamic analysis, symbolic execution, and ML for code
  • Head of Product — Former Stripe engineering manager

    • Led teams responsible for Stripe’s renowned testing culture
    • Experience achieving 99.9%+ test reliability in mission-critical payment systems

We’ve:

  • Tested code at global scale
  • Researched the cutting edge of AI program analysis
  • Built testing cultures that actually work in high-growth environments

TestPilot is the synthesis of all three perspectives.


Why Now

  • AI is writing more of your code every quarter.
  • Software complexity is compounding across services, platforms, and devices.
  • Customers expect reliability; regulators increasingly demand it.

Manual testing practices can’t keep up. TestPilot turns testing from a tax into a background process — always on, always learning, always focused on the bugs that matter.


Vision

TestPilot’s long-term vision is to become the quality layer for AI-era software development:

  • Every code change is automatically accompanied by a high-quality test suite
  • Every regression is traced back to the missing test — and that gap is closed permanently
  • Every team, from startup to enterprise, can ship at AI speed with enterprise-grade reliability

TestPilot: Write the feature. We’ll write the tests. And they’ll actually catch bugs.

Discussion

3

Sign in to join the conversation

tannerlinsley-avatar.png
Tanner Linsley·17d ago

E2E test generation is where this really shines. Unit tests are easy, E2E is the pain point.

kentcdodds.jpg
Kent C. Dodds·17d ago

AI-generated tests that actually catch bugs? I'll believe it when I see the coverage reports.

leerob.jpg
Lee Robinson·17d ago

Watching you code and generating tests in real time — that's the copilot we actually need.