Agentbrisk

How to Use Claude Code to Add Tests to an Untested Codebase

March 11, 2026 · Editorial Team · 6 min read · claude-codetestingvitest

Most codebases have at least one module that everyone quietly avoids touching because nobody wrote tests for it. You know the one. It works, mostly, and the last person who understood it left six months ago. Adding tests to that module by hand is tedious enough that it keeps getting pushed back. Claude Code makes it fast enough to actually do.

The approach here isn't to ask Claude Code to write tests blindly and hope for the best. It's a structured process: set your conventions first, feed it one module at a time, and iterate on the output until the coverage is meaningful. Here's what that looks like in practice.


Start with a CLAUDE.md file

Before you run a single command, create a CLAUDE.md file at the root of your project if you don't already have one. Claude Code reads this file at the start of every session and uses it as standing instructions. For a testing run, you want it to contain your test framework, your conventions, and any constraints.

A minimal example for a TypeScript project using Vitest:

# Testing conventions
- Framework: Vitest with @testing-library/react for component tests.
- Test files live next to source files: `foo.ts` gets `foo.test.ts`.
- Use `describe` blocks to group related cases. Use `it` for individual tests.
- Mock external HTTP calls with `vi.mock`. Never make real network requests in tests.
- Coverage target: aim for all exported functions, all conditional branches.
- Do not import `jest`; this project uses Vitest. The API is compatible but the import matters.

That last point about Vitest vs Jest is worth calling out explicitly because Claude Code has seen so much Jest code in training that it occasionally writes import { jest } by reflex. Putting it in CLAUDE.md corrects that before it becomes a pattern.

For a Python project using pytest, the equivalent section would specify pytest, conftest.py conventions, and whether you're using unittest.mock or pytest-mock.


Run Claude Code and point it at a specific module

Launch Claude Code in your project root:

claude

Then give it a concrete, bounded task. Don't say "add tests to the project." Say:

Add tests to src/utils/pricing.ts. Test every exported function. Include edge cases for zero values, negative inputs, and currency rounding. Use the conventions in CLAUDE.md.

The specificity matters. A bounded request on a single module produces a diff you can review in five minutes. A broad request produces a diff you'll spend an hour second-guessing.

Claude Code will read pricing.ts, understand the function signatures and logic, and generate a pricing.test.ts file. It typically gets the happy-path tests right immediately. The edge cases are where it needs guidance.


Iterate on coverage with follow-up prompts

After the initial file is generated, run your coverage tool:

npx vitest run --coverage

Vitest's coverage report (with @vitest/coverage-v8) will show you which branches are uncovered. Take that output back to Claude Code:

The coverage report shows the `applyDiscount` function has an uncovered branch on line 47. 
That branch handles discount codes with a null expiry date. Add a test for that case.

This back-and-forth is faster than trying to write perfect tests in one pass, and it's more accurate than asking Claude Code to "cover all edge cases" up front because you're giving it concrete information about what's actually missing.

In practice, three or four iterations gets most modules to 85-90% branch coverage. The remaining gaps are usually in error handling for truly exceptional conditions, which you can decide to test or not based on risk.


Handling complex modules with dependencies

Some modules are hard to test because they depend on a database, an external API, or a complex object graph. Claude Code handles this reasonably well if you give it the context upfront.

For a module that calls a database:

Add tests to src/services/userService.ts. The module uses a Prisma client passed via dependency injection. 
Mock the Prisma client using vi.mock. The mock should cover the findUnique, create, and update methods. 
Do not connect to a real database.

Claude Code will generate the mock setup. The mocks are sometimes too permissive (returning undefined where your type says User), so check that the mock return values match your TypeScript types. A mock that returns the wrong shape will make tests pass while hiding real bugs.

For Python projects, the same pattern applies: tell Claude Code what to mock and how. If you're using pytest fixtures, mention that:

Add tests to app/services/email_service.py. Use pytest fixtures for setup. 
Mock the SendGrid client using unittest.mock.patch. 
Fixtures should go in conftest.py if they're reusable.

A step-by-step walkthrough for a real module

Here's a concrete sequence I ran on a legacy Node.js utility module called dateHelpers.ts that had zero tests:

  1. Added testing conventions to CLAUDE.md (framework, file placement, mock rules).
  2. Opened Claude Code: claude
  3. Prompt: Add tests to src/utils/dateHelpers.ts. Test all exports. Edge cases: DST transitions, leap years, invalid date strings, timezone offsets.
  4. Claude Code generated dateHelpers.test.ts with 18 test cases. Ran npx vitest run src/utils/dateHelpers.test.ts.
  5. Three tests failed: two because the function behavior was actually wrong (Claude Code found real bugs), one because the mock for Intl.DateTimeFormat was incomplete.
  6. For the bugs: I asked Claude Code to look at the failures and explain them. It correctly identified that the formatRelativeDate function didn't handle dates more than a year in the past. I fixed the source, not just the test.
  7. For the incomplete mock: The Intl.DateTimeFormat mock in dateHelpers.test.ts doesn't cover the resolvedOptions method. Fix the mock.
  8. Final run: all 18 passing, coverage at 91%.

Total time: about 35 minutes. Writing those tests by hand would have taken two hours.


When Claude Code misreads the code

Claude Code occasionally misunderstands what a function is supposed to do, especially when the original code is unclear or relies on implicit conventions. The test will pass but test the wrong thing.

The tell is a test like expect(result).toBe(undefined) for a function that should always return a value. That's Claude Code being conservative about a function it couldn't fully reason about.

When this happens, add a comment to the source function explaining the expected behavior, then ask Claude Code to regenerate the test. Something like:

// Returns the user's display name. Falls back to email if name is not set. Never returns null.

That context, even as a simple comment, is usually enough to get the test right.


Adding tests to a Python project with pytest

The workflow is the same, with minor differences. In your CLAUDE.md:

# Testing conventions (Python)
- Framework: pytest
- Test files: tests/ directory, named test_<module>.py
- Use fixtures in conftest.py for shared setup
- Mock external calls with pytest-mock (mocker fixture), not unittest.mock directly
- Aim for full branch coverage on all public functions

Then in Claude Code:

Add tests to app/utils/validation.py. Use pytest. 
Edge cases: empty strings, Unicode input, values exceeding max length.

One thing I've noticed with Python: Claude Code tends to write more assert isinstance(result, dict) style assertions instead of checking specific values. Push back on that:

Replace the isinstance assertions in test_validation.py with specific value checks. 
Test what the function actually returns, not just the type.

That makes the tests catch real regressions rather than just confirming the function ran.


Setting up a repeatable process

Once you've done this a few times, the process becomes fast enough to run as part of your PR workflow. The pattern:

  • Keep CLAUDE.md updated with any new conventions as the project evolves.
  • Run Claude Code on any new module before it gets to review.
  • Use coverage reports as feedback to drive follow-up prompts.
  • Commit the generated tests and review them the same way you'd review any code.

The generated tests aren't sacred. They're a starting point. You'll catch things that are technically correct but test the wrong behavior. That's fine; you still end up with tests faster than writing from scratch, and the process of reviewing them teaches you things about the code you might not have known.

Search