TypeScript Best Practices for Building AI Agents in 2026

May 10, 2026 · Editorial Team · 9 min read · typescript ai-agents developer-tools

TypeScript is a good fit for building AI agents, and not just because type safety is generally good. The specific patterns that make TypeScript powerful (discriminated unions, exhaustiveness checking, runtime validation with Zod) map well onto the specific challenges of agent development: structured tool definitions, validated model outputs, and typed state machines.

The community has also converged on a few frameworks and patterns that have proven themselves in production. Here's what the TypeScript AI agent stack looks like in May 2026 and the patterns that actually matter.

Framework choices in 2026

You have three realistic options for TypeScript agent development:

Vercel AI SDK (ai package, version 4.x): The most popular TypeScript AI SDK by install count, with tight Next.js integration, streaming support that works well in browser and server contexts, and the clearest abstractions for the common patterns. The generateText and streamText functions handle the model API calls; the tool helper handles tool definitions with full type inference.

OpenAI SDK (openai package, version 4.x): The official OpenAI TypeScript SDK, which also works with any OpenAI-compatible API endpoint. It's lower-level than the Vercel AI SDK but more explicit. Some teams prefer it for non-Next.js backends where the Vercel SDK's abstractions feel like overhead.

Anthropic SDK (@anthropic-ai/sdk): Anthropic's official TypeScript SDK. Necessary if you're specifically using Claude models and want full access to extended thinking, prompt caching, and other Anthropic-specific features.

For most new projects in 2026, the Vercel AI SDK is the starting point. It handles the abstraction layer across model providers, which means you can switch between Claude 4 Sonnet and GPT-4o by changing a string, and the streaming and tool use abstractions are production-ready.

Type-safe tool definitions with the Vercel AI SDK

Tool definitions are where TypeScript's value is highest. A tool definition in an AI agent has three parts: the name (string), the description (string), and the parameters schema. Without proper typing, the parameters schema is just any, and you lose all the benefits of TypeScript.

The Vercel AI SDK's tool helper with Zod gives you full type inference:

import { tool, generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';

const searchTool = tool({
  description: 'Search for recent news articles about a topic',
  parameters: z.object({
    query: z.string().describe('The search query'),
    maxResults: z.number().min(1).max(20).default(5),
    dateRange: z.enum(['past_hour', 'past_day', 'past_week']).default('past_day')
  }),
  execute: async ({ query, maxResults, dateRange }) => {
    // TypeScript knows: query is string, maxResults is number, dateRange is union
    const results = await performSearch(query, { maxResults, dateRange });
    return results;
  }
});

The execute function gets typed parameters automatically from the Zod schema. query is typed as string, maxResults as number, dateRange as 'past_hour' | 'past_day' | 'past_week'. TypeScript catches it at compile time if you try to use them as any other type.

The return type of execute also gets inferred: if performSearch returns Promise<SearchResult[]>, then searchTool has a return type of SearchResult[] and the model's tool result will be typed accordingly.

Runtime validation for model outputs

Models don't always produce output in the exact format you specified. Even with precise system prompts and structured output settings, production agents encounter model outputs that need validation and recovery logic.

The generateObject function in the Vercel AI SDK handles structured output with Zod validation:

import { generateObject } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';

const ResearchSummary = z.object({
  title: z.string().max(100),
  keyFindings: z.array(z.string()).min(1).max(10),
  confidence: z.number().min(0).max(1),
  sources: z.array(z.object({
    url: z.string().url(),
    title: z.string(),
    relevanceScore: z.number().min(0).max(1)
  }))
});

type ResearchSummary = z.infer<typeof ResearchSummary>;

async function summarizeResearch(query: string): Promise<ResearchSummary> {
  const { object } = await generateObject({
    model: anthropic('claude-4-sonnet-20260301'),
    schema: ResearchSummary,
    prompt: `Research and summarize: ${query}`
  });
  
  // object is typed as ResearchSummary - no type assertion needed
  return object;
}

generateObject uses the Zod schema to instruct the model to produce structured output, and validates the response against the schema at runtime. If the model returns something that doesn't match the schema (missing required field, wrong type), generateObject either retries with the validation error or throws, depending on your configuration.

This is significantly better than parsing JSON from a freeform string response, which is what many agent implementations still do.

Typed state management for multi-step agents

Multi-step agents maintain state across turns. TypeScript's discriminated unions are perfect for modeling the states an agent can be in:

type AgentState =
  | { status: 'idle' }
  | { status: 'planning'; task: string }
  | { status: 'executing'; plan: string[]; currentStep: number }
  | { status: 'waiting_for_tool'; toolName: string; toolCallId: string }
  | { status: 'complete'; result: string }
  | { status: 'failed'; error: string; lastStep: number };

function processState(state: AgentState): string {
  switch (state.status) {
    case 'idle':
      return 'Waiting for task';
    case 'planning':
      return `Planning: ${state.task}`;
    case 'executing':
      return `Step ${state.currentStep} of ${state.plan.length}`;
    case 'waiting_for_tool':
      return `Waiting for: ${state.toolName}`;
    case 'complete':
      return `Done: ${state.result}`;
    case 'failed':
      return `Failed at step ${state.lastStep}: ${state.error}`;
    // TypeScript will error here if you miss a case (exhaustiveness check)
  }
}

The exhaustiveness check means if you add a new state to the discriminated union and forget to handle it in a switch, TypeScript will tell you at compile time. This prevents an entire class of runtime bugs that come from adding new states to a running agent.

Error handling with typed errors

A common pattern in TypeScript agent code is using typed errors rather than generic Error instances. This is similar to Rust's Result<T, E> pattern and works particularly well with Zod:

class ToolExecutionError extends Error {
  readonly toolName: string;
  readonly toolArgs: unknown;
  readonly originalError: unknown;
  
  constructor(toolName: string, toolArgs: unknown, originalError: unknown) {
    const message = originalError instanceof Error 
      ? originalError.message 
      : 'Unknown error';
    super(`Tool ${toolName} failed: ${message}`);
    this.name = 'ToolExecutionError';
    this.toolName = toolName;
    this.toolArgs = toolArgs;
    this.originalError = originalError;
  }
}

class ModelValidationError extends Error {
  readonly schema: z.ZodSchema;
  readonly invalidOutput: unknown;
  
  constructor(schema: z.ZodSchema, invalidOutput: unknown, zodError: z.ZodError) {
    super(`Model output failed validation: ${zodError.message}`);
    this.name = 'ModelValidationError';
    this.schema = schema;
    this.invalidOutput = invalidOutput;
  }
}

// Usage with explicit error handling
async function runWithRetry<T>(
  fn: () => Promise<T>,
  maxRetries: number = 3
): Promise<T> {
  let lastError: unknown;
  
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error;
      
      if (error instanceof ModelValidationError) {
        // Log structured info for debugging
        console.error('Validation failure:', {
          attempt,
          invalidOutput: error.invalidOutput
        });
        continue; // retry
      }
      
      if (error instanceof ToolExecutionError) {
        console.error('Tool failure:', {
          tool: error.toolName,
          args: error.toolArgs
        });
        // Don't retry tool execution errors - they're likely not transient
        throw error;
      }
      
      throw error; // Unknown errors: don't retry
    }
  }
  
  throw lastError;
}

Typed errors let you build smart retry logic that knows which errors are retryable. Model validation errors often are (the model may produce valid output on a retry). Tool execution errors usually aren't. Generic Error objects don't carry enough information to make this distinction.

Tool result types and the full conversation loop

A complete agent loop in TypeScript needs to handle tool results correctly. The Vercel AI SDK handles the loop for you, but if you're building a custom loop (for more control over error handling, logging, or multi-agent orchestration), here's a typed implementation:

import Anthropic from '@anthropic-ai/sdk';
import { z } from 'zod';

type Tool<TParams extends z.ZodSchema, TResult> = {
  name: string;
  description: string;
  schema: TParams;
  execute: (params: z.infer<TParams>) => Promise<TResult>;
};

async function runAgentLoop<TResult>(
  client: Anthropic,
  tools: Tool<z.ZodSchema, unknown>[],
  initialMessage: string
): Promise<string> {
  const messages: Anthropic.MessageParam[] = [
    { role: 'user', content: initialMessage }
  ];
  
  const toolDefinitions: Anthropic.Tool[] = tools.map(t => ({
    name: t.name,
    description: t.description,
    input_schema: zodToJsonSchema(t.schema)  // use zod-to-json-schema package
  }));
  
  while (true) {
    const response = await client.messages.create({
      model: 'claude-4-sonnet-20260301',
      max_tokens: 4096,
      tools: toolDefinitions,
      messages
    });
    
    if (response.stop_reason === 'end_turn') {
      const textBlock = response.content.find(b => b.type === 'text');
      return textBlock?.text ?? '';
    }
    
    if (response.stop_reason === 'tool_use') {
      const toolUseBlocks = response.content.filter(b => b.type === 'tool_use');
      
      // Add assistant's response to conversation
      messages.push({ role: 'assistant', content: response.content });
      
      // Execute tools and collect results
      const toolResults: Anthropic.ToolResultBlockParam[] = [];
      
      for (const block of toolUseBlocks) {
        if (block.type !== 'tool_use') continue;
        
        const tool = tools.find(t => t.name === block.name);
        if (!tool) {
          toolResults.push({
            type: 'tool_result',
            tool_use_id: block.id,
            is_error: true,
            content: `Unknown tool: ${block.name}`
          });
          continue;
        }
        
        try {
          const params = tool.schema.parse(block.input);
          const result = await tool.execute(params);
          toolResults.push({
            type: 'tool_result',
            tool_use_id: block.id,
            content: JSON.stringify(result)
          });
        } catch (error) {
          toolResults.push({
            type: 'tool_result',
            tool_use_id: block.id,
            is_error: true,
            content: error instanceof Error ? error.message : 'Tool execution failed'
          });
        }
      }
      
      messages.push({ role: 'user', content: toolResults });
      continue;
    }
    
    break;
  }
  
  return '';
}

This is verbose compared to using the Vercel AI SDK's generateText with tools, but it gives you full control over the loop, error handling, and logging. For production agents where observability matters, explicit loops are easier to instrument.

Streaming with type safety

Streaming is important for user-facing agents where you want to display output as it generates. The Vercel AI SDK's streamObject handles typed streaming:

import { streamObject } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';

const ArticleSchema = z.object({
  title: z.string(),
  sections: z.array(z.object({
    heading: z.string(),
    content: z.string()
  })),
  summary: z.string()
});

async function streamArticle(topic: string) {
  const stream = streamObject({
    model: anthropic('claude-4-sonnet-20260301'),
    schema: ArticleSchema,
    prompt: `Write an article about: ${topic}`
  });
  
  // Partial updates as the model streams
  for await (const partialObject of stream.partialObjectStream) {
    // partialObject is typed as DeepPartial<z.infer<typeof ArticleSchema>>
    // Fields may be undefined until the model completes that part
    if (partialObject.title) {
      updateUI({ title: partialObject.title });
    }
  }
  
  const finalObject = await stream.object;
  // finalObject is typed as z.infer<typeof ArticleSchema> - fully defined
  return finalObject;
}

DeepPartial makes all fields optional, which correctly models the state where the model is partway through generating the object. You can update the UI as fields arrive rather than waiting for the complete response.

Testing agents with TypeScript

Testing agents is harder than testing regular functions because model calls are non-deterministic and slow. The approach that works is mocking the model at the SDK boundary rather than at the network level.

The Vercel AI SDK has a MockLanguageModelV1 for testing:

import { MockLanguageModelV1 } from 'ai/test';
import { generateText } from 'ai';

test('agent handles tool call correctly', async () => {
  const mockModel = new MockLanguageModelV1({
    doGenerate: async () => ({
      rawCall: { rawPrompt: [], rawSettings: {} },
      finishReason: 'tool-calls',
      usage: { promptTokens: 10, completionTokens: 5 },
      toolCalls: [{
        toolCallType: 'function',
        toolCallId: 'call_1',
        toolName: 'searchTool',
        args: JSON.stringify({ query: 'test query', maxResults: 3 })
      }]
    })
  });
  
  let toolWasCalled = false;
  
  const result = await generateText({
    model: mockModel,
    tools: {
      searchTool: tool({
        parameters: z.object({ query: z.string(), maxResults: z.number() }),
        execute: async ({ query }) => {
          toolWasCalled = true;
          return [{ title: 'Test Result', url: 'https://example.com' }];
        }
      })
    },
    prompt: 'Search for test query'
  });
  
  expect(toolWasCalled).toBe(true);
});

This tests the agent's tool-use logic without making real API calls. The mock model is configured to always return a specific tool call, so you can test that the tool execution and result handling work correctly.

For integration tests where you want to use a real model, set up a separate test suite with real API calls that runs less frequently (daily in CI rather than on every PR), using the smallest/cheapest model that can complete the test cases.

TypeScript's type system and the ecosystem of tools around it (Zod, the Vercel AI SDK, typed error patterns) make it genuinely well-suited for building production AI agents. The patterns in this article aren't theoretical: they're what teams building production agents in TypeScript have converged on in 2026. Start with the Vercel AI SDK, use Zod for all your schemas, and build your error handling types before you need them.