developer-toolsopen-sourceapi Status: active

E2B

Secure cloud sandboxes for running AI-generated code safely in any language

E2B provides secure, isolated cloud sandbox environments for running AI-generated code. Developers and AI agent builders use E2B to give language models a safe place to execute Python, JavaScript, and other code without touching the host machine. Each sandbox is a fresh VM that spins up in under 500ms and cleans up automatically. The open-source core is on GitHub under e2b-dev/e2b with a hosted cloud service for production workloads.

E2B launched in 2023 with a narrow focus: give developers a safe, fast, isolated environment to run code generated by AI models. The problem it solves is real. When an LLM generates code and your application needs to execute it, you have two bad options. You can run it directly on your server and accept the security risk of executing arbitrary model-generated code. Or you can sandbox it yourself, which means building and maintaining VM infrastructure that has nothing to do with your product.

E2B is the third option. You call their API, send the code, get back the output. The sandbox runs in an isolated VM, cleans up when it's done, and never touches your host machine.

What E2B actually does

The core product is a sandbox API. You make a request to spin up a sandbox, you get back an ID, you send code to execute inside that sandbox, you get back stdout and stderr. When you're done, the sandbox terminates and all state is wiped.

Each sandbox is a fresh virtual machine. Nothing carries over from previous runs unless you explicitly preserve it. This isolation is the whole point. Model-generated code might try to read files it shouldn't, make network requests to unexpected places, or consume unbounded resources. Inside a sandbox, those behaviors are contained. The worst case is a failed sandbox that gets cleaned up, not a compromised server.

The SDKs for Python and JavaScript make this straightforward to integrate. A basic workflow looks like spawning a sandbox, running code, streaming the output back, and letting the sandbox time out. More complex workflows can upload files to the sandbox filesystem, install packages mid-session, and run multiple code blocks in sequence while preserving state.

Custom sandbox templates

The default templates come with Python and JavaScript environments already configured. For most AI coding tasks, these work without modification. But if your application needs specific packages pre-installed, specific system libraries, or a different runtime, you can build a custom template from a Dockerfile.

Custom templates are built once and cached. Subsequent sandboxes using that template start from the cached image rather than rebuilding. This keeps startup times fast even for complex environments. A data science template with pandas, numpy, matplotlib, and scipy pre-installed will start in roughly the same time as the bare Python template once it's cached.

The Dockerfile-based configuration is standard. If your team already knows Docker, the workflow is familiar. If not, there's a learning curve, but it's the same learning curve as any Docker-based infrastructure.

Streaming output

One of the more practically useful features is streaming stdout and stderr as code runs. For short scripts that complete in under a second, this doesn't matter much. For longer computations, data processing tasks, or anything that prints intermediate results, streaming means your application can show progress to the user in real time rather than waiting for the sandbox to finish.

This is particularly relevant for AI agents that do multi-step tasks. If an agent is processing a large dataset, writing the intermediate results to a file, and then generating a summary, streaming lets you show each step as it completes. The user sees progress. The experience feels responsive even when the underlying computation takes ten or twenty seconds.

File system access

Sandboxes have a writable filesystem. You can upload files from your application to the sandbox before running code, and download files from the sandbox after the run completes. This enables use cases like uploading a CSV, running Python to analyze and transform it, and downloading the resulting processed file.

The filesystem is ephemeral by default. When the sandbox terminates, everything on disk is gone. For workflows that need persistence across multiple sandbox sessions, you'd manage that at the application level by uploading the relevant state at the start of each session. E2B doesn't provide native persistent storage.

Open source and self-hosting

The sandbox runtime that powers E2B is open source at github.com/e2b-dev/e2b. The repository includes the orchestration logic, the API server, and the tooling for building custom templates. The code is Apache 2.0 licensed.

In practice, most teams use the hosted cloud service. Running the sandbox infrastructure yourself means managing the underlying VM compute, handling scaling, and dealing with the operational complexity of containerized execution at scale. The open-source core is more useful as an auditing and trust tool than as a self-hosting path for most teams.

For teams with strict data residency requirements or that need to run code in environments that can't make outbound network requests, self-hosting is technically feasible. The operational cost is real though.

Where E2B fits in a stack

E2B is a component, not a complete product. You don't use E2B as an end user. You use it as a developer to add code execution capability to an application you're building.

Common patterns include AI coding assistants that verify generated code before showing it to users, data analysis tools where users can write and run Python against their own data, educational platforms where students run code exercises in isolated environments, and LLM agent frameworks that need a safe execution environment as one step in a multi-step reasoning loop.

The E2B SDK integrates naturally with popular agent frameworks. There are documented integrations with LangChain, CrewAI, and the OpenAI Assistants API. The combination of E2B sandboxes with a code-generating model is a common pattern for building agents that can do real computation.

Pricing in practice

The free tier of 100 hours per month is generous for development. A typical development workflow where you run dozens of sandboxes per day will stay well within the free tier during the build phase.

For production applications, the math depends on usage patterns. If each user request triggers one sandbox that runs for an average of five seconds, you're paying about $0.001125 per request. At 100,000 requests per month, that's $112.50 in sandbox costs. This is reasonable for most production applications, though high-volume use cases may need to optimize sandbox duration or negotiate enterprise pricing.

The Pro plan at $25/month includes higher rate limits and priority support. For production deployments where sandbox availability and response time matter, the rate limit increases are often worth more than the dollar amount suggests.

Both E2B and Modal Labs provide cloud compute for AI applications, but they're aimed at different things. Modal is a general-purpose serverless compute platform for running functions and models in the cloud. E2B is specifically focused on code execution sandboxes for AI agents.

If you need to run untrusted AI-generated code in an isolated environment with fast startup times and per-second billing, E2B is the more focused tool. If you need to run your own AI inference, host model endpoints, or run arbitrary Python functions in the cloud, Modal has broader capabilities. Many applications that use E2B for sandboxing use something like Modal for model inference, with each serving its specific role.

Getting started

The quickest path to running a sandbox is the Python SDK. Install e2b, get an API key from e2b.dev, and you can run your first sandbox in a few lines of code. The documentation covers the basics well and includes examples for common patterns like file upload, package installation, and streaming output.

For production use, the custom template documentation is worth reading early. Knowing what packages you need pre-installed and building a template before your application goes live avoids the startup latency of installing packages at runtime.

The GitHub repository is worth a look for understanding how the sandboxes work under the hood. The architecture is not complex, and reading the source code clarifies what isolation guarantees the product actually provides.

Key features

Isolated sandbox environments: each run gets a fresh VM with no shared state
Code execution in Python, JavaScript, TypeScript, Bash, and more
Streaming output: see stdout and stderr in real time as code runs
File system access inside sandboxes for reading, writing, and installing packages
Custom sandbox templates built from Dockerfiles for specialized environments
SDK for Python and JavaScript to spin up sandboxes from your application
REST API for language-agnostic integration
Sandbox lifecycle management with timeouts and cleanup

Pros and cons

Pros

+ Open-source core: the sandbox runtime is on GitHub and auditable
+ Sandbox startup times under 500ms make synchronous code execution practical
+ Free tier of 100 hours per month is enough to prototype and test
+ Custom Docker templates let you preinstall packages and configure the environment
+ SDKs for Python and JS are clean and well-documented
+ Streaming output makes it possible to show real-time results to end users

Cons

− Free tier compute hours go fast if you're running many sandboxes per request
− Custom templates require Docker knowledge to build correctly
− No built-in UI for browsing sandbox history or debugging failed runs

Who is E2B for?

AI coding agents that need to run and verify generated code
Data analysis tools where users can execute Python against uploaded files
Educational platforms that give learners a safe code execution environment
LLM applications that need a persistent filesystem for multi-step tasks

Alternatives to E2B

If E2B isn't quite the right fit, the closest alternatives are modal-labs , and replit-agent . See our full E2B alternatives page for side-by-side comparisons.

Frequently Asked Questions

What is E2B used for?

E2B is used to give AI agents and LLM-powered applications a safe place to run code. Instead of executing model-generated code directly on your server, you send it to an E2B sandbox where it runs in an isolated VM. The host machine never sees the untrusted code. This is particularly common in AI coding assistants, data analysis agents, and anything where an AI might generate and then execute Python or JavaScript as part of its workflow.

Is E2B open source?

Yes. The E2B sandbox runtime is open source under the Apache 2.0 license at github.com/e2b-dev/e2b. The cloud service that hosts the sandboxes is a separate commercial product, but you can review the core code, contribute, and in principle run the sandbox infrastructure yourself. Most teams use the hosted cloud service for convenience and reliability.

How fast do E2B sandboxes start?

E2B targets sub-500ms cold start times for standard sandboxes. In practice, spin-up time depends on the sandbox template. The default Python and JavaScript templates start quickly. Custom templates based on larger Docker images take longer the first time but warm up with caching. For interactive agent workflows where the user is waiting on a response, the startup latency is usually imperceptible.

How does E2B pricing work?

The free tier covers 100 sandbox compute hours per month, which is more than enough for development and light production use. Beyond that, usage is billed at $0.000225 per sandbox second, which works out to about $0.81 per sandbox hour. Pro at $25/month includes higher rate limits and priority support. Enterprise pricing is custom with SLAs. If your application runs many short sandboxes, the per-second billing model is fairly efficient.

What languages does E2B support?

E2B supports any language you can run in a Docker container. Out of the box, the pre-built templates cover Python and JavaScript/TypeScript. Custom templates can include Go, Rust, Ruby, Java, or any other language by building from a Dockerfile. Python is by far the most common use case because most AI data analysis and code generation tasks involve Python.

Related agents

Aide

Open-source AI-native IDE built on VS Code with agent-first workflows and local memory

codingide Free tier

2,193 ★ — 0.0%

Anthropic Computer Use

Claude's computer-use capability that powers desktop and browser agents

Featured

autonomouscomputer-use Paid

Anthropic Skills

Pre-built and custom skills for Claude that extend what Claude can do in Claude Code

developer-toolsproductivity Free tier