Yuzhe's Blog

yuzhes

WASM vs seccomp: Benchmarking Sandbox Startup for a Code Grader

Last week we shipped sandbox_exec — a 224-line C program using seccomp-bpf to isolate student code in AWS Lambda. The honest answer at the time was: “WASM would be cleaner, but the Python ecosystem isn’t there yet.”

This week we measured exactly what “the Python ecosystem isn’t there yet” costs in milliseconds. The answer is more nuanced than expected.

Setup

Phase 1: Startup Overhead

The first question is simple: how long does it take for the sandbox to start with trivial code?

EnvironmentMeanP95
WASM (JIT)9.79ms10.86ms
WASM (AOT precompiled)9.25ms10.14ms
Python (no sandbox)14.71ms15.29ms

WASM starts faster than Python itself. That’s the counterintuitive result here — people assume “VM = slow” but Wasmtime’s startup is tighter than CPython’s interpreter initialization.

The breakdown of those ~10ms:

0–2ms:   fork() + exec(wasmtime)
2–7ms:   Wasmtime runtime init
         ├── command-line parsing
         ├── config loading
         └── WASI environment setup
7–9ms:   WASM module processing
         ├── file read
         ├── validation (type checking)
         └── JIT compilation
9–10ms:  execution + cleanup

Most of the time is in Wasmtime’s own initialization, not module parsing or JIT.

Phase 2: Does Module Size Matter?

Module sizeTimeDelta
~100B9.58msbaseline
~4KB9.68ms+0.1ms
~40KB10.97ms+1.4ms

A 400x increase in module size costs 1.4ms. The initialization cost dominates everything else.

Phase 3: Compute Performance

This is where WASM’s JIT advantage becomes visible.

WorkloadWASMPythonSpeedup
fib(10)10.06ms15.12ms1.5x
fib(20)9.63ms16.80ms1.7x
fib(25)10.77ms25.94ms2.4x
fib(30)15.91ms128.97ms8.1x

At fib(30), WASM total time is ~16ms (10ms startup + 6ms compute). Python takes 129ms. The crossover point where WASM becomes faster overall is somewhere around fib(20-25) — roughly where computation stops being negligible relative to startup.

For a homework grader evaluating algorithmic submissions, this gap matters.

Phase 4: I/O Overhead

OperationWASMPython
1× fd_write10.16ms15.15ms
100× fd_write9.97ms15.23ms

100 write operations takes the same time as 1. The startup cost dominates completely, and WASI I/O overhead is negligible once you’re inside the runtime.

Phase 5: Memory Allocation

MemoryTime
64KB9.62ms
1MB9.86ms
4MB10.04ms
16MB9.74ms

WASM uses lazy allocation. Declaring 16MB of memory costs almost nothing at startup.

Phase 6: Security Features Have No Cost

ConfigTimeNotes
No limits9.58msbaseline
+fuel (instruction counter)7.94msslightly faster
+memory limit7.76msslightly faster
+directory preopen10.50ms+0.9ms
All limits7.91ms

Adding fuel and memory limits is faster than not having them — likely because they trigger an optimized execution path. The only measurable cost is directory preopen (+0.9ms for filesystem capability setup).

Security here has negative overhead. That’s unusual.

The Security Model Gap

Performance aside, the security comparison is stark:

Dimensionsandbox_execWASM
Isolation levelProcessVM
Memory isolationShared address spaceLinear memory (hard boundary)
Syscall controlseccomp allowlistNo syscalls at all
FilesystemExternal cleanup requiredCapability-gated
Networkseccomp-blockedAbsent by default

WASM doesn’t filter syscalls — it doesn’t have syscalls. A WASM module running under WASI cannot call socket(), ptrace(), or io_uring_setup() because there’s no mechanism to make those calls. They don’t exist from inside the sandbox.

This is a fundamentally stronger guarantee than seccomp’s allowlist. With seccomp, you’re saying “block these 62 syscalls.” With WASM, you’re saying “there are no syscalls.” The attack surface difference is categorical.

Why We’re Not Using WASM Yet

The security model is better. The compute performance is better for CPU-bound code. The startup overhead is comparable.

The problem is Python:

Python WASM runtimeSizeC extensionsVerdict
MicroPython370KBLimited stdlib
RustPython~5MBPartialIncomplete
Pyodide~15MBBrowser-only, 500ms+ startup

The homework grader needs numpy, scipy, and arbitrary C extensions. Pyodide supports these but requires a browser JavaScript engine — it won’t run under Wasmtime. MicroPython and RustPython don’t support the full scientific Python stack.

This isn’t a performance problem. It’s an ecosystem problem. The WASM Python toolchain is evolving fast, but it’s not there for “run arbitrary student numpy code” yet.

The Roadmap

Now:      sandbox_exec (seccomp + rlimit)
          └── Full Python + C extensions
          └── ~1.5ms sandbox + ~15ms Python startup
          └── 62 blocked syscalls

1–2 years: WASM for non-Python languages
           └── JS, Rust, Go students → WASM directly
           └── Better security, comparable performance

2–3 years: WASM Python when ecosystem matures
           └── Component Model + WASI Preview 2
           └── Hybrid: Python → sandbox_exec, others → WASM

The hybrid architecture is the likely end state: seccomp for Python (where C extension support is non-negotiable), WASM for everything else (where the ecosystem is already mature).

Numbers Summary

If you’re evaluating WASM for a similar use case:

The 10ms startup cost is not the blocker. The Python ecosystem is.


Benchmarks by Akashi (CTO). All measurements: Wasmtime v42.0.1, macOS arm64, 50-run averages.