Sandboxing Student Code in Serverless: A Threat Model

Posted at 2026-03-07 # Systems # Security # Serverless # WebAssembly

EN · 中

Today my MSc project officially kicked off. The premise sounds simple: run student code safely inside AWS Lambda. The constraints make it interesting.

The Problem

Lambda Feedback is a platform where students submit code and get it evaluated in real time. The backend uses serverless functions — AWS Lambda spins up a container, runs the code, returns the result.

For performance, Lambda reuses containers. A function that handled Student A’s submission five minutes ago might handle Student B’s next. Same filesystem, same process memory, same /tmp.

That’s a problem.

[Lambda Instance]
├── /tmp          ← writable, persistent across invocations
├── env vars      ← might contain secrets
├── process memory ← Python module globals survive warm starts
└── network       ← outbound open by default

Student A can write a file to /tmp. Student B can read it. In the worst case, Student A can exfiltrate the evaluator’s logic or poison the grading environment.

What We Can’t Do

Standard OS-level isolation is off the table:

No root → no user namespaces, no unshare, no nsjail
No KVM → no Firecracker, no microVMs
No FUSE (probably) → no overlay filesystems at the process level

Lambda already applies a seccomp-bpf filter of its own. We can layer on top of it, but we can’t go beneath it.

The Defense Matrix

Here’s what’s available and what each tool covers:

Attack	seccomp	rlimit	env cleanup	/tmp clear
Fork bomb	✅	✅	—	—
Memory bomb	—	✅	—	—
Disk bomb	—	✅	—	✅
/tmp snooping	—	—	—	✅
Env var leak	⚠️	—	✅	—
/proc reading	⚠️	—	—	—
Reverse shell	✅	—	—	—
Network exfil	✅	—	—	—
setuid	✅	—	—	—

The gaps: /proc reading and environment variable leakage. seccomp can’t block getenv() — that’s a memory read, not a syscall. And /proc filtering with BPF argument inspection is fragile.

90% coverage is achievable. The remaining 10% needs creativity.

Clever Workarounds

1. `LD_PRELOAD` Interception

No kernel access needed. Compile a shim that wraps open():

// Intercept file opens at the libc level
int open(const char *path, int flags, ...) {
    if (strstr(path, "/proc") || strstr(path, "/var/task"))
        return -EACCES;
    return real_open(path, flags, ...);
}

LD_PRELOAD=/lib/shimmy_sandbox.so python3 student_submission.py

Student code calls open("/proc/self/environ") → gets denied. No kernel changes. Works anywhere LD_PRELOAD isn’t stripped.

Downside: a determined student who knows about this can work around it (call syscall() directly). It’s defense-in-depth, not a hard boundary.

2. Environment Sanitization

The simplest fix for env var leaks:

clean_env = {
    "PATH": "/usr/bin:/usr/local/bin",
    "HOME": "/tmp/student",
    "LANG": "en_US.UTF-8",
    # Everything else stripped — no AWS_*, no secrets
}
subprocess.run(["python3", "submission.py"], env=clean_env)

Zero overhead. Should be the baseline for any approach.

3. WebAssembly (The Nuclear Option)

Run student code inside a WASM runtime. Pyodide compiles CPython to WASM; Wasmer/Wasmtime provide the host.

student code → Pyodide → WASM linear memory → Wasmtime
                                              ↑
                                    No syscalls. No filesystem.
                                    Everything goes through host imports.

This solves everything — /proc, env vars, network, all of it. The WASM instance has no concept of the host filesystem.

The cost: Pyodide adds ~30MB and seconds of startup. For a platform that values fast feedback, that’s real. But it’s the only option that closes all the gaps.

The Recommended Stack

For now: fork + seccomp + rlimit + env sanitization.

Lambda invocation
  └── fork() new process
        ├── Apply seccomp-bpf filter (deny dangerous syscalls)
        ├── Apply rlimit (CPU, memory, open files)
        ├── Clean env (strip AWS_*, keep only PATH/HOME/LANG)
        ├── Clear /tmp
        └── exec student code

This covers ~90% of the threat surface with low complexity, no root, and reasonable performance overhead.

WASM goes on the roadmap as the long-term path for languages where the toolchain supports it. Python is the priority — Pyodide is production-ready enough.

What’s Next

Read the papers: Firecracker (NSDI’20), syscall interposition survey
Understand shimmy’s current architecture before touching anything
Verify what seccomp actually allows inside Lambda (empirical testing beats assumptions)
Supervisor meeting in two weeks

The interesting constraint here — userspace-only, no OS changes — forces creative solutions. That’s what makes it a research project rather than a configuration problem.

Yuzhe's Blog

yuzhes

Sandboxing Student Code in Serverless: A Threat Model

The Problem

What We Can’t Do

The Defense Matrix

Clever Workarounds

1. `LD_PRELOAD` Interception

2. Environment Sanitization

3. WebAssembly (The Nuclear Option)

The Recommended Stack

What’s Next

The Problem

What We Can’t Do

The Defense Matrix

Clever Workarounds

1. LD_PRELOAD Interception

2. Environment Sanitization

3. WebAssembly (The Nuclear Option)

The Recommended Stack

What’s Next

1. `LD_PRELOAD` Interception