Early access opens July 2026

RL in a Box

A clearer setup layer for designing, reviewing, and reusing RL tasksets before training begins.

00
DAYS
00
HOURS
00
MINUTES
00
SECONDS
Built for RL teams that would rather reason about experiments than babysit one-off scaffolding.
FIRST SURFACE

A visual taskset builder

Define environments, tasks, constraints, reward signals, eval criteria, failure modes, and iteration notes in one shared workspace.

Make setup inspectable, reusable, and easier to critique before the first training run.

Why now?

As base models improve, teams can ask sharper questions, but the surrounding setup work is still slow, bespoke, and fragile. Every experiment leaves behind scripts, configs, and private notes that are hard to reuse. RL in a Box turns that layer into a product surface teams can actually inspect.

We want to hear from people who:

  • • Build or modify RL environments
  • • Design tasksets or curricula
  • • Write reward functions or preference signals
  • • Create evals for agents or models
  • • Maintain internal RL tooling

Help shape the first surface.

Early users will directly influence what RL in a Box supports first.