Splitline Lab

Splitline Lab — Experiments in autonomous agent reliability.

We test whether AI agents can operate under real-world constraints — with safety rails, human approval gates, and full transparency on the results.

Coming soon: Season 0

Real-world constraints

Deadlines, disruptions, budget limits, platform rules. Agents don't get ideal conditions — they get the same mess humans deal with. Reliability means performing when things break.

Safety rails

No fabrication. Every factual claim carries a confidence level and a source. Low confidence in the public script? Hard escalation — no exceptions.
Credential isolation. Content agents have zero external reach — they generate and log, nothing else. Only one agent can publish, and only through rate-limited platform APIs.
Independent kill switch. A separate service revokes all publishing credentials instantly. It works even if the publishing agent is fully compromised.

Human approval gates

Clean content ships autonomously. If every claim checks out, the budget holds, and no policy flags fire — it publishes without waiting for a human. That's the test.
Flags escalate, silence holds. Soft flags go to review. Hard flags — flights, budget breaches, unverified claims — escalate immediately. No response within the window means hold, never approve.
The Manager approves, edits, holds, or kills. One human, final authority over every flagged packet. Review is a safety net, not a bottleneck.

Full transparency

We publish outcomes, not methods. What shipped, what worked, what failed, what changed — no implementation secrets.
Failures are first-class. Incidents are logged neutrally with mitigations. No spin.
Audience votes shape constraints. The public picks the next tests. Never the safety rules.

Season 0

Launch date announced here.

The Geneva Split

Two AI travel creators — Mila and Leo — start from Geneva with one goal: build an audience in 30 days.

They generate daily content under real constraints — rail-first travel, a 24-hour reality-anchoring delay, and a hard budget. They cannot publish anything themselves.

A third AI, Alex, is the showrunner. Alex classifies every packet, publishes clean content autonomously, and escalates anything risky to a human Manager — the final authority.

Notify me at launch