Ali Rathore

October 2025

Forcing the race

Race conditions are not hard to test. They are untested, because teams accept probabilistic reproduction.

A concurrency test that passes a thousand times in a row is usually a coincidence. The classic version fires two login requests at once and checks that neither returns a 500. On a fast machine the first request clears the database before the second arrives, so the two never actually contend, and the suite stays green while the race ships untouched.

These tests are weak because they aim at the wrong layer. A provisioning race is a SELECT that checks whether something exists followed by an INSERT that creates it, with one to five milliseconds of daylight in between. Two requests both pass the check, both insert, and now there are two of something the system was built to have one of. That window is too narrow to hit by firing requests at the HTTP boundary, which is why teams decide the bug is untestable. The contention lives in the database, and nobody goes there to provoke it.

Going there makes the race fire on every run. A trigger on the table, gated by a session flag so it only arms under test, can pause inside the critical section and stretch those milliseconds into a couple hundred, which turns “might interleave” into “must.” Advisory locks then act as barriers: the harness holds every in-flight transaction at the instant before the conflicting write, confirms through the database’s own activity views that all of them are genuinely blocked, and releases them together. The proof comes from the same layer rather than from the API. Two requests returning 200 means nothing, because application code is full of handlers that swallow conflicts politely. Statement statistics show what actually happened: two INSERTs attempted, one unique violation raised, one row committed.

None of this touches the application’s source. The well-known deterministic testing systems buy their reproducibility by owning the world: a database that simulates its whole cluster inside one event loop, a platform that drives the hypervisor’s scheduler. That is unavailable when the racing code is a third-party application you operate but did not write. What you always have instead is its database. The trigger becomes your fault injector, the advisory lock your scheduler, the statistics view your proof.

This matters because just-in-time provisioning manufactures these races on purpose. First login creates the user, their organization, and their workspace, synchronously, on demand. Every time two people from the same new customer sign in during the same morning, production runs a concurrent-creation experiment for you, and the people running it do not file bug reports. They just end up belonging to two different organizations.

One case looks like a race and is not. If the code mints a fresh random identifier per request for what is supposed to be a single logical entity, two concurrent first logins create two organizations, and no lock anywhere repairs it, because the requests never agreed on the identity they were contending for. The cure is determinism: derive the identifier from the entity’s name so both requests compute the same one, and the second insert collides instead of succeeding.

That is also why the final guarantee cannot sit in application logic. A check in code is a race by construction; the only thing that makes uniqueness actually true is a unique constraint the engine enforces, correct under every interleaving the scheduler can produce. The application is there to catch the violation and recover. Force that collision on every run in CI and you no longer have a race condition. You have a regression test.