title: | Post-mortem: A Backward Incompatible Database Migration |
---|---|
description: | Val runs failed due to a database migration that was not backward compatible |
pubDate: | 2025-04-02T00:00:00.000Z |
author: | Sophie Houser |
Today at 10:11am we experienced a 12-minute outage, which caused HTTP vals to return 503 errors and other types of vals to fail. In the end, the root cause was a deployment timing issue where database migrations were deployed successfully, but our application code deployment hung for several minutes. The new database migrations were incompatible with the old application code and crashed the process.
We aim to make all database migrations maintain backward compatibility, but in this case, we only discovered through the delayed deployment feedback that the new migrations were not compatible with previous versions.
Reliability is important to us and we’ve taken steps to make sure this doesn’t happen again. We’ve added a test to ensure database migrations are backward compatible, which we’ll run before we deploy any new code that includes database migrations.