Database Migrations

Moving and Upgrading Databases Safely


Your application code lives in Git. Every change is tracked, reviewable, and reversible. Your database schema should work the same way. Database migrations are the mechanism that makes this possible: versioned, ordered, trackable changes to your database structure that deploy alongside your application code.

The concept is simple. The execution, across 50+ production databases over two decades, is where the complexity lives. A migration that works on your local machine with 200 rows will lock a production table with 15 million rows for 40 minutes. A column rename that looks safe will break every query referencing the old name the moment it runs. A rollback that should undo the last deploy will silently drop a column that has already been populated with new data.

These are not hypothetical risks. We have encountered every one of them across the Laravel applications we build and maintain.


Why Database Migrations Fail in Production

Most tutorials teach database migrations as a straightforward pattern: write an "up" method to apply the change, write a "down" method to reverse it, run the command. This works in development. It fails in production for three reasons.

State is permanent

Code deployments replace the old version entirely. Schema changes modify a live, stateful system. A dropped column is gone. A type change may truncate data silently. There is no "git checkout" for a production database.

Scale changes the physics

Adding an index to a table with 500 rows takes milliseconds. Adding an index to a table with 50 million rows takes minutes, and the table is locked for writes the entire time. Every migration has a different risk profile depending on the data volume behind it.

Concurrency creates conflicts

During a rolling deployment, old code and new code run simultaneously. If the new code expects a column that does not exist yet, or the old code expects a column that has just been dropped, requests fail. The deployment window is not atomic.

The gap between tutorial-level migrations and production-grade database migration strategy is where most teams get bitten. The patterns that follow close that gap.


Safe Operations Versus Dangerous Operations

Not all schema migrations carry the same risk. The distinction matters because it determines your deployment strategy.

Generally safe (additive)
  • Adding a new table
  • Adding a nullable column (no default required, no table rewrite)
  • Adding an index (watch table size; see large table section below)
  • Adding a new enum value at the end of the list
Dangerous (require coordination)
  • Removing a column (requires multi-phase deploy)
  • Renaming a column or table (use expand-contract)
  • Changing a column type (safe only if lossless)
  • Dropping a table (irreversible once committed)

The rule is straightforward: anything additive is generally safe. Anything that modifies or removes existing structures needs a multi-step approach with code changes deployed before and after the schema change.


Zero-Downtime Migration Patterns

For any application that cannot tolerate downtime during deploys, database migrations must be backward-compatible. Old code and new code must both function correctly against the schema at every intermediate state.

The expand-contract pattern

This is the workhorse pattern for any non-trivial schema change. It works in three phases.

Phase 1: Expand
Add the new structure alongside the old. Add the new column, create the new table, add the new index. Do not remove or modify anything existing. Deploy this migration and the code that writes to both old and new structures.
Phase 2: Migrate data
Backfill the new structure from the old. For small tables, a single migration query works. For large tables, batch the operation. Verify data integrity.
Phase 3: Contract
Once all code reads from the new structure exclusively, remove the old column, table, or index. This is a separate deploy, sometimes days or weeks after the expand phase.

Deployment ordering

The sequence matters. For adding a column: deploy the migration first, then deploy the code that uses it. For removing a column: deploy the code that stops using it first, then deploy the migration that drops it. Getting this backwards is one of the most common causes of production incidents during deploys.

Change Type Safe Order Reason
Add column/table Migration first New code expects the structure to exist
Remove column/table Code first Old code must stop referencing before removal
Add constraint Code first (usually) Code should already satisfy the constraint
Rename Expand-contract Neither order is safe alone; use three-phase approach

In our CI/CD pipelines, we enforce this ordering through deployment checklists and migration review gates.


Large Table Migrations

When a table exceeds one million rows, standard ALTER TABLE statements become risky. The database engine may need to rewrite the entire table, locking it for the duration. On a busy table, this means blocked writes, queued requests, and eventually timeouts.

Row count risk thresholds

Row Count Risk Level Approach
Under 100,000 Low Standard migrations work fine
100,000 to 1 million Moderate Test timing on production-size data first
1 to 10 million High Plan carefully, consider online schema change tools
Over 10 million Critical Explicit strategy per change, use gh-ost or pt-online-schema-change

Online schema change tools

These tools modify large tables without holding long-running locks.

gh-ost (GitHub)

Creates a ghost table with the new schema, copies rows incrementally, tails the binlog for ongoing changes, then performs an atomic swap. No triggers required, and the operation can be paused or throttled.

pt-online-schema-change (Percona)

Uses a similar approach with triggers to capture ongoing writes. Well-tested and battle-proven, but triggers add overhead on write-heavy tables.

For PostgreSQL, operations like adding a nullable column or creating an index concurrently (CREATE INDEX CONCURRENTLY) avoid table locks entirely. PostgreSQL's ALTER TABLE ... ADD COLUMN with a default value was rewritten in version 11 to avoid table rewrites, making it significantly safer for large tables. We select the tool based on the database engine, table size, write volume, and acceptable risk window.


Data Migrations Versus Schema Migrations

Schema migrations change the structure of your database: columns, indexes, constraints, tables. Data migrations change the content: backfilling a new column, transforming existing values, merging duplicate records. Mixing them in a single migration file is a common mistake. They should be separate because they differ in critical ways.

Aspect Schema Migrations Data Migrations
Speed Usually fast (except on large tables) Can run for hours depending on volume
Reversibility Clean reversals (drop what you added) May not be reversible without backups
Transaction scope Typically transactional Should batch rows and commit incrementally
Failure mode Usually rolls back cleanly May partially complete, leaving inconsistent state

In Laravel's migration system, we structure these as separate migration files with clear naming conventions: add_status_column_to_orders for schema, backfill_order_status_from_legacy_field for data. The ordering ensures the schema change runs first.

For data migrations that touch millions of rows, we use background jobs with progress tracking. The migration file dispatches the job; the job batches the data transformation. This keeps the deploy fast and the data migration observable.


Multi-Tenant Migration Patterns

Running database migrations across multiple tenants introduces coordination challenges that single-database applications do not face. In the multi-tenant Laravel systems we build, we have settled on patterns that handle both shared-database and database-per-tenant architectures.

Shared database (tenant column)

All tenants share one database with a tenant_id column on every table. Migrations run once and affect all tenants simultaneously. The key discipline: test migrations against production-scale data, not just a single tenant's worth of rows. A table that holds 10,000 rows per tenant and serves 500 tenants has 5 million rows.

Database per tenant

Each tenant has their own database. Migrations must run against every database individually. The failure mode: 400 of 500 tenant databases migrated successfully, but 100 failed. Our migration runner iterates through tenant databases, records success or failure per tenant, and provides a dashboard showing status across the fleet.

This is one area where the data model design has long-term consequences. Choosing shared versus per-tenant databases in the early stages of the project determines your migration complexity for the life of the application.


Testing Migrations in CI/CD Pipelines

Migrations that are not tested before production will eventually fail in production. The testing strategy depends on what can go wrong.

Structural testing

Every migration should pass an up-down-up cycle: apply it, reverse it, apply it again. This verifies that the "down" method correctly undoes the "up" method and that the migration is re-runnable. In our CI pipelines, this runs automatically on every pull request that includes a migration file.

Data-volume testing

A migration that completes in 200 milliseconds against a test database with 50 rows may take 20 minutes against production data. We maintain a staging environment with production-scale data (anonymised) specifically for testing migration timing. Any migration touching a table with more than 500,000 rows requires a timing test before approval.

Dependency testing

Migrations have implicit dependencies on application code. A migration that adds a non-nullable column without a default value will fail if the application has not been updated to provide that value. CI should run the application's test suite against the migrated schema to catch these mismatches.

Migration review checklist

Before any migration reaches production, we verify six conditions.

  • Deployment order documented Does it require migration before code, or code before migration?
  • Table size assessed Does it touch a table with more than 100,000 rows?
  • Destructive changes flagged Does it modify or remove an existing column?
  • Rollback tested Has the "down" method been verified on a test database?
  • Data migration separated Does it include a data migration that should be its own file?
  • Production-scale timing measured Has it been tested against realistic data volumes?

This checklist catches the majority of migration incidents before they reach production. We enforce it through pull request templates and automated CI checks.


Common Failure Modes

Over 20 years and 50+ Laravel applications, we have seen most of the ways database migrations fail. These are the patterns that cause the most damage.

Lock timeout on a busy table

The migration acquires a lock, but the table is under heavy write load. The lock request queues behind existing transactions, and the migration times out. Schedule migrations on high-traffic tables during low-traffic windows. Use online schema change tools that avoid long-running locks.

Rollback that destroys data

The "down" method drops a column added in the "up" method. Between deploy and rollback, the application wrote thousands of records to that column. The rollback drops it all. Treat rollbacks as a safety net. For columns that accumulate data, rename rather than drop.

Partial migration in a non-transactional database

MySQL's DDL statements are not fully transactional. A migration that creates a table and then adds a foreign key can fail on the foreign key, leaving an orphaned table. Keep migrations small. One structural change per file. PostgreSQL handles this better with transactional DDL, which is one reason we prefer it for infrastructure decisions.

Migration ordering conflicts in team development

Two developers create migrations simultaneously for the same table. Both work individually. When merged, the ordering may cause conflicts. Use timestamps for migration ordering (Laravel's default). Review migration files in every pull request. Squash old migrations periodically.


How IGC Approaches Database Migrations

Database migrations are one of those areas where the difference between "it works" and "it works in production" is significant. Every pattern described above comes from real incidents, real production systems, and real fixes applied under pressure.

We treat migrations as first-class code. They go through code review, they have tests, they have a deployment strategy. For systems with strict audit requirements, migrations are logged and tied to deploy records.

Zero-downtime deploys: Every migration goes through the expand-contract pattern for applications that cannot afford downtime.
Multi-tenant observability: Our migration runner provides per-tenant status for database-per-tenant architectures.
Production-scale testing: Large table migrations are timed on production-size data before every deploy.

The result is that schema changes become routine. Changes that once required weekend maintenance windows deploy during business hours without anyone noticing. That is the goal: database versioning that is as boring and reliable as the rest of your deployment pipeline.


Get Your Migrations Right

If you are dealing with migration complexity across multiple databases, large tables, or multi-tenant systems, we can help. The first conversation is free and comes with no obligation. We have likely seen your specific problem before.

Book a discovery call →
Graphic Swish