Migrations (Safe Schema Changes)
Migrations: The Most Dangerous Deployment Step
Database migrations change the shape of your system. Unlike application code, schema changes affect all running instances immediately. Poor migration discipline causes downtime, partial deployments, and irreversible data loss.
Migrations influence:
- Deployment safety
- Backward compatibility
- Data integrity
- Rollback capability
Real Production Failure: Rolling Deploy Breakage
A team deployed new application code that required a new column. The migration dropped an old column immediately. Half the pods were still running the old version and crashed when the column disappeared.
Root cause: non-backward-compatible schema change during rolling deployment.
Core Rule: Migrations Must Be Backward-Compatible
In rolling deployments:
- Old code and new code run simultaneously
- Schema must support both versions
Safe Pattern
- Add new column (nullable)
- Deploy code that writes both old and new fields
- Backfill data
- Switch reads to new column
- Remove old column in later migration
Forward-Only Migrations
In production systems, down migrations are rarely safe. Prefer forward-only migrations and controlled rollbacks.
Reason: data loss from DROP COLUMN cannot be reversed automatically.
Transactional vs Non-Transactional DDL
Some databases (Postgres) allow transactional DDL.
BEGIN; ALTER TABLE users ADD COLUMN age INT; COMMIT;
Others (MySQL with certain engines) may auto-commit DDL.
Know your database behavior.
Zero-Downtime Schema Changes
Adding Columns
Safe if nullable or with default that does not lock table heavily.
Dropping Columns
Unsafe during rolling deploy unless old code removed first.
Renaming Columns
Prefer add new + migrate data + drop old.
Long-Running ALTER Risks
ALTER TABLE on large tables can lock writes and cause downtime.
Mitigation:
- Use online schema change tools
- Add columns without default then backfill in batches
- Test migration duration in staging with realistic data
Idempotent Migration Design
Migrations should not fail if partially applied.
ALTER TABLE users ADD COLUMN IF NOT EXISTS age INT;
Guard statements when supported.
Tooling Options
golang-migrate
- Widely used
- Supports SQL files
- Versioned migrations
goose
- SQL and Go-based migrations
- Simple integration
Regardless of tool, the discipline matters more than tooling.
Migration Execution Strategy
- Run migrations before deploying new version
- Or run as part of deployment job
- Never auto-run destructive migrations at startup without safeguards
Feature Flags + Schema Evolution
Combine feature flags with migrations:
- Deploy schema first
- Enable feature flag after verification
- Remove legacy paths later
This decouples schema change from feature activation.
Data Backfills
Backfilling large tables must be batched.
UPDATE users SET age = 0 WHERE age IS NULL LIMIT 1000;
Run repeatedly to avoid table locks.
Rollback Strategy
Instead of down migration:
- Revert application version
- Keep schema compatible
- Plan corrective forward migration if needed
Testing Migrations
- Run migrations on staging with production-sized data
- Measure execution time
- Verify no locks exceed SLA
Common Anti-Patterns
- Dropping columns during rolling deploy
- Long blocking ALTER on large tables
- Running migrations automatically without review
- Mixing schema change with data rewrite blindly
- Assuming down migration always safe
Operational Checklist
- Migrations backward-compatible
- Schema supports mixed app versions
- Long-running ALTER evaluated
- Backfills batched
- Feature flags used when necessary
- Migration tested with realistic data
- Rollback plan documented
Final Perspective
Migrations are not just SQL files. They are deployment contracts. Safe schema evolution requires coordination between application code, database behavior, and deployment strategy. In production systems, discipline in migrations prevents outages more effectively than any retry logic or scaling strategy.