Backup Storage Considerations
On this page
Backup & Storage Strategy in Production
Backups are not about storing data.
Backups are about recovery guarantees.
The only question that matters:
Can you restore under pressure?
This lesson focuses on production-safe backup design, consistency, isolation, and restore validation.
Production Scenario 1 — Ransomware or Accidental Deletion
Symptoms
- Critical data deleted
- Files encrypted
- Application unusable
Real Question
Do you have:
- Recent backup?
- Off-host backup?
- Immutable backup?
- Tested restore procedure?
Backup Types
File-Level (rsync, tar)
rsync -aHAX --delete /data/ /backup/data/Pros:
- Simple
- Flexible
- May produce inconsistent state during writes
- No atomic snapshot
Filesystem Snapshots (LVM)
lvcreate --size 5G --snapshot --name data_snap /dev/vg/dataPros:
- Point-in-time consistency
- Fast
- Requires LVM
Cloud Snapshots
- Block-level
- Instant
- Usually crash-consistent
Backup Consistency (Critical Concept)
Filesystem Consistency
Data blocks consistent, but app state may not be.
Application-Aware Backup
- Database dump (mysqldump, pg_dump)
- Flush and freeze operations
- Pre-backup hooks
Never rely only on file copy for active databases.
Scenario 2 — rsync During Active Writes
If rsync runs while app writes:
- Partial file copies
- Corrupted state
- Stop app briefly
- Use snapshot
- Or database-native backup tools
Incremental vs Full Backup
Full
- Complete copy
- Large storage cost
Incremental
- Only changed files
- Efficient
- More complex restore chain
Production strategy often combines:
- Weekly full
- Daily incremental
Retention Strategy
Common policy:
- Daily: 7 days
- Weekly: 4 weeks
- Monthly: 12 months
Retention must balance:
- Storage cost
- Compliance
- Recovery window
Ransomware Protection
- Backup must be off-host
- Use object storage with versioning
- Enable immutability (WORM mode)
- Separate credentials
If attacker can delete backups, backups are useless.
Scenario 3 — Backup Exists but Restore Fails
This is common.
Reasons:
- Incomplete backup
- Permission issues
- Missing dependencies
- Restore never tested
Restore Testing (Most Important Step)
Simulate recovery:
rsync -a /backup/data/ /restore-test/
Start application against restore-test environment.
Validate:
- Data integrity
- Permissions
- Application startup
Backup without restore testing = false security.
Backup Monitoring
- Alert on failed backup jobs
- Check backup file size anomalies
- Track duration changes
Mental Model
- Backup is a recovery contract
- Consistency matters more than frequency
- Off-host is mandatory
- Immutable storage protects against ransomware
- Restore testing is non-optional
Common Production Mistakes
- Keeping backups on same server
- Not testing restore
- Backing up corrupted data
- Ignoring database-specific backup needs
- No retention policy
- No monitoring of backup jobs
Production Checklist
- Use consistent backup method (snapshot or app-aware)
- Keep backups off-host
- Enable immutability/versioning
- Implement retention strategy
- Monitor backup success
- Test restore quarterly
- Document recovery procedure