LINUX-PRODUCTION Contents

Backup Storage Considerations

Snapshots vs file backups, retention, integrity checks, and restore testing.

On this page

Backup & Storage Strategy in Production

Backups are not about storing data.

Backups are about recovery guarantees.

The only question that matters:

Can you restore under pressure?

This lesson focuses on production-safe backup design, consistency, isolation, and restore validation.


Production Scenario 1 — Ransomware or Accidental Deletion

Symptoms

  • Critical data deleted
  • Files encrypted
  • Application unusable

Real Question

Do you have:

  • Recent backup?
  • Off-host backup?
  • Immutable backup?
  • Tested restore procedure?

Backup Types

File-Level (rsync, tar)

rsync -aHAX --delete /data/ /backup/data/
Pros:
  • Simple
  • Flexible
Cons:
  • May produce inconsistent state during writes
  • No atomic snapshot

Filesystem Snapshots (LVM)

lvcreate --size 5G --snapshot --name data_snap /dev/vg/data
Pros:
  • Point-in-time consistency
  • Fast
Cons:
  • Requires LVM

Cloud Snapshots

  • Block-level
  • Instant
  • Usually crash-consistent

Backup Consistency (Critical Concept)

Filesystem Consistency

Data blocks consistent, but app state may not be.

Application-Aware Backup

  • Database dump (mysqldump, pg_dump)
  • Flush and freeze operations
  • Pre-backup hooks

Never rely only on file copy for active databases.


Scenario 2 — rsync During Active Writes

If rsync runs while app writes:

  • Partial file copies
  • Corrupted state
Safer approach:
  • Stop app briefly
  • Use snapshot
  • Or database-native backup tools

Incremental vs Full Backup

Full

  • Complete copy
  • Large storage cost

Incremental

  • Only changed files
  • Efficient
  • More complex restore chain

Production strategy often combines:

  • Weekly full
  • Daily incremental

Retention Strategy

Common policy:

  • Daily: 7 days
  • Weekly: 4 weeks
  • Monthly: 12 months

Retention must balance:

  • Storage cost
  • Compliance
  • Recovery window

Ransomware Protection

  • Backup must be off-host
  • Use object storage with versioning
  • Enable immutability (WORM mode)
  • Separate credentials

If attacker can delete backups, backups are useless.


Scenario 3 — Backup Exists but Restore Fails

This is common.

Reasons:

  • Incomplete backup
  • Permission issues
  • Missing dependencies
  • Restore never tested

Restore Testing (Most Important Step)

Simulate recovery:

rsync -a /backup/data/ /restore-test/

Start application against restore-test environment.

Validate:

  • Data integrity
  • Permissions
  • Application startup

Backup without restore testing = false security.


Backup Monitoring

  • Alert on failed backup jobs
  • Check backup file size anomalies
  • Track duration changes

Mental Model

  • Backup is a recovery contract
  • Consistency matters more than frequency
  • Off-host is mandatory
  • Immutable storage protects against ransomware
  • Restore testing is non-optional

Common Production Mistakes

  • Keeping backups on same server
  • Not testing restore
  • Backing up corrupted data
  • Ignoring database-specific backup needs
  • No retention policy
  • No monitoring of backup jobs

Production Checklist

  • Use consistent backup method (snapshot or app-aware)
  • Keep backups off-host
  • Enable immutability/versioning
  • Implement retention strategy
  • Monitor backup success
  • Test restore quarterly
  • Document recovery procedure