LINUX-PRODUCTION Contents

Filesystems (ext4/xfs) Basics

Pick defaults, mount options, and avoid performance/safety traps.

On this page

Filesystems in Production (ext4 vs XFS)

In production, filesystems do not fail politely. They remount read-only, corrupt metadata, refuse writes, or lie about available space.

You do not need academic filesystem theory. You need to understand:

  • How ext4 and XFS behave under crash
  • How journaling affects recovery
  • What to do when the system remounts read-only
  • Why “No space left on device” can happen even when df looks fine

ext4 vs XFS – Production Differences

ext4

  • Very common default filesystem
  • Uses journaling (metadata journal)
  • Supports fsck (offline repair)
  • Safer for small/medium disks

XFS

  • Designed for large-scale systems
  • High parallel I/O performance
  • Cannot shrink easily
  • Uses xfs_repair (fsck.xfs does NOT repair)

Production rule: ext4 is conservative and predictable. XFS scales better under heavy write workloads.


Scenario 1 — Filesystem Remounted Read-Only

Symptoms

  • Application logs show write failures
  • System errors: "Read-only file system"
  • Container crashes

Diagnosis

dmesg -T | grep -i "read-only"
mount | grep "(ro,"

If filesystem is mounted as (ro), the kernel forced it read-only due to detected corruption or I/O failure.

Why It Happens

  • Disk I/O errors
  • Metadata corruption
  • Power loss during write
  • Storage backend instability

Fix Strategy

Step 1: Do NOT blindly remount rw.

sudo mount -o remount,rw /

If corruption exists, this may worsen damage.

Step 2: Check filesystem type:

df -T

If ext4:

Unmount (if possible) and run fsck:

sudo umount /dev/sdX1
sudo fsck -f /dev/sdX1

If XFS:

XFS cannot be repaired while mounted.

sudo umount /dev/sdX1
sudo xfs_repair /dev/sdX1

Important: fsck.xfs does not repair. It only checks.


Scenario 2 — "No Space Left on Device" but df Shows Free Space

Symptoms

  • Error: ENOSPC
  • df -h shows free disk space

Common Causes

  • Inodes exhausted
  • Reserved blocks (ext4)
  • Open deleted files still consuming space

Diagnosis

df -i

If IUse% is 100%, inodes are exhausted.

Check for deleted open files:

sudo lsof | grep deleted

If large files are deleted but still open by a process, disk space is not freed until process restarts.


Scenario 3 — Journal Replay After Crash

After sudden reboot, ext4 may replay journal automatically.

dmesg -T | grep -i "recovery"

ext4 journal recovery is automatic and usually safe.

XFS also performs log recovery:

dmesg -T | grep -i xfs

If journal replay fails, manual repair may be required.


Scenario 4 — Filesystem Corruption on Large XFS Volume

XFS corruption symptoms:

  • Metadata errors in dmesg
  • Cannot create files
  • Directory listing fails
dmesg -T | grep -i "xfs"

Repair process:

sudo umount /data
sudo xfs_repair -v /dev/sdX1

Never run xfs_repair on mounted filesystem.


Reserved Blocks in ext4

ext4 reserves ~5% space for root by default.

sudo tune2fs -l /dev/sdX1 | grep "Reserved block"

Adjust (carefully) if needed:

sudo tune2fs -m 1 /dev/sdX1

Reducing reserved space increases usable capacity but reduces safety margin.


Mental Model

  • Filesystem is not just storage — it is metadata + allocation logic
  • ext4 is stable and conservative
  • XFS is scalable and parallel-friendly
  • Read-only remount is a protection mechanism
  • Journal replay protects metadata consistency

Common Production Mistakes

  • Running fsck on mounted filesystem
  • Using fsck on XFS instead of xfs_repair
  • Ignoring dmesg I/O errors
  • Remounting rw without root cause analysis
  • Forgetting inode limits
  • Not checking deleted-but-open files

Production Checklist

  • Identify filesystem type with df -T
  • Check kernel logs before any repair
  • Unmount before running repair tools
  • Check inode usage (df -i)
  • Check open deleted files (lsof)
  • Validate storage backend health
  • Test write operations after repair