Linux Kdump & Fadump:

This guide simplifies the technical concepts of Kdump and Fadump, the two primary mechanisms used to capture "crash dumps" (memory snapshots) when a Linux kernel fails.

1. The Basics: Why do we need Crash Dumps?

When a server crashes or hangs, the goal is First Failure Data Capture (FFDC). This means gathering all the system's memory information (vmcore) the moment the problem occurs so you can analyze it later using tools like crash.

Kdump (The Standard)

Kdump is the go-to mechanism for most Linux systems. It uses kexec to boot into a second, "clean" kernel (called the capture kernel) without performing a full hardware reboot.

How it works: A small amount of RAM is reserved at boot time for this capture kernel. When the main kernel crashes, the system immediately jumps to the capture kernel to save the memory data to a disk or network.

Fadump (The IBM POWER Alternative):

Firmware-Assisted Dump (Fadump) is specific to IBM POWER systems. Instead of relying on a second kernel sitting in memory, it uses the system's firmware to preserve memory.

Why use it? It is more robust than Kdump because it fully resets the hardware (PCI slots, I/O devices) before capturing the dump. This ensures a "clean" environment if a hardware driver caused the crash.

2. Configuration Comparison: RHEL vs. SLES

While both distributions use the same underlying technology, the commands and file paths differ slightly.

Step 1: Reserving Memory

You must tell the system how much RAM to set aside for the crash mechanism via the crashkernel boot parameter.

Feature	RHEL (Red Hat / CentOS)	SLES (SUSE)
Tool	`grubby`	`yast2 kdump` or `/etc/default/grub`
Command	`grubby --args="crashkernel=2048M" --update-kernel=ALL`	`GRUB_CMDLINE_LINUX_DEFAULT="crashkernel=2048M"`
Apply Changes	Reboot system	`grub2-mkconfig -o /boot/grub2/grub.cfg` then reboot

Step 2: Selecting a Target

Where should the vmcore file be saved? You can choose local storage or a remote server.

Local: /var/crash (Default).
Network (SSH): Sends the dump to a remote server via encrypted shell.
Network (NFS): Mounts a remote file system to save the dump.
Raw: Writes directly to a specific partition (e.g., /dev/sdb1).

3. Optimizing the Dump with `makedumpfile`

Memory dumps can be massive. To save space, Linux uses a "core collector" called makedumpfile to compress the data and exclude unnecessary parts.

Common Filtering Levels (-d flag):

Level 1: Exclude zero-filled pages.
Level 16: Exclude free pages (most common).
Level 31: Exclude everything except kernel data (smallest file size).

RHEL Example in /etc/kdump.conf:

core_collector makedumpfile -l --message-level 1 -d 31

4. Specifics for Fadump (IBM POWER)

Fadump must be explicitly enabled. It uses the /sys/kernel/fadump/ directory for management.

To enable Fadump:

Add fadump=on to your kernel boot parameters using grubby (RHEL) or editing the GRUB file (SLES).
Verify status: cat /sys/kernel/fadump/enabled (1 means active).
Registration: The system must register with firmware to handle the crash: echo 1 > /sys/kernel/fadump/registered.

5. Summary Checklist

Install: Ensure kexec-tools is installed.
Reserve: Set the crashkernel size in GRUB.
Configure: Set the destination path and compression in /etc/kdump.conf (RHEL) or /etc/sysconfig/kdump (SLES).

Test: Trigger a "fake" crash to ensure it works:

Bash

echo 1 > /proc/sys/kernel/sysrq
echo c > /proc/sysrq-trigger

Linux Kdump & Fadump:

1. The Basics: Why do we need Crash Dumps?

Kdump (The Standard)

Fadump (The IBM POWER Alternative):

2. Configuration Comparison: RHEL vs. SLES

Step 1: Reserving Memory

Step 2: Selecting a Target

3. Optimizing the Dump with `makedumpfile`

4. Specifics for Fadump (IBM POWER)

5. Summary Checklist

Comments

More from this blog

Sad Servers:

Mastering Linux Networking:

Redhat Enterprise Linux Debugging

Using Dual Root Linux Partitions

Command Palette

1. The Basics: Why do we need Crash Dumps?

Kdump (The Standard)

Fadump (The IBM POWER Alternative):

2. Configuration Comparison: RHEL vs. SLES

Step 1: Reserving Memory

Step 2: Selecting a Target

3. Optimizing the Dump with makedumpfile

4. Specifics for Fadump (IBM POWER)

5. Summary Checklist

Comments

More from this blog

3. Optimizing the Dump with `makedumpfile`