atop: The Black Box Recorder for Your Linux System

2026-06-01

You get a Slack message at 9am: "the box was hammered at 3am, what happened?" You SSH in, run top, see normal load, and stare blankly. The moment is gone. Unless you installed atop — in which case the moment is sitting in /var/log/atop/, waiting for you to replay it.

atop is a top-like process monitor with one trick that changes everything: it runs as a daemon and writes a compact binary snapshot of CPU, memory, disk, network, and every process to disk every 10 minutes. Months of history fits in a few hundred MB. You can scrub through time like a video.

Install and enable the recorder:

sudo apt install atop
sudo systemctl enable --now atop      # the recorder
sudo systemctl enable --now atopacct  # process accounting: catches short-lived procs

Replay yesterday's logfile:

atop -r /var/log/atop/atop_20260531
#   t       step forward one interval
#   T       step backward
#   b       jump to specific time (e.g. b 03:00)
#   m d n c switch to memory / disk / network / commandline view
#   p       sort by CPU; M by memory; D by disk

The killer feature: atop retains processes that died during the interval. When a runaway Python script gets OOM-killed at 3:07am and vanishes, normal monitoring loses it. atop shows it in red with an exit code, RSS at death, and the full command line.

Need a window around the incident?

atop -r /var/log/atop/atop_20260531 -b 02:55 -e 03:15

The parseable mode is where scripting gets fun. Each category emits fixed-field records, so awk just works:

# Top memory hogs between 2-3am yesterday
atop -r /var/log/atop/atop_20260531 -b 02:00 -e 03:00 -P PRM \
  | awk '$1=="PRM" && $12 > 500000 {print $2, $NF, $12}' \
  | sort -k3 -n -r | head

# Disk write bandwidth per process during incident
atop -r /var/log/atop/atop_20260531 -b 03:00 -e 03:15 -P PRD \
  | awk '$1=="PRD" {print $NF, $10}' \
  | sort | datamash -g 1 sum 2 | sort -k2 -n -r

Categories you can request with -P: CPU MEM DSK NET PRC PRM PRD PRN (process variants) and more. Mix them: -P "PRC PRM PRD".

For aggregate reports in the sar style — without sar's missing process detail — there's atopsar:

atopsar -A -r /var/log/atop/atop_20260531 -b 02:00 -e 04:00
atopsar -c -r atop_20260531    # cpu only
atopsar -m -r atop_20260531    # memory only

Why this beats the alternatives for single-host postmortems:

Tune /etc/default/atop: INTERVAL=60 for one snapshot per minute on critical boxes. Logs rotate daily and gzip after a week. Disk cost: trivial. Forensic value when something breaks at 3am: enormous.

Key Takeaway: Install atop on every server now, because the most useful time to have a black box recorder is before the crash you haven't had yet.

All newsletters