2026-05-05
You write a backup script. You drop it in cron. Six months later, the database is so big the script takes 90 minutes, but cron fires it every hour. Now you have two — then three — copies racing each other, corrupting the tarball, and OOMing the box at 3am.
Junior fix: a PID file with echo $$ > /tmp/lock and a trap. It mostly works. It also leaks stale locks when a process is kill -9'd, races on creation, and has subtle bugs you'll discover in production.
The right answer has been sitting in util-linux since forever: flock(1). It uses kernel-level flock(2) advisory locks, which the kernel automatically releases when the holding process dies. No stale locks. No races. Roughly 8KB of C.
# Run at most one instance. If another is running, exit silently.
* * * * * flock -n /var/lock/backup.lock /usr/local/bin/backup.sh
-n means non-blocking: if the lock is held, exit immediately with status 1. No queue buildup, no log spam.
#!/bin/bash
(
flock -x -w 10 200 || { echo "locked, giving up"; exit 1; }
# critical section — only one process can be here
rsync -a /data/ /backup/
) 200>/var/lock/sync.lock
File descriptor 200 is opened by the redirection; flock locks that fd. When the subshell exits, the fd closes and the kernel drops the lock — even on kill -9, panic, or power loss.
#!/bin/bash
# Re-exec self under flock if not already locked
[ "${FLOCKER}" != "$0" ] && exec env FLOCKER="$0" flock -en "$0" "$0" "$@"
# ... rest of script runs with $0 locked ...
The script uses itself as the lockfile. No /var/lock hygiene. No cleanup. The FLOCKER env var prevents infinite re-exec.
Most people forget flock supports reader/writer semantics:
flock -s lockfile cat data.db # shared (multiple readers OK)
flock -x lockfile rebuild-index # exclusive (blocks readers)
Combine with -w 30 for a 30-second timeout, or -E 75 to set a custom exit code when -n can't acquire — distinguishes "lock held" from real errors in your monitoring.
mkdir as mutex: mkdir works but leaks on kill -9. flock can't.lockfile(1) from procmail: flock is in util-linux, ships everywhere, simpler semantics.OnUnitActiveSec: works on any box, including the AIX relic in the corner.flock locks are per-fd, not per-file. If your script opens the same lockfile twice in two redirections, it gets two separate locks. And NFS support depends on your server — for cross-host coordination, use a real lock manager. For everything on a single box, flock is the answer.
Check it: flock --help. It's already installed. You just never read the man page.
flock gives you race-free, kernel-cleaned mutex locks in one line of shell.
