2026-05-10
You know the drill. You need to extract the RX bytes from ifconfig, or the PID of the process listening on port 8080, or the mount point of /dev/sda2. So you reach for awk '{print $4}', count columns, get it wrong because the output has a header line, fix it, then discover the format is different on macOS. Thirty years of shell scripting and we're still parsing whitespace by hand.
jc (by Kelly Brazil) is a Python tool that parses the output of 100+ standard Unix commands, log formats, and config files into structured JSON. Then you pipe it into jq and you're done. No regex, no column counting, no platform-specific brittleness.
pip install jc
# or: apt install jc
The basic invocation. Two equivalent forms:
dig example.com | jc --dig
jc dig example.com # magic mode — jc runs the command for you
Now the practical wins. Find the PID listening on port 8080 without the awk dance:
ss -tlnp | jc --ss | jq '.[] | select(.local_port_num==8080) | .process'
Get the IP address of eth0 as a clean string, the same way on Linux and BSD:
jc ifconfig eth0 | jq -r '.[0].ipv4_addr'
Disk usage as actual numbers (not "47G" strings you have to parse):
jc df -h | jq '.[] | select(.use_percent > 80) | .filesystem'
What jc parses goes well beyond the obvious. There are slicers for ps, lsof, netstat, route, iptables, uptime, free, mount, last, w, who, id, uname, stat, find, tree, git log, systemctl, crontab, dpkg-l, rpm-qi, pip list... and parsers for files: /etc/passwd, /etc/hosts, /etc/fstab, resolv.conf, ssh_config, INI, X509 certs, syslog, Apache/nginx access logs, even iso-datetime strings.
The killer use case is config inspection in Ansible. Instead of regex-matching command output in your playbooks, register the output and pipe through jc:
- name: Check disk usage
shell: df -h | jc --df
register: df_out
- fail:
msg: "Root partition over 90%"
when: (df_out.stdout | from_json | selectattr('mounted_on','equalto','/') | first).use_percent > 90
For one-off log forensics, the syslog and log-format parsers are gold:
cat /var/log/nginx/access.log | jc --nginx-access-log | \
jq '[.[] | select(.status=="500")] | group_by(.request_url) |
map({url: .[0].request_url, count: length}) | sort_by(.count) | reverse | .[0:5]'
Top 5 URLs returning 500s, in one pipeline, no Python script.
Two tricks the docs bury. First, jc -p gives pretty-printed output for interactive exploration — pipe to that first to see the schema before writing your jq filter. Second, jc -r gives "raw" mode where numbers and dates stay as strings; useful when you're feeding the output to something that does its own type coercion.
The mainstream alternative is awk/cut/grep chains, which are write-only code that breaks when output formats drift between distros or versions. jc inverts the contract: the parser owns the format quirks, your jq filter just names the fields you want. Your scripts read like queries instead of column-counting puzzles.
