RFC 5987: Character Set and Language Encoding for HTTP Header Field Parameters

2026-05-28

RFC: RFC 5987

Published: 2010

Authors: Julian Reschke

If you've ever wondered why Content-Disposition headers in HTTP downloads contain bizarre-looking strings like filename*=UTF-8''r%C3%A9sum%C3%A9.pdf with that strange double-apostrophe sequence, RFC 5987 is the answer. It's the unsung hero that lets your web app serve a file named "日本語.pdf" or "café-menu.docx" without mangling it into gibberish.

The Problem: HTTP headers, by tradition rooted in RFC 822 and its descendants, are ASCII-only. But filenames, content descriptions, and other parameter values frequently need to carry non-ASCII characters — accented Latin letters, Cyrillic, CJK ideographs, emoji. For years, browsers each invented their own incompatible workarounds: IE used raw percent-encoded UTF-8, Firefox tried RFC 2231, Safari did something else. The result was that Content-Disposition: attachment; filename="..." was a minefield, and developers resorted to user-agent sniffing just to make downloads work cross-browser.

The Design: RFC 5987 takes the encoding mechanism originally defined in RFC 2231 (which was built for MIME email headers) and ports a subset of it to HTTP. The author, Julian Reschke — a longtime HTTPbis contributor — deliberately stripped out the parts of RFC 2231 that nobody implemented correctly, particularly multi-line parameter continuations. What's left is the clean part: the ext-value syntax.

The format is three fields separated by single quotes:

The asterisk suffix on the parameter name (e.g., filename* versus filename) is the signal that this is an extended-format value. Crucially, the spec allows both forms to appear side-by-side: a server can send filename="resume.pdf"; filename*=UTF-8''r%C3%A9sum%C3%A9.pdf. Old clients use the ASCII fallback; modern clients prefer the starred version.

Why It Matters Today: Every time a user downloads a file with a non-English name from Gmail, Dropbox, GitHub, or your company's intranet, RFC 5987 is doing the work. It's referenced by RFC 6266 (the modern Content-Disposition spec), RFC 8187 (which obsoleted 5987 with minor tweaks), and is baked into countless web framework helpers — Django's FileResponse, Express's res.download(), ASP.NET's File() action result. If you've ever called one of those, you've shipped RFC 5987 to production.

Quirks: The language tag field exists almost as a vestige. Reschke kept it for symmetry with RFC 2231 and MIME, but in practice no implementation does anything with it. The doubled apostrophes are a visual oddity that confuses every developer on first encounter — they're not a typo, they're two empty field delimiters around an omitted language tag. Also worth noting: RFC 5987 was superseded in 2017 by RFC 8187, which made UTF-8 mandatory and dropped ISO-8859-1 support. The wire format is unchanged.

It's a small, focused RFC — about 12 pages — but it's a textbook example of standards work: take a messy de-facto situation, pick the cleanest of the competing implementations, document it precisely, and shepherd browsers toward convergence.

Why it matters: RFC 5987 is the reason file downloads with non-ASCII filenames work the same way in every browser — a tiny encoding convention quietly powering internationalized web apps everywhere.

All newsletters