How to insert a global assembly trampoline into a binary in an LLVM backend

2026-06-04

Stack Overflow: View Question

Tags: c++, assembly, clang, llvm, compiler-construction

Score: 5 | Views: 102

The asker is building an LLVM pass that does two related things: rewrite specific call sites to go through an indirection (a trampoline), and emit that trampoline as a single, globally-visible chunk of assembly into the final binary. The call-site rewriting is the easy half — a MachineFunctionPass can swap the call target on the MI stream. The hard half is materializing the trampoline itself as a first-class symbol so the rewritten calls can name it at link time.

Why it's tricky. LLVM gives you several places to inject assembly, and each has different visibility and lifetime semantics:

Direction. For a trampoline that must exist exactly once per module and be referenced by name from rewritten calls, the cleanest approach is the AsmPrinter route combined with an IR-level symbol declaration:

  1. In the ModulePass, declare an external Function (no body) with the trampoline's mangled name and the correct signature. This gives the MachineFunctionPass a stable GlobalValue to retarget calls to.
  2. Subclass the target's AsmPrinter and override emitEndOfAsmFile (or emitStartOfAsmFile) to emit the trampoline section header, .globl, alignment, the label, and the instruction bytes — either as parsed MCInsts via the target's MCInstPrinter, or as a single OutStreamer->emitRawText() for pure asm.
  3. Guard the emission with a module-level flag (a named metadata node or a custom module attribute) so it fires once per TU and only when the rewriter actually inserted references.

Gotchas. Section placement matters: the trampoline must land in an executable section (.text) with appropriate alignment, and on ELF you'll want .type @function + .size so backtraces and the dynamic linker behave. COMDAT or linkonce_odr linkage avoids duplicate-symbol errors when the same module is compiled into multiple objects that later link together. LTO is another landmine — appendModuleInlineAsm survives LTO but loses optimizer visibility, while a declared function symbol survives LTO cleanly only if the linker can find the definition in some TU. Finally, if the trampoline clobbers non-volatile registers, the rewritten call sites need their register-mask / liveness updated so the register allocator doesn't assume callee-saved values survive.

The challenge: Synthesizing a globally-visible assembly symbol from inside an LLVM pass means picking the right layer — IR, module-asm blob, or MC streamer — each with different trade-offs for linkage, LTO, and downstream tool visibility.

All newsletters