How to Use Udis86 for x86 Disassembly: Step-by-Step Tutorial

How to Use Udis86 for x86 Disassembly: Step-by-Step Tutorial

What is Udis86

Udis86 is a lightweight, portable x86/x86-64 disassembler library and command-line tool. It converts machine code bytes into readable assembly instructions, useful for reverse engineering, debugging, teaching, and tooling that needs instruction-level analysis.

Prerequisites

  • A Unix-like system (Linux, macOS) or Windows with a POSIX environment (WSL, Cygwin, MSYS2).
  • Basic familiarity with x86 assembly and a terminal.
  • Build tools: gcc/clang, make, and Git.
  • Optional: a hex editor or objdump for inspection.

Installation (build from source)

  1. Clone the repo:

    Code

  2. Build and install:

    Code

    ./configure make sudo make install

    If configure is missing, run cmake . or follow the project README for alternatives on Windows.

Using the udis86 Command-Line Tool

  1. Basic usage:

    Code

    udis86 -b 32 -o 0x1000 -s ‘UH‰å’
    • -b ⁄64: mode (32-bit or 64-bit)
    • -o: starting virtual address for disassembly output
    • -s: inline bytes as a string (escaped hex)
  2. Disassemble a binary file:

    Code

    udis86 -b 64 -o 0x400000 /path/to/binary.bin

    If file contains non-code data, specify offsets or extract a code section first (use dd or objcopy).

  3. Read raw input from stdin:

    Code

    xxd -r -p bytes.hex | udis86 -b 32 -o 0x0
  4. Common flags:

    • -c: display instruction bytes in output
    • -m: attach metadata or machine-specific options (refer to man page)
    • -h: help

Using the udis86 Library in C

  1. Minimal example:

    c

    #include #include int main() { ud_t ud; uint8_t code[] = {0x55, 0x48, 0x89, 0xe5}; // push rbp; mov rbp,rsp ud_init(&ud); ud_set_input_buffer(&ud, code, sizeof(code)); ud_set_mode(&ud, 64); ud_set_pc(&ud, 0x1000); while (ud_disassemble(&ud)) { printf(“0x%llx: %s “, ud_insn_off(&ud), ud_insnasm(&ud)); } return 0; }
  2. Build:

    Code

    gcc -o disasm_example disasm_example.c -ludis86
  3. Key API calls:
    • ud_init, ud_set_input_buffer, ud_set_mode, ud_set_pc
    • ud_disassemble, ud_insn_asm, ud_insn_off, ud_insnlen

Tips for Accurate Disassembly

  • Set correct bit mode (32 vs 64).
  • Provide the correct starting PC to resolve relative addresses.
  • Strip data or use section boundaries to avoid disassembling non-code.
  • Use objdump/readelf to locate .text and symbol offsets.
  • For mixed code/data, use heuristics or manual inspection to find instruction entry points.

Example Workflow: Disassemble a Function from an ELF Binary

  1. Identify function address:

    Code

    readelf -s ./a.out | grep targetfunction
  2. Extract bytes for .text section:

    Code

    objdump -s -j .text –start-address=0x401000 –stop-address=0x401050 ./a.out | sed -n ’s/^[[:space:]][^:]: *//p’ | tr -d ‘ ’ | xxd -r -p > snippet.bin
  3. Disassemble:

    Code

    udis86 -b 64 -o 0x401000 snippet.bin

Troubleshooting

  • “Unknown opcode” — ensure correct architecture mode and that bytes are valid code.
  • Incorrect addresses — set PC with -o or ud_set_pc.
  • Build errors — install dependencies (autoconf, automake, libtool) or use package manager (apt install libudis86-dev).

Further Resources

  • Project repo and README for advanced build/use options.
  • udis86 man page for full flag and API reference.

This tutorial gives a practical, step-by-step path to disassembling x86 code with udis86, from installation to embedding the library in C programs.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *