WinAFL — Fuzzing Windows Binaries

WinAFL is Ivan Fratric's Windows port of AFL. Where AFL on Linux forks a template process off a fork-server, WinAFL hooks a single target function inside the target process and drives it in an in-memory loop — no fork(), no re-execing the binary for every input. The result is a fast coverage-guided fuzzer for closed-source Windows software: parsers, file-format readers, network servers, decoders, and any COM/ActiveX surface you can isolate into a single callable routine.

This note is a working reference. Commands first, theory where it clarifies why a flag exists. Tested against WinAFL built from master with DynamoRIO 9.x on Windows 10/11 x64.

Architecture in One Page

┌────────────────────────────────────────────────────────┐
│                    afl-fuzz.exe                       │  ← mutation engine,
│  - picks testcase from queue                           │    schedules inputs,
│  - writes it to input file / stdin / shared memory     │    tracks coverage bitmap
│  - signals target via pipe / shared memory             │
└──────────┬─────────────────────────────────────────────┘
           │  named pipe  (\\.\pipe\afl_pipe_default)
           │  + shared memory (__AFL_SHM_ID)
           ▼
┌────────────────────────────────────────────────────────┐
│                   drrun.exe  (DynamoRIO)               │  ← dynamic binary
│    loads winafl.dll client into the target process     │    instrumentation
└──────────┬─────────────────────────────────────────────┘
           │
           ▼
┌────────────────────────────────────────────────────────┐
│                 target.exe (your victim)               │
│    main() → … → target_function(argA, argB)  ◄──┐      │
│                  │                              │      │
│                  └── winafl.dll intercepts:     │      │
│                      * saves registers/stack    │      │
│                      * executes function        │      │
│                      * on return: restore and ──┘      │
│                        loop up to -fuzz_iterations      │
└────────────────────────────────────────────────────────┘

Three pieces matter:

afl-fuzz.exe — the coverage-guided mutator. Manages the queue, dictionaries, scheduler.
DynamoRIO (drrun.exe + winafl.dll) — instruments the target at basic-block granularity so WinAFL sees every edge taken.
Target function — the routine you pick inside the target binary. WinAFL saves its register/stack state the first time it's hit, runs it, restores, and loops. Each loop = one fuzz iteration.

The instrumentation mode is pluggable — you can replace DynamoRIO with Intel PT (hardware tracing) or Syzygy (static binary rewriting) without changing the rest.

Setup

Install DynamoRIO

# Grab the latest release (9.x known-good)
Invoke-WebRequest `
  -Uri "https://github.com/DynamoRIO/dynamorio/releases/download/release_9.0.1/DynamoRIO-Windows-9.0.1.zip" `
  -OutFile C:\tools\DynamoRIO.zip

Expand-Archive C:\tools\DynamoRIO.zip -DestinationPath C:\tools\
# → C:\tools\DynamoRIO-Windows-9.0.1\bin64\drrun.exe

Verify:

C:\tools\DynamoRIO-Windows-9.0.1\bin64\drrun.exe -version

Build WinAFL

git clone https://github.com/googleprojectzero/winafl.git
cd winafl
mkdir build64
cd build64

cmake -G "Visual Studio 17 2022" -A x64 `
  -DDynamoRIO_DIR=C:\tools\DynamoRIO-Windows-9.0.1\cmake ..

cmake --build . --config Release

Artifacts land in winafl\build64\bin\Release\:

Binary	Purpose
`afl-fuzz.exe`	Fuzzer driver
`winafl.dll`	DynamoRIO client
`winafl-cmin.py`	Corpus minimiser
`afl-showmap.exe`	Run one input, dump coverage
`afl-tmin.exe`	Testcase minimiser

Build a 32-bit copy in a separate build32\ against DynamoRIO's lib32\ if you're fuzzing x86 targets — WinAFL is bitness-sensitive.

Sanity check

# Toy test: put a PNG into .\in\ then fuzz notepad reading a file.
# This will not find bugs — but if it ticks over, your install works.
.\afl-fuzz.exe -i in -o out -D C:\tools\DynamoRIO-Windows-9.0.1\bin64 -t 20000 -- `
  -coverage_module notepad.exe -fuzz_iterations 5000 `
  -target_module notepad.exe -target_offset 0x1000 -nargs 2 -- `
  notepad.exe @@

If afl-fuzz reaches the "process timing" screen with non-zero execs_per_sec, the plumbing works.

Core Command Line

afl-fuzz.exe [afl options] -- [instrumentation options] -- target.exe [target args]

Three argument groups separated by --. Everything before the first -- belongs to afl-fuzz. Everything between the two -- is consumed by the DynamoRIO client (winafl.dll). Everything after the second -- is the target command line.

afl-fuzz options

Flag	Meaning
`-i <dir>`	Input corpus directory
`-o <dir>`	Output directory (queue, crashes, hangs)
`-t <ms>`	Per-execution timeout (milliseconds)
`-f <file>`	Write testcase to this exact path (useful when target reads a fixed filename)
`-M master / -S slaveN`	Parallel mode — master and N slaves sharing an output dir
`-x <dict>`	Dictionary file (keywords, magic numbers)
`-m <mb>`	Memory limit (default 200MB — raise with `-m none` for big parsers)
`-D <dynamorio_bin>`	Path to `drrun.exe`'s bin dir — selects DynamoRIO mode
`-P`	Persistent mode (on by default with DR)
`-l <mode>`	Instrumentation mode: `full` (default), `coverage`
`@@`	In target args — replaced by path to current testcase

winafl.dll options (DynamoRIO client)

Flag	Meaning
`-target_module <name>`	Module (DLL/EXE) that contains the target function
`-target_method <sym>`	Symbol name of the target function (needs PDB)
`-target_offset <hex>`	Alternative to `-target_method` — RVA from module base
`-nargs <n>`	Number of arguments the target function takes
`-fuzz_iterations <n>`	How many times to loop the target before respawning (typical 5000)
`-coverage_module <name>`	Instrument only these modules (repeatable — use for each interesting DLL)
`-persistence_mode <m>`	`in_app` (default) or `native_cov`
`-call_convention <c>`	`stdcall`, `fastcall`, `thiscall`, `ms64` (x64)
`-debug`	Emit a log in `%TEMP%` describing every fuzz iteration

Canonical invocation

.\afl-fuzz.exe ^
  -i corpus_in -o sync_dir -t 5000 -D C:\tools\DynamoRIO-Windows-9.0.1\bin64 ^
  -- ^
  -coverage_module target.dll -coverage_module parser.dll ^
  -fuzz_iterations 5000 -persistence_mode in_app ^
  -target_module target.exe -target_method ParseFile -nargs 2 ^
  -call_convention stdcall ^
  -- ^
  target.exe @@

The ^ are PowerShell line continuations — on cmd.exe use ^ too; in pwsh scripts use backticks. Keep it on one line in a .bat file and you'll sleep better.

Modes of Instrumentation

1. DynamoRIO (default)

Dynamic binary instrumentation. WinAFL ships as a DR client (winafl.dll); DR JIT-copies basic blocks and injects edge-tracking code before each block commits. Works on any Windows x86/x64 binary with no rebuild, no source.

Pros: turnkey, full edge coverage, easy to add new coverage modules. Cons: ~3–10× runtime overhead, some targets misbehave under JIT (anti-debug, exception-heavy code, TLS abuse).

-D C:\tools\DynamoRIO-Windows-9.0.1\bin64

2. Intel PT (hardware trace)

Uses Intel Processor Trace to record taken branches in a kernel driver buffer, then post-processes the trace into a coverage map. Much lower overhead — typically 2–5× versus DR's 5–10×.

Requirements: an Intel CPU with PT (Broadwell+), Windows 10+, and the winafl-intelpt.exe variant built with -DINTELPT=1.

.\winafl-intelpt.exe -i in -o out -t 5000 -instrument_method IPT -- ^
  -coverage_module target.dll -target_method ParseFile -nargs 2 ^
  -fuzz_iterations 5000 -- target.exe @@

Caveat: IPT needs the target to behave deterministically on re-entry. Heavy async / alertable waits still break the loop.

3. Syzygy (static rewriting)

Rewrites the PE once, up-front, inserting instrumentation into each basic block. At fuzz time there's no DR overhead — the binary runs native. Closest thing to source-level AFL performance on Windows.

# Rewrite a target DLL with Syzygy instrumentation
instrument.exe --mode=afl --input-image=target.dll --output-image=target_afl.dll ^
  --force-decompose --cookie-check-hook

# Drop target_afl.dll into the target directory and fuzz without -D
.\afl-fuzz.exe -i in -o out -Y -t 5000 -- ^
  -target_module target_afl.dll -target_method ParseFile -nargs 2 ^
  -fuzz_iterations 5000 -- target.exe @@

-Y tells WinAFL the target is statically instrumented — no DR injection. Syzygy requires an unstripped, non-hardened PE (no CFG, no /GUARD:CF) — rare in modern Microsoft binaries, very common in third-party software.

Mode picker

Situation	Mode
First-time fuzzing, unknown target	DynamoRIO
Target crashes under DR	Intel PT
Hot loop, need max exec/s	Syzygy (if PE allows)
Target is a service / driver helper	DynamoRIO (attach mode)

Picking a Target Function

This is 70% of the work. A bad target function gives you either zero crashes or zero speed.

Properties of a good target function

Takes a file / buffer / string pointer as input — no hidden network state, no GUI events.
Is idempotent across calls — same input → same path, because WinAFL restores only registers and the immediate stack, not heap or globals.
Lives deep enough to skip startup cost — after DLL init, COM init, config parsing — but shallow enough to hit the interesting parser.
Returns cleanly — no ExitProcess, no longjmp, no throw. Exceptions break the persistent loop unless you wrap them.
Is reachable from main() — WinAFL attaches at process start and waits for the first call.

Finding it with IDA

1. Open target.exe in IDA. Wait for auto-analysis.
2. View → Open Subviews → Imports. Look for:
     CreateFileW / ReadFile          ← direct file I/O
     MapViewOfFile                   ← memory-mapped parsing
     RegQueryValueEx                 ← configured-driven paths
     WSARecv / recv                  ← network surfaces
3. Cross-reference (X) each → trace up the call graph to the nearest
   function that takes a pointer + length.
4. Rename it: N → "ParseFileBuf" for clarity.
5. Note its RVA: address shown in IDA minus the module base.
   Example:   .text:0000000180023A40  →  RVA 0x23A40

Confirming it's actually called once per input

# Attach WinDbg, set a breakpoint, run with a known input
windbg.exe -g -o target.exe .\in\seed.bin

bp target!ParseFileBuf
g

You want the breakpoint to fire after startup and then once per input. If it fires during DllMain or inside a static initialiser, back off to a caller further up the stack.

Using `-target_offset` when there's no symbol

Without a PDB, -target_method can't resolve. Use the RVA:

-target_module target.exe -target_offset 0x23A40 -nargs 2

RVAs are stable across runs as long as the binary isn't re-linked and ASLR is per-boot (DR bases the RVA off the loaded module).

Setting `-nargs` and `-call_convention`

The target function's ABI drives two winafl options:

-nargs — number of stack/register arguments. WinAFL needs this to save/restore the arg slot so it can re-run the function. Over-count rather than under-count.
-call_convention — on x64, always ms64. On x86: stdcall (default), fastcall, thiscall (C++ member functions — first arg is this in ECX).

Example: fuzzing a C++ CParser::Parse(const wchar_t* path) on x86:

-target_method ??0CParser@@QAEXPB_W@Z  ← mangled name from the PDB
-call_convention thiscall -nargs 2     ← nargs = 1 + this

Writing a Harness

You rarely fuzz a target EXE directly. You write a thin harness: a tiny C program that links the victim DLL, parses the command-line @@ file, and calls the interesting function. The harness becomes your target — you control every knob.

Why harness?

Skip slow startup (config, network, UI).
Bypass sanity checks that would reject mutated inputs (e.g. TLS init, license).
Isolate the function — no heap pollution from unrelated code paths.
Reset state between iterations if needed.

Minimal harness template

// harness.c  —  build: cl /MD harness.c target.lib
#include <windows.h>
#include <stdio.h>

// Import (or GetProcAddress) the function to fuzz.
extern int __stdcall ParseFileBuf(unsigned char* data, size_t len);

// The target function WinAFL will hook. Keep it flat, no globals.
__declspec(noinline) int FuzzMe(const char* path)
{
    HANDLE h = CreateFileA(path, GENERIC_READ, 0, NULL,
                           OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
    if (h == INVALID_HANDLE_VALUE) return -1;

    DWORD size = GetFileSize(h, NULL);
    unsigned char* buf = (unsigned char*)malloc(size);
    DWORD read = 0;
    ReadFile(h, buf, size, &read, NULL);
    CloseHandle(h);

    // Swallow exceptions so the persistent loop survives malformed input.
    __try {
        ParseFileBuf(buf, read);
    } __except (EXCEPTION_EXECUTE_HANDLER) {
        // Access violations etc. — WinAFL sees them via DR's exception
        // callback, not this __except, so this keeps the process alive
        // for benign exceptions only.
    }

    free(buf);
    return 0;
}

int main(int argc, char** argv)
{
    if (argc < 2) return 1;
    FuzzMe(argv[1]);   // ← call FuzzMe exactly once; WinAFL will loop it
    return 0;
}

Build:

cl /MD /Zi /Od harness.c /link target.lib /OUT:harness.exe

Key details:

__declspec(noinline) — stop the optimiser from inlining FuzzMe into main, otherwise there's no symbol to hook.
/Od — no optimisation. You want predictable basic blocks while iterating on the harness.
/Zi — emit a PDB so -target_method FuzzMe works without needing an RVA.
__try / __except — catches non-crash exceptions (e.g. C++ EH translated to SEH) that would otherwise tear down the persistent loop.

Fuzz the harness

.\afl-fuzz.exe -i corpus -o sync -t 5000 -D C:\tools\DynamoRIO-Windows-9.0.1\bin64 -- ^
  -coverage_module target.dll -coverage_module harness.exe ^
  -target_module harness.exe -target_method FuzzMe -nargs 1 ^
  -fuzz_iterations 5000 -- harness.exe @@

Note -coverage_module is repeated — instrument both your harness and the victim DLL so edges inside the parser show up in the coverage map.

Corpus Preparation

Quality of the starting corpus is the single biggest lever after target choice.

Sourcing seeds

# Pull sample files of the right format from the filesystem
Get-ChildItem C:\ -Filter *.pdf -Recurse -ErrorAction SilentlyContinue |
  Select-Object -First 50 | Copy-Item -Destination .\corpus_raw\

# Or from an online corpus (example: afl-corpus)
git clone https://github.com/strongcourage/fuzzing-corpus.git

Minimise it

A tight corpus with maximum unique coverage runs faster and mutates better.

python .\winafl-cmin.py ^
  -D C:\tools\DynamoRIO-Windows-9.0.1\bin64 ^
  -t 10000 -i corpus_raw -o corpus_min -- ^
  -coverage_module target.dll ^
  -target_module harness.exe -target_method FuzzMe -nargs 1 ^
  -fuzz_iterations 1 -- harness.exe @@

winafl-cmin.py runs each file, records the coverage bitmap, and keeps only files that contribute new edges. Expect 5–20× reduction.

Trim individual testcases

afl-tmin.exe chops bytes off a single file while preserving the coverage it produces. Run it over the crashes directory afterwards to get minimal reproducers.

.\afl-tmin.exe -i crash_orig.bin -o crash_min.bin -- ^
  -D C:\tools\DynamoRIO-Windows-9.0.1\bin64 -- ^
  -coverage_module target.dll ^
  -target_module harness.exe -target_method FuzzMe -nargs 1 ^
  -fuzz_iterations 1 -- harness.exe @@

Dictionaries

Format-aware mutation. For every token in the dictionary, AFL will splice it into the mutated input at byte boundaries during the havoc / splice stages.

Format

# dict.txt
"magic_header"="%PDF-1.7"
"endobj"="endobj"
"stream"="stream"
"int_max"="\xff\xff\xff\xff"
"version_1"="\x01\x00\x00\x00"
# Hex-escaped bytes are allowed — \xHH decodes at load time.

Pass it

.\afl-fuzz.exe -i in -o out -x dict.txt -- ...

Auto-extract from a binary

strings.exe + filter is a decent starting dictionary:

strings.exe target.dll |
  Where-Object { $_.Length -gt 3 -and $_.Length -lt 32 } |
  ForEach-Object { "`"tok`"=`"$_`"" } |
  Set-Content dict.txt

Better: grab AFL's catalogue from afl/dictionaries/ — they ship tested dicts for HTML, XML, JPEG, PNG, PDF, JS, SQL, TLS.

Parallel Fuzzing

One afl-fuzz instance = one core. You want all of them.

Topology

sync_dir/
  fuzzer01/    ← master  (-M fuzzer01)   deterministic stages
  fuzzer02/    ← slave   (-S fuzzer02)   havoc / splice only
  fuzzer03/    ← slave   (-S fuzzer03)
  fuzzer04/    ← slave   (-S fuzzer04)

Masters run deterministic bitflip + arithmetic stages. Slaves skip those and focus on random havoc — cheap, parallel, complementary.

Launch

# Master
start cmd /k .\afl-fuzz.exe -i in -o sync -M fuzzer01 -t 5000 `
  -D C:\tools\DynamoRIO-Windows-9.0.1\bin64 -- `
  -coverage_module target.dll -target_module harness.exe `
  -target_method FuzzMe -nargs 1 -fuzz_iterations 5000 -- harness.exe @@

# Slaves (repeat with fuzzer02, fuzzer03, …)
start cmd /k .\afl-fuzz.exe -i in -o sync -S fuzzer02 -t 5000 `
  -D C:\tools\DynamoRIO-Windows-9.0.1\bin64 -- `
  -coverage_module target.dll -target_module harness.exe `
  -target_method FuzzMe -nargs 1 -fuzz_iterations 5000 -- harness.exe @@

Slaves read each other's queue every few seconds. New interesting inputs discovered by any instance propagate to all.

Pinning to cores

DynamoRIO can schedule oddly under load. Pin each instance:

start /affinity 0x1 cmd /k .\afl-fuzz.exe ... -M fuzzer01 ...
start /affinity 0x2 cmd /k .\afl-fuzz.exe ... -S fuzzer02 ...
start /affinity 0x4 cmd /k .\afl-fuzz.exe ... -S fuzzer03 ...
start /affinity 0x8 cmd /k .\afl-fuzz.exe ... -S fuzzer04 ...

Per-instance status

# Snapshot queue/crash counts across all instances
Get-ChildItem sync -Directory | ForEach-Object {
  $stats = Get-Content (Join-Path $_.FullName 'fuzzer_stats') -Raw
  "$($_.Name): $($stats -split '\n' | Select-String 'execs_per_sec|unique_crashes')"
}

Reading the UI

  +- process timing --------+
  |   run time : 0 days, 1 hrs, 14 min, 32 sec
  |   last new path : 0 days, 0 hrs, 3 min, 8 sec
  |   last uniq crash : none seen yet
  |   last uniq hang : 0 days, 0 hrs, 41 min, 2 sec
  +- cycle progress --------+- map coverage ----------+
  |  now processing : 174*  |    map density : 4.02%  |
  |  paths timed out : 3    |  count coverage : 2.81  |
  +-- stage progress -------+- findings in depth ----+
  |  now trying : havoc     |  favored paths : 42    |
  |  stage execs : 2048/4k  |   new edges on : 61    |
  |  total execs : 1.47M    | total crashes : 0      |
  |  exec speed : 312/sec   |  total hangs  : 7      |
  +-- fuzzing strategy ---+ +- path geometry --------+
  |   bit flips : 8/14k   | |    levels : 5          |
  |   byte flips : 6/3k   | |   pending : 87         |
  +-----------------------+ +------------------------+

Signals to watch:

Metric	Healthy	Bad — and what it means
`exec speed`	200–2000/sec	<50: target is slow or DR thrashing. Shrink corpus, reduce instrumentation scope, try IPT
`stability`	95–100%	<85%: non-deterministic target state leaking between iterations — harden harness or lower `-fuzz_iterations`
`map density`	climbing	Flat for hours: corpus/dictionary is too narrow — add seeds or tokens
`last new path`	< 30 min	Stalled: mutate strategy, add dictionary, or pick a better target function
`unique crashes`	>0 eventually	None after 24h of healthy speed: often a target problem, not a fuzzer problem

Stability drops are debugging

A 60% stability score means 40% of your iterations walk a different edge from run to run even on the same input. Causes:

Global state mutated by the target function (statics, TLS).
Heap addresses leaking into control flow (pointer comparisons).
Random number use in the hashed path.
Threads doing background work during the measurement window.

Fix by:

1. Move the target function up or down the call stack to a purer routine.
2. Reset the leaking global at the top of your harness.
3. Lower -fuzz_iterations to 1 (= non-persistent) as a diagnostic;
   if stability jumps to 100%, you have persistent-mode state leakage.

Crash Triage

sync\fuzzerXX\crashes\ fills up. Each file is a complete input that made the harness crash. You want: which of these are the same bug? and which are exploitable?

De-duplicate (coarse)

AFL already groups by crash-path hash, but two paths can hit the same root cause. Group by faulting IP:

Get-ChildItem .\sync\fuzzer01\crashes\id* | ForEach-Object {
    $out = & cdb.exe -g -G -c "!analyze -v; q" .\harness.exe $_.FullName 2>&1 |
           Select-String "Exception Address|ExceptionAddress"
    "$($_.Name)  $out"
} | Sort-Object { $_ -replace '.*Address:\s*', '' }

Cluster identical exception addresses — those are (almost certainly) the same underlying bug.

Classify with !exploitable

Microsoft's !exploitable extension (aka MSEC) rates crashes on a 4-tier scale: EXPLOITABLE, PROBABLY_EXPLOITABLE, PROBABLY_NOT_EXPLOITABLE, UNKNOWN.

# Load the extension once
.load C:\tools\msec\msec.dll

# Analyse
!exploitable -v

Typical good-news verdicts:

Write AV at controlled address → EXPLOITABLE
Read AV near null → PROBABLY_NOT_EXPLOITABLE
Stack buffer overrun with /GS intact → PROBABLY_EXPLOITABLE

Minimise the crash

.\afl-tmin.exe -i crashes\id_000012 -o crash_min.bin -- ^
  -D C:\tools\DynamoRIO-Windows-9.0.1\bin64 -- ^
  -coverage_module target.dll ^
  -target_module harness.exe -target_method FuzzMe -nargs 1 ^
  -fuzz_iterations 1 -- harness.exe @@

Minimal crash files are essential for:

Root-cause analysis (fewer bytes = clearer diff against a good input).
Writing a reliable exploit (you're not going to control 50k bytes).
Reporting to the vendor (short PoC = fast triage).

Step from crash to root cause

windbg.exe -c "g; !analyze -v" harness.exe crash_min.bin

# At the faulting instruction:
r                         ; register state
kb 40                     ; stack walk with args
!heap -p -a @rcx          ; is the faulting pointer a heap block we freed?
!teb ; !peb               ; general environment
ub . L10                  ; previous 10 instructions — how did we get here?

Then correlate with IDA:

Subtract module base from the faulting IP → RVA.
In IDA: G → paste RVA → graph view.
Walk upward to find the allocation or copy that set up the bad pointer.

Performance Tuning

Shrink the coverage surface

Every basic block DR instruments costs time. Fuzz only modules that matter:

# Bad — instruments ucrtbase, kernelbase, ntdll, …
-coverage_module harness.exe

# Good — just the parser
-coverage_module target_parser.dll

Tune `-fuzz_iterations`

Too low (e.g. 100) — you pay DR process startup cost constantly.
Too high (e.g. 100000) — leaked state builds up, stability drops.
Sweet spot for most targets: 1000–5000.

Measure empirically: run at several values and take the one with highest exec/s and stability > 95%.

Shared memory vs file on disk

WinAFL supports shared-memory testcase delivery (-fuzz_iterations with sharedmem_fuzz). If your harness can consume a buffer instead of a file, you cut the per-iteration I/O cost dramatically:

// Harness reads from a shared memory region instead of argv[1]
unsigned char* __afl_fuzz_ptr;
unsigned int*  __afl_fuzz_len;
extern void __afl_manual_init(void);

Declarations live in winafl/samples/shared_memory.c. Wire them in and replace argv[1] parsing with a direct read from the buffer — 2–5× speedup on small inputs.

Disable Windows nuisance

# Tell WER to not pop a dialog on every crash
reg add "HKLM\SOFTWARE\Microsoft\Windows\Windows Error Reporting" ^
    /v DontShowUI /t REG_DWORD /d 1 /f

# Kill the "modern" crash dialog
reg add "HKLM\SOFTWARE\Microsoft\Windows\Windows Error Reporting" ^
    /v Disabled /t REG_DWORD /d 1 /f

# Disable SmartScreen on the fuzzing box
Set-MpPreference -DisableRealtimeMonitoring $true   # Defender

Real-time AV scans each testcase file the moment it lands on disk — single biggest speed killer on a Windows fuzzing host.

Common Failure Modes

"Target process terminated unexpectedly before first iteration"

The target crashed or exited before the target function was reached. Causes:

Startup requires arguments / config that's missing.
License check / DRM trips and exits.
DllMain of a dependent DLL fails.

Run the binary outside AFL with an argument of in\seed.bin first. It must complete normally.

"Spurious hang" / target times out every iteration

-t too low for the target's natural runtime. Raise it.
The target is waiting on a socket/window/pipe that the harness didn't short-circuit. Rewrite the harness to eliminate the wait.
The target started a background thread that never finishes. Either TerminateThread it from the harness or pick a shallower function.

Zero `execs_per_sec` after first iteration

-target_method resolves but WinAFL can't re-enter cleanly — usually because the function uses SEH, tail-calls, or __stdcall with variable nargs.
Try -persistence_mode native_cov (lower stability, higher compatibility).
Try a wrapper function in your harness that calls the real target — WinAFL hooks your wrapper instead.

Stability stuck near 60%

Classic persistent-mode state leak. Fixes in order of cost:

-fuzz_iterations 1 to confirm persistence is the cause.
Reset globals at the start of the target function.
Replace malloc with a per-iteration arena that your harness wipes.
Pick a target function that sits before the stateful code.

"DynamoRIO failed to attach" / child exits with STATUS_INVALID_IMAGE_FORMAT

Bitness mismatch. drrun.exe in bin64 can only instrument x64 targets; bin32 only x86. Check dumpbin /headers target.exe for the machine type and point -D at the matching DR directory.

Target uses CFG / Control Flow Guard

DR handles CFG, but Syzygy does not. If you need static rewriting on a CFG-hardened binary, strip /GUARD:CF by re-linking (if you have the OBJs) or use a hex editor to clear IMAGE_DLLCHARACTERISTICS_GUARD_CF (0x4000) in the PE header — only on a copy of the file, and understand that this may destabilise the binary.

End-to-End Example: Fuzzing a Hypothetical PDF Parser

target:   libpdfcore.dll  (third-party, no PDB)
entry:    ParsePdfStream(const BYTE* data, size_t len)
                         RVA 0x14A20 in libpdfcore.dll (x64)
host exe: pdfview.exe     (loads libpdfcore.dll at startup)

1. Identify the function

# In IDA: open libpdfcore.dll, find 'ParsePdfStream' by its string xrefs
# to error messages like "malformed stream dict" — classic parser telltale.
# Record RVA: 0x14A20.

2. Build a harness

// pdf_harness.c
#include <windows.h>
#include <stdio.h>

typedef int (__stdcall *ParsePdfStream_t)(const BYTE*, size_t);

int main(int argc, char** argv)
{
    if (argc < 2) return 1;

    HMODULE h = LoadLibraryA("libpdfcore.dll");
    if (!h) return 2;

    ParsePdfStream_t fn =
        (ParsePdfStream_t)((BYTE*)h + 0x14A20);

    HANDLE f = CreateFileA(argv[1], GENERIC_READ, 0, NULL,
                           OPEN_EXISTING, 0, NULL);
    DWORD sz = GetFileSize(f, NULL);
    BYTE*  b = (BYTE*)malloc(sz);
    DWORD got = 0;
    ReadFile(f, b, sz, &got, NULL);
    CloseHandle(f);

    __try { fn(b, got); }
    __except (EXCEPTION_EXECUTE_HANDLER) { }

    free(b);
    return 0;
}

Build:

cl /MD /Zi /Od pdf_harness.c /OUT:pdf_harness.exe

Wrap the target call in FuzzMe if you prefer a symbolic hook:

__declspec(noinline) int FuzzMe(const char* path) { /* body above */ }
int main(int argc, char** argv) { return FuzzMe(argv[1]); }

3. Seed corpus

mkdir corpus_raw
Copy-Item C:\Users\Public\Documents\*.pdf corpus_raw\
python .\winafl-cmin.py -D C:\tools\DynamoRIO-Windows-9.0.1\bin64 `
  -t 10000 -i corpus_raw -o corpus_min -- `
  -coverage_module libpdfcore.dll `
  -target_module pdf_harness.exe -target_method FuzzMe -nargs 1 `
  -fuzz_iterations 1 -- pdf_harness.exe @@

4. Dictionary

# pdf.dict
"magic"="%PDF-"
"eof"="%%EOF"
"obj"="obj"
"endobj"="endobj"
"stream"="stream"
"endstream"="endstream"
"xref"="xref"
"trailer"="trailer"
"filter_flate"="/FlateDecode"
"filter_a85"="/ASCII85Decode"
"flate_hdr"="\x78\x9c"

5. Fuzz, 4-way parallel

start /affinity 0x1 cmd /k .\afl-fuzz.exe `
  -i corpus_min -o sync -M m1 -x pdf.dict -t 8000 `
  -D C:\tools\DynamoRIO-Windows-9.0.1\bin64 -- `
  -coverage_module libpdfcore.dll `
  -target_module pdf_harness.exe -target_method FuzzMe -nargs 1 `
  -fuzz_iterations 3000 -- pdf_harness.exe @@

start /affinity 0x2 cmd /k .\afl-fuzz.exe `
  -i corpus_min -o sync -S s1 -x pdf.dict -t 8000 `
  -D C:\tools\DynamoRIO-Windows-9.0.1\bin64 -- `
  -coverage_module libpdfcore.dll `
  -target_module pdf_harness.exe -target_method FuzzMe -nargs 1 `
  -fuzz_iterations 3000 -- pdf_harness.exe @@

# s2, s3 — same pattern, affinities 0x4 and 0x8

6. Triage after 24h

Get-ChildItem sync\*\crashes\id* | Group-Object {
    (& cdb.exe -c "!analyze -v; q" .\pdf_harness.exe $_.FullName 2>&1 |
     Select-String "FAULTING_IP" -Context 0,1) -join ''
}

Pick the first sample in each group, minimise:

.\afl-tmin.exe -i crashes\id_000007 -o id_000007.min -- ^
  -D C:\tools\DynamoRIO-Windows-9.0.1\bin64 -- ^
  -coverage_module libpdfcore.dll ^
  -target_module pdf_harness.exe -target_method FuzzMe -nargs 1 ^
  -fuzz_iterations 1 -- pdf_harness.exe @@

Load the minimised crash in WinDbg, walk back from the faulting IP, correlate with IDA, write the vuln up.

Advanced Patterns

Fuzzing a network service

The target receives over TCP; you can't easily replay a recv buffer from a file. Options:

Harness the parser, not the socket layer. Find the function that consumes the decoded buffer after recv, hook it directly, feed it bytes from @@.
Replace recv with a stub that reads from @@ on first call. Works via DR's drwrap or a patched import table in the harness.
Run a local proxy that accepts mutated bytes from AFL and forwards them to the real service. Slower but requires no reverse engineering of the parse routine.

Fuzzing COM / OLE surfaces

Instantiate the interface in the harness, call the method under test:

CoInitialize(NULL);
IShellLink* psl;
CoCreateInstance(&CLSID_ShellLink, NULL, CLSCTX_INPROC_SERVER,
                 &IID_IShellLink, (void**)&psl);
FuzzMe(psl, argv[1]);
psl->lpVtbl->Release(psl);
CoUninitialize();

Move the CoInitialize outside FuzzMe — you only want to pay for it once.

Fuzzing kernel drivers

WinAFL can't instrument kernel code. Workaround:

IOCTL fuzzing via a usermode harness — the harness opens the device and calls DeviceIoControl with mutated IOCTL buffers. Use WinAFL to drive the harness, but understand coverage is from the usermode side only.
kAFL / TKO — separate kernel fuzzers. Use those instead when you need real ring-0 coverage.

Grammar-aware inputs

For highly structured formats (JS engines, SQL parsers), bit-flip mutation is wasteful. Combine WinAFL with a grammar-based pre-mutator (e.g. Dharma, Grammarinator) that generates candidates, then lets AFL perform feedback-driven selection. The grammar produces valid skeletons; AFL bit-twiddles the fields.

Reference — Flags I Always Forget

Problem	Fix
Need to see exactly what WinAFL is doing	Add `-debug` — dumps to `%TEMP%\afl-XXX.log`
Fuzz runs but no coverage (`map density 0.00%`)	Wrong `-coverage_module`, or DR not injected — add `-verbose` to DR via `-dr_ops "-loglevel 2"`
Dictionary not being applied	`-x` must come before the first `--`
Target respawn every iteration	`-fuzz_iterations 1` — persistent mode disabled
Fuzzer idle, GUI shows `pend_fav 0`	Your corpus is already exhausted for the current queue — add seeds or wait for havoc to find new paths
Crashes reproduce under `afl-tmin` but not bare harness	You're forgetting the same environment/CWD — run the harness from the sync dir
`ERROR: Unable to start DR`	`-D` points at wrong bitness, or DR files are blocked by SmartScreen — `Unblock-File` the whole DR directory

Architecture in One Page

Setup

Install DynamoRIO

Build WinAFL

Sanity check

Core Command Line

afl-fuzz options

winafl.dll options (DynamoRIO client)

Canonical invocation

Modes of Instrumentation

1. DynamoRIO (default)

2. Intel PT (hardware trace)

3. Syzygy (static rewriting)

Mode picker

Picking a Target Function

Properties of a good target function

Finding it with IDA

Confirming it's actually called once per input

Using -target_offset when there's no symbol

Setting -nargs and -call_convention

Writing a Harness

Why harness?

Minimal harness template

Fuzz the harness

Corpus Preparation

Sourcing seeds

Minimise it

Trim individual testcases

Dictionaries

Format

Pass it

Auto-extract from a binary

Parallel Fuzzing

Topology

Launch

Pinning to cores

Per-instance status

Reading the UI

Stability drops are debugging

Crash Triage

De-duplicate (coarse)

Classify with !exploitable

Minimise the crash

Step from crash to root cause

Performance Tuning

Shrink the coverage surface

Tune -fuzz_iterations

Shared memory vs file on disk

Disable Windows nuisance

Common Failure Modes

"Target process terminated unexpectedly before first iteration"

"Spurious hang" / target times out every iteration

Zero execs_per_sec after first iteration

Stability stuck near 60%

"DynamoRIO failed to attach" / child exits with STATUS_INVALID_IMAGE_FORMAT

Target uses CFG / Control Flow Guard

End-to-End Example: Fuzzing a Hypothetical PDF Parser

1. Identify the function

2. Build a harness

3. Seed corpus

4. Dictionary

5. Fuzz, 4-way parallel

6. Triage after 24h

Advanced Patterns

Fuzzing a network service

Fuzzing COM / OLE surfaces

Fuzzing kernel drivers

Grammar-aware inputs

Reference — Flags I Always Forget

Further Reading (offline-safe starting points)

Using `-target_offset` when there's no symbol

Setting `-nargs` and `-call_convention`

Tune `-fuzz_iterations`

Zero `execs_per_sec` after first iteration