An SBOM is essential for any development, security, or compliance team that uses open-source or third-party components. It's especially critical in regulated industries or when working with government contracts.

What’s the difference between manual and automated SBOM generation?

Manual SBOM generation is labor-intensive and error-prone, involving the manual cataloging of components and their dependencies. Automated SBOMs use SCA tools to quickly and accurately scan the codebase, generate a component inventory, and integrate with CI/CD pipelines.

What are the risks of not having an SBOM?

Without an SBOM, teams lack visibility into third-party components. This increases the risk of undetected vulnerabilities and can lead to non-compliance with regulatory requirements.

Does an SBOM help with zero-day vulnerabilities?

Yes. An SBOM enables teams to quickly identify which components are affected by a zero-day vulnerability and respond faster by applying targeted patches.

How does an SBOM help with compliance?

An SBOM provides clear documentation of all software components and their dependencies, making it easier to pass audits and meet compliance requirements for regulations like GDPR, PCI DSS, and the U.S. Executive Order on Cybersecurity.

Why are SBOMs essential? What are its key benefits?

SBOMs offer critical benefits: 1) Complete component visibility, 2) Optimized software management, 3) Enhanced incident response, 4) Improved vulnerability assessment, 5) Faster security fixes, 6) Greater transparency and trust, 7) Regulatory compliance, and 8) Business continuity.

BLOG

8.5 Billion Executions. 2 Real Bugs. Here’s Why.

Learn how to run AFL++ fuzzing at scale, design harnesses, optimize coverage vs throughput, and reduce thousands of crashes into real vulnerabilities using CASR.

Posted on: Apr 23, 2026
By Vinay Kumar Rasala
5 Mins Read
Last updated on: Apr 23, 2026

AFL++ at Scale: Why crash volume doesn’t equal vulnerabilities

357 crash files. 2 actual bugs.

That is not a failure of fuzzing. It is a failure of interpretation.

In a recent AFL++ fuzzing campaign targeting libarchive, we ran approximately 8.5 billion executions across all fuzzing phases, generated over a thousand crash files, and ultimately reduced them to two unique crash sites through structured crash triage and deduplication.

This blog is a practical, engineering-first guide to that process:

How to design a multi-phase fuzzing workflow
How to build the right AFL++ instrumentation matrix
How to optimize coverage versus throughput
How to implement fuzzing crash triage at scale
How to move from crash volume to real vulnerabilities

If your fuzzing pipeline stops at crash counts, you are not measuring security. You are measuring noise.

Why AFL++ fuzzing produces high crash counts (and why they mislead)

Modern fuzzers like AFL++ are extremely good at generating output. That output, however, is not directly equivalent to vulnerabilities.

A single bug can be triggered via hundreds of input paths
Each path is logged as a separate crash
Parallel fuzzing instances amplify duplication

This is why:

Crash count does not equal vulnerability count
Coverage does not equal risk depth
Execution speed does not equal meaningful discovery

If you do not implement fuzzing deduplication, and root cause clustering, your results will always be inflated.

Target selection: Why libarchive works for fuzzing

libarchive is an ideal fuzzing target because:

It parses attacker-controlled archive inputs
It supports multiple formats such as tar, zip, cpio, ISO, and RAR
It is written in C with complex parsing logic
It is widely deployed in production systems

This creates a realistic attack surface where malformed inputs can trigger memory safety issues, null dereferences, parser inconsistencies, and denial-of-service conditions.

AFL++ build matrix: The most common fuzzing mistake

Before running AFL++, the most critical step is building the correct binaries.

Minimum viable AFL++ setup:

Binary	Purpose
Native (LTO)	Maximum throughput
ASAN	Memory error detection
CmpLog	Unlocks comparison-based paths

Why LTO matters

LTO instrumentation provides full-program visibility, collision-free edge coverage, and automatic dictionary extraction.

Why not run only ASAN

ASAN introduces approximately two times the runtime overhead. Running all instances with ASAN reduces total executions and limits discovery.

Correct pattern

afl-fuzz -M main ... -w ./target_asan -- ./target_native

The native binary delivers speed. The ASAN binary validates memory safety.

Why CmpLog is critical

Many formats rely on strict comparisons and magic bytes. CmpLog allows AFL++ to observe runtime comparisons, extract operand values, and inject them into mutations. This significantly improves path discovery in structured formats.

Phase 1: CLI validation

Start simple.

Instead of building harnesses immediately, fuzz the CLI target:

afl-fuzz -i afl_inp -o afl_out/ -t 1000 -M FUZZ01_LTO \
-- ./lto_build/bin/bsdtar -xf @@ -C /tmp/out_bsdtar

Goal

Validate the toolchain
Ensure seeds hit real code paths
Build the initial corpus

Outcome

Seeds: 29 to 64, minimized to 42
Runtime: approximately 7 hours

Key takeaway

Do not skip validation. Broken setups scale poorly.

Phase 2: Persistent mode in AFL++ (eliminating fork overhead)

The most important performance improvement in AFL++ is persistent mode.

Why it matters

Eliminates fork and execution overhead
Uses shared memory input
Improves throughput by five to twenty times

Minimal persistent loop

while (__AFL_LOOP(10000)) {
archive_read_open_memory(a, buf, len);
archive_read_next_header(...);
}

Engineering rules

Always free state per iteration
Limit loop iterations to avoid state drift
Place initialization after expensive setup

Results

Approximately 394 executions per second per instance
Corpus: 42 to 1,059 inputs
Crashes: 0

Parallelization strategy: Avoiding redundant mutation work

Once persistent mode removes execution overhead, the next bottleneck is how effectively multiple fuzzing instances explore the input space.

The key secondary-side choice is the power schedule (-p), and the rule is simple: don’t give every secondary the same one.

If all instances run identical schedules, they quickly converge on similar mutations, leading to redundant work and poor CPU utilization. Mixed schedules ensure each instance explores a different region of the search space.

Recommended schedule distribution

-p explore → pushes toward new, unexplored coverage paths
-p exploit → focuses on inputs already near interesting states
-p rare → prioritizes rarely-hit edges, effective for corner-case discovery

This diversity ensures that parallel fuzzers are complementary rather than duplicative.

MOpt integration (targeted, not universal)

One secondary instance should run with -L 0 to enable MOpt (Mutation Operator Optimization).

MOpt uses a particle swarm optimization model to:

Track which mutation operators produce new coverage
Dynamically adjust mutation probabilities toward effective strategies

It performs best as a single adaptive instance within a heterogeneous setup, not as a replacement for all fuzzers.

Key takeaway

Persistent mode unlocks throughput.

The parallel strategy determines whether that throughput translates into meaningful coverage growth or wasted cycles.

This phase builds coverage, not bugs.

Phase 3: Throughput optimization in AFL++ (maximizing exec/sec)

Once coverage stabilizes, shift the objective.

Strategy change

Reduce per-iteration work
Focus on high-probability crash paths
Reuse the Phase 2 corpus

Results

5.2 billion executions
Approximately 7,400 executions per second
270 crashes

Insight

Coverage plateaued while crash volume increased. This indicates duplication rather than new bug discovery.

Key takeaway

More speed increases crash volume, not necessarily the number of new vulnerabilities.

Phase 4: Expanding fuzzing coverage to deeper parser surfaces

To find new bugs, you must change the surface being tested.

New surfaces explored

ACL iterators
Sparse region traversal
Metadata structures
Nested object graphs

Why this matters

These areas introduce pointer-linked structures, count mismatches, and deep iteration paths.

Malformed inputs here trigger:

Null dereferences
Unbounded iteration
Arithmetic inconsistencies in linked regions.

Execution strategy: Power schedule diversification

Expanding the surface alone is insufficient. Without a diversified mutation strategy, parallel fuzzers converge on similar paths and waste cycles.

Phase 4 introduces explicit power-schedule separation across secondaries, ensuring that each instance explores a distinct region of the input space.

Instance strategy

-p explore → drives discovery of new coverage paths
-p exploit → intensifies mutations near known interesting inputs
-L 0 (MOpt) → dynamically optimizes mutation operators based on observed effectiveness
-l 2 (laf-intel) → rewrites multi-byte comparisons into byte-wise checks, improving constraint solvability
-c (CmpLog) → captures runtime comparisons to guide input mutation

Why this matters

Same harness + same schedule = redundant work

Same harness + different schedules = parallel exploration

Each instance:

Mutates inputs differently
Prioritizes different execution paths
Contributes non-overlapping coverage

This is what allows deeper surfaces to actually introduce new bugs, rather than duplicating earlier crash classes.

Results

Coverage increased from approximately 8.6k to 9.7k edges
Crashes: 896
Hangs: 2,414

Interpretation

New crash classes emerged from deeper parser logic, not increased iteration volume.

The sharp increase in hangs reflects:

Traversal of nested iterator paths
Quadratic behavior in malformed structures

Critically, these were new execution regions, not extensions of earlier write-path bugs.

Key takeaway

New bugs come from:

New surfaces × Diverse mutation strategies

Not more iterations.

Surface expansion without schedule diversity produces redundancy, not discovery.

Corpus evolution across the fuzzing workflow

Stage	Files
Initial seeds	29
Post CLI	42
Post Phase 2	1,059
Post Phase 4	6,779
Final merged	36,310

Raw corpus growth is exponential. Unique coverage is not.

Crash funnel: From noise to signal

Crash funnel From noise to signal

This funnel represents the most important concept in fuzzing at scale. Large volumes of crashes collapse into a very small number of real issues.

Crash triage: From 1,166 crashes to 2 bugs

Fuzzing produces crashing inputs. It does not directly produce vulnerabilities.

Triage pipeline

Reproduce crashes using ASAN
Deduplicate AFL++ outputs
Generate CASR reports
Cluster by stack trace similarity

Results

Stage	Count
Raw crashes	1,166 (approx)
Reproducible	357
Unique crash sites	2

Root cause

Both bugs were located in:

archive_entry_sparse.c

Operator insight

If a fuzzing campaign produces hundreds of crashes, more than ninety percent are typically duplicates.

Key takeaway

Crash triage is where fuzzing becomes engineering.

Minimal AFL++ setup

To replicate this workflow, start with:

One native binary with LTO
One ASAN binary
Persistent mode harness
CmpLog enabled
One master and two secondary instances

Expand only after stability is confirmed.

Common mistakes in fuzzing pipelines

Running only ASAN builds
Skipping persistent mode
Ignoring CmpLog
Treating crash count as vulnerability count
Not implementing a triage pipeline

What this means for modern AppSec pipelines

This is not just a fuzzing problem. It reflects a broader failure in application security pipelines:

Too much output
Too little interpretation
Weak connection to real risk

Security tools identify where systems break. Engineering determines what actually matters.

This is the shift toward execution-aware security:

Focus on runtime behavior
Collapse duplicate findings
Prioritize root causes

From noise to signal

Fuzzing produces noise. Engineering produces signal.

The difference is everything.

A well-run fuzzing workflow should:

Generate large volumes of data
Collapse that data aggressively
Produce a small, actionable set of bugs

If your pipeline ends at 357 crashes, it is incomplete.
If it ends at 2 root causes, it is useful.

FAQs

What is AFL++ fuzzing?

AFL++ is a coverage-guided fuzzing framework used to discover vulnerabilities by mutating inputs and observing program behavior.

Why do fuzzers generate many crashes?

Fuzzers generate many crashes because the same bug can be triggered through multiple execution paths.

How do you deduplicate fuzzing crashes?

You can deduplicate fuzzing crashes by using clustering tools such as CASR, which group crashes based on stack traces.

What is persistent mode in AFL++?

A persistent mode in AFL++ is one in which the target runs in a loop, avoiding process restarts and improving performance.

Why is the crash count misleading?

The crash count is misleading because it reflects detection frequency rather than unique vulnerabilities.

Vinay Kumar Rasala

Vinay Kumar Rasala serves as a security research associate at Appknox, a leading security suite for automating mobile security in enterprises. He specializes in ethical hacking and penetration testing and has actively collaborated with numerous enterprises, strengthening their APIs and mobile and web apps against cyber threats.
Vinay is passionate about exploring new technologies, mainly iOS tweaks, reverse engineering, and programming. In his free time, he enjoys playing open-world games and experimenting with cooking.

8.5 Billion Executions. 2 Real Bugs. Here’s Why.

AFL++ at Scale: Why crash volume doesn’t equal vulnerabilities

357 crash files. 2 actual bugs.

Why AFL++ fuzzing produces high crash counts (and why they mislead)

Target selection: Why libarchive works for fuzzing

AFL++ build matrix: The most common fuzzing mistake

Why LTO matters

Why not run only ASAN

Why CmpLog is critical

Phase 1: CLI validation

Goal

Outcome

Key takeaway

Phase 2: Persistent mode in AFL++ (eliminating fork overhead)

Why it matters

Minimal persistent loop

Engineering rules

Results

Parallelization strategy: Avoiding redundant mutation work

Recommended schedule distribution

MOpt integration (targeted, not universal)

Key takeaway

Phase 3: Throughput optimization in AFL++ (maximizing exec/sec)

Strategy change

Results

Insight

Key takeaway

Phase 4: Expanding fuzzing coverage to deeper parser surfaces

New surfaces explored

Why this matters

Execution strategy: Power schedule diversification

Instance strategy

Why this matters

Results

Interpretation

Key takeaway

Corpus evolution across the fuzzing workflow

Crash funnel: From noise to signal

Crash triage: From 1,166 crashes to 2 bugs

Triage pipeline

Results

Root cause

Operator insight

Key takeaway

Minimal AFL++ setup

Common mistakes in fuzzing pipelines

What this means for modern AppSec pipelines

From noise to signal

FAQs

What is AFL++ fuzzing?

Why do fuzzers generate many crashes?

How do you deduplicate fuzzing crashes?

What is persistent mode in AFL++?

Why is the crash count misleading?

Vinay Kumar Rasala

Subscribe To Our Newsletter For latest news on product updates and more...

PRODUCT

Automated Vulnerability Assessment

Service Offered

SOLUTION BY

RESOURCES

COMPARE

GET STARTED

LINKS

Subscribe To Our Newsletter
For latest news on product updates and more...