Commit 4a8415ab authored by Max Moroz's avatar Max Moroz Committed by Commit Bot

[libFuzzer] Docs: re-write the Efficient Fuzzing Guide as per feedback from the tech writer.

Also removed "Custom options" section as it doesn't need to be advertised anymore.
The only real usecase we still may have is `close_fd_mask`, but it's discouraged and
we better use it on a case-by-case basis, when it's inevitable.

Bug: 539572, 827228
Change-Id: I036303e1da0c844be4c3970ba5351651a2251a89
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1729847
Commit-Queue: Max Moroz <mmoroz@chromium.org>
Reviewed-by: default avatarJonathan Metzman <metzman@chromium.org>
Cr-Commit-Position: refs/heads/master@{#683649}
parent 234cc1c1
......@@ -24,12 +24,15 @@ Started Guide].
## Advanced Topics
* Improving fuzz target effectiveness: [Efficient Fuzzer Guide].
* Improving fuzz target effectiveness: [Efficient Fuzzing Guide].
* Creating a fuzz target that expects a protobuf (instead of a byte steam) as
input: [Guide to libprotobuf-mutator (LPM)].
**Note**: you can also use LPM to fuzz code that needs multiple mutated
*** note
**Note:** you can also use LPM to fuzz code that needs multiple mutated
inputs, or to generate inputs defined by a grammar.
***
* Reproducing bugs found by libFuzzer/AFL and reported by ClusterFuzz:
[Reproducing Bugs].
......@@ -60,7 +63,7 @@ Started Guide].
[ClusterFuzz]: https://clusterfuzz.com/
[ClusterFuzz Bugs]: https://bugs.chromium.org/p/chromium/issues/list?sort=-modified&colspec=ID%20Pri%20M%20Stars%20ReleaseBlock%20Component%20Status%20Owner%20Summary%20OS%20Modified&q=label%3AStability-LibFuzzer%2CStability-AFL%20label%3AClusterFuzz%20-status%3AWontFix%2CDuplicate&can=1
[ClusterFuzz Stats]: https://clusterfuzz.com/fuzzer-stats/by-fuzzer/fuzzer/libFuzzer/job/libfuzzer_chrome_asan
[Efficient Fuzzer Guide]: efficient_fuzzer.md
[Efficient Fuzzing Guide]: efficient_fuzzing.md
[Fuzzing]: https://en.wikipedia.org/wiki/Fuzzing
[Fuzzing on Chrome OS]: https://chromium.googlesource.com/chromiumos/docs/+/master/fuzzing.md
[Getting Started Guide]: getting_started.md
......
# Efficient Fuzzer Guide
This document describes ways to determine efficiency of a fuzz target and ways
to improve it.
## Overview
Being a coverage-driven fuzzing engine, libFuzzer considers a certain input
*interesting* if it results in new code coverage, i.e. it reaches a code that
has not been reached before. The set of all interesting inputs is called
*corpus*.
Items in corpus are constantly mutated in search of new interesting inputs.
Corpus can be shared across fuzzer runs and grows over time as new code is
reached.
There are several metrics you should look at to determine effectiveness of your
fuzz target:
* [Execution Speed](#Execution-Speed)
* [Code Coverage](#Code-Coverage)
* [Corpus Size](#Corpus-Size)
You can collect these metrics manually or take them from [ClusterFuzz status]
pages after a fuzz target is checked in Chromium repository.
The following things are extremely useful for improving fuzzing efficiency, so
we *strongly recommend* them for any fuzz target:
* [Seed Corpus](#Seed-Corpus)
* [Fuzzer Dictionary](#Fuzzer-Dictionary)
There are other ways that are useful in some cases, but not always applicable:
* [Custom Options](#Custom-Options)
* [Custom Build](#Custom-Build)
## Execution Speed
Fuzz target speed is calculated in executions per second. It is printed while a
fuzz target is running:
```
#19346 NEW cov: 2815 bits: 1082 indir: 43 units: 150 exec/s: 19346 L: 62
```
Because libFuzzer performs randomized mutations, it is critical to have it run
as fast as possible to navigate through the large search space efficiently and
find interesting code paths. You should try to get to at least 1,000 exec/s from
your fuzz target locally before submitting it to the Chromium repository.
### Initialization/Cleanup
Try to keep `LLVMFuzzerTestOneInput` function as simple as possible. If your
fuzzing function is too complex, it can bring down fuzzer execution speed OR it
can target very specific usecases and fail to account for unexpected scenarios.
Prefer to use static initialization and shared resources rather than performing
setup and teardown on every single input. Checkout example on
[startup initialization] in libFuzzer documentation.
You can skip freeing static resources. However, all resources allocated within
`LLVMFuzzerTestOneInput` function should be de-allocated since this function is
called millions of times during a fuzzing session. Otherwise, we will hit OOMs
frequently and reduce overall fuzzing efficiency.
### Memory Usage
Avoid allocation of dynamic memory wherever possible. Memory instrumentation
works faster for stack-based and static objects, than for heap allocated ones.
It is always a good idea to try different variants for your fuzz target locally,
and then submit the fastest implementation.
## Code Coverage
[Chrome libFuzzer coverage] provides a source-level coverage report for fuzz
targets from recent runs. Looking at the report might provide insight on how to
improve code coverage of a fuzz target.
You can also generate source-level coverage report locally on your particular
fuzzer by running the [coverage script] stored in Chromium repository. The
script provides detailed instructions as well as a usage example.
Note that code coverage of a fuzz target **depends heavily** on the corpus
provided when running the target, i.e. code coverage report generated by a fuzz
target launched without any corpus would not make much sense. To download the
corpus from ClusterFuzz, see [ClusterFuzz Corpus].
## Corpus Size
After running for a while, a fuzz target would reach a plateau and may stop
discovering new interesting inputs. Corpus for a reasonably complex target
should contain hundreds (if not thousands) of items.
Too small of a corpus size may indicate that fuzz target is hitting a code
barrier and is unable to get past it. Common cases of such issues include:
checksums, magic numbers, etc. However, it also could mean that it is impossible
for your fuzzer to reach a lot of code. The easiest way to diagnose this problem
is to generate and analyze a [coverage report](#Code-Coverage). To fix the
issue, you can:
* Change the code (e.g. disable crc checks while fuzzing), see
[Custom Build](#Custom-Build).
* Prepare or improve [seed corpus](#Seed-Corpus).
* Prepare or improve [fuzzer dictionary](#Fuzzer-Dictionary).
* Add [custom options](#Custom-Options).
## Seed Corpus
Seed corpus is a set of *valid* and *interesting* inputs that serve as starting
points for a fuzz target. If one is not provided, a fuzzing engine would have to
guess these inputs from scratch, which can take an indefinite amount of time
depending on size of the inputs and complexity of the target format.
Seed corpus works especially well for strictly defined file formats and data
transmission protocols.
* For file format parsers, add valid files from your test suite.
* For protocol parsers, add valid raw streams from test suite into separate
files.
Other examples include a graphics library seed corpus, which would be a variety
of small PNG/JPG/GIF files.
If you are running a fuzz target locally, you can pass a corpus directory as an
argument:
```
./out/libfuzzer/my_fuzzer ~/tmp/my_fuzzer_corpus
```
The fuzzer would store all the interesting inputs it finds in that directory.
While libFuzzer can start with an empty corpus, seed corpus is always useful and
in many cases is able to increase code coverage by an order of magnitude.
ClusterFuzz uses seed corpus defined in Chromium source repository. You need to
add a `seed_corpus` attribute to your `fuzzer_test` definition in BUILD.gn file:
```
fuzzer_test("my_fuzzer") {
...
seed_corpus = "test/fuzz/testcases"
...
}
```
You may specify multiple seed corpus directories via `seed_corpuses` attribute:
```
fuzzer_test("my_fuzzer") {
...
seed_corpuses = [ "test/fuzz/testcases", "test/unittest/data" ]
...
}
```
All files found in these directories and their subdirectories will be archived
into a `<my_fuzzer>_seed_corpus.zip` output archive.
If you can't store seed corpus in Chromium repository (e.g. it is too large,
cannot be open sourced, etc), you can upload the corpus to Google Cloud Storage
bucket used by ClusterFuzz:
1) Go to [Corpus GCS Bucket].
2) Open directory named `<my_fuzzer>`. If the directory does not exist,
please create it.
3) Upload corpus files into the directory.
Alternative and faster way is to use [gsutil] command line tool:
```bash
gsutil -m rsync <path_to_corpus> gs://clusterfuzz-corpus/libfuzzer/<my_fuzzer>
```
*** note
**Requirements:** You must have an @google.com and you must be logged into that
account to write to this bucket (@chromium.org will not work). You can use the
`gcloud auth login` command to log into your account in `gsutil` if you
installed `gsutil` through `gcloud`.
***
Note that if you upload the corpus to GCS, the `seed_corpus` attribute is not
needed in your `fuzzer_test` definition.
### Corpus Minimization
It's important to minimize seed corpus to a *small set of interesting inputs*
before uploading. The reason being that seed corpus is synced to all fuzzing
bots for every iteration, so it is important to keep it small both for fuzzing
efficiency and to prevent our bots from running out of disk space.
The minimization can be done using `-merge=1` option of libFuzzer:
```bash
# Create an empty directory.
mkdir seed_corpus_minimized
# Run the fuzzer with -merge=1 flag.
./my_fuzzer -merge=1 ./seed_corpus_minimized ./seed_corpus
```
After running the command above, `seed_corpus_minimized` directory will contain
a minimized corpus that gives the same code coverage as the initial
`seed_corpus` directory.
## Fuzzer Dictionary
It is very useful to provide fuzz target with a set of *common words or values*
that you expect to find in the input. Adding a dictionary highly improves the
efficiency of finding new units and works especially well in certain usecases
(e.g. fuzzing file format decoders or text based protocols like XML).
To add a dictionary, first create a dictionary file. This is a flat *ascii* text
file where tokens are listed one per line in the format of `name="value"`, where
`name` is optional and can be omitted, although it is a convenient way to
document the meaning of a particular token. The value must appear in quotes,
with hex escaping (\xNN) applied to all non-printable, high-bit, or otherwise
problematic characters (\\ and \" shorthands are recognized too). This syntax is
similar to the one used by [AFL] fuzzing engine (-x option).
An example dictionary looks like:
```
# Lines starting with '#' and empty lines are ignored.
# Adds "blah" word (w/o quotes) to the dictionary.
kw1="blah"
# Use \\ for backslash and \" for quotes.
kw2="\"ac\\dc\""
# Use \xAB for hex values.
kw3="\xF7\xF8"
# Key name before '=' can be omitted:
"foo\x0Abar"
```
Make sure to test your dictionary by running your fuzz target locally:
```bash
./out/libfuzzer/my_fuzzer -dict=<path_to_dict> <path_to_corpus>
```
If the dictionary is effective, you should see new units discovered in the
output.
To submit a dictionary to Chromium repository:
1) Add the dictionary file in the same directory as your fuzz target
2) Add `dict` attribute to `fuzzer_test` definition in BUILD.gn file:
```
fuzzer_test("my_fuzzer") {
...
dict = "my_fuzzer.dict"
}
```
The dictionary will be used automatically by ClusterFuzz once it picks up a new
revision build.
## Custom Options
Custom options help to fine tune libFuzzer execution parameters and will also
override the default values used by ClusterFuzz. Please read [libFuzzer options]
page for detailed documentation on how these work. A more up-to-date list of
options can be obtained by using libFuzzer's `-help=1` option (i.e. `./my_fuzzer
-help=1`).
Add the options needed in `libfuzzer_options` attribute to your `fuzzer_test`
definition in BUILD.gn file:
```
fuzzer_test("my_fuzzer") {
...
libfuzzer_options = [
# Suppress stderr output (not recommended, as it may silence useful info).
"close_fd_mask=2",
]
}
```
Please note that `dict` parameter should be provided
[separately](#Fuzzer-Dictionary). All other options can be passed using
`libfuzzer_options` property.
## Custom Build
If you need to change the code being tested by your fuzz target, you may use an
`#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` macro in your target code.
Note that patching target code is not a preferred way of improving corresponding
fuzz target, but in some cases that might be the only way possible, e.g. when
there is no intended API to disable checksum verification, or when target code
uses random generator that affects reproducibility of crashes.
[AFL]: http://lcamtuf.coredump.cx/afl/
[ClusterFuzz Corpus]: libFuzzer_integration.md#Corpus
[ClusterFuzz status]: libFuzzer_integration.md#Status-Links
[Corpus GCS Bucket]: https://console.cloud.google.com/storage/clusterfuzz-corpus/libfuzzer
[issue 638836]: https://bugs.chromium.org/p/chromium/issues/detail?id=638836
[coverage script]: https://cs.chromium.org/chromium/src/tools/code_coverage/coverage.py
[gsutil]: https://cloud.google.com/storage/docs/gsutil
[libFuzzer options]: https://llvm.org/docs/LibFuzzer.html#options
[startup initialization]: https://llvm.org/docs/LibFuzzer.html#startup-initialization
[Chrome libFuzzer coverage]: https://chromium-coverage.appspot.com/reports/latest_fuzzers_only/linux/index.html
# Efficient Fuzzing Guide
Once you have a fuzz target running, you can analyze and tweak it to improve its
efficiency. This document describes techniques to minimize fuzzing time and
maximize your results.
*** note
**Note:** If you haven’t created your first fuzz target yet, see the [Getting
Started Guide].
***
The most direct way to gauge the effectiveness of your fuzz target is to collect
metrics. You can get them manually, or take them from a [ClusterFuzz status]
page after your fuzz target is checked into the Chromium repository.
[TOC]
## Key metrics of a fuzz target
### Execution speed
A fuzzing engine such as libFuzzer typically explores a large search space by
performing randomized mutations, so it needs to run as fast as possible to find
interesting code paths.
Fuzz target speed is calculated in executions per second (`exec/s`). It is
printed while a fuzz target is running:
```
#11002 NEW cov: 1337 ft: 10934 corp: 707/409Kb lim: 1098 exec/s: 5333 rss: 27Mb L: 186/1098
```
You should aim for at least 1,000 exec/s from your fuzz target locally before
submitting it to the Chromium repository. If you’re under 1,000, consider the
following improvements:
* [Simplifying initialization/cleanup](#Simplifying-initialization-cleanup)
* [Minimizing memory usage](#Minimizing-memory-usage)
#### Simplifying initialization/cleanup
If your `LLVMFuzzerTestOneInput` function is too complex, it can decrease the
fuzzer’s execution speed. It can also cause the fuzzer to target specific
use-cases or fail to account for unexpected scenarios.
Instead of performing setup and teardown on each input, use static
initialization and shared resources. Check out this [startup initialization] in
libFuzzer’s documentation for an example.
*** note
**Note:** You can skip freeing static resources. However, all other resources
allocated within the `LLVMFuzzerTestOneInput` function should be de-allocated,
since the function gets called millions of times during a fuzzing session. If
you don’t, you’ll often run out of memory and reduce overall fuzzing efficiency.
***
#### Minimizing memory usage
Avoid allocation of dynamic memory wherever possible. Memory instrumentation
works faster for stack-based and static objects than for heap-allocated ones.
*** note
**Note:** It’s always a good idea to try different variants for your fuzz target
locally, then submit only the fastest implementation to the Chromium repository.
***
### Code coverage
You can check the percentage of code covered by your fuzz target to gauge
fuzzing effectiveness:
* Review aggregated Chrome coverage from recent runs by checking the [fuzzing
coverage] report. This report can provide insight on how to improve code
coverage.
* Generate a source-level coverage report for your fuzzer by running the
[coverage script] stored in the Chromium repository. The script provides
detailed instructions and a usage example.
*** note
**Note:** The code coverage of a fuzz target depends heavily on the corpus. A
well-chosen corpus will produce much greater code coverage. On the other hand,
a coverage report generated by a fuzz target without a corpus won't cover much
code. If you don’t have a corpus to use, you can download the [corpus from
ClusterFuzz]. For more information on the corpus, see
[Corpus Size](#Corpus-Size).
***
### Corpus size
A guided fuzzing engine such as libFuzzer considers an input (a.k.a. testcase
or corpus unit) *interesting* if the input results in new code coverage (i.e.,
if the fuzzer reaches code that has not been reached before). The set of all
interesting inputs is called the *corpus*. A corpus is shared across fuzzer runs
and grows over time.
If a fuzz target stops discovering new interesting inputs after running for a
while, it typically indicates that the fuzz target is hitting a code barrier
(also called a *coverage plateau*). The corpus for a reasonably complex target
should contain hundreds (if not thousands) of inputs.
If a fuzz target reaches coverage plateau with a small corpus, the common causes
are checksums and magic numbers. Or, it may be impossible for your fuzzer to
reach a lot of code. The easiest way to diagnose the problem is to generate and
analyze a [coverage report](#code-coverage). Then, to fix the issue, try the
following:
* Change the code (e.g., disable CRC checks while fuzzing) with a
[custom build](#Custom-build).
* Prepare or improve the [seed corpus](#Seed-corpus).
* Prepare or improve the [fuzzer dictionary](#Fuzzer-dictionary).
## Ways to improve a fuzz target
### Seed corpus
You can give your fuzz target a starting point by creating a set of valid and
interesting inputs called a *seed corpus*. If you don’t provide a seed corpus,
the fuzzing engine has to guess inputs from scratch, which can take time
(depending on the size of the inputs and the complexity of the target format).
In many cases, providing a seed corpus can increase code coverage by an order of
magnitude.
Seed corpuses work especially well for strictly defined file formats and data
transmission protocols:
* For file format parsers, add valid files from your test suite.
* For protocol parsers, add valid raw streams from a test suite into separate
files.
* For graphics libraries, add a variety of small PNG/JPG/GIF files.
#### Using a corpus locally
If you’re running a fuzz target locally, you can easily designate a corpus by
passing a directory as an argument:
```
./out/libfuzzer/my_fuzzer ~/tmp/my_fuzzer_corpus
```
The fuzzer stores all the interesting inputs it finds in the directory.
#### Creating a Chromium repository seed corpus
When running fuzz targets at scale, ClusterFuzz looks for a seed corpus defined
in the Chromium source repository. You can define one in your `BUILD.gn` file by
adding a `seed_corpus` attribute to your `fuzzer_test` target definition:
```
fuzzer_test("my_fuzzer") {
...
seed_corpus = "test/fuzz/testcases"
...
}
```
If you want to specify multiple seed corpus directories, use the `seed_corpuses`
attribute instead:
```
fuzzer_test("my_fuzzer") {
...
seed_corpuses = [ "test/fuzz/testcases", "test/unittest/data" ]
...
}
```
All files found in these directories and their subdirectories are stored in a
`<my_fuzzer>_seed_corpus.zip` output archive.
#### Uploading corpus files to GCS
If you can't store your seed corpus in the Chromium repository (e.g., it’s too
large, can’t be open-sourced, etc.), you can upload the corpus to the Google
Cloud Storage (GCS) bucket used by ClusterFuzz.
1) Open the [Corpus GCS Bucket] in your browser.
2) Search for the directory named `<my_fuzzer>`. If the directory does not
exist, create it.
3) In the `<my_fuzzer>` directory, upload your corpus files.
*** note
**Note:** If you upload your corpus to GCS, you don’t need to add the
`seed_corpus` attribute to your `fuzzer_test` target definition. However, adding
seed corpus to the Chromium repository is the preferred way.
***
You can do the same thing by using the [gsutil] command line tool:
```bash
gsutil -m rsync <path_to_corpus> gs://clusterfuzz-corpus/libfuzzer/<my_fuzzer>
```
*** note
**Note:** To write to this bucket using `gsutil`, you must be logged into your
@google.com account (@chromium.org will not work). You can use the `gcloud auth
login` command to log into your account in `gsutil` if you installed `gsutil`
through `gcloud`.
***
#### Minimizing a seed corpus
Your seed corpus is synced to all fuzzing bots for every iteration, so it's
important to minimize it to a small set of interesting inputs before uploading.
Keeping the seed corpus small improves fuzzing efficiency and prevents our bots
from running out of disk space.
You can minimize your seed corpus by using libFuzzer’s `-merge=1` option:
```bash
# Create an empty directory.
mkdir seed_corpus_minimized
# Run the fuzzer with -merge=1 flag.
./my_fuzzer -merge=1 ./seed_corpus_minimized ./seed_corpus
```
After running the command, the `seed_corpus_minimized` directory will contain a
minimized corpus that gives the same code coverage as your initial `seed_corpus`
directory.
### Fuzzer dictionary
You can help your fuzzer increase its coverage by providing a set of common
words or values that you expect to find in the input. Such a dictionary works
especially well for certain use-cases (e.g., fuzzing file format decoders or
text-based protocols like XML).
Add a fuzzer dictionary:
1) Create a flat ASCII text file that lists one input token per line in the
format `name="value"`. The value must appear in quotes with hex escaping
(`\xNN`) applied to all non-printable, high-bit, or otherwise problematic
characters (`\` and `"` shorthands are recognized, too). This syntax is
similar to the one used by the [AFL] fuzzing engine (`-x` option).
*** note
**Note:** `name` can be omitted, but it is a convenient way to document the
meaning of each token. Here’s an example dictionary:
***
```
# Lines starting with '#' and empty lines are ignored.
# Adds "blah" word (w/o quotes) to the dictionary.
kw1="blah"
# Use \\ for backslash and \" for quotes.
kw2="\"ac\\dc\""
# Use \xAB for hex values.
kw3="\xF7\xF8"
# Key name before '=' can be omitted:
"foo\x0Abar"
```
2) Test your dictionary by running your fuzz target locally:
```bash
./out/libfuzzer/my_fuzzer -dict=<path_to_dict> <path_to_corpus>
```
If the dictionary is effective, you should see `NEW` units discovered in the
output.
3) Add the dictionary file in the same directory as your fuzz target, then add
the `dict` attribute to the `fuzzer_test` definition in your `BUILD.gn` file:
```
fuzzer_test("my_fuzzer") {
...
dict = "my_fuzzer.dict"
}
```
The dictionary is submitted to the Chromium repository. Once ClusterFuzz
picks up a new revision build, the dictionary is used automatically.
### Custom build
If you need to change the code being tested by your fuzz target, you can use an
`#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` macro in your target code.
*** note
**Note:** Patching target code is not a preferred way of improving the
corresponding fuzz target, but in some cases it might be the only way to do it
(e.g., when there is no intended API to disable checksum verification, or when
the target code uses a random generator that affects the reproducibility of
crashes).
***
[AFL]: http://lcamtuf.coredump.cx/afl/
[ClusterFuzz status]: libFuzzer_integration.md#Status-Links
[Corpus GCS Bucket]: https://console.cloud.google.com/storage/clusterfuzz-corpus/libfuzzer
[Getting Started Guide]: getting_started.md
[corpus from ClusterFuzz]: libFuzzer_integration.md#Corpus
[coverage script]: https://cs.chromium.org/chromium/src/tools/code_coverage/coverage.py
[fuzzing coverage]: https://chromium-coverage.appspot.com/reports/latest_fuzzers_only/linux/index.html
[gsutil]: https://cloud.google.com/storage/docs/gsutil
[startup initialization]: https://llvm.org/docs/LibFuzzer.html#startup-initialization
......@@ -25,10 +25,12 @@ tools/mb/mb.py gen -m chromium.fuzz -b 'Libfuzzer Upload Mac ASan' out/libfuzzer
python tools\mb\mb.py gen -m chromium.fuzz -b "Libfuzzer Upload Windows ASan" out\libfuzzer
```
**Note**: You can also invoke [AFL] by using the `use_afl` GN argument, but we
*** note
**Note:** You can also invoke [AFL] by using the `use_afl` GN argument, but we
recommend libFuzzer for local development. Running libFuzzer locally doesn't
require any special configuration and gives quick, meaningful output for speed,
coverage, and other parameters.
***
It’s possible to run fuzz targets without sanitizers, but not recommended, as
sanitizers help to detect errors which may not result in a crash otherwise.
......@@ -43,8 +45,10 @@ sanitizers help to detect errors which may not result in a crash otherwise.
For more on builder and sanitizer configurations, see the [Integration
Reference] page.
*** note
**Hint**: Fuzz targets are built with minimal symbols by default. You can adjust
the symbol level by setting the `symbol_level` attribute.
***
### Creating your first fuzz target
......@@ -53,9 +57,11 @@ After you set up your build environment, you can create your first fuzz target:
1. In the same directory as the code you are going to fuzz (or next to the tests
for that code), create a new `<my_fuzzer>.cc` file.
**Note**: Do not use the `testing/libfuzzer/fuzzers` directory. This
*** note
**Note:** Do not use the `testing/libfuzzer/fuzzers` directory. This
directory was used for initial sample fuzz targets but is no longer
recommended for landing new targets.
***
2. In the new file, define a `LLVMFuzzerTestOneInput` function:
......@@ -79,11 +85,13 @@ After you set up your build environment, you can create your first fuzz target:
}
```
**Note**: Most of the targets are small. They may perform one or a few API calls
*** note
**Note:** Most of the targets are small. They may perform one or a few API calls
using the data provided by the fuzzing engine as an argument. However, fuzz
targets may be more complex if a certain initialization procedure needs to be
performed. [quic_stream_factory_fuzzer.cc] is a good example of a complex fuzz
target.
***
### Running the fuzz target
......@@ -119,9 +127,11 @@ your fuzz target is efficient, it will find a lot of them quickly. A `... pulse
For more information about the output, see [libFuzzer's output documentation].
**Note**: If you observe an `odr-violation` error in the log, please try setting
*** note
**Note:** If you observe an `odr-violation` error in the log, please try setting
the following environment variable: `ASAN_OPTIONS=detect_odr_violation=0` and
running the fuzz target again.
***
#### Symbolizing a stacktrace
......@@ -149,12 +159,14 @@ fuzz target, ClusterFuzz will run it at scale. Check the [ClusterFuzz status]
page after a day or two.
If you want to better understand and optimize your fuzz target’s performance,
see the [Efficient Fuzzer Guide].
see the [Efficient Fuzzing Guide].
**Note**: It’s important to run fuzzers at scale, not just in your own
*** note
**Note:** It’s important to run fuzzers at scale, not just in your own
environment, because local fuzzing will catch fewer issues. If you run fuzz
targets at scale continuously, you’ll catch regressions and improve code
coverage over time.
***
## Optional improvements
......@@ -166,9 +178,11 @@ You can make it more effective with several easy steps:
* **Create a seed corpus**. You can guide the fuzzing engine to generate more
relevant inputs by adding the `seed_corpus = "src/fuzz-testcases/"` attribute
to your fuzz target and adding example files to the appropriate directory. For
more, see the [Seed Corpus] section of the [Efficient Fuzzer Guide].
more, see the [Seed Corpus] section of the [Efficient Fuzzing Guide].
**Note**: make sure your corpus files are appropriately licensed.
*** note
**Note:** make sure your corpus files are appropriately licensed.
***
* **Create a mutation dictionary**. You can make mutations more effective by
providing the fuzzer with a `dict = "protocol.dict"` GN attribute and a
......@@ -195,7 +209,7 @@ You can make it more effective with several easy steps:
* **Generate a [code coverage report]**. See which code the fuzzer covered in
recent runs, so you can gauge whether it hits the important code parts or not.
**Note**: Since the code coverage of a fuzz target depends heavily on the
**Note:** Since the code coverage of a fuzz target depends heavily on the
corpus provided when running the target, we recommend running the fuzz target
built with ASan locally for a little while (several minutes / hours) first.
This will produce some corpus, which should be used for generating a code
......@@ -236,8 +250,10 @@ mutate multiple inputs at once.
If you need to mutate multiple inputs of various types and length, see [Getting
Started with libprotobuf-mutator in Chromium].
**Note**: This method requires extra effort, but works with APIs and data
*** note
**Note:** This method requires extra effort, but works with APIs and data
structures of any complexity.
***
#### Hash-based argument
......@@ -255,11 +271,13 @@ extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
```
**Note**: The hash value derived from the data is a random value, rather than a
*** note
**Note:** The hash value derived from the data is a random value, rather than a
meaningful one controlled by the fuzzing engine. A single bit mutation might
lead to a new code coverage, but the next mutation would generate a new hash
value and trigger another code path, without providing any real guidance to the
fuzzing engine.
***
#### Bytes taken from (data, size)
......@@ -348,16 +366,16 @@ vector or string object, for example, simply initialize that object by passing
[AFL]: AFL_integration.md
[AddressSanitizer]: http://clang.llvm.org/docs/AddressSanitizer.html
[ClusterFuzz status]: libFuzzer_integration.md#Status-Links
[Efficient Fuzzer Guide]: efficient_fuzzer.md
[Efficient Fuzzing Guide]: efficient_fuzzing.md
[FuzzedDataProvider]: https://cs.chromium.org/chromium/src/third_party/libFuzzer/src/utils/FuzzedDataProvider.h
[Fuzzer Dictionary]: efficient_fuzzer.md#Fuzzer-Dictionary
[Fuzzer Dictionary]: efficient_fuzzing.md#Fuzzer-dictionary
[GN]: https://gn.googlesource.com/gn/+/master/README.md
[Getting Started with libprotobuf-mutator in Chromium]: libprotobuf-mutator.md
[Integration Reference]: reference.md
[MemorySanitizer]: http://clang.llvm.org/docs/MemorySanitizer.html
[Seed Corpus]: efficient_fuzzer.md#Seed-Corpus
[Seed Corpus]: efficient_fuzzing.md#Seed-corpus
[UndefinedBehaviorSanitizer]: http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
[code coverage report]: efficient_fuzzer.md#Code-Coverage
[code coverage report]: efficient_fuzzing.md#Code-coverage
[crbug/598448]: https://bugs.chromium.org/p/chromium/issues/detail?id=598448
[google/fuzzing documentation page]: https://github.com/google/fuzzing/blob/master/docs/split-inputs.md#fuzzed-data-provider
[libFuzzer's output documentation]: http://llvm.org/docs/LibFuzzer.html#output
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment