Commit 80708039 authored by Chris Palmer's avatar Chris Palmer Committed by Commit Bot

Answer more of people's questions about the Rule Of 2.

Or, try to. Also give more examples.

Bug: None
Change-Id: If7d57944aa61e017bb4b6a7d144ded3b51b04318
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1497366Reviewed-by: default avatarRobert Sesek <rsesek@chromium.org>
Reviewed-by: default avatarEmily Stark <estark@chromium.org>
Commit-Queue: Chris Palmer <palmer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#638254}
parent 9e508a68
......@@ -15,29 +15,67 @@ When code that handles untrustworthy inputs at high privilege has bugs, the
resulting vulnerabilities are typically of Critical or High severity. (See our
[Severity Guidelines](severity-guidelines.md).) We'd love to reduce the severity
of such bugs by reducing the amount of damage they can do (lowering their
privilege), avoiding the classes of memory corruption bugs (using a safe
privilege), avoiding the various types of memory corruption bugs (using a safe
language), or reducing the likelihood that the input is malicious (asserting the
trustworthiness of the source).
For the purposes of this document, our main concern is reducing (and hopefully,
ultimately eliminating) bugs that arise due to _memory unsafety_. [A recent
study by Matt Miller from Microsoft
Security](https://github.com/Microsoft/MSRC-Security-Research/blob/master/presentations/2019_02_BlueHatIL/2019_01%20-%20BlueHatIL%20-%20Trends%2C%20challenge%2C%20and%20shifts%20in%20software%20vulnerability%20mitigation.pdf)
states that "~70% of the vulnerabilities addressed through a security update
each year continue to be memory safety issues". A trip through Chromium's bug
tracker will show many, many vulnerabilities whose root cause is memory
unsafety. (For example, [Type=Bug-Security
sanitizer](https://bugs.chromium.org/p/chromium/issues/list?can=1&q=Type%3DBug-Security+sanitizer&colspec=ID+Pri+M+Stars+ReleaseBlock+Component+Status+Owner+Summary+OS+Modified&x=m&y=releaseblock&cells=ids).)
Security engineers in general, very much including Chrome Security Team, would
like to advance the state of engineering to where memory safety issues are much
more rare. Then, we could focus more attention on the application-semantic
vulnerabilities. 😊 That would be a big improvement.
## What?
Some definitions are in order.
### Untrustworthy Inputs
_Untrustworthy inputs_ are inputs that
* have non-trivial grammars; or
* have non-trivial grammars; and/or
* come from untrustworthy sources.
If there were an input type so simple that it were straightforward to write a
memory-safe handler for it, we wouldn't need to worry much about where it came
from **for the purposes of memory safety**, because we'd be sure we could handle
it. We would still need to treat the input as untrustworthy after
parsing, of course.
Unfortunately, it is very rare to find a grammar trivial enough that we can
trust ourselves to parse it successfully or fail safely. (But see
[Normalization](#Normalization) for a potential example.)
Obviously, any arbitrary peer on the Internet is an untrustworthy source without
some evidence of trustworthiness (which includes at least [a strong assertion of
the source's identity](#verifying-the-trustworthiness-of-a-source)).
_Unsafe implementation languages_ are languages that lack
[memory safety](https://en.wikipedia.org/wiki/Memory_safety), including at least
C, C++, and assembly language. Memory-safe languages include Go, Rust, Python,
Java, JavaScript, Kotlin, and Swift.
[Normalization](#Normalization) for a potential example.) Therefore, we do need
to concern ourselves with the provenance of such inputs.
Any arbitrary peer on the Internet is an untrustworthy source, unless we get
some evidence of its trustworthiness (which includes at least [a strong
assertion of the source's
identity](#verifying-the-trustworthiness-of-a-source)). When we can know with
certainty that an input is coming from the same source as the application itself
(e.g. Google in the case of Chrome, or Mozilla in the case of Firefox), and that
the transport is integrity-protected (such as with HTTPS), then it can be
acceptable to parse even complex inputs from that source. It's still ideal,
where feasible, to not have to trust the source — such as by parsing the input
in a sandbox.
### Unsafe Implementation Languages
_Unsafe implementation languages_ are languages that lack [memory
safety](https://en.wikipedia.org/wiki/Memory_safety), including at least C, C++,
and assembly language. Memory-safe languages include Go, Rust, Python, Java,
JavaScript, Kotlin, and Swift. (Note that the safe subsets of these languages
are safe by design, but of course implementation quality is a different story.)
### High Privilege
_High privilege_ is a relative term. The very highest-privilege programs are the
computer's firmware, the bootloader, the kernel, any hypervisor or virtual
......@@ -99,8 +137,10 @@ _normal_ or _minimal_ form, usually by first transforming it into a format with
a simpler grammar. We say that all data, file, and wire formats are defined by a
_grammar_, even if that grammar is implicit or only partially-specified (as is
so often the case). A file format with a particularly simple grammar is
[Farbfeld](https://tools.suckless.org/farbfeld/) (see the table at the top).
It's rare to find such a simple grammar, however.
[Farbfeld](https://tools.suckless.org/farbfeld/) (the grammar is represented in
the table at the top).
It's rare to find such a simple grammar for input formats, however.
For example, consider the PNG image format, which is complex and whose [C
implementation has suffered from memory corruption bugs in the
......@@ -127,21 +167,51 @@ only process objects adhering to a well-defined, generally low-complexity
grammar. This is a big part of why [we like for Mojo messages to use structured
types](mojo.md#Use-structured-types).
For example, it would be safe enough to convert a PNG to an SkBitmap in a
sandboxed process, and then send the `SkBitmap` to a higher-privileged process
via IPC. Although there may be bugs in the IPC message deserialization code
and/or in Skia's `SkBitmap` handling code, we consider this safe enough for a
few reasons:
* we must accept the risk of bugs in Mojo deserialization; but thankfully
* Mojo deserialization is very amenable to fuzzing;
* it's a big improvement to scope bugs to smaller areas, like deserialization
functions and very simple classes like `SkBitmap` and `SkPixmap`; and
* ultimately this process results in parsing significantly simpler grammars (PNG
→ Mojo + `SkBitmap` in this case).
> (We have to accept the risk of memory safety bugs in Mojo deserialization
> because C++'s high performance is crucial in such a throughput- and
> latency-sensitive area. If we could change this code to be both in a safer
> language and still have such high performance, that'd be ideal. But that's
> unlikely to happen soon.)
### Safe Languages
Where possible, it's great to use a memory-safe language. Of the currently
approved set of implementation languages in Chromium, the most likely candidates
are Java (on Android only) and JavaScript (although we don't currently use it in
high-privilege processes like the browser). One can imagine Swift on iOS or
Kotlin on Android, too, although they are not currently used in Chromium.
Kotlin on Android, too, although they are not currently used in Chromium. (Some
of us on Security Team aspire to get more of Chromium in safer languages, but
that's a long-term, heavy lift.)
For an example of image processing, we have the pure-Java class
[BaseGifImage](https://cs.chromium.org/chromium/src/third_party/gif_player/src/jp/tomorrowkey/android/gifplayer/BaseGifImage.java?rcl=27febd503d1bab047d73df26db83184fff8d6620&l=27).
On Android, where we can use Java and also face a particularly high cost for
creating new processes (necessary for sandboxing), using Java to decode tricky
formats can be a great approach. We do a similar thing with the pure-Java
[JsonSanitizer](https://cs.chromium.org/chromium/src/services/data_decoder/public/cpp/android/java/src/org/chromium/services/data_decoder/JsonSanitizer.java),
to 'vet' incoming JSON in a memory-safe way before passing the input to the C++
JSON implementation.
## Existing Code That Violates The Rule
Obviously, we still have a lot of code that violates this rule. For example,
until very recently, all of the network stack was in the browser process, and
its whole job is to parse complex and untrustworthy inputs (TLS, QUIC, HTTP,
DNS, X.509, and more). This dangerous combination is why bugs in that area of
code are often of Critical severity:
We still have a lot of code that violates this rule. For example, until very
recently, all of the network stack was in the browser process, and its whole job
is to parse complex and untrustworthy inputs (TLS, QUIC, HTTP, DNS, X.509, and
more). This dangerous combination is why bugs in that area of code are often of
Critical severity:
* [OOB Write in `QuicStreamSequencerBuffer::OnStreamData`](https://bugs.chromium.org/p/chromium/issues/detail?id=778505)
* [Stack Buffer Overflow in `QuicClientPromisedInfo::OnPromiseHeaders`](https://bugs.chromium.org/p/chromium/issues/detail?id=777728)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment