Answer more of people's questions about the Rule Of 2.

Or, try to. Also give more examples. Bug: None Change-Id: If7d57944aa61e017bb4b6a7d144ded3b51b04318 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1497366Reviewed-by: Robert Sesek <rsesek@chromium.org> Reviewed-by: Emily Stark <estark@chromium.org> Commit-Queue: Chris Palmer <palmer@chromium.org> Cr-Commit-Position: refs/heads/master@{#638254}

Answer more of people's questions about the Rule Of 2.
Or, try to. Also give more examples. Bug: None Change-Id: If7d57944aa61e017bb4b6a7d144ded3b51b04318 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1497366Reviewed-by: Robert Sesek <rsesek@chromium.org> Reviewed-by: Emily Stark <estark@chromium.org> Commit-Queue: Chris Palmer <palmer@chromium.org> Cr-Commit-Position: refs/heads/master@{#638254}
80708039 · Chris Palmer · Commit Bot · 9e508a68 · 80708039
Commit 80708039 authored Mar 06, 2019 by Chris Palmer Committed by Commit Bot Mar 06, 2019
Hide whitespace changes
Inline Side-by-side

Showing with 90 additions and 20 deletions

docs/security/rule-of-2.md docs/security/rule-of-2.md +90 -20

No files found.
--- a/docs/security/rule-of-2.md
+++ b/docs/security/rule-of-2.md
@@ -15,29 +15,67 @@ When code that handles untrustworthy inputs at high privilege has bugs, the
 resulting vulnerabilities are typically of Critical or High severity. (See our
 [Severity Guidelines](severity-guidelines.md).) We'd love to reduce the severity
 of such bugs by reducing the amount of damage they can do (lowering their
-privilege), avoiding the classes of memory corruption bugs (using a safe
+privilege), avoiding the various types of memory corruption bugs (using a safe
 language), or reducing the likelihood that the input is malicious (asserting the
 trustworthiness of the source).

+For the purposes of this document, our main concern is reducing (and hopefully,
+ultimately eliminating) bugs that arise due to _memory unsafety_. [A recent
+study by Matt Miller from Microsoft
+Security](https://github.com/Microsoft/MSRC-Security-Research/blob/master/presentations/2019_02_BlueHatIL/2019_01%20-%20BlueHatIL%20-%20Trends%2C%20challenge%2C%20and%20shifts%20in%20software%20vulnerability%20mitigation.pdf)
+states that "~70% of the vulnerabilities addressed through a security update
+each year continue to be memory safety issues". A trip through Chromium's bug
+tracker will show many, many vulnerabilities whose root cause is memory
+unsafety. (For example, [Type=Bug-Security
+sanitizer](https://bugs.chromium.org/p/chromium/issues/list?can=1&q=Type%3DBug-Security+sanitizer&colspec=ID+Pri+M+Stars+ReleaseBlock+Component+Status+Owner+Summary+OS+Modified&x=m&y=releaseblock&cells=ids).)
+
+Security engineers in general, very much including Chrome Security Team, would
+like to advance the state of engineering to where memory safety issues are much
+more rare. Then, we could focus more attention on the application-semantic
+vulnerabilities. 😊 That would be a big improvement.
+
 ## What?

+Some definitions are in order.
+
+### Untrustworthy Inputs
+
 _Untrustworthy inputs_ are inputs that

-  * have non-trivial grammars; or
+  * have non-trivial grammars; and/or
  * come from untrustworthy sources.

+If there were an input type so simple that it were straightforward to write a
+memory-safe handler for it, we wouldn't need to worry much about where it came
+from **for the purposes of memory safety**, because we'd be sure we could handle
+it. We would still need to treat the input as untrustworthy after
+parsing, of course.
+
 Unfortunately, it is very rare to find a grammar trivial enough that we can
 trust ourselves to parse it successfully or fail safely. (But see
-[Normalization](#Normalization) for a potential example.)
-
-Obviously, any arbitrary peer on the Internet is an untrustworthy source without
-some evidence of trustworthiness (which includes at least [a strong assertion of
-the source's identity](#verifying-the-trustworthiness-of-a-source)).
-
-_Unsafe implementation languages_ are languages that lack
-[memory safety](https://en.wikipedia.org/wiki/Memory_safety), including at least
-C, C++, and assembly language. Memory-safe languages include Go, Rust, Python,
-Java, JavaScript, Kotlin, and Swift.
+[Normalization](#Normalization) for a potential example.) Therefore, we do need
+to concern ourselves with the provenance of such inputs.
+
+Any arbitrary peer on the Internet is an untrustworthy source, unless we get
+some evidence of its trustworthiness (which includes at least [a strong
+assertion of the source's
+identity](#verifying-the-trustworthiness-of-a-source)). When we can know with
+certainty that an input is coming from the same source as the application itself
+(e.g. Google in the case of Chrome, or Mozilla in the case of Firefox), and that
+the transport is integrity-protected (such as with HTTPS), then it can be
+acceptable to parse even complex inputs from that source. It's still ideal,
+where feasible, to not have to trust the source — such as by parsing the input
+in a sandbox.
+
+### Unsafe Implementation Languages
+
+_Unsafe implementation languages_ are languages that lack [memory
+safety](https://en.wikipedia.org/wiki/Memory_safety), including at least C, C++,
+and assembly language. Memory-safe languages include Go, Rust, Python, Java,
+JavaScript, Kotlin, and Swift. (Note that the safe subsets of these languages
+are safe by design, but of course implementation quality is a different story.)
+
+### High Privilege

 _High privilege_ is a relative term. The very highest-privilege programs are the
 computer's firmware, the bootloader, the kernel, any hypervisor or virtual
@@ -99,8 +137,10 @@ _normal_ or _minimal_ form, usually by first transforming it into a format with
 a simpler grammar. We say that all data, file, and wire formats are defined by a
 _grammar_, even if that grammar is implicit or only partially-specified (as is
 so often the case). A file format with a particularly simple grammar is
-[Farbfeld](https://tools.suckless.org/farbfeld/) (see the table at the top).
-It's rare to find such a simple grammar, however.
+[Farbfeld](https://tools.suckless.org/farbfeld/) (the grammar is represented in
+the table at the top).
+
+It's rare to find such a simple grammar for input formats, however.

 For example, consider the PNG image format, which is complex and whose [C
 implementation has suffered from memory corruption bugs in the
@@ -127,21 +167,51 @@ only process objects adhering to a well-defined, generally low-complexity
 grammar. This is a big part of why [we like for Mojo messages to use structured
 types](mojo.md#Use-structured-types).

+For example, it would be safe enough to convert a PNG to an SkBitmap in a
+sandboxed process, and then send the `SkBitmap` to a higher-privileged process
+via IPC. Although there may be bugs in the IPC message deserialization code
+and/or in Skia's `SkBitmap` handling code, we consider this safe enough for a
+few reasons:
+
+* we must accept the risk of bugs in Mojo deserialization; but thankfully
+* Mojo deserialization is very amenable to fuzzing;
+* it's a big improvement to scope bugs to smaller areas, like deserialization
+  functions and very simple classes like `SkBitmap` and `SkPixmap`; and
+* ultimately this process results in parsing significantly simpler grammars (PNG
+  → Mojo + `SkBitmap` in this case).
+
+> (We have to accept the risk of memory safety bugs in Mojo deserialization
+> because C++'s high performance is crucial in such a throughput- and
+> latency-sensitive area. If we could change this code to be both in a safer
+> language and still have such high performance, that'd be ideal. But that's
+> unlikely to happen soon.)
+
 ### Safe Languages

 Where possible, it's great to use a memory-safe language. Of the currently
 approved set of implementation languages in Chromium, the most likely candidates
 are Java (on Android only) and JavaScript (although we don't currently use it in
 high-privilege processes like the browser). One can imagine Swift on iOS or
-Kotlin on Android, too, although they are not currently used in Chromium.
+Kotlin on Android, too, although they are not currently used in Chromium. (Some
+of us on Security Team aspire to get more of Chromium in safer languages, but
+that's a long-term, heavy lift.)
+
+For an example of image processing, we have the pure-Java class
+[BaseGifImage](https://cs.chromium.org/chromium/src/third_party/gif_player/src/jp/tomorrowkey/android/gifplayer/BaseGifImage.java?rcl=27febd503d1bab047d73df26db83184fff8d6620&l=27).
+On Android, where we can use Java and also face a particularly high cost for
+creating new processes (necessary for sandboxing), using Java to decode tricky
+formats can be a great approach. We do a similar thing with the pure-Java
+[JsonSanitizer](https://cs.chromium.org/chromium/src/services/data_decoder/public/cpp/android/java/src/org/chromium/services/data_decoder/JsonSanitizer.java),
+to 'vet' incoming JSON in a memory-safe way before passing the input to the C++
+JSON implementation.

 ## Existing Code That Violates The Rule

-Obviously, we still have a lot of code that violates this rule. For example,
-until very recently, all of the network stack was in the browser process, and
-its whole job is to parse complex and untrustworthy inputs (TLS, QUIC, HTTP,
-DNS, X.509, and more). This dangerous combination is why bugs in that area of
-code are often of Critical severity:
+We still have a lot of code that violates this rule. For example, until very
+recently, all of the network stack was in the browser process, and its whole job
+is to parse complex and untrustworthy inputs (TLS, QUIC, HTTP, DNS, X.509, and
+more). This dangerous combination is why bugs in that area of code are often of
+Critical severity:

  * [OOB Write in `QuicStreamSequencerBuffer::OnStreamData`](https://bugs.chromium.org/p/chromium/issues/detail?id=778505)
  * [Stack Buffer Overflow in `QuicClientPromisedInfo::OnPromiseHeaders`](https://bugs.chromium.org/p/chromium/issues/detail?id=777728)