Commit e555e608 authored by Daniel Cheng's avatar Daniel Cheng Committed by Commit Bot

Fix Mojo backcompat check to handle UTF-8 encoded files (take 2).

https://crrev.com/c/2223973 attempted to fix this check to handle UTF-8
encoded files. However, it appears that Python2's open() will sometimes
return a bytestring for a file containing UTF-8 encoded text. encode()
will then assume the bytestring is encoded using the default 'ascii'
encoding, which then explodes upon seeing any byte with a high byte set
(e.g. any character that's not representable in 7-bit ASCII).

All mojom files in Chrome should be encoded as UTF-8. Since Python 2.7,
io.open() has identical functionality to Python 3's open(), so by using
that and specifying the encoding argument, the UTF-8 assumption can be
explicitly encoded. This makes it safe to unconditionally call encode()
to generate a bytestring to pass into the Mojo parser.

Bonus: this CL adds some additional gymanstics for exceptions to make it
easier to find the parse failure by using six.reraise().

Note: the original issue discussed in the CL description was fixed in a
different way by https://crrev.com/c/2225327, which uses codecs.open().
However, Python 3's open() is an alias for io.open(), which has somewhat
different behavior from codecs.open().

Change-Id: I86be3c5bc97c78078fe212162eb88e519180c3b0
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2226093Reviewed-by: default avatarKen Rockot <rockot@google.com>
Reviewed-by: default avatarFergal Daly <fergal@chromium.org>
Commit-Queue: Daniel Cheng <dcheng@chromium.org>
Cr-Commit-Position: refs/heads/master@{#774937}
parent e4627b85
......@@ -12,12 +12,13 @@ This can be used e.g. by a presubmit check to prevent developers from making
breaking changes to stable mojoms."""
import argparse
import codecs
import errno
import io
import json
import os
import os.path
import shutil
import six
import sys
import tempfile
......@@ -26,6 +27,10 @@ from mojom.generate import translate
from mojom.parse import parser
class ParseError(Exception):
pass
def _ValidateDelta(root, delta):
"""Parses all modified mojoms (including all transitive mojom dependencies,
even if unmodified) to perform backward-compatibility checks on any types
......@@ -61,10 +66,16 @@ def _ValidateDelta(root, delta):
modules = override_modules
else:
modules = unmodified_modules
with codecs.open(os.path.join(root, mojom), encoding='utf-8') as f:
contents = ''.join(f.readlines())
ast = parser.Parse(contents, mojom)
with io.open(os.path.join(root, mojom), encoding='utf-8') as f:
contents = f.read()
try:
ast = parser.Parse(contents, mojom)
except Exception as e:
six.reraise(
ParseError,
'encountered exception {0} while parsing {1}'.format(e, mojom),
sys.exc_info()[2])
for imp in ast.import_list:
parseMojom(imp.import_filename, file_overrides, override_modules)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment