Commit e555e608 authored by Daniel Cheng's avatar Daniel Cheng Committed by Commit Bot

Fix Mojo backcompat check to handle UTF-8 encoded files (take 2).

https://crrev.com/c/2223973 attempted to fix this check to handle UTF-8
encoded files. However, it appears that Python2's open() will sometimes
return a bytestring for a file containing UTF-8 encoded text. encode()
will then assume the bytestring is encoded using the default 'ascii'
encoding, which then explodes upon seeing any byte with a high byte set
(e.g. any character that's not representable in 7-bit ASCII).

All mojom files in Chrome should be encoded as UTF-8. Since Python 2.7,
io.open() has identical functionality to Python 3's open(), so by using
that and specifying the encoding argument, the UTF-8 assumption can be
explicitly encoded. This makes it safe to unconditionally call encode()
to generate a bytestring to pass into the Mojo parser.

Bonus: this CL adds some additional gymanstics for exceptions to make it
easier to find the parse failure by using six.reraise().

Note: the original issue discussed in the CL description was fixed in a
different way by https://crrev.com/c/2225327, which uses codecs.open().
However, Python 3's open() is an alias for io.open(), which has somewhat
different behavior from codecs.open().

Change-Id: I86be3c5bc97c78078fe212162eb88e519180c3b0
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2226093Reviewed-by: default avatarKen Rockot <rockot@google.com>
Reviewed-by: default avatarFergal Daly <fergal@chromium.org>
Commit-Queue: Daniel Cheng <dcheng@chromium.org>
Cr-Commit-Position: refs/heads/master@{#774937}
parent e4627b85
...@@ -12,12 +12,13 @@ This can be used e.g. by a presubmit check to prevent developers from making ...@@ -12,12 +12,13 @@ This can be used e.g. by a presubmit check to prevent developers from making
breaking changes to stable mojoms.""" breaking changes to stable mojoms."""
import argparse import argparse
import codecs
import errno import errno
import io
import json import json
import os import os
import os.path import os.path
import shutil import shutil
import six
import sys import sys
import tempfile import tempfile
...@@ -26,6 +27,10 @@ from mojom.generate import translate ...@@ -26,6 +27,10 @@ from mojom.generate import translate
from mojom.parse import parser from mojom.parse import parser
class ParseError(Exception):
pass
def _ValidateDelta(root, delta): def _ValidateDelta(root, delta):
"""Parses all modified mojoms (including all transitive mojom dependencies, """Parses all modified mojoms (including all transitive mojom dependencies,
even if unmodified) to perform backward-compatibility checks on any types even if unmodified) to perform backward-compatibility checks on any types
...@@ -61,10 +66,16 @@ def _ValidateDelta(root, delta): ...@@ -61,10 +66,16 @@ def _ValidateDelta(root, delta):
modules = override_modules modules = override_modules
else: else:
modules = unmodified_modules modules = unmodified_modules
with codecs.open(os.path.join(root, mojom), encoding='utf-8') as f: with io.open(os.path.join(root, mojom), encoding='utf-8') as f:
contents = ''.join(f.readlines()) contents = f.read()
ast = parser.Parse(contents, mojom) try:
ast = parser.Parse(contents, mojom)
except Exception as e:
six.reraise(
ParseError,
'encountered exception {0} while parsing {1}'.format(e, mojom),
sys.exc_info()[2])
for imp in ast.import_list: for imp in ast.import_list:
parseMojom(imp.import_filename, file_overrides, override_modules) parseMojom(imp.import_filename, file_overrides, override_modules)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment