Commit 328e57ca authored by Janusz Majnert's avatar Janusz Majnert Committed by Commit Bot

Accept non utf-8 characters with python3

If input file contains non-utf-8 characters, running
GetResourceAllowlistFileList in python3 fails with a variation of:
"UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 24: invalid start byte"

This change makes python3 treat data as bytes, while leaving
python2 behavior unchanged.

Change-Id: Idb6f4cf8fe3a98551df91b1249a1b74ba2916acc
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2560505Reviewed-by: default avatarAndrew Grieve <agrieve@chromium.org>
Commit-Queue: Andrew Grieve <agrieve@chromium.org>
Cr-Commit-Position: refs/heads/master@{#831015}
parent 3ae74629
...@@ -111,15 +111,15 @@ def GetResourceAllowlistFileList(file_list_path): ...@@ -111,15 +111,15 @@ def GetResourceAllowlistFileList(file_list_path):
paths = ar.ExpandThinArchives(paths) paths = ar.ExpandThinArchives(paths)
resource_ids = set() resource_ids = set()
prefix = 'AllowlistedResource<' prefix = b'AllowlistedResource<'
for p in paths: for p in paths:
with open(p) as f: with open(p, 'rb') as f:
data = f.read() data = f.read()
start_idx = 0 start_idx = 0
while start_idx != -1: while start_idx != -1:
start_idx = data.find(prefix, start_idx) start_idx = data.find(prefix, start_idx)
if start_idx != -1: if start_idx != -1:
end_idx = data.find('>', start_idx) end_idx = data.find(b'>', start_idx)
resource_ids.add(int(data[start_idx + len(prefix):end_idx])) resource_ids.add(int(data[start_idx + len(prefix):end_idx]))
start_idx = end_idx start_idx = end_idx
return resource_ids return resource_ids
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment