Commit f9b89e7b authored by lukasza's avatar lukasza Committed by Commit bot

Split run_tool.py into run_tool.py, extract_edits.py and apply_edits.py

The split will allow generation of edits on multiple configs (e.g. linux
vs windows OR rel vs dbg) and merging the edits before applying them once:
    $ tools/clang/scripts/run_tool.py rewrite_to_chrome_style \
        --generate-compdb --all out/rel >run_tool.linux.rel.out
    $ ...
    $ cat run_tool.*.out \
        | tools/clang/scripts/extract_edits.py \
        | tools/clang/scripts/apply_edits.py
        --generate-compdb --all out/rel >run_tool.linux.rel.out

Test steps:
- tools/clang/translation_unit/test_translation_unit.py
- tools/clang/scripts/test_tool.py rewrite_to_chrome_style
- manually running run_tool | extract_edits | apply_edits pipeline
  on WTF and verifying that it still builds after the rename

BUG=598138
TEST=See "Test steps" above.

Review-Url: https://codereview.chromium.org/2599193002
Cr-Commit-Position: refs/heads/master@{#440881}
parent d5c112c4
......@@ -69,14 +69,14 @@ represents one edit. Fields are separated by `:::`, and the first field must
be `r` (for replacement). In the future, this may be extended to handle header
insertion/removal. A deletion is an edit with no replacement text.
The edits are applied by [`run_tool.py`](#Running), which understands certain
The edits are applied by [`apply_edits.py`](#Running), which understands certain
conventions:
* The tool should munge newlines in replacement text to `\0`. The script
* The clang tool should munge newlines in replacement text to `\0`. The script
knows to translate `\0` back to newlines when applying edits.
* When removing an element from a 'list' (e.g. function parameters,
initializers), the tool should emit a deletion for just the element. The
script understands how to extend the deletion to remove commas, etc. as
initializers), the clang tool should emit a deletion for just the element.
The script understands how to extend the deletion to remove commas, etc. as
needed.
TODO: Document more about `SourceLocation` and how spelling loc differs from
......@@ -118,6 +118,12 @@ that are generated as part of the build:
```shell
ninja -C out/Debug # For non-Windows
ninja -d keeprsp -C out/Debug # For Windows
# experimental alternative:
$gen_targets = $(ninja -C out/gn -t targets all \
| grep '^gen/[^: ]*\.[ch][pc]*:' \
| cut -f 1 -d :`)
ninja -C out/Debug $gen_targets
```
On Windows, generate the compile DB first, and after making any source changes.
......@@ -127,28 +133,53 @@ Then omit the `--generate-compdb` in later steps.
tools/clang/scripts/generate_win_compdb.py out/Debug
```
Then run the actual tool:
Then run the actual clang tool to generate a list of edits:
```shell
tools/clang/scripts/run_tool.py <toolname> \
--generate-compdb
out/Debug <path 1> <path 2> ...
out/Debug <path 1> <path 2> ... >/tmp/list-of-edits.debug
```
`--generate-compdb` can be omitted if the compile DB was already generated and
the list of build flags and source files has not changed since generation.
`<path 1>`, `<path 2>`, etc are optional arguments to filter the files to run
the tool across. This is helpful when sharding global refactorings into smaller
the tool against. This is helpful when sharding global refactorings into smaller
chunks. For example, the following command will run the `empty_string` tool
across just the files in `//base`:
against just the `.c`, `.cc`, `.cpp`, `.m`, `.mm` files in `//net`. Note that
the filtering is not applied to the *output* of the tool - the tool can emit
edits that apply to files outside of `//cc` (i.e. edits that apply to headers
from `//base` that got included by source files in `//cc`).
```shell
tools/clang/scripts/run_tool.py empty_string \
--generated-compdb \
out/Debug base
out/Debug net >/tmp/list-of-edits.debug
```
Note that some header files might only be included from generated files (e.g.
from only from some `.cpp` files under out/Debug/gen). To make sure that
contents of such header files are processed by the clang tool, the clang tool
needs to be run against the generated files. The only way to accomplish this
today is to pass `--all` switch to `run_tool.py` - this will run the clang tool
against all the sources from the compilation database.
Finally, apply the edits as follows:
```shell
cat /tmp/list-of-edits.debug \
| tools/clang/scripts/extract_edits.py \
| tools/clang/scripts/apply_edits.py out/Debug <path 1> <path 2> ...
```
The apply_edits.py tool will only apply edits to files actually under control of
`git`. `<path 1>`, `<path 2>`, etc are optional arguments to further filter the
files that the edits are applied to. Note that semantics of these filters is
distinctly different from the arguments of `run_tool.py` filters - one set of
filters controls which files are edited, the other set of filters controls which
files the clang tool is run against.
## Debugging
Dumping the AST for a file:
......
#!/usr/bin/env python
# Copyright (c) 2013 The Chromium Authors. All rights reserved.
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.
"""Applies edits generated by a clang tool that was run on Chromium code.
Synopsis:
cat run_tool.out | extract_edits.py | apply_edits.py <build dir> <filters...>
For example - to apply edits only to WTF sources:
... | apply_edits.py out/gn third_party/WebKit/Source/wtf
In addition to filters specified on the command line, the tool also skips edits
that apply to files that are not covered by git.
"""
import argparse
import collections
import functools
import multiprocessing
import os
import os.path
import subprocess
import sys
script_dir = os.path.dirname(os.path.realpath(__file__))
tool_dir = os.path.abspath(os.path.join(script_dir, '../pylib'))
sys.path.insert(0, tool_dir)
from clang import compile_db
Edit = collections.namedtuple('Edit',
('edit_type', 'offset', 'length', 'replacement'))
def _GetFilesFromGit(paths=None):
"""Gets the list of files in the git repository.
Args:
paths: Prefix filter for the returned paths. May contain multiple entries.
"""
args = []
if sys.platform == 'win32':
args.append('git.bat')
else:
args.append('git')
args.append('ls-files')
if paths:
args.extend(paths)
command = subprocess.Popen(args, stdout=subprocess.PIPE)
output, _ = command.communicate()
return [os.path.realpath(p) for p in output.splitlines()]
def _ParseEditsFromStdin(build_directory):
"""Extracts generated list of edits from the tool's stdout.
The expected format is documented at the top of this file.
Args:
build_directory: Directory that contains the compile database. Used to
normalize the filenames.
stdout: The stdout from running the clang tool.
Returns:
A dictionary mapping filenames to the associated edits.
"""
path_to_resolved_path = {}
def _ResolvePath(path):
if path in path_to_resolved_path:
return path_to_resolved_path[path]
if not os.path.isfile(path):
resolved_path = os.path.realpath(os.path.join(build_directory, path))
else:
resolved_path = path
if not os.path.isfile(resolved_path):
sys.stderr.write('Edit applies to a non-existent file: %s\n' % path)
resolved_path = None
path_to_resolved_path[path] = resolved_path
return resolved_path
edits = collections.defaultdict(list)
for line in sys.stdin:
line = line.rstrip("\n\r")
try:
edit_type, path, offset, length, replacement = line.split(':::', 4)
replacement = replacement.replace('\0', '\n')
path = _ResolvePath(path)
if not path: continue
edits[path].append(Edit(edit_type, int(offset), int(length), replacement))
except ValueError:
sys.stderr.write('Unable to parse edit: %s\n' % line)
return edits
def _ApplyEditsToSingleFile(filename, edits):
# Sort the edits and iterate through them in reverse order. Sorting allows
# duplicate edits to be quickly skipped, while reversing means that
# subsequent edits don't need to have their offsets updated with each edit
# applied.
edit_count = 0
error_count = 0
edits.sort()
last_edit = None
with open(filename, 'rb+') as f:
contents = bytearray(f.read())
for edit in reversed(edits):
if edit == last_edit:
continue
if (last_edit is not None and edit.edit_type == last_edit.edit_type and
edit.offset == last_edit.offset and edit.length == last_edit.length):
sys.stderr.write(
'Conflicting edit: %s at offset %d, length %d: "%s" != "%s"\n' %
(filename, edit.offset, edit.length, edit.replacement,
last_edit.replacement))
error_count += 1
continue
last_edit = edit
contents[edit.offset:edit.offset + edit.length] = edit.replacement
if not edit.replacement:
_ExtendDeletionIfElementIsInList(contents, edit.offset)
edit_count += 1
f.seek(0)
f.truncate()
f.write(contents)
return (edit_count, error_count)
def _ApplyEdits(edits):
"""Apply the generated edits.
Args:
edits: A dict mapping filenames to Edit instances that apply to that file.
"""
edit_count = 0
error_count = 0
done_files = 0
for k, v in edits.iteritems():
tmp_edit_count, tmp_error_count = _ApplyEditsToSingleFile(k, v)
edit_count += tmp_edit_count
error_count += tmp_error_count
done_files += 1
percentage = (float(done_files) / len(edits)) * 100
sys.stderr.write('Applied %d edits (%d errors) to %d files [%.2f%%]\r' %
(edit_count, error_count, done_files, percentage))
sys.stderr.write('\n')
return -error_count
_WHITESPACE_BYTES = frozenset((ord('\t'), ord('\n'), ord('\r'), ord(' ')))
def _ExtendDeletionIfElementIsInList(contents, offset):
"""Extends the range of a deletion if the deleted element was part of a list.
This rewriter helper makes it easy for refactoring tools to remove elements
from a list. Even if a matcher callback knows that it is removing an element
from a list, it may not have enough information to accurately remove the list
element; for example, another matcher callback may end up removing an adjacent
list element, or all the list elements may end up being removed.
With this helper, refactoring tools can simply remove the list element and not
worry about having to include the comma in the replacement.
Args:
contents: A bytearray with the deletion already applied.
offset: The offset in the bytearray where the deleted range used to be.
"""
char_before = char_after = None
left_trim_count = 0
for byte in reversed(contents[:offset]):
left_trim_count += 1
if byte in _WHITESPACE_BYTES:
continue
if byte in (ord(','), ord(':'), ord('('), ord('{')):
char_before = chr(byte)
break
right_trim_count = 0
for byte in contents[offset:]:
right_trim_count += 1
if byte in _WHITESPACE_BYTES:
continue
if byte == ord(','):
char_after = chr(byte)
break
if char_before:
if char_after:
del contents[offset:offset + right_trim_count]
elif char_before in (',', ':'):
del contents[offset - left_trim_count:offset]
def main():
parser = argparse.ArgumentParser()
parser.add_argument(
'build_directory',
help='path to the build dir (dir that edit paths are relative to)')
parser.add_argument(
'path_filter',
nargs='*',
help='optional paths to filter what files the tool is run on')
args = parser.parse_args()
filenames = set(_GetFilesFromGit(args.path_filter))
edits = _ParseEditsFromStdin(args.build_directory)
return _ApplyEdits(
{k: v for k, v in edits.iteritems()
if os.path.realpath(k) in filenames})
if __name__ == '__main__':
sys.exit(main())
#!/usr/bin/env python
# Copyright (c) 2016 The Chromium Authors. All rights reserved.
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.
"""Script to extract edits from clang tool output.
If a clang tool emits edits, then the edits should look like this:
...
==== BEGIN EDITS ====
<edit1>
<edit2>
...
==== END EDITS ====
...
extract_edits.py takes input that is concatenated from multiple tool invocations
and extract just the edits. In other words, given the following input:
...
==== BEGIN EDITS ====
<edit1>
<edit2>
==== END EDITS ====
...
==== BEGIN EDITS ====
<yet another edit1>
<yet another edit2>
==== END EDITS ====
...
extract_edits.py would emit the following output:
<edit1>
<edit2>
<yet another edit1>
<yet another edit2>
This python script is mainly needed on Windows.
On unix this script can be replaced with running sed as follows:
$ cat run_tool.debug.out \
| sed '/^==== BEGIN EDITS ====$/,/^==== END EDITS ====$/{//!b};d'
| sort | uniq
"""
import sys
def main():
unique_lines = set()
inside_marker_lines = False
for line in sys.stdin:
line = line.rstrip("\n\r")
if line == '==== BEGIN EDITS ====':
inside_marker_lines = True
continue
if line == '==== END EDITS ====':
inside_marker_lines = False
continue
if inside_marker_lines and line not in unique_lines:
unique_lines.add(line)
print line
return 0
if __name__ == '__main__':
sys.exit(main())
This diff is collapsed.
......@@ -42,6 +42,60 @@ def _NumberOfTestsToString(tests):
return '%d test%s' % (tests, 's' if tests != 1 else '')
def _RunToolAndApplyEdits(tools_clang_scripts_directory,
tool_to_test,
test_directory_for_tool,
actual_files):
try:
# Stage the test files in the git index. If they aren't staged, then
# run_tool.py will skip them when applying replacements.
args = ['add']
args.extend(actual_files)
_RunGit(args)
# Launch the following pipeline:
# run_tool.py ... | extract_edits.py | apply_edits.py ...
args = ['python',
os.path.join(tools_clang_scripts_directory, 'run_tool.py'),
tool_to_test,
test_directory_for_tool]
args.extend(actual_files)
run_tool = subprocess.Popen(args, stdout=subprocess.PIPE)
args = ['python',
os.path.join(tools_clang_scripts_directory, 'extract_edits.py')]
extract_edits = subprocess.Popen(args, stdin=run_tool.stdout,
stdout=subprocess.PIPE)
args = ['python',
os.path.join(tools_clang_scripts_directory, 'apply_edits.py'),
test_directory_for_tool]
apply_edits = subprocess.Popen(args, stdin=extract_edits.stdout,
stdout=subprocess.PIPE)
# Wait for the pipeline to finish running + check exit codes.
stdout, _ = apply_edits.communicate()
for process in [run_tool, extract_edits, apply_edits]:
process.wait()
if process.returncode != 0:
print "Failure while running the tool."
return process.returncode
# Reformat the resulting edits via: git cl format.
args = ['cl', 'format']
args.extend(actual_files)
_RunGit(args)
return 0
finally:
# No matter what, unstage the git changes we made earlier to avoid polluting
# the index.
args = ['reset', '--quiet', 'HEAD']
args.extend(actual_files)
_RunGit(args)
def main(argv):
if len(argv) < 1:
print 'Usage: test_tool.py <clang tool>'
......@@ -76,72 +130,52 @@ def main(argv):
print 'Tool "%s" does not have compatible test files.' % tool_to_test
return 1
try:
# Set up the test environment.
for source, actual in zip(source_files, actual_files):
shutil.copyfile(source, actual)
# Stage the test files in the git index. If they aren't staged, then
# run_tools.py will skip them when applying replacements.
args = ['add']
args.extend(actual_files)
_RunGit(args)
# Generate a temporary compilation database to run the tool over.
with open(compile_database, 'w') as f:
f.write(_GenerateCompileCommands(actual_files, include_paths))
args = ['python',
os.path.join(tools_clang_scripts_directory, 'run_tool.py'),
tool_to_test,
test_directory_for_tool]
args.extend(actual_files)
run_tool = subprocess.Popen(args, stdout=subprocess.PIPE)
stdout, _ = run_tool.communicate()
if run_tool.returncode != 0:
print 'run_tool failed:\n%s' % stdout
return 1
args = ['cl', 'format']
args.extend(actual_files)
_RunGit(args)
passed = 0
failed = 0
for expected, actual in zip(expected_files, actual_files):
print '[ RUN ] %s' % os.path.relpath(actual)
expected_output = actual_output = None
with open(expected, 'r') as f:
expected_output = f.readlines()
with open(actual, 'r') as f:
actual_output = f.readlines()
if actual_output != expected_output:
failed += 1
for line in difflib.unified_diff(expected_output, actual_output,
fromfile=os.path.relpath(expected),
tofile=os.path.relpath(actual)):
sys.stdout.write(line)
print '[ FAILED ] %s' % os.path.relpath(actual)
# Don't clean up the file on failure, so the results can be referenced
# more easily.
continue
print '[ OK ] %s' % os.path.relpath(actual)
passed += 1
os.remove(actual)
if failed == 0:
os.remove(compile_database)
print '[==========] %s ran.' % _NumberOfTestsToString(len(source_files))
if passed > 0:
print '[ PASSED ] %s.' % _NumberOfTestsToString(passed)
if failed > 0:
print '[ FAILED ] %s.' % _NumberOfTestsToString(failed)
return 1
finally:
# No matter what, unstage the git changes we made earlier to avoid polluting
# the index.
args = ['reset', '--quiet', 'HEAD']
args.extend(actual_files)
_RunGit(args)
# Set up the test environment.
for source, actual in zip(source_files, actual_files):
shutil.copyfile(source, actual)
# Generate a temporary compilation database to run the tool over.
with open(compile_database, 'w') as f:
f.write(_GenerateCompileCommands(actual_files, include_paths))
# Run the tool.
exitcode = _RunToolAndApplyEdits(tools_clang_scripts_directory, tool_to_test,
test_directory_for_tool, actual_files)
if (exitcode != 0):
return exitcode
# Compare actual-vs-expected results.
passed = 0
failed = 0
for expected, actual in zip(expected_files, actual_files):
print '[ RUN ] %s' % os.path.relpath(actual)
expected_output = actual_output = None
with open(expected, 'r') as f:
expected_output = f.readlines()
with open(actual, 'r') as f:
actual_output = f.readlines()
if actual_output != expected_output:
failed += 1
for line in difflib.unified_diff(expected_output, actual_output,
fromfile=os.path.relpath(expected),
tofile=os.path.relpath(actual)):
sys.stdout.write(line)
print '[ FAILED ] %s' % os.path.relpath(actual)
# Don't clean up the file on failure, so the results can be referenced
# more easily.
continue
print '[ OK ] %s' % os.path.relpath(actual)
passed += 1
os.remove(actual)
if failed == 0:
os.remove(compile_database)
print '[==========] %s ran.' % _NumberOfTestsToString(len(source_files))
if passed > 0:
print '[ PASSED ] %s.' % _NumberOfTestsToString(passed)
if failed > 0:
print '[ FAILED ] %s.' % _NumberOfTestsToString(failed)
return 1
if __name__ == '__main__':
......
......@@ -265,9 +265,5 @@ int main(int argc, const char* argv[]) {
clang::tooling::newFrontendActionFactory<CompilationIndexerAction>();
clang::tooling::ClangTool tool(options.getCompilations(),
options.getSourcePathList());
// This clang tool does not actually produce edits, but run_tool.py expects
// this. So we just print an empty edit block.
llvm::outs() << "==== BEGIN EDITS ====\n";
llvm::outs() << "==== END EDITS ====\n";
return tool.run(frontend_factory.get());
}
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment