Reland "[Traffic Annotation] Fail on CQ on C++ parsing error"

This is a reland of c2b90515 TBR=rhalavati@chromium.org Original change's description: > [Traffic Annotation] Fail on CQ on C++ parsing error > > Previously, CQ would pass if extractor.py failed at parsing the C++ > declaration. With this CL, parsing errors make CQ turn red (to avoid > committing code that will make linux-annotator-rel fail). > > Also, always rerun extractor.py when there's an error to get stderr > contents. Previously, when running with --error-resilient, it wouldn't > re-run the Clang tool. This saved several minutes on CQ, but didn't > report a useful error message. Since extractor.py is _much_ faster, > the tradeoff is different: the error message is more important than > the couple of seconds saved. > > Finally: slightly improve extractor.py's error reporting when it fails > to parse the annotation declaration. This makes it more actionable and > gives better context to the CL author. > > TESTED=traffic_annotation_auditor_unittests, extractor_test.py, > annotation_tokenizer_test.py > > Bug: 1046794 > Change-Id: Id397dcfcced39fbebf72ffcdc6d6549c66c2910f > Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2028707 > Commit-Queue: Nicolas Ouellet-Payeur <nicolaso@chromium.org> > Reviewed-by: Ramin Halavati <rhalavati@chromium.org> > Cr-Commit-Position: refs/heads/master@{#737039} Bug: 1046794 Change-Id: Ia7672de311a2ab32bf7d45d7f379e60a96e92d47 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2036794 Commit-Queue: Nicolas Ouellet-Payeur <nicolaso@chromium.org> Reviewed-by: Nicolas Ouellet-Payeur <nicolaso@chromium.org> Auto-Submit: Nicolas Ouellet-Payeur <nicolaso@chromium.org> Cr-Commit-Position: refs/heads/master@{#738251}

Reland "[Traffic Annotation] Fail on CQ on C++ parsing error"
This is a reland of c2b90515 TBR=rhalavati@chromium.org Original change's description: > [Traffic Annotation] Fail on CQ on C++ parsing error > > Previously, CQ would pass if extractor.py failed at parsing the C++ > declaration. With this CL, parsing errors make CQ turn red (to avoid > committing code that will make linux-annotator-rel fail). > > Also, always rerun extractor.py when there's an error to get stderr > contents. Previously, when running with --error-resilient, it wouldn't > re-run the Clang tool. This saved several minutes on CQ, but didn't > report a useful error message. Since extractor.py is _much_ faster, > the tradeoff is different: the error message is more important than > the couple of seconds saved. > > Finally: slightly improve extractor.py's error reporting when it fails > to parse the annotation declaration. This makes it more actionable and > gives better context to the CL author. > > TESTED=traffic_annotation_auditor_unittests, extractor_test.py, > annotation_tokenizer_test.py > > Bug: 1046794 > Change-Id: Id397dcfcced39fbebf72ffcdc6d6549c66c2910f > Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2028707 > Commit-Queue: Nicolas Ouellet-Payeur <nicolaso@chromium.org> > Reviewed-by: Ramin Halavati <rhalavati@chromium.org> > Cr-Commit-Position: refs/heads/master@{#737039} Bug: 1046794 Change-Id: Ia7672de311a2ab32bf7d45d7f379e60a96e92d47 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2036794 Commit-Queue: Nicolas Ouellet-Payeur <nicolaso@chromium.org> Reviewed-by: Nicolas Ouellet-Payeur <nicolaso@chromium.org> Auto-Submit: Nicolas Ouellet-Payeur <nicolaso@chromium.org> Cr-Commit-Position: refs/heads/master@{#738251}
be1de9a4 · Nicolas Ouellet-Payeur · Commit Bot · 9e6cb6e9 · be1de9a4 · be1de9a4
Commit be1de9a4 authored Feb 04, 2020 by Nicolas Ouellet-Payeur Committed by Commit Bot Feb 04, 2020
13 changed files
--- a/tools/traffic_annotation/auditor/traffic_annotation_auditor.cc
+++ b/tools/traffic_annotation/auditor/traffic_annotation_auditor.cc
@@ -188,8 +188,8 @@ bool TrafficAnnotationAuditor::RunExtractor(
    ExtractorBackend backend,
    bool filter_files_based_on_heuristics,
    bool use_compile_commands,
-    bool rerun_on_errors,
-    const base::FilePath& errors_file) {
+    const base::FilePath& errors_file,
+    int* exit_code) {
  DCHECK(backend == ExtractorBackend::CLANG_TOOL ||
         backend == ExtractorBackend::PYTHON_SCRIPT);

@@ -243,7 +243,8 @@ bool TrafficAnnotationAuditor::RunExtractor(
  base::FilePath original_path;
  base::GetCurrentDirectory(&original_path);
  base::SetCurrentDirectory(source_path_);
-  bool result = base::GetAppOutput(cmdline, &extractor_raw_output_);
+  bool result = base::GetAppOutputWithExitCode(cmdline, &extractor_raw_output_,
+                                               exit_code);

  // If the extractor had no output, it means that the script running it could
  // not perform the task.
@@ -270,10 +271,7 @@ bool TrafficAnnotationAuditor::RunExtractor(
      std::string tool_errors;
      std::string options_file_text;

-      if (rerun_on_errors)
-        base::GetAppOutputAndError(cmdline, &tool_errors);
-      else
-        tool_errors = "Not Available.";
+      base::GetAppOutputAndError(cmdline, &tool_errors);

      if (!base::ReadFileToString(options_filepath, &options_file_text))
        options_file_text = "Could not read options file.";

--- a/tools/traffic_annotation/auditor/traffic_annotation_auditor.h
+++ b/tools/traffic_annotation/auditor/traffic_annotation_auditor.h
@@ -70,14 +70,14 @@ class TrafficAnnotationAuditor {
  // received from repository and heuristically filtered to only process the
  // relevant files. If |use_compile_commands| flag is set, the list of files is
  // extracted from compile_commands.json instead of git and will not be
-  // filtered.  If clang tool returns error, and |rerun_on_errors| is true, the
-  // tool is run again to record errors.  Errors are written to |errors_file| if
-  // it is not empty, otherwise LOG(ERROR).
+  // filtered.  If clang tool returns errors, the tool is run again to record
+  // errors.  Errors are written to |errors_file| if it is not empty, otherwise
+  // LOG(ERROR).
  bool RunExtractor(ExtractorBackend backend,
                    bool filter_files_based_on_heuristics,
                    bool use_compile_commands,
-                    bool rerun_on_errors,
-                    const base::FilePath& errors_file);
+                    const base::FilePath& errors_file,
+                    int* exit_code);

  // Parses the output of clang tool (|extractor_raw_output_|) and populates
  // |extracted_annotations_|, |extracted_calls_|, and |errors_|.

--- a/tools/traffic_annotation/auditor/traffic_annotation_auditor_ui.cc
+++ b/tools/traffic_annotation/auditor/traffic_annotation_auditor_ui.cc
@@ -2,6 +2,8 @@
 // Use of this source code is governed by a BSD-style license that can be
 // found in the LICENSE file.

+#include <stdlib.h>
+
 #include "base/files/file_util.h"
 #include "base/strings/string_number_conversions.h"
 #include "base/strings/string_split.h"
@@ -17,6 +19,10 @@

 namespace {

+// extractor.py returns 2 if it has a parsing error (from invalid C++ source
+// code). Even in error-resilient mode, this should cause a CQ failure.
+const int EX_PARSE_ERROR = 2;
+
 const char* HELP_TEXT = R"(
 Traffic Annotation Auditor
 Extracts network traffic annotations from the repository, audits them for errors
@@ -313,7 +319,7 @@ int main(int argc, char* argv[]) {
  if (command_line.HasSwitch("help") || command_line.HasSwitch("h") ||
      argc == 1) {
    printf("%s", HELP_TEXT);
-    return 1;
+    return EXIT_FAILURE;
  }

  base::FilePath build_path = command_line.GetSwitchValuePath("build-path");
@@ -341,14 +347,14 @@ int main(int argc, char* argv[]) {
          << "The value for 'limit' switch should be a positive integer.";

      // This error is always enforced, as it is a commandline switch.
-      return 1;
+      return EXIT_FAILURE;
    }
  }

  // If 'error-resilient' switch is provided, 0 will be returned in case of
  // operational errors, otherwise 1.
  bool error_resilient = command_line.HasSwitch("error-resilient");
-  int error_value = error_resilient ? 0 : 1;
+  int error_value = error_resilient ? EXIT_SUCCESS : EXIT_FAILURE;

 #if defined(OS_WIN)
  for (const auto& path : command_line.GetArgs()) {
@@ -371,7 +377,7 @@ int main(int argc, char* argv[]) {
        << "You must specify a compiled build directory to run the auditor.\n";

    // This error is always enforced, as it is a commandline switch.
-    return 1;
+    return EXIT_FAILURE;
  }

  // If source path is not provided, guess it using build path.
@@ -401,9 +407,18 @@ int main(int argc, char* argv[]) {
      auditor.ClearPathFilters();
    }

-    if (!auditor.RunExtractor(backend, filter_files, all_files,
-                              !error_resilient, errors_file)) {
-      LOG(ERROR) << "Failed to run clang tool.";
+    int extractor_exit_code = EXIT_SUCCESS;
+    if (!auditor.RunExtractor(backend, filter_files, all_files, errors_file,
+                              &extractor_exit_code)) {
+      LOG(ERROR) << "Failed to run extractor.py. (exit code "
+                 << extractor_exit_code << ")";
+      // Parsing errors cause failures, even in error-resilient mode.
+      if (extractor_exit_code == EX_PARSE_ERROR) {
+        LOG(ERROR) << "The Traffic Annotation Auditor failed to parse a "
+                   << "network annotation definition. (see CppParsingError "
+                   << "above)";
+        return EX_PARSE_ERROR;
+      }
      return error_value;
    }


--- a/tools/traffic_annotation/bin/README.md
+++ b/tools/traffic_annotation/bin/README.md
@@ -75,4 +75,4 @@ The following two lines will be updated by the above script, and the modified
 README should be committed along with the updated .sha1 checksums.

 CLANG_REVISION = '64a362e7216a43e3ad44e50a89265e72aeb14294'
-LASTCHANGE=79120883ec3483cea3995017488dff0310def70c-refs/heads/master@{#704087}
+LASTCHANGE=ae8f603e9821f418ed34b6e0908e69e350543207-refs/heads/master@{#736938}
--- a/tools/traffic_annotation/bin/linux64/traffic_annotation_auditor.sha1
+++ b/tools/traffic_annotation/bin/linux64/traffic_annotation_auditor.sha1
-c4b30752082017388670c8e22a45eee2e00e5849
\ No newline at end of file
+bac011e685683ec9cfe4d21f6bb29271f9dec05b
\ No newline at end of file
--- a/tools/traffic_annotation/bin/win32/traffic_annotation_auditor.exe.sha1
+++ b/tools/traffic_annotation/bin/win32/traffic_annotation_auditor.exe.sha1
-fb3084359026d56b2f198979a765334871ec337b
\ No newline at end of file
+1c7e532efe730de200485ad8db6f658e87ceb53b
\ No newline at end of file
--- a/tools/traffic_annotation/scripts/annotation_tokenizer.py
+++ b/tools/traffic_annotation/scripts/annotation_tokenizer.py
@@ -30,10 +30,22 @@ TOKEN_REGEXEN = [
    ('right_paren', re.compile(r'(\))')),
 ]

+# Number of characters to include in the context (for error reporting).
+CONTEXT_LENGTH = 20

 Token = namedtuple('Token', ['type', 'value', 'pos'])


+class CppParsingError(Exception):
+  """An error during C++ parsing/tokenizing."""
+
+  def __init__(self, expected_type, body, pos, file_path, line_number):
+    context = body[pos:pos + CONTEXT_LENGTH]
+    msg = ("Expected {} in annotation definition at {}:{}.\n" +
+           "near '{}'").format(expected_type, file_path, line_number, context)
+    Exception.__init__(self, msg)
+
+
 class Tokenizer:
  """Simple tokenizer with basic error reporting.

@@ -51,16 +63,20 @@ class Tokenizer:
    """Like assert(), but reports errors in a _somewhat_ useful way."""
    if token and token.type == expected_type:
      return
-    raise Exception(
-        "Expected %s in annotation definition at %s:%d.\nnear '%s'" %
-        (expected_type, self.file_path, self.line_number,
-         self.body[self.pos:self.pos+10]))
+    # Skip whitespace to make the error message more useful.
+    pos = self._skip_whitespace()
+    raise CppParsingError(expected_type, self.body, pos, self.file_path,
+                          self.line_number)
+
+  def _skip_whitespace(self):
+    """Return the position of the first non-whitespace character from here."""
+    whitespace_re = re.compile(r'\s*')
+    return whitespace_re.match(self.body, self.pos).end()

  def _get_token(self):
    """Return the token here, or None on failure."""
-    # Skip initial whitespace
-    whitespace_re = re.compile(r'\s*')
-    pos = whitespace_re.match(self.body, self.pos).end()
+    # Skip initial whitespace.
+    pos = self._skip_whitespace()

    # Find the token here, if there's one.
    token = None

--- a/tools/traffic_annotation/scripts/annotation_tokenizer_test.py
+++ b/tools/traffic_annotation/scripts/annotation_tokenizer_test.py
@@ -9,7 +9,8 @@ Unit tests for annotation_tokenizer.py.

 import unittest

-from annotation_tokenizer import Tokenizer
+from annotation_tokenizer import Tokenizer, CppParsingError
+

 class AnnotationTokenizerTest(unittest.TestCase):
  def testRealAnnotationDefinition(self):
@@ -61,19 +62,19 @@ class AnnotationTokenizerTest(unittest.TestCase):
  def testAdvanceErrorPaths(self):
    tokenizer = Tokenizer('  hello , ', 'foo.txt', 33)
    tokenizer.advance('symbol')
-    with self.assertRaisesRegexp(Exception,
+    with self.assertRaisesRegexp(CppParsingError,
                                 'Expected symbol.+at foo.txt:33'):
      # There are no more tokens.
      tokenizer.advance('symbol')

    tokenizer = Tokenizer('"hello"', 'foo.txt', 33)
-    with self.assertRaisesRegexp(Exception,
+    with self.assertRaisesRegexp(CppParsingError,
                                 'Expected comma.+at foo.txt:33'):
      # The type doesn't match.
      tokenizer.advance('comma')

    tokenizer = Tokenizer('{', 'foo.txt', 33)
-    with self.assertRaisesRegexp(Exception,
+    with self.assertRaisesRegexp(CppParsingError,
                                 'Expected string_literal.+at foo.txt:33'):
      # Not a valid token at all.
      tokenizer.advance('string_literal')

--- a/tools/traffic_annotation/scripts/extractor.py
+++ b/tools/traffic_annotation/scripts/extractor.py
@@ -13,9 +13,13 @@ import argparse
 import os
 import re
 import sys
+import traceback

 from annotation_tools import NetworkTrafficAnnotationTools
-from annotation_tokenizer import Tokenizer
+from annotation_tokenizer import Tokenizer, CppParsingError
+
+# Exit code for parsing errors. Other runtime errors return 1.
+EX_PARSE_ERROR = 2

 ANNOTATION_TYPES = {
    'DefineNetworkTrafficAnnotation': 'Definition',
@@ -235,7 +239,11 @@ def main():
  for file_path in args.file_paths:
    if not args.no_filter and os.path.abspath(file_path) not in compdb_files:
      continue
-    annotation_definitions.extend(extract_annotations(file_path))
+    try:
+      annotation_definitions.extend(extract_annotations(file_path))
+    except CppParsingError:
+      traceback.print_exc()
+      return EX_PARSE_ERROR

  # Print output.
  for annotation in annotation_definitions:

--- a/tools/traffic_annotation/scripts/extractor_test.py
+++ b/tools/traffic_annotation/scripts/extractor_test.py
@@ -61,7 +61,8 @@ class ExtractorTest(unittest.TestCase):
      (stdout, stderr) = map(dos2unix, proc.communicate())

      self.assertEqual(expected_stderr, remove_tracebacks(stderr))
-      self.assertEqual(int(bool(expected_stderr)), proc.returncode)
+      expected_returncode = 2 if expected_stderr else 0
+      self.assertEqual(expected_returncode, proc.returncode)
      self.assertEqual(expected_stdout, stdout)



--- a/tools/traffic_annotation/scripts/test_data/too_few_args-stderr.txt
+++ b/tools/traffic_annotation/scripts/test_data/too_few_args-stderr.txt
-Exception: Expected comma in annotation definition at too_few_args.cc:8.
+CppParsingError: Expected comma in annotation definition at too_few_args.cc:8.
 near ');
 '
--- a/tools/traffic_annotation/scripts/test_data/wrong_arg_type-stderr.txt
+++ b/tools/traffic_annotation/scripts/test_data/wrong_arg_type-stderr.txt
-Exception: Expected string_literal in annotation definition at wrong_arg_type.cc:8.
-near 'should_be_'
+CppParsingError: Expected string_literal in annotation definition at wrong_arg_type.cc:8.
+near 'should_be_a_string_l'
--- a/tools/traffic_annotation/scripts/test_data/wrong_arg_type.cc
+++ b/tools/traffic_annotation/scripts/test_data/wrong_arg_type.cc
@@ -5,4 +5,6 @@
 #include "net/traffic_annotation/network_traffic_annotation.h"

 net::NetworkTrafficAnnotationTag kTrafficAnnotation =
-    net::DefineNetworkTrafficAnnotation(should_be_a_string_literal, "XXX");
+    net::DefineNetworkTrafficAnnotation(
+        should_be_a_string_literal_and_has_a_long_name,
+        "XXX");