Commit b602383e authored by Ben Pastene's avatar Ben Pastene Committed by Commit Bot

Gracefully kill child procs of CrOS VM tests on SIGTERM.

We collect logs from the VM after tests. The problem is, if the test
hangs indefinitely, we won't get a chance to grab the logs before
swarming kills us when we reach the timeout.

Swarming sends a SIGTERM, waits for grace_period, then SIGKILLs the task
if it's still running. This CL will catch the SIGTERM and kill what
should be the frozen test process. This will let cros_run_vm_test pull
the logs before we get SIGKILLed.

Bug: 848402
Change-Id: I48be1de865e3b287584978b5461f15e2bae49dfd
Reviewed-on: https://chromium-review.googlesource.com/1087800
Commit-Queue: Ben Pastene <bpastene@chromium.org>
Reviewed-by: default avatarJohn Budorick <jbudorick@chromium.org>
Cr-Commit-Position: refs/heads/master@{#565357}
parent 178d8ad3
...@@ -25,6 +25,7 @@ ...@@ -25,6 +25,7 @@
python_version: "2.7" python_version: "2.7"
# Used by: # Used by:
# build/chromeos/run_vm_test.py
# third_party/catapult # third_party/catapult
# #
# This version must be compatible with the version range specified by # This version must be compatible with the version range specified by
......
#!/usr/bin/env python #!/usr/bin/env vpython
# #
# Copyright 2018 The Chromium Authors. All rights reserved. # Copyright 2018 The Chromium Authors. All rights reserved.
# Use of this source code is governed by a BSD-style license that can be # Use of this source code is governed by a BSD-style license that can be
...@@ -10,10 +10,12 @@ import json ...@@ -10,10 +10,12 @@ import json
import logging import logging
import os import os
import re import re
import signal
import stat import stat
import subprocess import subprocess
import sys import sys
import psutil
CHROMIUM_SRC_PATH = os.path.abspath(os.path.join( CHROMIUM_SRC_PATH = os.path.abspath(os.path.join(
os.path.dirname(__file__), '..', '..')) os.path.dirname(__file__), '..', '..'))
...@@ -206,9 +208,26 @@ def vm_test(args): ...@@ -206,9 +208,26 @@ def vm_test(args):
if not env_copy.get('GN_ARGS'): if not env_copy.get('GN_ARGS'):
env_copy['GN_ARGS'] = 'is_chromeos = true' env_copy['GN_ARGS'] = 'is_chromeos = true'
env_copy['PATH'] = env_copy['PATH'] + ':' + os.path.join(CHROMITE_PATH, 'bin') env_copy['PATH'] = env_copy['PATH'] + ':' + os.path.join(CHROMITE_PATH, 'bin')
rc = subprocess.call( test_proc = subprocess.Popen(
cros_run_vm_test_cmd, stdout=sys.stdout, stderr=sys.stderr, env=env_copy) cros_run_vm_test_cmd, stdout=sys.stdout, stderr=sys.stderr, env=env_copy)
# Traps SIGTERM and kills all child processes of cros_run_vm_test when it's
# caught. This will allow us to capture logs from the VM if a test hangs
# and gets timeout-killed by swarming. See also:
# https://chromium.googlesource.com/infra/luci/luci-py/+/master/appengine/swarming/doc/Bot.md#graceful-termination_aka-the-sigterm-and-sigkill-dance
def _kill_child_procs(trapped_signal, _):
logging.warning(
'Received signal %d. Killing child processes of test.', trapped_signal)
if not test_proc or not test_proc.pid:
# This shouldn't happen?
logging.error('Test process not running.')
return
for child in psutil.Process(test_proc.pid).children():
logging.warning('Killing process %s', child)
child.kill()
signal.signal(signal.SIGTERM, _kill_child_procs)
test_proc.wait()
# Create a simple json results file for the sanity test if needed. The results # Create a simple json results file for the sanity test if needed. The results
# will contain only one test ('cros_vm_sanity_test'), and will either be a # will contain only one test ('cros_vm_sanity_test'), and will either be a
# PASS or FAIL depending on the return code of cros_run_vm_test above. # PASS or FAIL depending on the return code of cros_run_vm_test above.
...@@ -222,7 +241,7 @@ def vm_test(args): ...@@ -222,7 +241,7 @@ def vm_test(args):
with open(args.test_launcher_summary_output, 'w') as f: with open(args.test_launcher_summary_output, 'w') as f:
json.dump(json_results.GenerateResultsDict([run_results]), f) json.dump(json_results.GenerateResultsDict([run_results]), f)
return rc return test_proc.returncode
def main(): def main():
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment