Commit 1e916aa9 authored by Juan Antonio Navarro Perez's avatar Juan Antonio Navarro Perez Committed by Commit Bot

[soundwave] Keep only 4 months of data in soundwave study outputs

This prevents the size of the output csv from growing without bounds.
The limit on the input that DataStudio can take seems to be around
100 MiB.

Change-Id: Ib4656fe066686931135e2280e042fc832f1bb520
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1844974
Auto-Submit: Juan Antonio Navarro Pérez <perezju@chromium.org>
Commit-Queue: Sami Kyöstilä <skyostil@chromium.org>
Reviewed-by: default avatarSami Kyöstilä <skyostil@chromium.org>
Cr-Commit-Position: refs/heads/master@{#703268}
parent 883389e7
...@@ -21,6 +21,11 @@ def PostProcess(df): ...@@ -21,6 +21,11 @@ def PostProcess(df):
df['timestamp'] = df.groupby( df['timestamp'] = df.groupby(
['test_suite', 'bot', 'point_id'])['timestamp'].transform('min') ['test_suite', 'bot', 'point_id'])['timestamp'].transform('min')
# Prevent the size of the output from growing without bounts. Limit for
# DataStudio input appears to be around 100MiB.
four_months_ago = pandas.Timestamp.utcnow() - pandas.DateOffset(months=4)
df = df[df['timestamp'] > four_months_ago.tz_convert(None)].copy()
# We use all runs on the latest day for each quarter as reference. # We use all runs on the latest day for each quarter as reference.
df['quarter'] = df['timestamp'].dt.to_period('Q') df['quarter'] = df['timestamp'].dt.to_period('Q')
df['reference'] = df['timestamp'].dt.date == df.groupby( df['reference'] = df['timestamp'].dt.date == df.groupby(
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment