Commit bdaedbe2 authored by Alex Turner's avatar Alex Turner Committed by Commit Bot

Ensure no rules are repeated before consolidation step

There are two deduplication steps currently in the filter list
generation procedure. One simply removes duplicated rules. The other
consolidates rules that differ only by the domains to which they apply.
The removal of duplicated rules should be done before the consolidation
step as the current ordering can lead to domains being repeated in a
rule.

Bug: 1067711
Change-Id: I3126e28f8232c0b5eaf8c021432542fad4e5ca03
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2134776Reviewed-by: default avatarJosh Karlin <jkarlin@chromium.org>
Commit-Queue: Alex Turner <alexmt@chromium.org>
Cr-Commit-Position: refs/heads/master@{#756809}
parent b387ea6d
......@@ -91,8 +91,8 @@ An example using [EasyList](https://easylist.to/easylist/easylist.txt) follows:
Appends whitelist rules and also deduplicates rules which only differ by their set of affected domains.
```sh
1. grep ^@@ easylist.txt >> smaller_list.txt
2. awk -F,domain= '{ if(!length($2)) table[$1] = ""; else table[$1 FS] = length(table[$1 FS]) ? table[$1 FS] "|" $2 : $2; } END{ for (key in table) print key table[key] }' smaller_list.txt > smaller_list_deduped.tmp && mv smaller_test_deduped.tmp smaller_list.txt
3. sort smaller_list.txt | uniq > final_list.txt
2. sort smaller_list.txt | uniq > deduped_smaller_list.txt
3. awk -F,domain= '{ if(!length($2)) table[$1] = ""; else table[$1 FS] = length(table[$1 FS]) ? table[$1 FS] "|" $2 : $2; } END{ for (key in table) print key table[key] }' deduped_smaller_list.txt > final_list.txt
```
## 5. Turn the final list into a form usable by Chromium tools
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment