-
yfriedman@chromium.org authored
Picked up changes: 047ef5d Hookup the MarkupParser into title detection. 885a0af Also track whitespace text in TextBlocks. 39ef3db Add more colors to debug output. 0577e31 Add comments about UnicodePatternGenerator ranges 805c65e Simplify iteration in *RulesClassifier fe610d7 Make debug output colorful edc26e0 Introduce a DomDistillerTestSuite for running multiple tests in one context. 7a7f948 Convert relative "poster" attribute to absolute for HTML5 video. 73225c7 Remove unused Clone method from TextDocument/TextBlock. bb65cef Don't merge non-content lists with trailing content. 3d00eba check for opacity for element visibility 79f3bae ignore invisible page links 8416ada Allow DocumentTitleMatchClassifier to match repeated titles. 8dade4b Fix id check for comment exclusion. 7f91f0d improve heuristics for tables in eval set 5fa3c95 Make DomDistiller's text-only output include all extracted text. BUG=367233,368941,376107,378385,380792,381973 Review URL: https://codereview.chromium.org/403803002 git-svn-id: svn://svn.chromium.org/chrome/trunk/src@284208 0039d316-1c4b-4281-b951-d872f2087c98
ff294095