SSE2 optimization for a-rate Oscillator
Add SSE2 optimization for the main processing loop for the a-rate Oscillator and for the computation in WaveDataForFundamentalFrequency. Profiling showed WaveDataForFundamentalFrequency was taking 27% of the total time. The optimized version takes about half as long by computing 4 results at a time. The failed test is caused by changing the linear interpolation formula from (1-f)*x0 + f*x1 to the mathematically equivalent x0+f*(x1-x0). However, this isn't exactly the same in floating-point. The updated formula has one less operation so it's a useful speed up. Using a float virtual_read_index in the SIMD loop causes significant loss of precision (50 dB or less in the osc sweep tests instead of 100+ dB), so we need to do a slightly more complex SIMD version with double virtual_read_index. WebAudio bench results without this CL: TEST μs MIN Q1 MEDIAN Q3 MAX MEAN STDDEV Oscillator.frequency-linear-a-rate 1884 1884 1948 1969 1997 2598 1987.72 94.13 With this CL: TEST μs MIN Q1 MEDIAN Q3 MAX MEAN STDDEV Oscillator.frequency-linear-a-rate 1348 1348 1397 1418 1453 2028 1474.74 151.59 We see about a 25% improvement in speed (based on the mean). Bug: 1013118 Change-Id: I818834c64c3529b26467dbaa7623354f92b47b2a Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2264204Reviewed-by:Hongchan Choi <hongchan@chromium.org> Reviewed-by:
Dale Curtis <dalecurtis@chromium.org> Commit-Queue: Raymond Toy <rtoy@chromium.org> Cr-Commit-Position: refs/heads/master@{#789097}
Showing
Please register or sign in to comment