• Raymond Toy's avatar
    SSE2 optimization for a-rate Oscillator · 7d8ffaab
    Raymond Toy authored
    Add SSE2 optimization for the main processing loop for the a-rate
    Oscillator and for the computation in WaveDataForFundamentalFrequency.
    Profiling showed WaveDataForFundamentalFrequency was taking 27% of the
    total time.  The optimized version takes about half as long by computing
    4 results at a time.
    
    The failed test is caused by changing the linear interpolation formula
    from (1-f)*x0 + f*x1 to the mathematically equivalent x0+f*(x1-x0).
    However, this isn't exactly the same in floating-point.  The updated
    formula has one less operation so it's a useful speed up.
    
    Using a float virtual_read_index in the SIMD loop causes significant
    loss of precision (50 dB or less in the osc sweep tests instead of
    100+ dB), so we need to do a slightly more complex SIMD version with
    double virtual_read_index.
    
    WebAudio bench results without this CL:
    
    TEST	μs	MIN	Q1	MEDIAN	Q3	MAX	MEAN	STDDEV
    Oscillator.frequency-linear-a-rate	1884	1884	1948	1969	1997	2598	1987.72	94.13
    
    With this CL:
    TEST	μs	MIN	Q1	MEDIAN	Q3	MAX	MEAN	STDDEV
    Oscillator.frequency-linear-a-rate	1348	1348	1397	1418	1453	2028	1474.74	151.59
    
    We see about a 25% improvement in speed (based on the mean).
    
    Bug: 1013118
    Change-Id: I818834c64c3529b26467dbaa7623354f92b47b2a
    Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2264204Reviewed-by: default avatarHongchan Choi <hongchan@chromium.org>
    Reviewed-by: default avatarDale Curtis <dalecurtis@chromium.org>
    Commit-Queue: Raymond Toy <rtoy@chromium.org>
    Cr-Commit-Position: refs/heads/master@{#789097}
    7d8ffaab
periodic_wave.h 5.88 KB