Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
See #79 The new code is both more readable and faster Old assembly: mov eax, edi cmp edi, 256 jb .LBB1_2 sar eax, 31 not al .LBB1_2: ret New assembly: xor ecx, ecx test edi, edi cmovns ecx, edi cmp ecx, 255 mov eax, 255 cmovl eax, ecx ret Benchmark results : Benchmarking decode a 512x512 JPEG: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 12.5s or reduce sample count to 40 decode a 512x512 JPEG time: [2.4692 ms 2.4873 ms 2.5106 ms] change: [-18.558% -17.141% -15.659%] (p = 0.00 < 0.05) Performance has improved. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) high mild 6 (6.00%) high severe Benchmarking decode a 512x512 progressive JPEG: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 28.0s or reduce sample count to 20 decode a 512x512 progressive JPEG time: [5.5010 ms 5.5212 ms 5.5459 ms] change: [-12.718% -11.746% -10.721%] (p = 0.00 < 0.05) Performance has improved. Found 9 outliers among 100 measurements (9.00%) 3 (3.00%) high mild 6 (6.00%) high severe extract metadata from an image time: [1.3028 us 1.3110 us 1.3207 us] change: [+1.8341% +2.8787% +3.8439%] (p = 0.00 < 0.05) Performance has regressed. Found 9 outliers among 100 measurements (9.00%) 7 (7.00%) high mild 2 (2.00%) high severe
- Loading branch information