Skip to content

Commit

Permalink
small idct code improvement
Browse files Browse the repository at this point in the history
See #79
The new code is both more readable and faster

Old assembly:
        mov     eax, edi
        cmp     edi, 256
        jb      .LBB1_2
        sar     eax, 31
        not     al
   .LBB1_2:
        ret

New assembly:
        xor     ecx, ecx
        test    edi, edi
        cmovns  ecx, edi
        cmp     ecx, 255
        mov     eax, 255
        cmovl   eax, ecx
        ret

Benchmark results :

Benchmarking decode a 512x512 JPEG: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 12.5s or reduce sample count to 40
decode a 512x512 JPEG   time:   [2.4692 ms 2.4873 ms 2.5106 ms]
                        change: [-18.558% -17.141% -15.659%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) high mild
  6 (6.00%) high severe

Benchmarking decode a 512x512 progressive JPEG: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 28.0s or reduce sample count to 20
decode a 512x512 progressive JPEG
                        time:   [5.5010 ms 5.5212 ms 5.5459 ms]
                        change: [-12.718% -11.746% -10.721%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) high mild
  6 (6.00%) high severe

extract metadata from an image
                        time:   [1.3028 us 1.3110 us 1.3207 us]
                        change: [+1.8341% +2.8787% +3.8439%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  7 (7.00%) high mild
  2 (2.00%) high severe
  • Loading branch information
lovasoa committed Apr 10, 2020
1 parent dab997e commit 76e993a
Showing 1 changed file with 1 addition and 7 deletions.
8 changes: 1 addition & 7 deletions src/idct.rs
Original file line number Diff line number Diff line change
Expand Up @@ -297,13 +297,7 @@ fn dequantize_and_idct_block_1x1(coefficients: &[i16], quantization_table: &[u16
// take a -128..127 value and stbi__clamp it and convert to 0..255
fn stbi_clamp(x: i32) -> u8
{
// trick to use a single test to catch both cases
if x as u32 > 255 {
if x < 0 { return 0; }
if x > 255 { return 255; }
}

x as u8
x.max(0).min(255) as u8
}

fn stbi_f2f(x: f32) -> i32 {
Expand Down

0 comments on commit 76e993a

Please sign in to comment.