Closed Bug 472952 Opened 16 years ago Closed 15 years ago

Build libmozjs with Intel's icc compiler

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 461808

People

(Reporter: gal, Unassigned)

References

Details

Attachments

(1 file)

      No description provided.
Depends on: 472951
TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           1.192x as fast    1091.7ms +/- 0.2%   915.8ms +/- 0.3%     significant

=============================================================================

  3d:                  1.200x as fast     155.5ms +/- 0.8%   129.6ms +/- 0.6%     significant
    cube:              1.198x as fast      45.3ms +/- 1.7%    37.8ms +/- 1.5%     significant
    morph:             1.25x as fast       29.6ms +/- 1.2%    23.7ms +/- 2.0%     significant
    raytrace:          1.184x as fast      80.6ms +/- 0.5%    68.1ms +/- 0.6%     significant

  access:              1.158x as fast     135.6ms +/- 0.6%   117.1ms +/- 0.3%     significant
    binary-trees:      1.088x as fast      38.3ms +/- 0.9%    35.2ms +/- 0.9%     significant
    fannkuch:          1.21x as fast       60.7ms +/- 0.6%    50.3ms +/- 0.7%     significant
    nbody:             1.191x as fast      23.7ms +/- 1.5%    19.9ms +/- 1.1%     significant
    nsieve:            1.103x as fast      12.9ms +/- 1.8%    11.7ms +/- 3.0%     significant

  bitops:              1.125x as fast      39.7ms +/- 2.3%    35.3ms +/- 1.4%     significant
    3bit-bits-in-byte: ??                   1.7ms +/- 20.3%     2.0ms +/- 0.0%     not conclusive: might be *1.176x as slow*
    bits-in-byte:      1.151x as fast      10.7ms +/- 3.2%     9.3ms +/- 3.7%     significant
    bitwise-and:       1.053x as fast       2.0ms +/- 0.0%     1.9ms +/- 11.9%     significant
    nsieve-bits:       1.145x as fast      25.3ms +/- 1.4%    22.1ms +/- 1.0%     significant

  controlflow:         *1.040x as slow*    32.4ms +/- 1.1%    33.7ms +/- 1.0%     significant
    recursive:         *1.040x as slow*    32.4ms +/- 1.1%    33.7ms +/- 1.0%     significant

  crypto:              1.25x as fast       60.9ms +/- 0.7%    48.8ms +/- 0.6%     significant
    aes:               1.22x as fast       35.5ms +/- 1.1%    29.1ms +/- 1.4%     significant
    md5:               1.36x as fast       19.1ms +/- 1.2%    14.0ms +/- 0.0%     significant
    sha1:              1.105x as fast       6.3ms +/- 5.5%     5.7ms +/- 6.1%     significant

  date:                1.25x as fast      217.1ms +/- 0.5%   174.0ms +/- 0.4%     significant
    format-tofte:      1.34x as fast      117.2ms +/- 0.4%    87.6ms +/- 0.6%     significant
    format-xparb:      1.156x as fast      99.9ms +/- 0.8%    86.4ms +/- 0.4%     significant

  math:                1.078x as fast      40.0ms +/- 0.8%    37.1ms +/- 0.6%     significant
    cordic:            *1.152x as slow*    19.1ms +/- 1.2%    22.0ms +/- 0.0%     significant
    partial-sums:      1.56x as fast       14.0ms +/- 0.0%     9.0ms +/- 0.0%     significant
    spectral-norm:     1.131x as fast       6.9ms +/- 3.3%     6.1ms +/- 3.7%     significant

  regexp:              -                   55.2ms +/- 0.5%    55.1ms +/- 0.7% 
    dna:               -                   55.2ms +/- 0.5%    55.1ms +/- 0.7% 

  string:              1.25x as fast      355.3ms +/- 0.2%   285.1ms +/- 0.4%     significant
    base64:            1.26x as fast       15.4ms +/- 2.4%    12.2ms +/- 2.5%     significant
    fasta:             1.23x as fast       70.5ms +/- 0.5%    57.2ms +/- 0.5%     significant
    tagcloud:          1.24x as fast      109.0ms +/- 0.4%    88.2ms +/- 0.6%     significant
    unpack-code:       1.28x as fast      129.6ms +/- 0.3%   101.6ms +/- 0.5%     significant
    validate-input:    1.189x as fast      30.8ms +/- 1.0%    25.9ms +/- 0.9%     significant
The numbers above show a 20% speedup for a PGO-ed icc build (profile input sunspider) against a GCC build. Both OPT and MacOSX. 

The speedups on Windows might be even more dramatic since ICC compiles the threaded interpreter whereas windows uses a switch statement.
Build instructions:

CC=/opt/intel/Compiler/11.0/056/bin/ia32/icc CXX=/opt/intel/Compiler/11.0/056/bin/ia32/icpc ../configure --enable-optimize --disable-debug

export PATH=/opt/intel/Compiler/11.0/056/bin/ia32:${PATH}

XCFLAGS=-prof-gen make 

<run sunspider>

make clean
XCFLAGS=-prof-use make

<benchmark sunspider>
Comparison V8 bleeding edge to TM built with ICC/PGO:

TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           1.023x as fast    918.8ms +/- 0.2%   898.5ms +/- 0.1%     significant

=============================================================================

  3d:                  1.39x as fast     130.8ms +/- 0.7%    93.8ms +/- 0.5%     significant
    cube:              1.42x as fast      38.8ms +/- 1.9%    27.3ms +/- 1.8%     significant
    morph:             *1.66x as slow*    23.7ms +/- 1.5%    39.4ms +/- 0.9%     significant
    raytrace:          2.52x as fast      68.3ms +/- 0.5%    27.1ms +/- 0.8%     significant

  access:              1.98x as fast     117.3ms +/- 0.6%    59.2ms +/- 1.0%     significant
    binary-trees:      9.03x as fast      35.2ms +/- 0.9%     3.9ms +/- 5.8%     significant
    fannkuch:          2.46x as fast      50.5ms +/- 0.7%    20.5ms +/- 1.8%     significant
    nbody:             1.026x as fast     19.8ms +/- 1.5%    19.3ms +/- 1.8%     significant
    nsieve:            *1.31x as slow*    11.8ms +/- 2.6%    15.5ms +/- 2.4%     significant

  bitops:              *1.130x as slow*   35.4ms +/- 1.4%    40.0ms +/- 1.5%     significant
    3bit-bits-in-byte: *1.86x as slow*     2.1ms +/- 10.8%     3.9ms +/- 5.8%     significant
    bits-in-byte:      1.179x as fast      9.2ms +/- 3.3%     7.8ms +/- 3.9%     significant
    bitwise-and:       *6.37x as slow*     1.9ms +/- 11.9%    12.1ms +/- 3.4%     significant
    nsieve-bits:       1.37x as fast      22.2ms +/- 1.4%    16.2ms +/- 1.9%     significant

  controlflow:         12.5x as fast      33.7ms +/- 1.0%     2.7ms +/- 12.8%     significant
    recursive:         12.5x as fast      33.7ms +/- 1.0%     2.7ms +/- 12.8%     significant

  crypto:              1.187x as fast     48.9ms +/- 0.8%    41.2ms +/- 1.1%     significant
    aes:               1.66x as fast      29.1ms +/- 0.8%    17.5ms +/- 2.2%     significant
    md5:               1.120x as fast     14.0ms +/- 0.0%    12.5ms +/- 3.0%     significant
    sha1:              *1.93x as slow*     5.8ms +/- 5.2%    11.2ms +/- 2.7%     significant

  date:                1.54x as fast     174.9ms +/- 0.5%   113.7ms +/- 0.3%     significant
    format-tofte:      1.44x as fast      88.2ms +/- 0.8%    61.2ms +/- 0.5%     significant
    format-xparb:      1.65x as fast      86.7ms +/- 0.4%    52.5ms +/- 0.7%     significant

  math:                *1.74x as slow*    37.3ms +/- 0.9%    65.0ms +/- 0.0%     significant
    cordic:            *1.37x as slow*    22.1ms +/- 1.0%    30.3ms +/- 1.1%     significant
    partial-sums:      *2.86x as slow*     9.1ms +/- 2.5%    26.0ms +/- 0.0%     significant
    spectral-norm:     *1.43x as slow*     6.1ms +/- 3.7%     8.7ms +/- 4.0%     significant

  regexp:              *3.59x as slow*    55.6ms +/- 0.7%   199.6ms +/- 0.2%     significant
    dna:               *3.59x as slow*    55.6ms +/- 0.7%   199.6ms +/- 0.2%     significant

  string:              1.006x as fast    284.9ms +/- 0.2%   283.3ms +/- 0.3%     significant
    base64:            *1.84x as slow*    12.2ms +/- 2.5%    22.5ms +/- 1.7%     significant
    fasta:             1.80x as fast      57.3ms +/- 0.6%    31.9ms +/- 0.7%     significant
    tagcloud:          1.010x as fast     88.1ms +/- 0.5%    87.2ms +/- 0.3%     significant
    unpack-code:       *1.036x as slow*  101.4ms +/- 0.4%   105.0ms +/- 0.0%     significant
    validate-input:    *1.42x as slow*    25.9ms +/- 0.9%    36.7ms +/- 1.3%     significant
So it seems icc's sin() implementation is a bit off. This affects ICC -O2 and no optimization.

whale:src gal$ ./Darwin_OPT.OBJ/js -e "print(Math.sin(10))"
-0.5440211108893699
whale:src gal$ ./Darwin_ICC_OPT.OBJ/js -e "print(Math.sin(10))"
-0.5440211108893698
whale:src gal$
MSVC: -0.5440211108893698 (tested using ff windows build).

Maybe GCC is off here. This should be investigated further in a separate bug.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: