Closed Bug 545303 Opened 14 years ago Closed 14 years ago

TM: non API function for JS_CallTracer

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: gwagner, Assigned: gwagner)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(5 files, 3 obsolete files)

Luke noticed that all the GC JS_CALL_*_TRACER macros and the TRACE_JSVALS macro call JS_CallTracer which is a public API function.
So I made a non-public js_CallTracer which contains the body of JS_CallTracer and gets called by JS_CallTracer and changed all the internal uses to call js_CallTracer directly.

The surprising outcome is:
TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           1.008x as fast    721.2ms +/- 0.2%   715.5ms +/- 0.3%     significant

=============================================================================

  3d:                  -                 116.8ms +/- 0.4%   116.7ms +/- 0.5% 
    cube:              -                  38.4ms +/- 1.0%    38.4ms +/- 1.0% 
    morph:             -                  21.0ms +/- 0.0%    21.0ms +/- 0.0% 
    raytrace:          -                  57.4ms +/- 0.6%    57.3ms +/- 0.6% 

  access:              ??                 95.3ms +/- 0.5%    95.4ms +/- 0.7%     not conclusive: might be *1.001x as slow*
    binary-trees:      -                  19.7ms +/- 1.8%    19.6ms +/- 1.9% 
    fannkuch:          *1.018x as slow*   44.2ms +/- 0.7%    45.0ms +/- 0.0%     significant
    nbody:             -                  19.5ms +/- 1.9%    19.5ms +/- 1.9% 
    nsieve:            1.053x as fast     11.9ms +/- 1.9%    11.3ms +/- 3.1%     significant

  bitops:              ??                 31.1ms +/- 1.7%    31.3ms +/- 1.1%     not conclusive: might be *1.006x as slow*
    3bit-bits-in-byte: -                   1.3ms +/- 26.6%     1.2ms +/- 25.1% 
    bits-in-byte:      *1.011x as slow*    9.0ms +/- 0.0%     9.1ms +/- 2.5%     significant
    bitwise-and:       ??                  1.8ms +/- 16.7%     2.0ms +/- 0.0%     not conclusive: might be *1.111x as slow*
    nsieve-bits:       -                  19.0ms +/- 0.0%    19.0ms +/- 0.0% 

  controlflow:         *1.070x as slow*    7.1ms +/- 3.2%     7.6ms +/- 4.9%     significant
    recursive:         *1.070x as slow*    7.1ms +/- 3.2%     7.6ms +/- 4.9%     significant

  crypto:              1.029x as fast     46.7ms +/- 2.0%    45.4ms +/- 1.8%     significant
    aes:               -                  26.9ms +/- 3.4%    25.8ms +/- 2.9% 
    md5:               -                  13.2ms +/- 2.3%    13.0ms +/- 0.0% 
    sha1:              -                   6.6ms +/- 5.6%     6.6ms +/- 5.6% 

  date:                1.019x as fast    104.6ms +/- 0.4%   102.7ms +/- 0.7%     significant
    format-tofte:      1.023x as fast     54.5ms +/- 0.7%    53.3ms +/- 0.6%     significant
    format-xparb:      1.014x as fast     50.1ms +/- 0.5%    49.4ms +/- 0.7%     significant

  math:                -                  34.9ms +/- 2.5%    34.9ms +/- 1.8% 
    cordic:            ??                 15.5ms +/- 2.4%    15.6ms +/- 2.4%     not conclusive: might be *1.006x as slow*
    partial-sums:      ??                 12.7ms +/- 2.7%    12.8ms +/- 2.4%     not conclusive: might be *1.008x as slow*
    spectral-norm:     -                   6.7ms +/- 5.2%     6.5ms +/- 5.8% 

  regexp:              ??                 36.1ms +/- 0.6%    36.3ms +/- 1.0%     not conclusive: might be *1.006x as slow*
    dna:               ??                 36.1ms +/- 0.6%    36.3ms +/- 1.0%     not conclusive: might be *1.006x as slow*

  string:              1.014x as fast    248.6ms +/- 0.3%   245.2ms +/- 0.2%     significant
    base64:            ??                 11.2ms +/- 2.7%    11.4ms +/- 3.2%     not conclusive: might be *1.018x as slow*
    fasta:             1.025x as fast     50.0ms +/- 0.0%    48.8ms +/- 0.6%     significant
    tagcloud:          1.007x as fast     84.0ms +/- 0.0%    83.4ms +/- 0.4%     significant
    unpack-code:       1.026x as fast     80.1ms +/- 0.5%    78.1ms +/- 0.3%     significant
    validate-input:    ??                 23.3ms +/- 1.5%    23.5ms +/- 1.6%     not conclusive: might be *1.009x as slow*
Attached patch patch (obsolete) — Splinter Review
Assignee: general → anygregor
Cool... I wasn't expecting SS to get any faster (I thought we didn't GC during SS?); what about the GC-heavy tests that you've been using to compare our GC to WebKit's in that other bug?
I modified one of Igors benchmark and run it on my Mac Pro:

Processor Dual-Core Intel Xeon @ 2.66 GHz
Number Of Processors:	2


test();

function test()
{
 
  function generate_big_object_graph()
  {
    var root = {};
    f(root, 17);
    return root;
    function f(parent, depth) {
      if (depth == 0) 
          return;
      --depth;

      f(parent.a = {}, depth);
      f(parent.b = {}, depth);
    }
  }

  function f(obj) {
    with (obj)
      return arguments;
  }

  
  gc();
  
  var start = Date.now();
  x = null;
  x = f(generate_big_object_graph());
  var create_end = Date.now();
  gc();
  
  var fullgc_end = Date.now();
  x = null;
  gc();
  
  var emptygc_end = Date.now();
  var actual = "";
  actual = " alloc: "+(create_end - start)+
  "\n used : "+(fullgc_end - create_end) + 
  "\n free : "+(emptygc_end - fullgc_end);
   
  print(actual);

}

Tip:
 alloc: 108
 used : 22
 free : 25

Patch:
 alloc: 107
 used : 19
 free : 24
So, if I am reading the benchmark correctly, the 'used' time is when there is a lot of marking going on, and that shows, depending on the variation in measurement, a 15% improvement.  Cool!
Attached patch patch (obsolete) — Splinter Review
Attachment #426155 - Attachment is obsolete: true
Bonus points for renaming "Trace" to "GCTrace" or "GCScan" or something that won't be confusing forever. ;-)

/be
(In reply to comment #2)
> (I thought we didn't GC during SS?)

Me too -- Gregor, what is the story?

/be
(In reply to comment #6)
> Bonus points for renaming "Trace" to "GCTrace" or "GCScan" or something that
> won't be confusing forever. ;-)

Holy hannah, yes.
(In reply to comment #6)
> Bonus points for renaming "Trace" to "GCTrace" or "GCScan" or something that
> won't be confusing forever. ;-)

Lets rename it do the original name, js_CallMarker
Attached patch patch (obsolete) — Splinter Review
I am very confused! Even with this little patch where I extract js_CallTracer and move the public JS_CallTracer function to jsapi.cpp I get 5-8 ms speedup on SS.
I also tried it on Linux but nothing happens there.
Could somebody try this on Mac as well?
Attached patch patchSplinter Review
How about js_CallGCMarker ?
Attachment #426266 - Attachment is obsolete: true
Attachment #426358 - Attachment is obsolete: true
Attached image GC Times Tip
I used this benchmark to collect more data on this patch:
All numbers represent again cycles * 1E6.

test();

function test()
{
 
  function generate_big_object_graph()
  {
    var root = {};
    f(root, 17);
    return root;
    function f(parent, depth) {
      if (depth == 0) 
          return;
      --depth;

      f(parent.a = {}, depth);
      f(parent.b = {}, depth);
    }
  }

  function f(obj) {
    with (obj)
      return arguments;
  }

  for (var i = 0; i != 10; ++i) {
    gc();
    x = null;
    x = f(generate_big_object_graph());
    gc();
    gc();
    x = null;
    gc();
  }
}
The marking time reduces by about 5%.
Attached image GC Times Tip on Linux
Previous histograms were Mac results.
These are Linux numbers.
Amazing how the finalizer times vary between these two OS.
If I only had a PC I would be able to get some numbers for Windows... :)
Attachment #426781 - Flags: review?(igor)
Comment on attachment 426781 [details] [diff] [review]
patch

js_locked_atom_tracer(JSDHashTable *tabl
>     JSAtomHashEntry *entry = TO_ATOM_ENTRY(hdr);
>     JSTracer *trc = (JSTracer *)arg;
> 
>     if (entry->keyAndFlags == 0) {
>         /* Ignore uninitialized entries during tracing. */
>         return JS_DHASH_NEXT;
>     }
>     JS_SET_TRACING_INDEX(trc, "locked_atom", (size_t)number);
>-    JS_CallTracer(trc, ATOM_ENTRY_KEY(entry),
>+    js_CallGCMarker(trc, ATOM_ENTRY_KEY(entry),
>                   IS_STRING_TABLE(table) ? JSTRACE_STRING : JSTRACE_DOUBLE);

Nit: align the second argument line.
Attachment #426781 - Flags: review?(igor) → review+
Whiteboard: fixed-in-tracemonkey
http://hg.mozilla.org/mozilla-central/rev/3b25677f1fee
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: