Closed Bug 504640 Opened 15 years ago Closed 11 years ago

Animation suffers from large pauses which appear to be garbage collection

Categories

(Core :: JavaScript Engine, defect)

x86
Windows Vista
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: jrmuizel, Unassigned)

References

()

Details

The animation at
http://people.mozilla.org/~jmuizelaar/world-map-fn.html
pauses every 3 seconds or so. 

I'm assuming this is a garbage collection problem because if you look at the memory graph in perfmon you can see a sawtooth wave with the peaks matching up with the pauses.

The animation plays just fine in Chrome, Opera, and Safari, with no sawtooth wave in perfmon and no pauses.

http://people.mozilla.org/~jmuizelaar/world-map.html is a different version that does the same thing with a different structure for the code. It suffers the same problem.
Yes.

When you remove all of the drawing from the profile we spend:
96% in js_Interpret
 56% is in js_MontiorLoopEdge
 39.7% is in js_NewArrayObject
  39.6% is in js_GC
   14.5% is in jsd_ObjectHook
   10.8% is in js_FinalizeStringRT
   10.1% is in array_finalize

If you focus js_GC the big time users are:
27.1% in js_FinalizeStringRT, 25% of which is in free()
24.9% in array_finalize()
11.1% in PR_Lock()
The pauses are definitely caused by js_GC. (The cycle collector is not to blame.) Here is a bit of timing log, where |dt| is the time in ms taken by this collection, and |T| is the time since the last GC started (thus, the period of the pauses). The js_GC lines print out at the same time the pause occurs. I'm going to see if I can get a bit more data on why this is happening (why frequent GCs, and why so slow):

js_GC dt=26.081690 T=2425.768185
js_GC dt=251.137470 T=1781.884940
js_GC dt=258.916435 T=1927.528605
js_GC dt=259.983690 T=1948.617705
js_GC dt=259.920510 T=1875.315610
js_GC dt=260.224160 T=1976.343610
cycle collector
js_GC dt=78.465785 T=374.011295
js_GC dt=226.496170 T=1534.087965
js_GC dt=257.596830 T=2011.833785
js_GC dt=257.231440 T=1930.198200
js_GC dt=258.293100 T=1905.468170
js_GC dt=257.173410 T=1996.877805
js_GC dt=258.711355 T=1877.218695
js_GC dt=258.234165 T=1986.075890
js_GC dt=259.157145 T=1970.652625
Can you split the mark and sweep portions out? I have a hunch.
dt_1 is mark, dt_2 is sweep, thus 95% of GC time is in sweep.

js_GC dt=258.178230 T=1841.567980 dt_1=11.310565 dt_2=246.867665
js_GC dt=261.475435 T=2018.390295 dt_1=11.494820 dt_2=249.980615
js_GC dt=263.672275 T=1887.543940 dt_1=11.124675 dt_2=252.547600
js_GC dt=265.855710 T=1885.269530 dt_1=12.292560 dt_2=253.563150
So if we were to, say, defer sweep until allocation time, and then sweep a few hundred objects per, I bet that animation would get a lot smoother?
Delayed sweeping sounds good. I'll do a quick measurement of how much sweep per allocation would work on this one.

Shark profile from an optimized build that shows it's finalizers, and specifically |free|:

        Self    Total   
	 5.8%	100.0%	js_GC	
	 3.0%	 44.6%	    js_FinalizeStringRT	
	 2.6%	 41.2%	        free	
	25.9%	 31.0%	            szone_free	
	 2.7%	 39.7%	    js_FinalizeObject	
	 1.0%	 37.0%	        array_finalize(JSContext*, JSObject*)	
	 3.1%	 35.1%	            free	
	16.0%	 24.8%	                szone_free
Depends on: 505612
We get completely owned in the iterator:

  for (p in countries) {
       var i = countries[p];
       i[0](ctx, i[1], i[2]);
  }

This is where all the string allocations come from. Number strings from 1 to very large:

#0  js_NewString (cx=0x1466800, chars=0x27023040, length=2) at ../../../js/src/jsstr.cpp:2731
#1  0x00290498 in JS_NewStringCopyZ (cx=0x1466800, s=0xbfffa5c5 "65") at ../../../js/src/jsapi.cpp:5377
#2  0x0031daf7 in NumberToStringWithBase (cx=0x1466800, d=65, base=10) at ../../../js/src/jsnum.cpp:853
#3  0x0031db52 in js_NumberToString (cx=0x1466800, d=65) at ../../../js/src/jsnum.cpp:862
#4  0x00389f9f in js_ValueToString (cx=0x1466800, v=131) at ../../../js/src/jsstr.cpp:2909
#5  0x003163b2 in CallEnumeratorNext (cx=0x1466800, iterobj=0x39f25140, flags=1, rval=0xbfffa758) at ../../../js/src/jsiter.cpp:566
#6  0x00316468 in js_CallIteratorNext (cx=0x1466800, iterobj=0x39f25140, rval=0xbfffa758) at ../../../js/src/jsiter.cpp:588
#7  0x003b3447 in CallIteratorNext_tn (cx=0x1466800, pc=0x4492a4 ":", iterobj=0x39f25140) at ../../../js/src/jstracer.cpp:11846
#8  0x001c7f69 in ?? ()
#9  0xbfffcde8 in ?? ()
#10 0x003d63fa in js_MonitorLoopEdge (cx=0x1466800, inlineCallCount=@0xbfffd2b8) at ../../../js/src/jstracer.cpp:5623
(In reply to comment #8)
> We get completely owned in the iterator:
> 
>   for (p in countries) {
>        var i = countries[p];
>        i[0](ctx, i[1], i[2]);
>   }

I assume this code would perform better:

    for (var p = 0; p < countries.length; ++p) {
        var i = countries[p];
        i[0](ctx, i[1], i[2]);
    }

I also note that in this situation, p is only used as an array index. Would it be at all possible to optimize the original kind of loop to avoid the string conversions back and forth?
(In reply to comment #9)
> (In reply to comment #8)
> > We get completely owned in the iterator:
> > 
> >   for (p in countries) {
> >        var i = countries[p];
> >        i[0](ctx, i[1], i[2]);
> >   }
> 
> I assume this code would perform better:
> 
>     for (var p = 0; p < countries.length; ++p) {
>         var i = countries[p];
>         i[0](ctx, i[1], i[2]);
>     }

Yup.

> I also note that in this situation, p is only used as an array index. Would it
> be at all possible to optimize the original kind of loop to avoid the string
> conversions back and forth?

The original JS implementation I did at Netscape in 1995, and the SpiderMonkey rewrite of it in 1996 (which reused a lot of code but switched to GC from ref counting and added jsval in place of a fat discriminated union), both would keep int type for the for-in loop variable if the property was an index. ECMA "fixed" this to require string. Oh well.

Andreas and I talked about optimizing this. We might want a magic string subtype, or possibly something more magic (like a pseudo-boolean) to represent "indexes". It wouldn't need to store the chars. Computing any given digit is easy (unsigned div and mod). I suggested a spin-out bug, not sure Andreas filed it yet.

/be
Seems like we could refactor CallIteratorNext to specialize for the integer case, since we can see that we're dealing with an array, and that we're only using as an integer.  No doubt some tricky edges there, but worth another bug.

(For that specific case, if we can guard on "simple dense array with length N", then the loop changes to a simple integer sequence...)
Specializing jsiter.cpp for int is not a problem, the issue is required language semantics. If you try to iterate internally over int index values, you will need a barrier to stringify for typeof, +, and other operators that can tell string from int.

/be
Index optimization filed as bug 505818.
(In reply to comment #11)
> Seems like we could refactor CallIteratorNext

CallEnumeratorNext, for reference.

Lacking a magic type, we have to stringify here per ECMA-262.

/be
(In reply to comment #9)
> I assume this code would perform better:
> 

It does (in all browsers), but we still get the pauses.
The remaining pause is recompilation. We currently recompile after each GC. Bug for that is filed already.
(In reply to comment #16)
> The remaining pause is recompilation. We currently recompile after each GC. Bug
> for that is filed already.

Do you mean bug 506117? I'm still seeing pauses with a Minefield nightly from Sept. 14.
(In reply to comment #17)
> (In reply to comment #16)
> > The remaining pause is recompilation. We currently recompile after each GC. Bug
> > for that is filed already.
> 
> Do you mean bug 506117?

That bug is not enough. It should block a bug about not purging JITted code on every GC, and this bug should depend on that bug.

/be
bug 508707 does what #18 asks for. Adding the dependency.
Depends on: 508707
#19: I meant bug 506117.
(In reply to comment #19)
> bug 508707 does what #18 asks for. Adding the dependency.

Andreas reminded me it was bug 506117 that did this, after I pointed out that bug 508707 wasn't the bug ;-).

/be
Depends on: 506117
No longer depends on: 508707
I wonder if bug 482853 and bug 482204 are this, too?
Looking at the GC log in the error console (via javascript.options.mem.log = true), I see GC pause times of 12ms.  Seems fixed by general GC improvements and igc.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.