Closed Bug 456721 Opened 16 years ago Closed 15 years ago

Control GC frequency/a high water mark of Tracemonkey via about:config

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9.2a1

People

(Reporter: gyuyoung.kim, Assigned: gyuyoung.kim)

References

(Depends on 1 open bug)

Details

Attachments

(2 files, 16 obsolete files)

3.39 KB, patch
mrbkap
: review+
mrbkap
: superreview+
Details | Diff | Splinter Review
3.31 KB, patch
Details | Diff | Splinter Review
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; ko; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1
Build Identifier: fennec-0.9pre.en-US.linux-arm

In SpiderMonkey, I have an idea for fennec. As you know, memory management is important on mobile device. So, I think SpiderMonkey need also to be optimized for mobile. Even though there are already some GC call conditions in SpiderMonkey, It seems to me the conditions need to be modified for mobile. 

My idea's summary is as below, 

1. Set up a threshold for SpiderMonkey.
   e.g)  gcMaxBytes = 1.5 MB

2. Call GC before allocating a new arena.
 2-1. If we can get a free memory from GC execution, we don't need to consume a new arena.
 2-2. If we cannot get a free memory from GC execution, we need to allocate a new arena.

3. After GC execution, if a free memory ratio among arena lists is too small, we need to increase in the threshold.
   e.g) gcMaxBytes = 1.5 MB -> 2.0 MB -> 2.5 MB -> 3.0 MB

When I ran SunSpider test on both fennec and with patch, I could get a result as following,
(I only measure a gcBytes which is measured by NewGCArena() for the time being.)

  Fennec:
    - GC Call: 168 times
    - Max gcBytes: 9,724 KB
    - Total Allocated Arena'size: 317,276 KB
    - SunSpider Result: 111,785.8 ms +/- 2.3%

  Fennec with this patch:
    - GC Call: 445 times
    - Max gcBytes: 1,536 KB
    - Total Allocated Arena'size : 275,548 KB
    - SunSpider Result: 122,376.4 ms +/- 2.1%

SUT
  - N810 device.
  - Fennec Version: 0.9 pre version.
  - Connection: Wireless LAN 

As above result, the patch makes SpiderMonkey around 10% slowly. But, it can control peak memory and reduce memory consumption.

I know this patch has many problems, but if we can reduce a GC execution overhead, it will be helpful to fennec. I would like to get any advices or thoughts for my idea.

Reproducible: Always
Attached patch Patch-v0.1 (obsolete) — Splinter Review
Attached image Description of idea (obsolete) —
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Fennec → Core
QA Contact: general → general
Flags: wanted1.9.1?
Attachment #340115 - Flags: superreview?(brendan)
Attachment #340115 - Flags: review?(shaver)
Attachment #340115 - Flags: superreview?(brendan)
Attachment #340115 - Flags: review?(shaver)
Please don't add tabs to Mozilla source files. Thanks.

Cc'ing folks who may have thoughts on the idea here.

/be
Gyu-Young, exactly what are you measuring when you report "Total Allocated Arena'size"?
It seems there are two major functions for memory allocation,
NewGCArena() and JS_malloc() in SpiderMonkey. 

Memory allocated by NewGCArena() can be managed by GC directly. But,it can only support free memory of fixed size.

On the other hand, JS_malloc() can be allocated variable size.
And, it seems memory allocated by JS_malloc() can be influenced by GC indirectly.

For the time being, I only measured memory which was allocated by NewGCArena(). I mean 'Total Allocated Arena'size' is total memory which was allocated during the SunSpider test.
Attached file Patch-v0.1 (obsolete) —
From now on,I do not use tab in source code. Thanks, brendan.
Attachment #340115 - Attachment is obsolete: true
Attached patch Patch-v0.1 (obsolete) — Splinter Review
Attachment #340518 - Attachment is obsolete: true
> I mean 'Total Allocated Arena'size' is total memory which was allocated during
> the SunSpider test.

Robin:
Sorry, I would like to change above sentense as below,

I mean 'Total Allocated Arena'size' is total memory which were allocated in NewGCArena() during the SunSpider test.
Gyu-Young,

We need a high water mark of memory use during SunSpider, not the sum of the total allocations.

For example, suppose a series of GCs result in the following allocation schemes:

#1: alloc 5, free 4, alloc 5, free 5, alloc 5, free 6
#2: alloc 6, free 6

In scheme #1, the "total allocated arena size", as I believe you calculate it, is 15.  That's 5 + 5 + 5.  In scheme #2, it is 6.  But, in both cases, no more than 6 units of memory were ever consumed.  That is, the high water mark was 6 units, in both scheme #1 and scheme #2.

Consider also this GC allocation scheme:

#3: alloc 10, free 10

and compare it with #1.  #1 has a "total allocated arena size" of 15, whereas #3 has 10.  But, #3 uses much more memory (10 units) than #1 (which uses 6 units).

So, there is not much connection between "total allocated arena size" and the memory used at any one given time.  If I'm mistaken about how you calculate "total allocated arena size", then please enlighten me.

The idea of using more frequent garbage collection to reduce memory usage certainly makes sense.  We need to collect a high water mark of memory use, to measure your algorithm's effectiveness.
Robin:

Thank you for your advice.

I understand the high water mark you mentioned as peak of memory consumption. I agree that "a high water mark" is more important than "total memory consumption". I mentioned already a high water mark in comment #1, but it seems I didn't express it clearly. 

In comment #1, "Max gcBytes" means a high water mark among memory consumption of NewGCArena().

 Fennec:
    - Max gcBytes: 9,724 KB  
      (9,724 KB means there are 2,431 arenas (4 KB per an arena) at a high water mark)

 Fennec with this patch:
    - Max gcBytes: 1,536 KB 
      (1,536 KB means there are 384 arenas (4 KB per an arena) at a high water mark) 

NewGCArena() function has a counter variable for measuring its memory consumption as below,

In NewGCArena() of jsgc.cpp, 
   963: rt->gcBytes += GC_ARENA_SIZE (4 KB)

In DestroyGCArenas() of jsgc.cpp, 
   981: rt->gcBytes -= GC_ARENA_SIZE (4 KB)

So, I measured a maximum value of rt->gcBytes. Whenever NewGCArena() is called, I checked rt->gcBytes's size if it is bigger than before or not, in order to measure the peak of rt->gcBytes during the sunspider test.
(I cannot find any memory profilers on N810 yet, if you know any memory profilers on N810, please let me know.) 

If I don't still understand what you meant correctly, please let me know.

p.s
>> If I'm mistaken about how you calculate "total allocated arena size", then please enlighten me.
 I counted NewGCArena()'s execution in order to measure "total allocated arena size" during SunSpider test. Because a NewGCArena() execution consumes 4 KB. I just wanted to show how much this patch can reduce memory usage during the sunspider test together with a high water mark (Max gcBytes).
Gyu-Young,

I mistook your "Max gcBytes" statistic for the new "gcMaxBytes" value which you introduced.  Sorry.  Using the high water mark of gcBytes does achieve what I suggested - it is a high water mark of memory use.

I would argue that reporting "Total Allocated Arena'size" at all is not useful.  Worse, it might deceive.  It is not a good metric for describing the value of your patch.  Your other 3 metrics are good, IMO.

In the future, I think it would be useful to be able to dynamically enable the patch's behaviour, but not only for Fennec.  It might be useful for all Firefox users, say, to be able to make this kind of space-time tradeoff.  I wonder if it's worth coding that way, now?
Gyu-Young,

I also think it might be nice to see your 3 stats (# GCs, max gcBytes, runtime) not just for SunSpider, but for each of its component tests.

This would be to avoid the possibility that there is just one test within SunSpider which dominates and determines the max gcBytes for the whole run.  In fact, this would not surprise me, though I don't know which of the tests it would be.

I know that collecting these stats is lots of work, but I do think that it would give a better idea of the value of your patch.  In other words, I would want to know if the patch benefits all of the SunSpider tests equally, or if all but one of them do not benefit at all while one benefits greatly.
Robin:

I attach SunSpider Result. As you mentioned, there are big difference between both in some cases.
Especially, In crypto’s aes and string’s tagcloud cases.
 - aes:      *1.35x as slow*  1975.6ms +/- 3.3%  2664.0ms +/- 67.8% significant
 - tagcloud: *1.29x as slow*  4627.2ms +/- 4.8%  5956.2ms +/- 66.4% significant
 
Fennec:
 - SunSpider Result: 107,982.6 ms +/- 3.9%
 - GC #: 149 times
 - Maximum rt->gcBytes (a high water mark): 10,772 KB 

Fennec with this patch:
 - SunSpider Result: 115,121.4 ms +/- 0.7%
 - GC #: 496 times
 - Maximum rt->gcBytes (a high water mark): 1,536 KB

SUT
 - HW: Nokia N810
 - Source Code: Today, Pull from http://hg.mozilla.org/mozilla-central and http://hg.mozilla.org/mobile-browser mobile.

>> In the future, I think it would be useful to be able to dynamically enable the patch's behaviour, but not only for Fennec.

If we can do it like a JIT can be turn on/off on about:config, I also think it would be useful.

>> I wonder if it's worth coding that way, now?
 
I made this patch in order to try my idea.
One of my idea's key is that we find free memory using GC before allocating a new arena. And also, this idea needs to calulate a free memory ratio among arena list for deciding a threshold related to memory maximum usage.
So, I implemented calculation of free memory ratio in arena list in js_GC().
(Of course, I think this patch need to be discussed and modified. Thus, I attach V0.1 to patch)

BTW, we can change GC call simply at js/src/xpconnect/src/xpcruntimesvc.cpp file.
   236: const uint32 gGCSize = 32L * 1024L * 1024L; /* pref? */
   ...
   274: JS_SetGCParameter(mRuntime, JSGC_MAX_BYTES, 0xffffffff);

Above source code decides a maximum threshold because they set rt->gcMaxBytes and rt->gcMaxMallocBytes. It seems to me this is a little rough condition. So, I think my idea is better than this for fennec.
It is very interesting that the aes and tagcloud tests run 35% and 29% faster *with* the patch.  This suggests that those tests are written in such a way that they produce a lot of garbage (unreferenced objects) quickly.  More GCs reduces the memory consumption so much that they run much faster.  I wonder if others know how much those 2 tests are being dominated by garbage?

Is it easy to get a "Max gcBytes" value for each of the tests, to get an idea of the high water mark of memory usage?
>> Is it easy to get a "Max gcBytes" value for each of the tests, to get an idea
of the high water mark of memory usage?

In order to measure "Max rt->gcBytes" for each of the tests, I ran SunSpider test on SpiderMonkey shell. Please find my test result in attached. 

However, I noticed the performance difference in Fennec and SpiderMonkey shell. 

Especially, the execution time of below two test cases are faster with this patch than without: 
7	bitops-bitwise-and.js	 	2,301 	   2,099      1.09x as slow
8	bitops-nsieve-bits.js	 	4,609 	   4,056      1.13x as slow 

SiderMonkey shell with this patch is a little faster in comparison with original. 
(I ran 3 times for each test.)

I guess that the reason is that this patch can reduce GC execution time,  because arenas needed scanning are much smaller than original during the GC execution.

SpiderMonkey shell:
  - Sunspider Result : 100,577 ms
  - GC # : 26
  - Maximum rt->gcBytes : 11,892 KB

SpiderMonkey shell with this patch:
  - Sunspider Result : 100,559 ms
  - GC # : 52
  - Maximum rt->gcBytes : 3,072 KB

SUT
 - H/W: Nokia N810
 - # of Test Runs: 3 times for each tests.
 - Test : In order to reproduce same environment, I modified js.cpp file as below,

   3893:   #ifdef NS_OSSO
                 rt = JS_NewRuntime(32L * 1024L * 1024L);
	         JS_setGCParameter(rt, JSGC_MAX_BYTES, 0x00180000);
   3896:   #else
	         rt = JS_NewRuntime(64L * 1024L * 1024L);
   3898:   #endif
It looks like, on your Nokia N810, the runtime is essentially unchanged with your patch.  But, the high water mark of memory consumption is drastically reduced on each test.  Further, there are no "bad cases" of runtime increase due to the patch on any of the SunSpider tests.

The stats are a strong argument in favour of the patch.  The basic idea is a really good one.  Good work, Gyu-Young.

Note that I haven't reviewed the patch per se.  There may be better ways to achieve the same result.

I hope others will pay this patch the attention it deserves.  I think that the various thresholds involved should be tunable via about:config, eventually.
There are some style issues with Patch-v0.1:

a) We don't use Hungarian notation in spidermonkey, or anything like it (no "g" for your global consts as in gGCExpandMaxByte, for example, and no "n" for numbers(?) as in nArenaCount).

b) Make sure your spacing matches the spacing around it.  Variable names in structs should be spaced away from the type-name identically to the structs nearby.  Look around for examples.

c) Indentation like this is wrong:
+            if (rt->gcBytes > rt->gcMaxBytes && 
+                    rt->gcCallCountForNormal < GC_CALL_LIMIT && 
+                    rt->gcPoke) {
+                        rt->nArenaKind = flindex;

You need to make your code match the code around it.

+            if (rt->gcBytes > rt->gcMaxBytes &&
+                rt->gcCallCountForNormal < GC_CALL_LIMIT &&
+                rt->gcPoke) {
+                rt->nArenaKind = flindex;

And here:

+        if (*doubleFlags != (jsbitmap) -1) {
+#ifdef NS_OSSO
+        rt->gcCallCountForDouble = 0;

Should be:

+        if (*doubleFlags != (jsbitmap) -1) {
+#ifdef NS_OSSO
+            rt->gcCallCountForDouble = 0;


d) Please make sure your patch does not add whitespace at the end of lines.

e) This line exceeds even the new 99 column limit, make sure you split it across multiple lines in a style that matches nearby code:
0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
+            rt->gcArenaRatio = (int)(rt->nFreeThing * 100)/(rt->nArenaCount * THINGS_PER_ARENA(arenaList->thingSize));

f) Spaces around operators (including casts), to match surrounding code:
+        rt->gcArenaRatio = (int)(rt->nFreeThing * 100)/(rt->nArenaCount * DOUBLES_PER_ARENA);

Should be:
+        rt->gcArenaRatio = (int) (rt->nFreeThing * 100) / (rt->nArenaCount * DOUBLES_PER_ARENA);

What does "NS_OSSO" mean?  What are the consequences of taking this patch without the preprocessor guards?  (ie., just changing to this for all platforms)

This needs Igor or Brendan's review, as well.
Attached patch Patch-v0.1 (obsolete) — Splinter Review
Thank you for your advice and help. I tried to modify my patch according to your comment. If this patch has still nits, please let me know.

>> What does "NS_OSSO" mean? 

It seems to me NS_OSSO is a preprocessor for Nokia maemo platform. So, I know that some source codes are already using this preprocessor for fennec. I also used it for the time being, because I do not find appropriate preprocessor for this patch yet.
 
>> What are the consequences of taking this patch
>> without the preprocessor guards?  (ie., just changing to this for all platforms)

I don't know if this patch can be submitted without the preprocessor or not, because this patch still makes fennec or firefox a bit slow.

Igor:

Could you give me your advice about this idea ?
Attachment #340519 - Attachment is obsolete: true
Comment on attachment 342024 [details] [diff] [review]
Patch-v0.1

A quick remark:

>@@ -3585,6 +3657,15 @@ js_GC(JSContext *cx, JSGCInvocationKind 
>             METER(nkilledarenas++);
>         } else {
>             ap = &a->prev;
>+#ifdef NS_OSSO
>+            rt->arenaCount++;
>+            if (rt->arenaKind == GC_DOUBLE_KIND) {
>+                for (i = 0; i != DOUBLES_PER_ARENA; ++i) {
>+                    if (!IsMarkedDouble(a, i))
>+                        rt->freeThing++;
>+                }
>+            }
>+#endif

Avoid this scan-counting and rather count the doubles in the arena while marking them using a counter in the arena instead of the current flag on double presence. A typical pattern with doubles is a burst of activity with double allocations -> GC -> few arenas with rather a few holes among double cells. So this counting just wastes the CPU time and trashes the caches.
(In reply to comment #18)
> Igor:
> 
> Could you give me your advice about this idea ?

I think the current patch spends too much efforts on GC-arena heuristics but ignores the malloc accounting. The idea behind JS_malloc is not only to have a function that auto-reports Out-Of-Memory errors but also counts the bytes that are stored inside GC-things and can be freed only via running the GC. But since JS_free does not take the size argument, this accounting can only be used as a heuristic. This may lead to a bad situation when a script allocates few string with huge character arrays sucking out all memory from the system.
Igor:

Thank you for your advice. 

>> I think the current patch spends too much efforts on GC-arena heuristics
>> but ignores the malloc accounting.

I know that JS_malloc() already has malloc counting and it can forecast the bytes which are used by NewGCArena(). 

However, I wanted to manage a maximum threshold of memory usage dynamically in order to prevent to occupy too much memory. So, I thought it was better to use a free memory's ratio for increasing/decreasing the threshold related to memory usage dynamically. Thus, I do not use JS_malloc count but use gcBytes(NewGCArena) directly. 

Anyway, as you mentioned, I agree that the calculation can waste CPU time and caches of free things. 
So, I'm trying to find an appropriate implementation for this idea without unnecessary calcuation.

By the way, I wonder how do you think about this idea in mobile environment.

ps. 
>> This may lead to a bad situation when a script allocates few string
>> with huge character arrays sucking out all memory from the system.
 
I cannot understand if the problem is due to my patch or not.
(In reply to comment #21)

> So, I thought it was better to use
> a free memory's ratio for increasing/decreasing the threshold related to memory
> usage dynamically. Thus, I do not use JS_malloc count but use
> gcBytes(NewGCArena) directly. 

The problem with the current implementation (which the suggested patch does not fix) is that gcBytes does not account for all bytes allocated for GC things. For example, the string chars and arrays to store properties of objects are allocated using malloc. Similarly, malloc/new is heavily used to implement various DOM wrappers exposing the native objects into the JS world. Such memory is not freed until the GC frees the correspondent GC things and can contribute substantially to the total number of bytes that a script can allocate. 

Yet the only contribution from these extra malloc bytes to the decisions about running the GC is a simple heuristic on the number of malloced bytes since the last GC. This number may has nothing to do with the total malloc size as it can be arbitrary larger then the number of bytes malloced since the last GC. 

Hence is my problem with the patch: it focuses just on GC arena bytes by adding many new heuristics. But it completely ignores the malloced accounting, which may well be as relevant for the real web as gcBytes. This is too one-sided IMO. It would be nice to have a version of JS_free that takes a size argument so at least some insights into typical behavior of the total size of malloced bytes can be gained.

Alternatively, I would like see a patch that focuses just on improving accounting for doubles. It would be interesting to know how changing just that part will affect the benchmarks.

> By the way, I wonder how do you think about this idea in mobile environment.

The current heuristics in the engine comes from desktop perspective. So it is not surprising that alternatives works better on mobiles.
Attached patch Patch-v0.2 (obsolete) — Splinter Review
I try to modify my patch according to your advice.
 - Do not ignore the malloced accounting.
 - Do not spend too much efforts on GC arena heuristics.

 (If I do not understand what you meant correctly, please let me know.)

Key of my idea is to adjust thresholds of memory usage(rt->gcMaxBytes and
rt->gcMaxMallocBytes) dynamically. In order to accomplish the idea, it is 
important to predict how much memory SpiderMonkey will use in near future. 
My previous patch make use of free memory ratio for predicting it. Because, I
think the more live things (i.e. memory cells in use) means the heavier work by
SpiderMonkey. Therefore, the thresholds(rt->gcMaxBytes and 
rt->gcMaxMallocBytes) need to be increased. However, this involves too much 
heuristics. So, I changed it to a simple way of prediction in this new patch.

In order to forecast memory usage of SpiderMonkey, new patch calculates 
difference of gcBytes during GC, which is returned to system soon after GC. It 
is difficult to calculate a difference of malloced accounting used by 
JS_malloc() unless JS_free() takes in a size parameter. It seems we need a 
header field for recording the size of each memory cell or a memory allocation 
table in order to know freed memory size in JS_free(). It can make SpiderMonkey 
more complex. Thus, I opted to use gcBytes. 

In new patch, I make a condition using average cell utilization as below.

  if (rt->gcGCBeforeBytes - rt->gcLastBytes < (int) (gcCommonDiff * 0.3) {...}
  else if (rt->gcGCBeforeBytes - rt->gcLastBytes > (int) (gcCommonDiff * 3) {...}

I sets a basic unit of 512 KB for increasing/decreasing the thresholds. 
According to attached stats, average cell utilization is about 30 % during 
SunSpider test. So,I use both 512 KB and the 30 % utilization for the 
condition. If difference of gcBytes during GC is bigger than 1.5 MB, we can 
think that GC arenas do not have live GC cells about 512 KB. Therefore, we can 
reduce the thresholds (rt->gcMaxBytes and rt->gcMaxMallocBytes) with 512KB. 
However, I don't find reasonable reason about increasing the thresholds yet. 
For the time being, I set 512 KB * 30 %. I still need to find appropriate 
heuristic for increasing/decreasing the thresholds.

Below result shows SunSpider result. there are hardly difference of performance
in the result. But,it seems this patch is a little slower(about 3~5%) than 
normal on average.

Test Result
 
  - Current SpiderMonkey
    * SunSpider Result : 116,044.4 ms +/- 2.3%
    * GC # : 148 times
    * A high water mark of rt->gcBytes : 11,664 KB
    * A high water mark of rt->gcMallocBytes : 3,162 KB
    * Average Cell Utilization : 17.2 %
    
 - This patch
    * SunSpider Result :116,932.8 ms +/- 1.2%
    * GC # : 304 times
    * A high water mark of rt->gcBytes : 3,072 KB
    * A high water mark of rt->gcMallocBytes : 2,411 KB
    * Average Cell Utilization : 29.1 %

SUT
 - HW : Nokia N810
 - Test Date : 2008-10-14
Attachment #342024 - Attachment is obsolete: true
I turned on JS_GCMETER preprocessor for tracing GC allocation stats. According to this stats, this patch makes average cell utilization is about 12% higher than normal.
Assignee: general → gyuyoung.kim
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment on attachment 343161 [details] [diff] [review]
Patch-v0.2

Igor, could you give some more guidance if not review? Thanks,

/be
Attachment #343161 - Flags: review?(igor)
Comment on attachment 343161 [details] [diff] [review]
Patch-v0.2

>@@ -1823,7 +1829,12 @@ js_NewGCThing(JSContext *cx, uintN flags
>         return NULL;
>     }
> 
>+#ifndef JS_GC_MOBILE
>     doGC = (rt->gcMallocBytes >= rt->gcMaxMallocBytes && rt->gcPoke);
>+#else
>+    doGC = ((rt->gcMallocBytes >= rt->gcMaxMallocBytes || rt->gcBytes > rt->gcMaxBytes) &&
>+             rt->gcPoke);
>+#endif

Currently the condition rt->gcBytes > rt->gcMaxBytes is checked in NewGCArena and the function returns null when it is true. The reason behind this is that until there are unused cells in the allocated arenas gcBytes cannot increase. So why it is necessary to check for the condition here?
(In reply to comment #23)
> Key of my idea is to adjust thresholds of memory usage(rt->gcMaxBytes and
> rt->gcMaxMallocBytes) dynamically.

Hm, this sounds similar to what the browser is already doing when it schedules the GC when gcBytes exceeds gcLastBytes by a certain factor (16 in the current implementation). See also the bug 453432 which adds theis GC scheduling to the engine itself and allows to control the factor at runtime.
>> So why it is necessary to check for the condition here?

I wanted to check again both rt->gcMallocBytes and rt->gcBytes for limiting memory usage strongly. As you mentioned in above comment, it looks like unnecessary work. It is my mistake.

>> this sounds similar to what the browser is already doing when it schedules
>> the GC when gcBytes exceeds gcLastBytes by a certain factor (16 in the current
>> implementation).

It seems IsGCThresholdReached() function implements a similar thing with my idea. The difference between them is the way of prediction for deciding threshold. My patch makes use of difference of gcBytes during GC, and thresholds (gcCommonDiff  * 3 and * 0.3) . On the other hand,  IsGCThresholdReached() uses a difference between last GC arenas and current GC arenas and a trigger factor for checking if reaching threshold or not. But, I don't know which is better solution.  For now, it seems the patch on Bug 453432 is rather better implementation than my patch's.

In my opnion, the trgger factor still comes from desktop perspective (1,600%). I wonder how did you decide on the factor(1,600%) ?
Attached patch Patch based on Bug 453432 (obsolete) — Splinter Review
I try to implement my idea based on bug 453432's patch. I modify the patch as attached. When I set the trigger value to 300%, the result is similar to the previous's.

 -  On bug 453432's patch

    * SunSpider Result : 111,033.0 ms +/- 6.4%
    * GC # : 143  
    * A high water mark of rt->gcBytes : 11,968 KB
    * A high water mark of rt->gcMallocBytes : 2,409 KB
  
 - Patch attached on bug 453432's patch

    * SunSpider Result :115,744.8 ms +/- 2.0%
    * GC # :  316 
    * A high water mark of rt->gcBytes : 3,072 KB
    * A high water mark of rt->gcMallocBytes : 2,410 KB

SUT
 - HW : Nokia N810
    CPU : TI OMAP 2420 400 MHz
    RAM : 128 MB
 - Browser : Fennec 1.0 a1 

I need to further test/consider about this patch, but it seems to me this patch can implement my idea as well.
(In reply to comment #29)
> In my opnion, the trgger factor still comes from desktop perspective (1,600%).
> I wonder how did you decide on the factor(1,600%) ?

That was simple - running the SunSpider benchmarks in the browser, see bug 417052 comment 17.
I run sunspider test on each trigger factor based on bug 453432's patch. The result is as below,

T.F    GC #    peak of gcBytes  peak of gcMallocBytes     SunSpider Result 
=====  ======  ===============  ====================== =======================
100 %           ( Fennec can't be run because of too many GC invocations )
300 %   295      3,928 KB           2,409 KB             121,899.0ms +/- 1.7% 
400 %   233      4,686 KB           2,958 KB             119,175.4ms +/- 6.8%
600 %   189      5,136 KB           3,162 KB             121,736.6ms +/- 11.8% 
800 %   168      6,048 KB           2,409 KB             115,558.4ms +/- 2.0% 
1000 %  156      7,840 KB           3,161 KB             118,986.0ms +/- 2.6% 
1200 %  150      9,172 KB           3,161 KB             113,915.8ms +/- 2.0% 
1600 %  148      12,080 KB          3,160 KB             118,480.4ms +/- 5.5%   

SUT
 - HW : Nokia N810
    CPU : TI OMAP 2420 400 MHz
    RAM : 128 MB
 - Browser : Fennec 1.0 a1
 - Connection : Wireless LAN
 - I only changed the trigger factor in JS_SetGCParameter().
    JS_SetGCParameter(mJSRuntime, JSGC_TRIGGER_FACTOR, "value");

Sunspider results does not have consistency. However, generally, the sunspider result tends to increase number of result as Trigger Factor is decreased. (It seems there are about 2~5% difference between 300% and 1,600%)

We can see coherent flow between GC execution frequency and the peak of memory on above result. According to my previous tests, I think 300% is the best Trigger Factor for mobile. 
And, out of memory checks was already expanded for mobile. Thus,I think we do not need to limit thresholds(rt->gcMaxBytes and rt->gcMaxMallocBytes) with specific value.

If bug 453432's patch is going to be merged into trunk, source code can be modified as below,

In XPCJSRuntime::XPCJSRuntime() of js/src/xpconnect/src/xpcjsruntime.cpp

+#ifndef JS_GCMOBILE
+        // GC will be called when gcBytes is 1600% of gcLastBytes.
+        JS_SetGCParameter(mJSRuntime, JSGC_TRIGGER_FACTOR, 1600);
+#else
+        // GC will be called when gcBytes is 300% of gcLastBytes.
+        JS_SetGCParameter(mJSRuntime, JSGC_TRIGGER_FACTOR, 300);
+#endif

This patch can be turned on/off on about:config, or we can make that user sets the trigger factor on about:config directly.
So far, I have observed how high high-water-mark of memory usage is and how high the GC frequency is set for fennec. In mobile, as you know, there are a lot of various devices, which have various hardware environment. Moreover, it is also differ how much heap memory fennec can use on each device. Thus, I think the best way to set appropriate limitation of memory usage and GC frequency is with about:config for each device environment.

I made a patch depends on bug 453432 with which high-water-mark and GC frequency can be adjusted on about:config.
(I think bug 453432 already implemented my idea that the value of high-water-mark is changed dynamically during the browser runtime.)
Attachment #343389 - Attachment is obsolete: true
Attachment #343161 - Flags: review?(igor)
Attachment #344245 - Flags: review?(igor)
Attachment #344245 - Flags: superreview?(brendan)
Depends on: 453432
Igor: 

I wonder how bug 453432 is being progressed ?
And, I'm waiting your comment about my last patch.
(In reply to comment #34)
> Igor: 
> 
> I wonder how bug 453432 is being progressed ?
> And, I'm waiting your comment about my last patch.

I will do the review after landing a reviewed patch from the bug 453432.
Flags: wanted-fennec1.0+
Blocks: 459117
Dave, this is the adaptive GC patch I was talking about.

/be
Igor: review ping.

/be
Let me try to summarize. It appears that the patch for bug 453432 does 2 things:

  - Move the place where we decide to run a GC to the JS allocator. (This makes a lot of sense.)
  - Consolidate the criteria for running GC and make them user-tunable. The criteria are: Run a GC when either of these is true:

      bytes allocated by JS_malloc since last GC > X
      arena bytes / max(arena bytes since last GC, 8k) > Y

            where X and Y are set by the user, and in current FF,
                   X = 32M, Y = 16

IMO, the criteria are bogus but definitively fixing them is major work for post-3.1.


This patch, in its latest form, adds browser prefs for X and Y so that they can be tuned for the N810. 

Also, gyuyoung's experiments have shown that on the N810, total SS run time differences for Y in the range [3, 16] are of the magnitude of noise, and that the lower the number the better for maximum memory usage, which is important for the N810. He also found that for Y = 3, X could be infinity and it really doesn't matter (on SS). 


Did I get that right? If so, I think the key is to hurry up and land bug 453432: between this bug and bug 468880, we know that GC scheduling is currently unacceptable for current/upcoming applications without tuning, and apparently bug 453432 is blocking all GC tuning.
(In reply to comment #37)
> Igor: review ping.

See comment 35: due to extreme sensitivity of the current mochi, unit etc. tests to the details of the GC scheduling fixing the bug 453432 takes time.
It seems the this patch can be adjusted to mozilla-central because of bug 474801's resolved fix. Thus, I tried on sunspider using this patch on Linux PC.
And also, the javascript.options.jit.content was turned on.

T.F   GC #    peak of gcBytes  peak of gcMallocBytes      Sunspider Result 
==== ======    ==============  =======================   ===================
100%      Firefox can't test Sunspider because of too much GC invocations
200%  130        3,592 KB          2,459 KB               1,850.0ms +/- 2.5%
300%   91        4,140 KB          2,798 KB               1,747.2ms +/- 2.9%
400%   67        5,808 KB          2,797 KB               1,733.0ms +/- 3.6%
600%   58        6,544 KB          2,738 KB               1,642.2ms +/- 2.4%
800%   57       10,592 KB          3,144 KB               1,700.8ms +/- 5.0%
1000%  60       15,360 KB          3,201 KB               1,659.2ms +/- 4.4%
1200%  60       18,528 KB          3,083 KB               1,672.0ms +/- 5.5%
1400%  59       21,504 KB          3,196 KB               1,725.2ms +/- 4.0%
1600%  58        5,944 KB          3,170 KB               1,717.2ms +/- 14.8%

System Under Test
 - CPU : Intel(R) Core(TM)2 Duo CPU  E7300  @ 2.66GHz
 - RAM : 2GB
 - OS : Ubuntu 8.04
 - Test Source Code : mozilla-central on 30-Jan-2009.

As you see the result, there is a strange thing. When the trigger factor is set as 1600%, the peak of gcBytes is much smaller than 1400%. However, I saw the peak of gcBytes is simillar to 1400% when the jit.content is turned off. I don't know why the jit.content affects the peak of gcBytes. I'm trying to find the reason.

Anyway,in my opinion, this patch still can control GC invocations frequency and limit peak of memory usage.

If I did any mistakes, please let me know. Thank you.
Gyuyoung, it looks like bug 453432 is going to be delayed for a while. So you don't need to target your patch to go on top of that bug. 

We really want this patch, so could you verify that your pre-453432 patch (attachment 343164 [details], perhaps?) works on tip and then put that up for review by brendan? We'll try to expedite the review process.
Flags: wanted1.9.1? → wanted1.9.1+
Attachment #340116 - Attachment is obsolete: true
Attachment #341082 - Attachment is obsolete: true
Attachment #341380 - Attachment is obsolete: true
Attachment #343161 - Attachment is obsolete: true
Attachment #343162 - Attachment is obsolete: true
Attachment #343164 - Attachment is obsolete: true
My previous patchs don't need to be managed further. Because they have some problems. I changed status of previous patchs into "obsolete".

Now,I am only focusing on my last patch. It can control GC frequency and a high water mark of Tracemonkey's memory usage via "about:config" page. In future, browsers based on mozilla should be run on various devices. So,in my opinion, mozilla browsers will need this patch.

Now, this patch can be merged into mozilla-central trunk. I succeed in merging this patch into latest mozilla-central trunk. If my patch has any nits, please let me know.

Thanks.
Summary: GC call optimization of SpiderMonkey for Fennec → Control GC frequency/a high water mark of Tracemonkey via about:config
I changed the title of this bug with a concrete meaning.
Attachment #344245 - Flags: superreview?(brendan)
Attachment #344245 - Flags: review?(igor)
Attachment #344245 - Flags: review?(brendan)
Bug 453432 is RESOLVED WONTFIX. Gyuyoung, this seems to require a new patch here. Or is it a bigger problem?

/be
Attachment #344245 - Attachment description: Patch-v0.3 depends on Bug 453432 → Patch-v0.3 does not depend on Bug 453432
Attachment #344245 - Flags: superreview?(mrbkap)
Attachment #344245 - Flags: review?(brendan)
Attachment #344245 - Flags: review+
Comment on attachment 344245 [details] [diff] [review]
Patch-v0.3 does not depend on Bug 453432

>diff -r f5593ad6dcc0 dom/src/base/nsJSEnvironment.cpp
- In SetMemoryHighWaterMarkPrefChangedCallback(const char* aPrefName, void* aClosure)
>+{
>+  PRInt32 highwatermark = nsContentUtils::GetIntPref(aPrefName, 32);
>+  if (highwatermark >= 32) {
>+    JS_SetGCParameter(nsJSRuntime::sRuntime, JSGC_MAX_MALLOC_BYTES,
>+                      highwatermark * 1024L * 1024L);
>+    JS_SetGCParameter(nsJSRuntime::sRuntime, JSGC_MAX_BYTES,
>+                      0xffffffff);
>+  } else {
>+    JS_SetGCParameter(nsJSRuntime::sRuntime, JSGC_MAX_MALLOC_BYTES,
>+                      highwatermark * 1024L * 1024L);

You're setting MAX_MALLOC_BYTES to the same thing in both arms of the conditional. Please move it out in front of the "if".

I'd also appreciate a comment on why MAX_BYTES is clamped to 0xffffffff and MAX_MALLOC_BYTES is not since it's not at all clear from reading the code here.

sr=mrbkap to get this in the tree and unblock mobile, but please file a followup bug on the bizarre and unclear relationship between MAX_MALLOC_BYTES and MAX_BYTES (do we need 2 prefs for those?).
Attachment #344245 - Flags: superreview?(mrbkap) → superreview+
(In reply to comment #44)
> Bug 453432 is RESOLVED WONTFIX. Gyuyoung, this seems to require a new patch
> here. Or is it a bigger problem?
> 
> /be

This patch needs codes related to "rt->gcTriggerFactor" in order to control the GC frequency. However,the codes related to the "rt->gcTriggerFactor" were already merged into mozilla-central by Bug 474801. It seems this patch can be added into mozilla-central regardless of Bug 453432.
I did a mistake in my patch. In "if" block, I wanted to use 32MB, not highwatermark as below.

-    JS_SetGCParameter(nsJSRuntime::sRuntime, JSGC_MAX_MALLOC_BYTES, -                      highwatermark * 1024L * 1024L);

+    JS_SetGCParameter(nsJSRuntime::sRuntime, JSGC_MAX_MALLOC_BYTES,
+                      32L * 1024L * 1024L);

To my knowledge, gcMaxBytes is set as 0xffffffff and gcMaxMallocBytes is set as 32MB when JSRuntime is created in XPCJSRuntime::XPCJSRuntime() of xpcjsruntime.cpp. So,I limit the first one is to 0xffffffff, second one to 32MB,even so,user sets the high water mark over 32 MB.

I append a comment to this patch. Please review the patch.
(In reply to comment #45)
> sr=mrbkap to get this in the tree and unblock mobile, but please file a
> followup bug on the bizarre and unclear relationship between MAX_MALLOC_BYTES
> and MAX_BYTES (do we need 2 prefs for those?).

I file a bug. Bug 482904 -  Unclear relationship between MAX_MALLOC_BYTES and MAX_BYTES.

Thank you.
Comment on attachment 366762 [details] [diff] [review]
Patch depend upon Bug 474801 for check in

Marking review request so it shows up on brendan's radar.
Attachment #366762 - Flags: review?(brendan)
The last patch is same with previous patch which was already reviewed by brendan. So, I didn't request review again. Reason of changing patch's title is that this patch doesn't need to depend on bug 453432 any longer. If I should request review again when I meet same situation, I will request review.
Attachment #366762 - Flags: superreview?(mrbkap)
Comment on attachment 366762 [details] [diff] [review]
Patch depend upon Bug 474801 for check in

>+    //There are two options of memory usage in tracemonkey. One is
>+    //to use malloc() and the other is to use memory for GC.(E.g. 

Move that space at the end of the above line to before the "(E.g.".

>+  SetMemoryHighWaterMarkPrefChangedCallback("javascript.options.mem.high-water-mark",

Use _ not - in pref names.

r=me with that, would appreciate an sr= from mrbkap.

/be
I modified the patch with brendan's comment.
Attachment #369597 - Flags: superreview?(mrbkap)
Attachment #369597 - Flags: review?(brendan)
Attachment #366762 - Attachment is obsolete: true
Attachment #366762 - Flags: superreview?(mrbkap)
Attachment #366762 - Flags: review?(brendan)
I did a mistake. I didn't change high-water-mark with high_water_mark totally.
Attachment #344245 - Attachment is obsolete: true
Attachment #369600 - Flags: superreview?(mrbkap)
Attachment #369600 - Flags: review?(brendan)
Attachment #369597 - Attachment is obsolete: true
Attachment #369597 - Flags: superreview?(mrbkap)
Attachment #369597 - Flags: review?(brendan)
Comment on attachment 369600 [details] [diff] [review]
 Patch depend upon Bug 474801 for check in   

Looks good.
Attachment #369600 - Flags: superreview?(mrbkap) → superreview+
Comment on attachment 369600 [details] [diff] [review]
 Patch depend upon Bug 474801 for check in   

Just noticed gcfrequency could arguably be better named gc_frequency.

/be
Attachment #369600 - Flags: review?(brendan) → review+
Ok, I change gcfrequency with gc_frequency. I didn't request a review again. If I need to request a review again for the last patch, please let me know.
Attachment #369930 - Flags: review?(brendan)
Attachment #369930 - Flags: review?(brendan) → review?(mrbkap)
Attachment #369930 - Flags: superreview+
Attachment #369930 - Flags: review?(mrbkap)
Attachment #369930 - Flags: review+
Attachment #369600 - Attachment is obsolete: true
Gyuyoung, if this is ready to be checked in, please add the keyword 'checkin-needed'.
Status: NEW → ASSIGNED
Keywords: checkin-needed
It seems that path of nsJSEnvironment.cpp was changed.
(dom/src/base/nsJSEnvironment.cpp -> dom/base/nsJSEnvironment.cpp).
Am I required to submit the patch again?
It would be preferable if you did so, yes, particularly considering the two months of changes between now and when the patch was submitted.  You might get lucky and get a committer willing to do that work for you, but I wouldn't bet on it -- best to make patches as easy as possible to commit so that they're as likely as possible to be committed.
Somewhat lame that we missed checking this in. This was wanted for 1.9.1. Could someone volunteer pushing this in after the refresh? If not I can push it into TM.
Fortunately, this patch still is merged to mozilla-central. Do I request review for patch again? In my opinion, this patch doesn't need to get reviews again.
No review needed, just checkin help.

/be
Attachment #382228 - Flags: review+
Attachment #382228 - Flags: superreview+
Comment on attachment 382228 [details] [diff] [review]
Patch - new path of nsJSEnvironment.cpp

>diff -r b34586dde1b5 dom/base/nsJSEnvironment.cpp
>--- a/dom/base/nsJSEnvironment.cpp    Wed Mar 11 15:00:43 2009 +1300
>+++ b/dom/base/nsJSEnvironment.cpp    Wed Mar 11 13:31:47 2009 +0900
>@@ -3756,6 +3756,38 @@ ReportAllJSExceptionsPrefChangedCallback
>   return 0;
> }
> 
>+static int
>+SetMemoryHighWaterMarkPrefChangedCallback(const char* aPrefName, void* aClosure)
>+{
>+  PRInt32 highwatermark = nsContentUtils::GetIntPref(aPrefName, 32);
>+
>+  if (highwatermark >= 32) {
>+    //There are two options of memory usage in tracemonkey. One is
>+    //to use malloc() and the other is to use memory for GC. (E.g.
>+    //js_NewGCThing()/RefillDoubleFreeList()).
>+    //Let's limit the high water mark for the first one to 32MB,
>+    //and second one to 0xffffffff.

Ob. nit: space after // (one space only) and before the words of the comment lines is prevailing style.

Blake, could you help get this in?

/be
Assignee: gyuyoung.kim → nobody
Component: JavaScript Engine → DOM
QA Contact: general → general
Assignee: nobody → gyuyoung.kim
(In reply to comment #61)
> Do I request review for patch again? In my opinion, this patch doesn't need
> to get reviews again.

To clarify Brendan's answer, the answer in general depends on your second sentence above.  If changes to make a patch work were non-existent or minimal, no repeated review is necessary.  If a patch needed many changes, or those changes were tricky, you should ask for review again.  Basically it's your call, although (not surprisingly) if your judgment in such matters is shown to be wrong often enough people will be less inclined to trust your judgment and may still request re-reviews.
To Brendan,
I add a space between "//" and words.

To jeff,
Sorry for my poor expression. I didn't mean to judge it myself.
Attachment #382228 - Attachment is obsolete: true
(In reply to comment #65)
> To jeff,
> Sorry for my poor expression. I didn't mean to judge it myself.

Wait, my answer specifically said it was fine to decide for yourself that it doesn't need review!  Maybe I'm the one not communicating well.  :-)

The rule is this: we trust patch authors to decide whether un-bitrotted patches need re-review until we see evidence that their decisions have a good chance of being wrong.  Is that clear?
Ok, it is very clear now, thanks. :)
When will this patch merge with tracemonkey or mozilla-central?
Comment on attachment 382241 [details] [diff] [review]
Patch - Add a space between "//" and words
[Checkin: Comment 69]


http://hg.mozilla.org/mozilla-central/rev/a565fa8f3583
Attachment #382241 - Attachment description: Patch - Add a space between "//" and words → Patch - Add a space between "//" and words [Checkin: Comment 69]
Attachment #382241 - Flags: approval1.9.1.1?
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Keywords: checkin-needed
OS: Other → All
Hardware: Other → All
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.2a1
Attachment #382241 - Flags: approval1.9.1.1? → approval1.9.1.2?
Is this needed primarily for Fennec? If so, that's being built off of the soon-to-be-created mozilla-1.9.2 branch so I'd rather see it wait until then. If it's needed for the 1.9.1.x train, can someone tell me why?
I think that this patch is not primarily. As you know, this patch just controls gcfrequency and high-water-mark of tracemonkey's memory usage for various devices. Thus, we need to decide how to set the values for each device. It seems to me there are no various targets for Fennec yet.
See also bug 506125. We are trying to move away from malloc and gc-thing bytes accounting by SpiderMonkey, toward using OS-based RSS and other VM measures for memory pressure feedback.

/be
Comment on attachment 382241 [details] [diff] [review]
Patch - Add a space between "//" and words
[Checkin: Comment 69]

We'll look at this for 1.9.1.3.
Attachment #382241 - Flags: approval1.9.1.2? → approval1.9.1.3?
(In reply to comment #72)
> See also bug 506125. We are trying to move away from malloc and gc-thing bytes
> accounting by SpiderMonkey, toward using OS-based RSS and other VM measures for
> memory pressure feedback.
> 
> /be

It seems the bug 506125's solution is better than now. However,if the bug 506125's patch is pushed to trunk, this patch should be reverted from trunk. I'm considering how to adjust concept of this bug based on bug 506125.
Comment on attachment 382241 [details] [diff] [review]
Patch - Add a space between "//" and words
[Checkin: Comment 69]

I'm not seeing the compelling need for this change on the older branch, and I'm interpreting the later comments as saying you're even reevaluating this approach. 1.9.2/Fennec has this fix, that should be good enough. Re-request approval if we've misinterpreted this bug.
Attachment #382241 - Flags: approval1.9.1.3?
(In reply to comment #75)
> (From update of attachment 382241 [details] [diff] [review])
> I'm not seeing the compelling need for this change on the older branch, and I'm
> interpreting the later comments as saying you're even reevaluating this
> approach. 1.9.2/Fennec has this fix, that should be good enough.

Sorry for my late reply. I don't have better approach now. If I have any ideas, I will discuss them with community. Thanks.
Depends on: 520970
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: