Closed Bug 1245485 Opened 8 years ago Closed 8 years ago

Also disable updateDecommittedRegion for windows?

Categories

(Core :: JavaScript: GC, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox47 --- fixed

People

(Reporter: h4writer, Assigned: terrence)

References

Details

Attachments

(1 file, 2 obsolete files)

I'm looking at performance differences between the javascript engine in windows and linux and trying to pinpoint the differences and find ways to improve our performance on windows. Therefore I created two VirtualBox images with the same spec (memory/harddrive size/cores/...). On both I build a js shell and ran octane on it. The biggest difference was raytrace and I looked into it.

The analysis in octane-raytrace showed that we were spending much more time in MinorGC:

Windows:
> MinorGC:               Reason  PRate Size    Time canIon mkVals mkClls mkSlts mcWCll mkGnrc ckTbls mkRntm mkDbgr clrNOC collct  tenCB swpABO updtIn frSlts  clrSB  sweep resize pretnr logPtT
> ...
> MinorGC:       OUT_OF_NURSERY   0.3%    1     851      8      8     10      8     10      7      8     63     22      9     18      7      7     11      7      8     93    382      7      8
> MinorGC:       OUT_OF_NURSERY   0.0%    1     753      4      4      4      4      7      4      7     38     12      5      6      4      4      6      3      8    118    400      7      8
> MinorGC:       OUT_OF_NURSERY   0.2%    1     705      4      4      6      5      5      4      4     42     13      6     13      4      4      7      4      5     89    399      4      4
> MinorGC:       OUT_OF_NURSERY   0.2%    1     689      4      4      5      5      6      4      4     44     13      6     14      4      4      7      4      5     91    378      4      5
> MinorGC:       OUT_OF_NURSERY   0.1%    1     664      5      4      3      5      7      4      4     39     11      4      7      4      4      6      5      4     83    378      4      5
> MinorGC:       OUT_OF_NURSERY   0.2%    1     682      4      4      6      4      5      3      4     41     12      4     12      4      4      7      4      5     85    390      4      4
> MinorGC:       OUT_OF_NURSERY   0.1%    1     669      4      4      3      4      6      4      3     38     11      5      6      4      4      7      5      4     83    385      4      4
> MinorGC:       OUT_OF_NURSERY   0.2%    1     850      3      3      5      4      5      4    190     39     12      4     27      3      4      7      4      5     84    359      4      4
> MinorGC:       OUT_OF_NURSERY   0.0%    1     604      3      4      4      4      6      3      3     35     10      4      5      3      3      6      4      4     71    361      3      3
> MinorGC:       OUT_OF_NURSERY   0.3%    1     853      8      8      9      8      7      8      7     60     22      8     18      8      8     11      7      8     93    388      7      8
> MinorGC:       OUT_OF_NURSERY   0.2%    1     691      3      3      4      4      9      3      2     34      9      6     10      3      4      5      3      4     74    437      2      3
> MinorGC:       OUT_OF_NURSERY   0.1%    1    1002      6      5      6      6      5      5      5     75     10     16     10      6      6     10      5      7    124    572      5      6
> MinorGC:       OUT_OF_NURSERY   0.1%    1    1330      3      4      5      3      6      3      6     41     10      5     52      6      7      9      6      8    164    885      5      7
> MinorGC:       OUT_OF_NURSERY   0.1%    1    1163      4      5      5      6      5      5      5     67     14      6      8      5      5      8      5      7    101    795      4      6
> MinorGC:       OUT_OF_NURSERY   0.2%    1     698      4      4      6      4      5      4      4     42     12      5     12      4      4      6      4      4     85    402      3      4
> MinorGC:       OUT_OF_NURSERY   0.0%    1     868      7      8      7      8     10      7      5     65     22      8     10      7      8     12      7      5    110    425      4      5
> MinorGC:       OUT_OF_NURSERY   0.2%    1     819      4      5      6      4      5      5      4     66     13      5     13      4      5      8      4      5     94    471      4      5
> ...

Linux:
> MinorGC:               Reason  PRate Size    Time canIon mkVals mkClls mkSlts mcWCll mkGnrc ckTbls mkRntm mkDbgr clrNOC collct  tenCB swpABO updtIn frSlts  clrSB  sweep resize pretnr logPtT
> ...
> MinorGC:       OUT_OF_NURSERY   0.3%    1     129      1      1      2      2      2      1      1     17      4      2      5      2      1      3      2      2     41      4      1      1
> MinorGC:       OUT_OF_NURSERY   0.0%    1     120      1      1      2      2      3      1      1     13      4      2      3      1      2      2      2      2     41      4      1      2
> MinorGC:       OUT_OF_NURSERY   0.2%    1     176      2      2      2      1      1      1      2     15      5      2      4      1      1      2      1      3     46     28      2      3
> MinorGC:       OUT_OF_NURSERY   0.2%    1     135      2      2      2      1      2      1      1     19      4      1      8      2      2      3      1      2     42      5      1      1
> MinorGC:       OUT_OF_NURSERY   0.1%    1     182      2      3      3      3      4      3      2     22      9      3      4      3      3      4      2      3     44      5      3      3
> MinorGC:       OUT_OF_NURSERY   0.2%    1     277      3      3     15      3      3      3      3     40      9      3     21      4      8     10     16      3     44      5      3      2
> MinorGC:       OUT_OF_NURSERY   0.1%    1     181      2      3      3      2      4      2      2     21      8      3      4      3      3      4      2      3     44      5      2      3
> MinorGC:       OUT_OF_NURSERY   0.2%    1     233      3      4      5      3      4      3      2     23      8      3     11      3     38      4      3      3     44      6      3      2
> MinorGC:       OUT_OF_NURSERY   0.0%    1     206      3      3      2      3      4      2      2     21      9      3      3      3      3      4      3      3     56      5      2      3
> MinorGC:       OUT_OF_NURSERY   0.3%    1     187      3      3      3      3      3      3      2     23      8      3      7      3      3      5      2      3     44      5      3      3
> MinorGC:       OUT_OF_NURSERY   0.2%    1     186      2      3      4      3      6      3      3     20      8      3      6      2      3      4      3      3     44      6      3      3
> MinorGC:       OUT_OF_NURSERY   0.1%    1     181      2      3      3      3      3      3      3     22      8      3      4      2      3      4      3      3     44      6      3      3
> MinorGC:       OUT_OF_NURSERY   0.1%    1     529      7      7      9      7      9      7      6     55     21      8     48      7      8     10      7      8    130     16      6      7
> MinorGC:       OUT_OF_NURSERY   0.1%    1     491      7      7      7      7      7      7      7     58     21      8     11      7      7     10      7      8    132     14      7      7
> MinorGC:       OUT_OF_NURSERY   0.2%    1     174      2      2      3      3      3      2      2     20     15      3      6      2      2      4      2      2     44      5      3      2
> MinorGC:       OUT_OF_NURSERY   0.0%    1     163      3      2      3      2      3      3      2     18      8      2      3      2      2      3      2      3     44      5      3      2
> MinorGC:       OUT_OF_NURSERY   0.2%    1     177      3      2      3      2      3      2      3     31      7      2      6      3      2      4      2      2     44      5      3      2
> MinorGC:       OUT_OF_NURSERY   0.2%    1     175      2      2      3      3      3      3      2     19      7      3     15      2      3      3      2      3     43      5      2      3
> MinorGC:       OUT_OF_NURSERY   0.1%    1     162      2      2      2      2      3      2      2     19      7      3      3      2      2      4      2      2     44      5      2      3
> MinorGC:       OUT_OF_NURSERY   0.2%    1     220      2      2      3      2      3      2      2     19      7      2      6      3      3      3      2      3     43      4      2      3
> MinorGC:       OUT_OF_NURSERY   0.0%    1     160      3      2      2      3      3      2      2     17      7      2      3      2      3      3      2      2     44      5      2      2
> MinorGC:       OUT_OF_NURSERY   0.2%    1     165      2      3      3      2      2      3      2     19      7      3      6      3      3      3      2      3     44      5      3      2
> MinorGC:       OUT_OF_NURSERY   0.1%    1     163      2      2      2      3      3      2      2     18      7      3      3      3      3      3      2      3     43      5      3      2
> MinorGC:       OUT_OF_NURSERY   0.2%    1     186      2      2      3      2      2      3      3     20      7      2     23      3      3      4      2      2     44      5      3      2
> MinorGC:       OUT_OF_NURSERY   0.2%    1     169      3      2      3      3      3      3      3     18      7      2      5      2      2      3      2      2     43     13      2      2
> MinorGC:       OUT_OF_NURSERY   0.1%    1     157      3      2      2      2      3      2      2     18      7      3      3      3      3      3      2      3     43      5      2      3
> MinorGC:       OUT_OF_NURSERY   0.2%    1     252      2      3      3      2      3      2      3     19      7      3      5      2      2      4      2     11     95      5      3      3
> MinorGC:       OUT_OF_NURSERY   0.0%    1     156      2      2      2      2      3      2      2     17      7      3      2      2      3      3      3      3     42      5      2      3
> ...


In these results I found that the 'resize' part took the most time. And following that code I saw that we call updateDecommittedRegion which takes this time. We disabled this for Mac in bug 994054. Seems that Windows is also slow here.
Disabling that for windows, brought me close to linux performance. (3000 point increase to 8000, linux is around 9000. Still need to find the other issues.)
=> Can we disable this for windows too?

Something interesting to note here, is that in the VirtualBox I can see this difference quite good. We also have browser build on AWFY, which is quite interesting.
https://arewefastyet.com/#machine=17&view=single&suite=octane&subtest=Raytrace: Seems quite good raytrace performance
https://arewefastyet.com/#machine=31&view=single&suite=octane&subtest=RayTrace: Is now back at quite good performance. But went down for some reason.
Whichever solution, I will definitely need to see how this affects browser builds, instead of shell builds...
=> Could it be that nursery behaves different in browser builds and as consequence we don't call 'updateDecommittedRegion' as often.

Looking at the code and runtime information in shrinkAllocableSpace I saw:
https://dxr.mozilla.org/mozilla-central/source/js/src/gc/Nursery.cpp#714
For every nursery gc in raytrace, numActiveChunks_ = 1. As a result we don't decrease the number of active chunks. In that case does it even make sense to decommit that? Since we will immediately use it again?
=> wouldn't it be better to only call "updateDecommittedRegion" when numActiveChunks has actually decreased?
> Looking at the code and runtime information in shrinkAllocableSpace I saw:
> https://dxr.mozilla.org/mozilla-central/source/js/src/gc/Nursery.cpp#714
> For every nursery gc in raytrace, numActiveChunks_ = 1. As a result we don't
> decrease the number of active chunks. In that case does it even make sense
> to decommit that? Since we will immediately use it again?
> => wouldn't it be better to only call "updateDecommittedRegion" when
> numActiveChunks has actually decreased?
oh I meant: Does it make sense to markt this as uncommitted, when it was never touched anyway and still should be uncommitted?


@Terrence: I got some questions for you ;)
Flags: needinfo?(terrence)
Blocks: 1245487
Maybe it's a stupid question, but you did set JSGC_DISABLE_POISONING=1 to false, right? Raytrace is the benchmark most affected by this.
(In reply to Guilherme Lima from comment #2)
> Maybe it's a stupid question, but you did set JSGC_DISABLE_POISONING=1 to
> false, right? Raytrace is the benchmark most affected by this.

Good question. IIRC this is only enabled by default in nightly builds, right? Since I build my own builds I think I shouldn't use that?
Poisoning is enabled when either JS_CRASH_DIAGNOSTICS or JS_GC_ZEAL are set [1]. NIGHTLY_BUILD implies JS_CRASH_DIAGNOSTICS if address sanitizer isn't used [2] - but NIGHTLY_BUILD is set based on GRE_MILESTONE [3], which in turn is set if the version number in config/milestone.txt contains 'a1' [4]. The milestone is 47.0a1 right now on m-c [5], so I think it's on by default if you're working off m-c. It's probably possible to override that with build flags, but I'm not sure what combinations of NIGHTLY_BUILD and RELEASE_BUILD are valid - by the looks of things Aurora/Dev Edition should have neither.

[1] https://dxr.mozilla.org/mozilla-central/rev/584870f1cbc5d060a57e147ce249f736956e2b62/js/src/jsutil.h#358
[2] https://dxr.mozilla.org/mozilla-central/rev/584870f1cbc5d060a57e147ce249f736956e2b62/js/src/jsutil.h#353
[3] https://dxr.mozilla.org/mozilla-central/rev/584870f1cbc5d060a57e147ce249f736956e2b62/configure.in#3434
[4] https://dxr.mozilla.org/mozilla-central/rev/584870f1cbc5d060a57e147ce249f736956e2b62/configure.in#3424
[5] https://dxr.mozilla.org/mozilla-central/rev/584870f1cbc5d060a57e147ce249f736956e2b62/config/milestone.txt#13
You are the one who filled bug 1158095! My point is that maybe you're just measuring that GC poisoning is slower on Windows than on Linux, or at least that it has a bigger impact on Windows performance.
(In reply to Guilherme Lima from comment #5)
> You are the one who filled bug 1158095! My point is that maybe you're just
> measuring that GC poisoning is slower on Windows than on Linux, or at least
> that it has a bigger impact on Windows performance.

Oh definitely. I just assumed because I build myself and didn't put "--enable-js-diagnostics" that I was fine.
I forgot we now enable it for all 'default' builds. I.e. it was enabled.

New numbers with JSGC_DISABLE_POISONING=1
windows:
> MinorGC:               Reason  PRate Size    Time canIon mkVals mkClls mkSlts mcWCll mkGnrc ckTbls mkRntm mkDbgr clrNOC collct  tenCB swpABO updtIn frSlts  clrSB  sweep resize pretnr logPtT                
> MinorGC:       OUT_OF_NURSERY   0.2%    1     290      4      5      8      5      7      5      5     60     14      6     14      5      5      8      5      5      5     22      4      4                
> MinorGC:       OUT_OF_NURSERY   0.1%    1     263      4      4      4      5      8      5      3     56     13      5      8      4      5      7      5      5      4     21      4      5                           
> MinorGC:       OUT_OF_NURSERY   0.2%    1     239      4      4      5      4     10      4      3     41     12      5     13      4      4      7      4      4      4     17      4      5                
> MinorGC:       OUT_OF_NURSERY   0.2%    1     253      4      5      5      5      6      5      4     41     13      5     22      4      4      7      4      4      5     18      4      5                
> MinorGC:       OUT_OF_NURSERY   0.1%    1     238      4      5      4      5      4      5      4     43     12      5      7      5      4      8      5      5      4     18      4      4                
> MinorGC:       OUT_OF_NURSERY   0.2%    1     211      4      4      5      3      5      4      3     37     10      5     11      4      5      6      4      4      4     15      3      4                
> MinorGC:       OUT_OF_NURSERY   0.1%    1     407      8      8      8      8     10      7      7     67     23      8     12      7      8     11      7      8      7     20      7      8 

linux:
> MinorGC:               Reason  PRate Size    Time canIon mkVals mkClls mkSlts mcWCll mkGnrc ckTbls mkRntm mkDbgr clrNOC collct  tenCB swpABO updtIn frSlts  clrSB  sweep resize pretnr logPtT
> MinorGC:       OUT_OF_NURSERY   0.2%    1      88      1      1      3      2      2      2      1     15      4      2      4      2      1      2      2      1      2      4      1      2
> MinorGC:       OUT_OF_NURSERY   0.1%    1      82      2      2      2      2      1      1      2     14      5      2      2      2      2      3      2      2      2      5      2      2
> MinorGC:       OUT_OF_NURSERY   0.1%    1      91      2      2      3      2      2      2      2     19      5      2      4      2      2      2      2      2      1      5      2      2
> MinorGC:       OUT_OF_NURSERY   0.1%    1      91      1      1      1      2      2      2      2     22      5      2      3      2      1      3      2      2      2      5      2      2
> MinorGC:       OUT_OF_NURSERY   0.2%    1      89      1      1      4      2      3      2      1     16      4      2      5      2      2      3      2      2      2      5      2      2

i.e. resize is still a magnitude 10 slower sadly
(In reply to Hannes Verschore [:h4writer] from comment #1)
> > Looking at the code and runtime information in shrinkAllocableSpace I saw:
> > https://dxr.mozilla.org/mozilla-central/source/js/src/gc/Nursery.cpp#714
> > For every nursery gc in raytrace, numActiveChunks_ = 1. As a result we don't
> > decrease the number of active chunks. In that case does it even make sense
> > to decommit that? Since we will immediately use it again?
> > => wouldn't it be better to only call "updateDecommittedRegion" when
> > numActiveChunks has actually decreased?
> oh I meant: Does it make sense to markt this as uncommitted, when it was
> never touched anyway and still should be uncommitted?
> 
> 
> @Terrence: I got some questions for you ;)

Nursery decommit was added on short notice for B2G. My experience with arena decommit on windows was that it was not that severe, but I guess this case is different. I'll try to improve it today.
Attached patch improve_nursery_decommit-v0.diff (obsolete) — Splinter Review
Only decommit the delta, instead of assuming that the OS will be able to tell that repeatedly decommitting the same region is a nop.
Assignee: nobody → terrence
Status: NEW → ASSIGNED
Flags: needinfo?(terrence)
Attachment #8715401 - Flags: review?(sphink)
Attached patch improve_nursery_decommit-v0.diff (obsolete) — Splinter Review
Don't forget to qref first.
Attachment #8715401 - Attachment is obsolete: true
Attachment #8715401 - Flags: review?(sphink)
Attachment #8715406 - Flags: review?(sphink)
Also, we don't need to decommit to the end anymore, we can just decommit the newly unused portion.
Attachment #8715406 - Attachment is obsolete: true
Attachment #8715406 - Flags: review?(sphink)
Attachment #8715417 - Flags: review?(sphink)
Comment on attachment 8715406 [details] [diff] [review]
improve_nursery_decommit-v0.diff

Review of attachment 8715406 [details] [diff] [review]:
-----------------------------------------------------------------

This is ok, but I think it would be better to change

  updateDecommittedRegion(int priorChunkUsage)

to

  setNumActiveChunks(int newChunks)

or

  setActiveNurserySize(int numChunks)

or something. The "remember to call updateDecommittedRegion if you drop the chunk size" technique seems error-prone.

Whether you'd want to collapse growAllocableSpace/shrinkAllocableSpace into one setAllocableSpace and do it directly there, I'm not so sure. I'm fine with just the internal setNumActiveChunks or whatever.

r=me with that change

::: js/src/gc/Nursery.cpp
@@ +119,2 @@
>  #ifndef JS_GC_ZEAL
> +    if (numActiveChunks_ == numNurseryChunks_)

MOZ_ASSERT(numNurseryChunks_ >= numActiveChunks_), just on general principle?
Attachment #8715406 - Attachment is obsolete: false
Comment on attachment 8715417 [details] [diff] [review]
improve_nursery_decommit-v1.diff

Review of attachment 8715417 [details] [diff] [review]:
-----------------------------------------------------------------

Oops. Moving review to newer patch.

This is ok, but I think it would be better to change

  updateDecommittedRegion(int priorChunkUsage)

to

  setNumActiveChunks(int newChunks)

or

  setActiveNurserySize(int numChunks)

or something. The "remember to call updateDecommittedRegion if you drop the chunk size" technique seems error-prone.

Whether you'd want to collapse growAllocableSpace/shrinkAllocableSpace into one setAllocableSpace and do it directly there, I'm not so sure. I'm fine with just the internal setNumActiveChunks or whatever.

r=me with that change
Attachment #8715417 - Flags: review?(sphink) → review+
Attachment #8715406 - Attachment is obsolete: true
Results on AWFY:
25% win on dromaeo-3d-cube
47% win on dromaeo-sunspider-access-binary-trees
38% win on dromaeo-v8-raytrace
37% jetstream-proto-raytracer
10% win on jetstream-regexp-2010
5% win on Octane-RegExp
small win on dromaeo-v8-deltablue
small win on dromaeo-jslib-traverse-prototype

No regressions!
\o/
And it is a 5~7% win on Octane-Raytrace (the results were noisy before the patch, and now there's basically no variation from run to run).
https://hg.mozilla.org/mozilla-central/rev/87283160a274
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: