Long cycle collector pauses with an addon installed

VERIFIED FIXED in Firefox -esr60

Status

()

P1
blocker
VERIFIED FIXED
a year ago
10 months ago

People

(Reporter: savy.mathieu, Assigned: jonco)

Tracking

({perf, regression})

60 Branch
mozilla62
perf, regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(relnote-firefox 60+, firefox-esr52 unaffected, firefox-esr6060+ verified, firefox60blocking verified, firefox61+ verified, firefox62+ verified)

Details

Attachments

(2 attachments, 2 obsolete attachments)

(Reporter)

Description

a year ago
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.162 Safari/537.36

Steps to reproduce:

I am developper at ZenHub. Several users reported that since Firefox 60 (dev and nightly version), Firefox periodically freezes for 1 to 2 seconds. I have reproduced with Firefox 61.

An easy way to reproduce is to install ZenHub (sorry, I haven't found a more straight forward way) and heads to a busy ZenHub board: https://github.com/saltstack/salt#boards?repos=122537149,18853426,1390248,6910286,12017151,47988358

Launch the performance profiler

And start randomly scrolling vertically and horizontally like in this gif: http://www.giphy.com/gifs/3s8X6ZaStlUBSBl4Fk

You should get huge garbage collection periods like that: https://imgur.com/a/zjhlo


Here's the description of the problem by one of our users: https://github.com/ZenHubIO/support/issues/1383#issuecomment-376321483


Actual results:

Garbage collector runs for up to 2 seconds and freezes everything.


Expected results:

In Firefox 59 (the current latest release), everything works fine
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:61.0) Gecko/20100101 Firefox/61.0
Build ID: 20180330100123

I have tested this issue on latest Nightly build and latest Fx release build using Mac OS X 10.13.3. When scrolling horizontally like in the gif above, I can see big garbage collector period but my browser is not freezing. 

Do you know of any particular page that could demonstrate the issue like the one in the provided gif ? 

I think the right component would be Core:: JavaScript: GC. Please change if this is not the correct component.
Flags: needinfo?(savy.mathieu)
Component: Untriaged → JavaScript: GC
Product: Firefox → Core
(Reporter)

Comment 2

a year ago
I tried to find a better way to reproduce but couldn't. The fact that the garbage collector is running for long period of time in Firefox 61 but not in 59 cannot be considered as a sufficient regression?

Your browser is not freezing but what about your framerate? Doesn't it drop to under 1 frame par second during those garbage collection period?
Flags: needinfo?(savy.mathieu)
(Reporter)

Comment 3

11 months ago
The simple fact of loading our extension in memory is enough to trigger those freezes. 

Update steps to reproduce the issue:

- Install the Firefox extension from here: https://www.zenhub.com/extension
- In GitHub, load a page where the extension is not going to do anything like the wiki page https://github.com/jquery/jquery/wiki
- Open the performance tool and launch it

You should see huge garbage collection periods. If that's not the case (sometimes the problem doesn't show), open a new tab with `https://github.com/jquery/jquery/wiki` wait a few seconds, and launch the performance tool again.

I ran some experiments with the extension:
- If I remove the call to our main() function inside the content script, the garbage collector still goes crazy. Therefore, I think we can exclude that it is not the code ran by the extension that triggers those garbage collection.

- Removing `require()` calls in order to reduce the size of the final javascript file makes the garbage collection periods smaller. If I remove all dependencies, the problem disappears completely.

I think that the problem has to do with loading the extension is memory.

Here's a snapshot of one of the garbage collection we experienced: https://perf-html.io/public/4cf5853a4990e2496cc025c25a13154dd30b4c8c/flame-graph/?thread=3&threadOrder=0-2-3-4-1&v=3
(Reporter)

Updated

11 months ago
Iteration: --- → 60.4 - Mar 12
Has Regression Range: --- → yes
Keywords: regression

Comment 4

11 months ago
This also happens on Firefox 62.0a1, here is the snapshot: https://perfht.ml/2FVTovK.

Could this be related to 1447391?

(FYI I work with mathieuu and have been helping him debug the issue)

Updated

11 months ago
Iteration: 60.4 - Mar 12 → ---
Has Regression Range: yes → no
Keywords: regressionwindow-wanted
(Reporter)

Updated

11 months ago
Has Regression Range: no → yes
Has STR: --- → no
Keywords: regressionwindow-wanted
(Reporter)

Comment 5

11 months ago
I cannot figure out how to put the regression range but here is the commit responsible: https://hg.mozilla.org/integration/mozilla-inbound/rev/1b4d5be72031

We had two different persons bisecting it and arrived at this specific commit.

Updated

11 months ago
Flags: needinfo?(jcoppeard)
Blocks: 1432794
status-firefox60: --- → wontfix
status-firefox61: --- → affected
status-firefox-esr52: --- → unaffected
status-firefox-esr60: --- → affected
These are cycle collections (CCs), not GCs, though the time is spent in the GC machinery called by the CC.

And this does remind me of bug 1447391, in that JS::GCCellPtr::mayBeOwnedByOtherRuntimeSlow is again taking up an uncomfortably large percentage of the time. This time, though, it's not being called by TraversalTracer::onChild; that accounts for 8% of that CC and mayBeOwnedByOtherRuntimeSlow takes 34%. So it looks like a different incarnation of the same problem.

Still, that's a ton of time in CC, and various of the GC helper functions seem to be doing a ton of work. I guess something in  the bug 1432794 patch causes that?
jonco: the bug 1432794 regression range is most likely the relevant thing here, but do you think this patch is overkill for speeding up mayBeOwnedByOtherRuntimeSlow? The only thing I see remotely "slow" about that function is that it isn't inlined.
Attachment #8973876 - Flags: review?(jcoppeard)
Status: UNCONFIRMED → NEW
status-firefox62: --- → affected
Ever confirmed: true
(Assignee)

Comment 8

11 months ago
The changes in bug 1432794 should in no way affect cycle collection.  That bug means we do less setup work when parsing off main thread, but the changes are not visible from the main thread.

Andrew, do you have any idea why the CC might be taking so long here?
Flags: needinfo?(continuation)
(Assignee)

Comment 9

11 months ago
Comment on attachment 8973876 [details] [diff] [review]
Make inlineable isPermanentAtom

Review of attachment 8973876 [details] [diff] [review]:
-----------------------------------------------------------------

This has shown up in a couple of places so I reckon it's worth improving this path.

::: js/src/gc/GC.cpp
@@ +8460,1 @@
>      return as<Symbol>().isWellKnownSymbol();

This method should probably become symbolMayBeOwnedByOtherRuntime() at this point.
Attachment #8973876 - Flags: review?(jcoppeard) → review+
(Assignee)

Comment 10

11 months ago
(In reply to mathieuu from comment #5)
Thanks for the bug report.

How did you bisect this - did you use mozregression or did you build the browser yourself?
Flags: needinfo?(jcoppeard) → needinfo?(savy.mathieu)
(Reporter)

Comment 11

11 months ago
Jeff and I both used mozregression and both ended up finding the same commit. We have done it twice given that good commits are consist in waiting long enough to make sure the garbage collector won't freeze the page.

With a profile where the Zenhub extension was installed and logged into GitHub:

The command we ran: mozregression --good 59 --bad 60 --profile "/Users/xxxxx/Library/Application Support/Firefox/Profiles/xxxx.default" --profile-persistence clone-first --arg="https://github.com/jquery/jquery/wiki

Then, we opened the performance tool and waited. If the garbage collector freezes, tag the build as bad. If not, open a new tab with the same address and measure performance again a couple of times to make sure it doesn't happen. If not, tag good.

Here's the full log: https://pastebin.com/gim66c2E
Flags: needinfo?(savy.mathieu)
(In reply to Jon Coppeard (:jonco) from comment #8)
> Andrew, do you have any idea why the CC might be taking so long here?

Not really. The main JS engine changes that I remember that caused big CC perf regressions added new GC things, but didn't make them things we could represent in the CC graph, so you'd get an exponential blowup in the number of things the CC ended up tracing. I don't think that applies here.

The only other thing I can think of that would regress CC performance would be making many objects gray instead of black.
Flags: needinfo?(continuation)
See Also: → bug 1242861
You know, as long as I'm going to mess around with this, the Symbol changes are actually much smaller than the String ones. Might as well finish this off. There shouldn't be many Symbols, but maybe the magic of inlining will somehow optimize away more of this or something.
Attachment #8974889 - Flags: review?(jcoppeard)
Assignee: nobody → sphink
Status: NEW → ASSIGNED
Hmm, so the commit in comment 5 matches the CC regression on Nightly, so I think this is the same issue I've been trying to address in bug 1460636 and bug 1460385.
(Assignee)

Updated

11 months ago
Attachment #8974889 - Flags: review?(jcoppeard) → review+
(Assignee)

Comment 15

11 months ago
(In reply to Andrew McCreight [:mccr8] from comment #14)
OK, it seems like bug 1432794 is implicated after all.  I'll take a deeper look.
(Assignee)

Comment 16

11 months ago
I've reproduced this as described above.  Here's some cycle collector timings without/with the patch in bug 1432794.  The latter show much higher total time.

Without:

    CC(T+187.4)[content-97370] max pause: 21ms, total time: 90ms, slices: 8, suspected: 3268, visited: 7515 RCed and 160046 GCed, collected: 4384 RCed and 60111 GCed (60111|0|0 waiting for GC)
    ForgetSkippable 12 times before CC, min: 0 ms, max: 2 ms, avg: 1 ms, total: 12 ms, max sync: 0 ms, removed: 14462
    CC(T+293.5)[default-97367] max pause: 2ms, total time: 2ms, slices: 1, suspected: 340, visited: 826 RCed and 611 GCed, collected: 12 RCed and 0 GCed (38|0|0 waiting for GC)
    ForgetSkippable 37 times before CC, min: 0 ms, max: 0 ms, avg: 0 ms, total: 9 ms, max sync: 0 ms, removed: 10243
    CC(T+188.7)[content-97370] max pause: 45ms, total time: 88ms, slices: 5, suspected: 714, visited: 3866 RCed and 159890 GCed, collected: 2977 RCed and 59955 GCed (120066|0|0 waiting for GC)
    ForgetSkippable 3 times before CC, min: 1 ms, max: 1 ms, avg: 1 ms, total: 4 ms, max sync: 0 ms, removed: 10215
    CC(T+300.1)[default-97367] max pause: 2ms, total time: 2ms, slices: 1, suspected: 89, visited: 458 RCed and 1603 GCed, collected: 80 RCed and 952 GCed (952|0|0 waiting for GC)
    ForgetSkippable 10 times before CC, min: 0 ms, max: 1 ms, avg: 0 ms, total: 5 ms, max sync: 0 ms, removed: 2023
    CC(T+196.0)[content-97370] max pause: 25ms, total time: 171ms, slices: 15, suspected: 4745, visited: 11646 RCed and 246889 GCed, collected: 10838 RCed and 146797 GCed (146797|0|0 waiting for GC)
    ForgetSkippable 16 times before CC, min: 0 ms, max: 2 ms, avg: 1 ms, total: 16 ms, max sync: 0 ms, removed: 9256
    CC(T+197.1)[content-97370] max pause: 23ms, total time: 176ms, slices: 15, suspected: 422, visited: 7600 RCed and 246457 GCed, collected: 6993 RCed and 146448 GCed (293245|0|0 waiting for GC)
    ForgetSkippable 9 times before CC, min: 1 ms, max: 2 ms, avg: 1 ms, total: 12 ms, max sync: 0 ms, removed: 11356
    CC(T+68.6)[content-97385] max pause: 2ms, total time: 2ms, slices: 1, suspected: 180, visited: 405 RCed and 2958 GCed, collected: 156 RCed and 2840 GCed (2840|0|0 waiting for GC)
    ForgetSkippable 6 times before CC, min: 0 ms, max: 0 ms, avg: 0 ms, total: 1 ms, max sync: 0 ms, removed: 2300
    CC(T+305.2)[content-97368] max pause: 1ms, total time: 1ms, slices: 1, suspected: 49, visited: 350 RCed and 316 GCed, collected: 230 RCed and 176 GCed (176|0|0 waiting for GC)
    ForgetSkippable 8 times before CC, min: 0 ms, max: 0 ms, avg: 0 ms, total: 1 ms, max sync: 0 ms, removed: 902
    CC(T+306.7)[default-97367] max pause: 2ms, total time: 2ms, slices: 1, suspected: 392, visited: 767 RCed and 1160 GCed, collected: 8 RCed and 540 GCed (540|0|0 waiting for GC)
    ForgetSkippable 9 times before CC, min: 0 ms, max: 2 ms, avg: 0 ms, total: 5 ms, max sync: 0 ms, removed: 6650
    CC(T+204.1)[content-97370] max pause: 15ms, total time: 102ms, slices: 8, suspected: 3539, visited: 6387 RCed and 165651 GCed, collected: 5477 RCed and 65710 GCed (65710|0|0 waiting for GC)
    ForgetSkippable 11 times before CC, min: 0 ms, max: 3 ms, avg: 1 ms, total: 16 ms, max sync: 0 ms, removed: 13316
    CC(T+75.7)[content-97385] max pause: 1ms, total time: 1ms, slices: 1, suspected: 354, visited: 537 RCed and 1343 GCed, collected: 143 RCed and 308 GCed (308|0|0 waiting for GC)
    ForgetSkippable 6 times before CC, min: 0 ms, max: 9 ms, avg: 1 ms, total: 10 ms, max sync: 0 ms, removed: 767
    CC(T+313.2)[default-97367] max pause: 2ms, total time: 2ms, slices: 1, suspected: 75, visited: 398 RCed and 651 GCed, collected: 8 RCed and 27 GCed (27|0|0 waiting for GC)
    ForgetSkippable 8 times before CC, min: 0 ms, max: 1 ms, avg: 0 ms, total: 5 ms, max sync: 0 ms, removed: 1581
    CC(T+210.6)[content-97370] max pause: 49ms, total time: 91ms, slices: 4, suspected: 2473, visited: 5227 RCed and 153551 GCed, collected: 4720 RCed and 53376 GCed (53376|0|0 waiting for GC)
    ForgetSkippable 14 times before CC, min: 0 ms, max: 1 ms, avg: 0 ms, total: 11 ms, max sync: 0 ms, removed: 7384
    CC(T+83.8)[content-97385] max pause: 0ms, total time: 0ms, slices: 1, suspected: 22, visited: 298 RCed and 1035 GCed, collected: 0 RCed and 0 GCed (0|0|0 waiting for GC)
    ForgetSkippable 6 times before CC, min: 0 ms, max: 7 ms, avg: 1 ms, total: 8 ms, max sync: 0 ms, removed: 3

With: 

    CC(T+218.6)[content-97347] max pause: 1640ms, total time: 2680ms, slices: 27, suspected: 1646, visited: 2708 RCed and 121987 GCed, collected: 1949 RCed and 22055 GCed (22055|0|0 waiting for GC)
    ForgetSkippable 7 times before CC, min: 0 ms, max: 1 ms, avg: 0 ms, total: 3 ms, max sync: 0 ms, removed: 2052
    CC(T+222.8)[content-97347] max pause: 71ms, total time: 727ms, slices: 22, suspected: 700, visited: 2241 RCed and 121267 GCed, collected: 1185 RCed and 21344 GCed (43399|0|0 waiting for GC)
    ForgetSkippable 3 times before CC, min: 0 ms, max: 1 ms, avg: 0 ms, total: 2 ms, max sync: 0 ms, removed: 4498
    CC(T+1.2)[content-97351] max pause: 74ms, total time: 781ms, slices: 16, suspected: 2042, visited: 2584 RCed and 101120 GCed, collected: 2183 RCed and 52526 GCed (52526|0|0 waiting for GC)
    ForgetSkippable 6 times before CC, min: 0 ms, max: 0 ms, avg: 0 ms, total: 1 ms, max sync: 0 ms, removed: 1750
    CC(T+2.4)[content-97351] max pause: 73ms, total time: 770ms, slices: 15, suspected: 307, visited: 498 RCed and 85511 GCed, collected: 201 RCed and 36980 GCed (89506|0|0 waiting for GC)
    ForgetSkippable 2 times before CC, min: 0 ms, max: 0 ms, avg: 0 ms, total: 0 ms, max sync: 0 ms, removed: 434
    CC(T+306.8)[default-97344] max pause: 6ms, total time: 6ms, slices: 1, suspected: 2399, visited: 4589 RCed and 9473 GCed, collected: 2982 RCed and 3067 GCed (3067|0|0 waiting for GC)
    ForgetSkippable 10 times before CC, min: 0 ms, max: 1 ms, avg: 0 ms, total: 6 ms, max sync: 0 ms, removed: 6980
    CC(T+10.5)[content-97351] max pause: 50ms, total time: 149ms, slices: 4, suspected: 89, visited: 833 RCed and 68695 GCed, collected: 713 RCed and 20176 GCed (20176|0|0 waiting for GC)
    ForgetSkippable 7 times before CC, min: 0 ms, max: 0 ms, avg: 0 ms, total: 0 ms, max sync: 0 ms, removed: 136
    CC(T+313.6)[default-97344] max pause: 4ms, total time: 4ms, slices: 1, suspected: 182, visited: 1691 RCed and 6487 GCed, collected: 43 RCed and 103 GCed (103|0|0 waiting for GC)
    ForgetSkippable 14 times before CC, min: 0 ms, max: 0 ms, avg: 0 ms, total: 6 ms, max sync: 0 ms, removed: 5270
    CC(T+233.5)[content-97347] max pause: 1843ms, total time: 2659ms, slices: 24, suspected: 3473, visited: 4854 RCed and 133302 GCed, collected: 2701 RCed and 33334 GCed (33334|0|0 waiting for GC)
    ForgetSkippable 11 times before CC, min: 0 ms, max: 2 ms, avg: 0 ms, total: 8 ms, max sync: 0 ms, removed: 8947
    CC(T+237.6)[content-97347] max pause: 69ms, total time: 731ms, slices: 25, suspected: 681, visited: 2574 RCed and 133152 GCed, collected: 1704 RCed and 33229 GCed (66563|0|0 waiting for GC)
    ForgetSkippable 2 times before CC, min: 1 ms, max: 1 ms, avg: 1 ms, total: 2 ms, max sync: 0 ms, removed: 8174
    CC(T+17.6)[content-97351] max pause: 71ms, total time: 708ms, slices: 14, suspected: 4, visited: 90 RCed and 48519 GCed, collected: 0 RCed and 0 GCed (0|0|0 waiting for GC)
    ForgetSkippable 6 times before CC, min: 0 ms, max: 0 ms, avg: 0 ms, total: 0 ms, max sync: 0 ms, removed: 7
Summary: Garbage collector periodically freezes Firefox → Long cycle collector pauses with an addon installed
From looking around in another bug, the object groups with the propids are all being held alive by addon scripts (RES in that case). There were around 7000 a single content process. Perhaps your patch somehow made it so that those scripts ended up being marked grey instead of black? Or maybe it caused us to create a lot more of them?
Depends on: 1460957
Depends on: 1460636, 1460385
See Also: bug 1460636, bug 1460385
Attachment #8973876 - Attachment is obsolete: true
Attachment #8974889 - Attachment is obsolete: true
Depends on: 1461012
I think it makes sense to have this bug as the central place for the general issue, rather than track each possible fix separately.
Severity: normal → blocker
status-firefox60: wontfix → affected
Priority: -- → P1
tracking-firefox60: --- → ?
tracking-firefox61: --- → ?
tracking-firefox62: --- → ?
With my patch in bug 1460636, the total time spent in a cycle collection in the content process (which is shown in the browser console when you set javascript.options.mem.log to true in about:config) is reduced from around 1800ms to around 80ms.
(Reporter)

Comment 20

11 months ago
That's a very nice performance improvement. Let me know whenever a build with that revision becomes available to download with mozregression, I will re-test the original addon problem.
tracking-firefox60: ? → +
tracking-firefox61: ? → +
tracking-firefox62: ? → +
tracking-firefox-esr60: --- → ?
If you update to the latest version of Firefox Nightly, from 5-12 or later, it will have my patch in bug 1460636. Please try it out and let me know if it helps when you get a chance. Thanks.
Flags: needinfo?(savy.mathieu)

Comment 22

11 months ago
Please add "PERF" key word if this is a performance bug.
Some STR that reproduces this, based on comment 3:

1. Go to about:config and set javascript.options.mem.log to true. This will let you see the CC times in the browser console.
2. Go to https://www.zenhub.com/extension and install the addon (you'll have to agree to some prompt).
3. Go to https://github.com/jquery/jquery/wiki
4. Wait for maybe 10 seconds for the page to load and the browser to settle down a bit after the load.
5. Open the browser console, and filter on "cc(" so you only see entries from the cycle collector. There will still be a lot of entries.

You can see some examples of what the good and bad behavior are in comment 16.

To break this down a little, you'll see lines that look like this:
CC(T+14883.7)[content-9384] max pause: 6ms, total time: 8ms, slices: 2, suspected: 256, visited: 834 RCed and 2035 GCed, 

There are two relevant parts here. One is the the "content" near the start. You are interested in entries that say "content" and not "default" in that same location. (We are interested in the content process and not the main process.) The second part is the time after "total time". If the bug is not present, the total time will be under 100ms (maybe even under 10ms). If the bug IS present, then total time will be hundreds or even thousands of milliseconds.
Keywords: perf
For step two, note that you have to scroll down the page a bit and click on the button to install it in Firefox. For step 3, you don't have to be logged in to GitHub. I can reproduce the issue even in a clean profile.
I was thinking about how bug 1432794 could be related to this.

When we create a new group with a prototype that is Object.prototype and some other protos, we immediately mark the group as having unknown properties and no properties are recorded. After that bug, placeholder protos are used instead of Object.prototype etc and we mark the object literal groups as having unknown properties when we merge the compartments, however (1) in ObjectGroup::markUnknown we don't actually clear or null out the properties list and (2) in ObjectGroup::traceChildren we still trace all the properties, even if unknownProperties().

So that suggests we should maybe call setNewGroupUnknown when we create (some of) these placeholder objects. Also, ObjectGroup::markUnknown should probably null out the properties list.
(Assignee)

Comment 26

11 months ago
(In reply to Jan de Mooij [:jandem] from comment #25)
Thank Jan, it looks like that's the problem! \o/
(Assignee)

Comment 27

11 months ago
As suggested, set 'new group unknown' flag on various prototype placeholders and assert that this as done correctly when merging.

With this patch the longest CC I get when attempting to reproduce the issue is 50ms and most are much shorter.
Attachment #8975320 - Flags: review?(jdemooij)
Comment on attachment 8975320 [details] [diff] [review]
bug1449033-set-new-group-unknown

Review of attachment 8975320 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!
Attachment #8975320 - Flags: review?(jdemooij) → review+

Comment 29

11 months ago
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/aafd8f9a6ed4
Set new group unknown flag on placeholder prototypes where necessary r=jandem
Assignee: sphink → jcoppeard
Do you think it would be better to backport this patch to release, or bug 1460385 and bug 1460636? I think my patches are fairly low risk, but I have no idea how risky this is. This patch sounds like it actually fixes the issue rather than just the performance fault.
Flags: needinfo?(jcoppeard)

Comment 31

11 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/aafd8f9a6ed4
Status: ASSIGNED → RESOLVED
Last Resolved: 11 months ago
status-firefox62: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
(In reply to Andrew McCreight [:mccr8] from comment #30)
> This patch sounds like it actually fixes the
> issue rather than just the performance fault.

AFAIK this is unrelated to the shape issue, so we might want to uplift your patch for that anyway (because we did see it show up big time in that Google Inbox profile).
(Assignee)

Comment 33

11 months ago
(In reply to Andrew McCreight [:mccr8] from comment #30)
I think we should backport this patch since it fixes the root problem.  Uplifting anything is a risk, but this patch definitely makes things more correct.  Up to you if you think other patches should be uplifted too.
Flags: needinfo?(jcoppeard)
(Assignee)

Comment 34

11 months ago
Comment on attachment 8975320 [details] [diff] [review]
bug1449033-set-new-group-unknown

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1432794.
[User impact if declined]: Possible performance regression (long CC times) with addons installed.
[Is this code covered by automated tests?]: Yes
[Has the fix been verified in Nightly?]: I checked that the problem does no longer reproduces in latest nightly.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: I'd say this is low risk.
[Why is the change risky/not risky?]: Bug 1432794 changed the way scripts are compiled off-thread and inadvertantly introuduced a subtle difference in the output.  This change removes that difference, so with this change the system is closer to its original state.
[String changes made/needed]: None.
Attachment #8975320 - Flags: approval-mozilla-beta?
(Assignee)

Comment 35

11 months ago
Approval Request Comment
[Feature/Bug causing the regression]: Bug 1432794.
[User impact if declined]: Possible performance regression (long CC times) with addons installed.
[Is this code covered by automated tests?]: Yes
[Has the fix been verified in Nightly?]: I checked that the problem does no longer reproduces in latest nightly.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: I'd say this is low risk.
[Why is the change risky/not risky?]: Bug 1432794 changed the way scripts are compiled off-thread and inadvertantly introuduced a subtle difference in the output.  This change removes that difference, so with this change the system is closer to its original state.
[String changes made/needed]: None.
Attachment #8975422 - Flags: approval-mozilla-release?
Comment on attachment 8975320 [details] [diff] [review]
bug1449033-set-new-group-unknown

Fix for a perf regression we shipped in Fx60 which gets us back closer to pre-60 behavior. Approved for 61.0b5.
Attachment #8975320 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment 37

11 months ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-beta/rev/2e4a18b1c547
status-firefox61: affected → fixed
No longer depends on: 1460385
Hi Krupa, Tom, it seems likely this patch and issue will drive 60.0.1. We are hoping to gtb asap. KMag mentioned the need to test this thoroughly, especially the webext that were triggering this perf issue. Could you please get QA folks from your team ready/started? Thanks!
Flags: needinfo?(tgrabowski)
Flags: needinfo?(kraj)

Updated

11 months ago
tracking-firefox60: + → blocking
Comment on attachment 8975422 [details] [diff] [review]
bug1449033-release

We'll need this on the ESR 60.0.1 dot release as well.
Attachment #8975422 - Flags: approval-mozilla-esr60?
tracking-firefox-esr60: ? → 60+
Comment on attachment 8975422 [details] [diff] [review]
bug1449033-release

This perf issue will drive the 60.0.1 mainline and ESR release.
Attachment #8975422 - Flags: approval-mozilla-release?
Attachment #8975422 - Flags: approval-mozilla-release+
Attachment #8975422 - Flags: approval-mozilla-esr60?
Attachment #8975422 - Flags: approval-mozilla-esr60+

Comment 41

11 months ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-release/rev/45c0a10df388
status-firefox60: affected → fixed

Comment 42

11 months ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-esr60/rev/0ff2170567cb
status-firefox-esr60: affected → fixed
(In reply to Ritu Kothari (:ritu) from comment #38)
> Hi Krupa, Tom, it seems likely this patch and issue will drive 60.0.1. We
> are hoping to gtb asap. KMag mentioned the need to test this thoroughly,
> especially the webext that were triggering this perf issue. Could you please
> get QA folks from your team ready/started? Thanks!

Cosmin Badescu and Stefan Georgiev are leading the testing efforts on this.
Flags: needinfo?(tgrabowski)
Flags: needinfo?(stefan.georgiev)
Flags: needinfo?(cosmin.badescu)

Comment 44

10 months ago
We started the testing and we can see some serious failures for Linux(total time > 2s) using Beta 61.0b5. Also there are some tests where the total time recorded is greater than 100ms. You can see the partial results under https://goo.gl/rzsXtW. 

We will continue the testing and add results for the rest of scenarios in the same document.

Thanks,
Victor
(In reply to Victor Carciu from comment #44)
> We started the testing and we can see some serious failures for Linux(total
> time > 2s) using Beta 61.0b5. Also there are some tests where the total time
> recorded is greater than 100ms. You can see the partial results under
> https://goo.gl/rzsXtW. 

Thanks for testing these. I'm glad we can get some more confidence that the patch worked.

I looked at these results, and I guess I should have clarified a bit: it is okay if the total CC times for a particular process  are above 100ms sometimes. Sometimes the CC has to do some work, like if you close a tab and we're cleaning it up, or the page discarded a bunch of data. This won't always clean up everything immediately, so it is normal to see even 2 or 3 in a row that are longer.

In particular, the Linux Nightly results look totally fine to me. There's only a single slightly longer CC. The OSX beta result also looks fine to me. We get 2 or 3 CCs in a row for a process, but then the times return to normal. The beta Windows looks a little less good, because the CC time goes up to around 200ms for 5 or 6 CCs in a row, but it does eventually drop down again.

The Linux beta result does look a little problematic, where the total time is very long (2000ms), and it is persisting for 3 CCs in a row. Is there a way to see more context? I only see those 3 CCs for that process. Again, if the times are much shorter before and after that, I think it isn't too bad.

As seen in comment 16, the big problem here was when we had long total CC times on every CC.
Added this to Firefox and ESR 60.0.1 release notes
relnote-firefox: --- → 60+
(Reporter)

Comment 47

10 months ago
I can no longer reproduce the bug on Firefox nightly (62.0a1 - 2018-05-15). Good job!
Flags: needinfo?(savy.mathieu)
In terms of the test plan, I should add that an increase in CC times after opening and particularly closing tabs is expected, even to multiple seconds.

Comment 49

10 months ago
I can reproduce this issue on Firefox 59.0.3 (20180427210249) under Win 7 32-bit, Mac OS X 10.13.3 and Ubuntu 14.04 64-bit.

The total time for the content process is over 100ms and in some of the cases it went over 2000ms.

This issue is verified as fixed on Firefox 62.0a1 latest, Firefox 61.0b5 (20180514150347), Firefox 60.0.1 (20180514191118) and ESR 60.0.1 (20180514191126) under Win 7 32-bit, Mac OS X 10.13.3 and Ubuntu 14.04 64-bit.

Here is a link with the tests results: https://docs.google.com/document/d/1V00C0NsQqJPHQFd5ZK0rpIAi-1v0YpakWVHDVC-hv5E/edit# 

Some increases of the content process, have been seen when we opened/closed tabs, but this it is expected. (comment48)
Status: RESOLVED → VERIFIED
status-firefox60: fixed → verified
status-firefox61: fixed → verified
status-firefox62: fixed → verified
status-firefox-esr60: fixed → verified
Flags: needinfo?(stefan.georgiev)
Flags: needinfo?(kraj)
Flags: needinfo?(cosmin.badescu)
See Also: → bug 1462391
Duplicate of this bug: 1443299
You need to log in before you can comment on or make changes to this bug.