Closed Bug 1062907 Opened 10 years ago Closed 10 years ago

3% Ubuntu 64 Session Restore test on Inbound (v.35) September 3rd from push 4b56ee08673c

Categories

(Testing :: Talos, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

we have a clear jump in our session restore numbers: http://graphs.mozilla.org/graph.html#tests=[[313,131,35]]&sel=none&displayrange=30&datatype=running I did some retriggers here: https://tbpl.mozilla.org/?tree=Mozilla-Inbound&fromchange=82e1223b8639&tochange=55f0e713ff42&jobname=Ubuntu%20HW%2012.04%20x64%20mozilla-inbound%20talos%20other you can see we stay <1440 and after the regression we are always >1450 So the offending push: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=4c3e6c741c46&tochange=4b56ee08673c lots of stuff landed at the same time, so many people get to help figure out the root cause!
when I get access to bug 1061027 and bug 1061028, I will mark this as blocking them until we find the root cause.
:Yoric, I could use your expert opinion here on what is causing the regression
Flags: needinfo?(dteller)
Blocks: 1061027, 1061028
I strongly doubt that this is bug 992894, as that bug introduces only Windows-specific code. At a glance, I do not not have an expert opinion on the rest. Nothing appears related to Session Restore.
Flags: needinfo?(dteller)
(In reply to Joel Maher (:jmaher) from comment #1) I'm sure these two changes are not the cause of this regression. Bug 1061027 is merely a compile-time change and does not change the dynamic runtime behavior in any way (the object code should be identical). The changed function is only used for MathML content. Bug 1061028 adds a do_QueryFrame() call in a method (RemoveLetterFrames) which is only used for elements that have a matching ::first-letter style rule, which I doubt any of our chrome has.
mats, thanks for taking a few minutes to look at this. It could easily be another patch.
This shouldn't be by bug 1061618 - "Extend MOZ_UPDATER ifdef to fix blank pages on about:preferences" because that code change only affects the UI code of the Preferences and is only the modification of the parameters of an event listener and the exclusion of an event listener if Firefox is built without updater. The sessionstore.js in the profile of the test doesn't contain about:preferences: http://hg.mozilla.org/build/talos/file/849898ecb9be/talos/startup_test/sessionrestore/profile/sessionstore.js
Fwiw, bug 1050214 added 256 strings of 4 chars and 4 strings of 2 chars in any JSRuntime (i.e. any JSContext or equivalently, any global), so this might have a small impact on memory, but I don't expect that big an impact Is there a way to run the session restore benchmarks on a trybuild? I could try to backout it locally, just to see whether it has an impact.
this isn't obvious via trychooser: try: -b o -p linux64 -u none -t other_l64 if for some reason that doesn't work, then do: try: -b o -p linux64 -u none -t all thanks for looking into this :bbouvier!
Without the backout of bug 1050214 (i.e. current state): https://tbpl.mozilla.org/?tree=Try&rev=707d1b5d74e6 With the backout: https://tbpl.mozilla.org/?tree=Try&rev=1812931b5a4c No problem for helping looking into this (I've mentored the discussed bug). I'll look at the test results once the try builds are both done.
Flags: needinfo?(benj)
Not sure how to read this, but it seems that bug 1050214 is the culprit here: https://datazilla.mozilla.org/?product=Firefox&x86=false&repository=Try-Non-PGO&os_version=Ubuntu%2012.04&stop=1410274781&project=talos&start=1409669981&graph_search=1812931b5a4c&os=linux&test=media_tests&test=ts_paint&test=tpaint&test=a11yr&test=sessionrestore Can you please confirm? If that's the case, not sure we can do any better (I've been told the same thing happened when Intl in JS landed: a lot of new stuff had to be added anyways, so these regressions were unavoidable :/).
Flags: needinfo?(benj) → needinfo?(jmaher)
Hey bbouvier, it looks like bug 1050214 is the problem- and it is true that sometimes changing modules or pgo, or other random things cause us to regress. Usually this is seen on PGO and regular opt builds are not affected. As this is a single platform, single test, and the patch doesn't have obvious gotchas- I would advise resolving this as wontfix and being satisfied that we have some form of documentation as to what the regression is related to. Feel free to resolve this if you agree.
Flags: needinfo?(jmaher)
Flags: needinfo?(benj)
I've asked other JS peers if there was another way to do the same thing without affecting too much memory, but it's inherent to the feature: we're adding new names to an object which is in the global namespace, and these names need to be stored at some point. We're doing a quick refactoring of this code in bug 1066020 but this shouldn't affect memory. I don't see anything else we can do that would help here :/ so attemptively closing this as wontfix.
No longer blocks: 992894, 1061027, 1061028
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(benj)
Resolution: --- → WONTFIX
bbouvier, who made the decision that this feature is worth the 3% regression? Is that documented anywhere?
Flags: needinfo?(benj)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #13) > bbouvier, who made the decision that this feature is worth the 3% > regression? Is that documented anywhere? Here is the full rationale: the regressing bug adds static properties (258 of them) to a global JS object, the SIMD object. SIMD in JavaScript is not in the ES6 spec yet, it shall be in ES7 but the spec is only being started to be written, so these static properties may move away in the future. In the meanwhile, considering that: 1) the SIMD object is present in Nightlies only, so this doesn't affect non-Nightly users. 2) according to comment 11, it happens on only one platform. 3) I have been told Intl.js has added some regressions like this in the past, for similar reasons. 4) we've always added experimental features in Nightly without any issues, and (imo) we even stand as leaders in the field of implementing new experimental JS features. With all of this in mind, I indeed (implicitly) made the choice to keep this. I realize this is not my call to decide whether we can accept this regression or not, even on Nightlies, and would back it out without any problems, were I asked to do so. Please let me know your call about it.
Flags: needinfo?(benj) → needinfo?(benjamin)
I doubt it's my call either: I'm mainly asking for clarity about who's accountable for this decision. Is that Jorendorff as module owner of JS?
Flags: needinfo?(benjamin) → needinfo?(jorendorff)
Till reminded me: The self-hosting global is shared with worker threads, so it must be initialized and then frozen. JS_DefineFunctions already has some code that prevents us from trying to define self-hosted functions when initializing the self-hosting global: if (cx->runtime()->isSelfHostingGlobal(cx->global())) continue; We can use similar logic to make sure SIMD is not defined in the self-hosting global. This is a somewhat short-sighted fix. I'm open to alternatives.
Flags: needinfo?(jorendorff)
Reopening. > This is a somewhat short-sighted fix. I'm open to alternatives. New idea in bug 1067459. Till is taking it. Should have more info tomorrow.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Bug 1084319 reverted these changes, as the SIMD masks won't be part of the spec anymore. Now that bug 1084319 landed, has the situation improved?
are you sure it is bug 1084319? I don't see any commit messages linked in there.
oh, I have verified this on the graph server, thanks for updating this bug. Is there any chance we can do this same backout on Aurora?
(In reply to Joel Maher (:jmaher) from comment #21) > oh, I have verified this on the graph server, thanks for updating this bug. > Is there any chance we can do this same backout on Aurora? Patch 2 from this bug (which does the backout) applies cleanly on Aurora and build cleanly. If it sticks in inbound, I'll ask for uplift.
(In reply to Joel Maher (:jmaher) from comment #21) > oh, I have verified this on the graph server, thanks for updating this bug. > Is there any chance we can do this same backout on Aurora? Thinking about it, the SIMD object is Nightly only, so the object isn't even created in Aurora at all, so I doubt this will have any effect on memory usage. Can we close now that bug 1083238 landed in m-c?
Flags: needinfo?(jmaher)
there is a good chance this didn't make it to aurora, since our fix for this is nightly only and the aurora session restore regressions are not as bad as trunk I am fine closing this.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Flags: needinfo?(jmaher)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.