Closed
Bug 1118618
Opened 10 years ago
Closed 10 years ago
[e10s] Slow script/plugin hang UI
Categories
(Core :: DOM: Content Processes, defect)
Tracking
()
RESOLVED
FIXED
mozilla38
Tracking | Status | |
---|---|---|
e10s | m4+ | --- |
People
(Reporter: billm, Assigned: billm)
References
Details
Attachments
(3 files, 1 obsolete file)
72.32 KB,
patch
|
mrbkap
:
review+
|
Details | Diff | Splinter Review |
3.81 KB,
patch
|
bent.mozilla
:
review+
|
Details | Diff | Splinter Review |
18.43 KB,
patch
|
mconley
:
review+
|
Details | Diff | Splinter Review |
Here's the plan for this bug: 1. If a web page has a slow script, we tell the chrome process and it puts up a UI describing the problem and giving the user the option to terminate the script, debug it, kill the content process, or ignore the slow script. 2. If a plugin hangs, we tell the chrome process and it puts up a UI describing the problem and giving the user the option to kill the plugin, kill the content process, or ignore the hang. 3. If there chrome process is waiting for a CPOW when we want to display the hang UI, then we time out the CPOW. I'll file a separate bug to figure out the best UX for this. Right now I have a simple UI that I think we should start with.
Assignee | ||
Comment 1•10 years ago
|
||
There's a comment in ProcessHangMonitor.cpp that describes the architecture. I hope it's sufficient.
Attachment #8545028 -
Flags: review?(mrbkap)
Assignee | ||
Comment 2•10 years ago
|
||
This is pretty straightforward. There's a comment in nsIHangReport.idl in the previous patch that probably would be helpful to read.
Attachment #8545029 -
Flags: review?(mconley)
Assignee | ||
Comment 3•10 years ago
|
||
Almost forgot this patch. It allows us to find out when a synchronous IPC starts and ends.
Attachment #8545403 -
Flags: review?(bent.mozilla)
Comment 5•10 years ago
|
||
Comment on attachment 8545029 [details] [diff] [review] frontend-changes Review of attachment 8545029 [details] [diff] [review]: ----------------------------------------------------------------- This is really exciting stuff, Bill! Very, very cool. Just a few notes. Thanks! ::: browser/base/content/browser.js @@ +1192,5 @@ > Services.obs.addObserver(gXPInstallObserver, "addon-install-complete", false); > window.messageManager.addMessageListener("Browser:URIFixup", gKeywordURIFixup); > window.messageManager.addMessageListener("Browser:LoadURI", RedirectLoad); > > + ProcessHangMonitor.init(); It looks like ProcessHangMonitor is only supposed to be initted once globally. Initializing it should probably be moved to nsBrowserGlue.js, since that's where we init global things instead of per-window things. ::: browser/base/content/tabbrowser.xml @@ +1489,5 @@ > > if (wasActive) > aBrowser.focus(); > > + var evt = document.createEvent("Events"); let, not var ::: browser/locales/en-US/chrome/browser/browser.dtd @@ +825,5 @@ > <!ENTITY panicButton.thankyou.msg1 "Your recent history is cleared."> > <!ENTITY panicButton.thankyou.msg2 "Safe browsing!"> > <!ENTITY panicButton.thankyou.buttonlabel "Thanks!"> > > +<!ENTITY processHang.terminateScript.label "Terminate JavaScript"> We might want to run some of these strings by UX or Matej Novak to make sure we've got the terminology right for our users. There's some pretty scary technical language in here, and the audience isn't going to just be developers. ::: browser/locales/en-US/chrome/browser/browser.properties @@ +435,5 @@ > dataReportingNotification.button.label = Choose What I Share > dataReportingNotification.button.accessKey = C > > +# Process hang reporter > +processHang.message = A web page is causing %1$S to run slowly. What would you like to do? Nit - I don't think the spaces in front of = add much, unless you want to align it with the other items. I wouldn't worry about aligning though. ::: browser/modules/ProcessHangMonitor.jsm @@ +10,5 @@ > +let Cu = Components.utils; > + > +this.EXPORTED_SYMBOLS = ["ProcessHangMonitor"]; > + > +Cu.import("resource://gre/modules/Services.jsm"); Might as well have this loaded lazily. @@ +13,5 @@ > + > +Cu.import("resource://gre/modules/Services.jsm"); > + > +/** > + * This JSM is responsible for observing content process hang reports This is great, but in general I think each method of this module needs to be documented individually. @@ +29,5 @@ > + // List of hang reports that haven't expired or been dismissed by > + // the user. The list contains pairs [report, timer]. Each time the > + // hang is reported, the timer is refreshed so it expires after > + // HANG_EXPIRATION_TIME. > + _activeReports: [], Instead of an Array, this is probably a great place to use a Map that keys on reports to timers. @@ +90,5 @@ > + callback: null, > + }]; > + > + nb.appendNotification(message, "process-hang", > + "chrome://browser/content/aboutRobots-icon.png", This is probably fine for now, but we should file a follow-up bug to make sure UX gets a chance to supply something better if they have something. @@ +109,5 @@ > + let frameLoader = browser.QueryInterface(Ci.nsIFrameLoaderOwner).frameLoader; > + let reportToShow = null; > + for (let [report, timer] of this._activeReports) { > + if (report.isReportForBrowser(frameLoader)) { > + reportToShow = report; Might as well just return report here, and return null at the end of the function? Not sure we need reportToShow. @@ +132,5 @@ > + let callback = { > + handleResult: (pref) => { > + ignore = pref.value; > + }, > + handleCompletion: () => { I guess we don't care about the reason passed to this function? @@ +159,5 @@ > + > + // If a new tab is selected or if a tab changes remoteness, then > + // we may need to show or hide a hang notification. > + > + if (event.type == "TabSelect") { Make this an or? @@ +172,5 @@ > + win.gBrowser.tabContainer.addEventListener("TabRemotenessChange", this, true); > + }, > + > + untrackWindow: function(win) { > + win.gBrowser.tabContainer.removeEventListener("TabSelect", this, true); We need to untrack a window when it closes, otherwise I think we're going to leak. Where do we monitor for window closing so we can untrack? @@ +176,5 @@ > + win.gBrowser.tabContainer.removeEventListener("TabSelect", this, true); > + win.gBrowser.tabContainer.removeEventListener("TabRemotenessChange", this, true); > + }, > + > + updateWindows: function() { Nit - can you please put this closer to updateWindow? @@ +302,5 @@ > + } > + let uri = browser.currentURI.spec; > + cps.set(uri, "browser.hangs.ignore.slow_script", true, browser.loadContext); > + } else { > + cps.setGlobal("browser.hangs.ignore.plugin", true, null); So we're going to ignore plugin hangs for all domains if the user chooses to ignore for a single domain? Or are you thinking along the lines of, "we got a Flash hang, ignore all Flash hangs from here on out"? If so, I'm worried that we're mixing mental models for the user here. Some choices are going to be applied to domains, and some are going to be applied globally. That feels confusing.
Attachment #8545029 -
Flags: review?(mconley) → review-
Assignee | ||
Comment 6•10 years ago
|
||
New patch.
Attachment #8545029 -
Attachment is obsolete: true
Attachment #8546108 -
Flags: review?(mconley)
Assignee | ||
Comment 7•10 years ago
|
||
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #5) > Comment on attachment 8545029 [details] [diff] [review] > frontend-changes > > Review of attachment 8545029 [details] [diff] [review]: > ----------------------------------------------------------------- > > We might want to run some of these strings by UX or Matej Novak to make sure > we've got the terminology right for our users. There's some pretty scary > technical language in here, and the audience isn't going to just be > developers. I filed bug 1119442 for UX. I also made the terminology a little simpler. > ::: browser/modules/ProcessHangMonitor.jsm > @@ +10,5 @@ > > +let Cu = Components.utils; > > + > > +this.EXPORTED_SYMBOLS = ["ProcessHangMonitor"]; > > + > > +Cu.import("resource://gre/modules/Services.jsm"); > > Might as well have this loaded lazily. I don't think it makes sense to load Services lazily. Pretty much every module uses it, so it's already going to be loaded. The only savings is the cost of loading an already loaded module, which should be cheap. > @@ +172,5 @@ > > + win.gBrowser.tabContainer.addEventListener("TabRemotenessChange", this, true); > > + }, > > + > > + untrackWindow: function(win) { > > + win.gBrowser.tabContainer.removeEventListener("TabSelect", this, true); > > We need to untrack a window when it closes, otherwise I think we're going to > leak. Where do we monitor for window closing so we can untrack? When a window is closed, Gecko automatically disconnects all its event listeners. So there isn't any need for this. > @@ +302,5 @@ > > + } > > + let uri = browser.currentURI.spec; > > + cps.set(uri, "browser.hangs.ignore.slow_script", true, browser.loadContext); > > + } else { > > + cps.setGlobal("browser.hangs.ignore.plugin", true, null); > > So we're going to ignore plugin hangs for all domains if the user chooses to > ignore for a single domain? > > Or are you thinking along the lines of, "we got a Flash hang, ignore all > Flash hangs from here on out"? > > If so, I'm worried that we're mixing mental models for the user here. Some > choices are going to be applied to domains, and some are going to be applied > globally. That feels confusing. Yeah, the ignore thing kinda bothered me. I really didn't like how you can't stop ignoring hangs. The content pref service unfortunately doesn't have any kind of generic UI, and I don't know how we would expose that otherwise. I don't think ignoring hangs is all that useful. They should be rare and you can always just close the box. So I removed all the ignore machinery.
Comment 8•10 years ago
|
||
Comment on attachment 8546108 [details] [diff] [review] frontend-changes v2 Review of attachment 8546108 [details] [diff] [review]: ----------------------------------------------------------------- I think this is fine, once the last set of things I found get fixed. ::: browser/base/content/browser.js @@ +148,5 @@ > > XPCOMUtils.defineLazyModuleGetter(this, "PageThumbs", > "resource://gre/modules/PageThumbs.jsm"); > > +// Needed by browser.xul. TBH, I don't think this comment adds much value. ::: browser/modules/ProcessHangMonitor.jsm @@ +244,5 @@ > + * Install event handlers on |win| to watch for events that would > + * cause a different hang report to be displayed. > + */ > + trackWindow: function(win) { > + //win.gBrowser.tabContainer.addEventListener("TabSelect", this, true); Commented out code... this should probably be put back in? Or, the untrackWindow function needs to get the removeEventListener taken out. @@ +276,5 @@ > + timer.cancel(); > + timer.initWithCallback(this, HANG_EXPIRATION_TIME, timer.TYPE_ONE_SHOT); > + } > + > + // Otherwise create a new timer and display the report. This doesn't look right - we're going to fall through and create a new timer and map it to this report even if we already had one. Was this supposed to be in an else?
Attachment #8546108 -
Flags: review?(mconley) → review+
Comment on attachment 8545403 [details] [diff] [review] sync-notification Review of attachment 8545403 [details] [diff] [review]: ----------------------------------------------------------------- Looks good!
Attachment #8545403 -
Flags: review?(bent.mozilla) → review+
Comment 10•10 years ago
|
||
Comment on attachment 8545028 [details] [diff] [review] platform-changes Review of attachment 8545028 [details] [diff] [review]: ----------------------------------------------------------------- This looks good. A couple of nits. ::: dom/base/nsGlobalWindow.cpp @@ +10965,5 @@ > + filename.get(), > + lineno); > + if (action == ProcessHangMonitor::Terminate) { > + return KillSlowScript; > + } else if (action == ProcessHangMonitor::StartDebugger) { Nit: here and below: nix the else-after-returns. ::: dom/ipc/ProcessHangMonitor.cpp @@ +190,5 @@ > + > +template<> > +struct RunnableMethodTraits<HangMonitorChild> > +{ > + typedef HangMonitorChild Class; Nit: these traits structs should use two-space indentation. @@ +387,5 @@ > +} > + > +void > +HangMonitorChild::ClearHang() > +{ Can we assert IsMainThread here?
Attachment #8545028 -
Flags: review?(mrbkap) → review+
Assignee | ||
Comment 11•10 years ago
|
||
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/f7d82a8a8e94 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/8ab6c26d26f5
Comment 12•10 years ago
|
||
sorry had to back this out for bustage like: https://treeherder.mozilla.org/logviewer.html#?job_id=5388242&repo=mozilla-inbound
Comment 13•10 years ago
|
||
https://hg.mozilla.org/projects/cedar/rev/23cc21e9bfef https://hg.mozilla.org/projects/cedar/rev/c1549e8bf04e
Assignee | ||
Comment 14•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/cf5c463cd993 https://hg.mozilla.org/integration/mozilla-inbound/rev/e12319cbdcd3
Assignee | ||
Comment 15•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/6f0f1a79b379
Assignee | ||
Comment 16•10 years ago
|
||
Had to back this out because it seems to be slowing down b2g ICS debug xpcshell tests to the point where they sometimes time out. A bit strange since it's disabled on b2g, but maybe it's real. I'll investigate. https://hg.mozilla.org/integration/mozilla-inbound/rev/13c3bf8c3e21
And a clobber touch to hopefully fix build bustage: https://hg.mozilla.org/integration/mozilla-inbound/rev/1810fd0f78b4
https://hg.mozilla.org/mozilla-central/rev/1810fd0f78b4
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Assignee | ||
Updated•10 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 19•10 years ago
|
||
Relanded in one changeset (I was lazy). https://hg.mozilla.org/integration/mozilla-inbound/rev/63680efe6d55 I had to set the max script run time to 0 on b2g. Otherwise I guess we trigger the interrupt callback too much and it's too slow.
Comment 20•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/63680efe6d55
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
Comment 22•10 years ago
|
||
¡Hola Bill! Is there some threshold to be tweaked? I do see the new UI quite a few time on Nightly's start-up and I'm pretty sure I do see this way more often than the old modal window. ¡Gracias!
Flags: needinfo?(wmccloskey)
Assignee | ||
Comment 23•10 years ago
|
||
(In reply to alex_mayorga from comment #22) > I do see the new UI quite a few time on Nightly's start-up and I'm pretty > sure I do see this way more often than the old modal window. If you're seeing the UI after waking up your laptop, then that's bug 1114345. However, we also decreased the threshold for when it appears. The old slow script dialog appeared after 10 seconds. This one appears after 3 seconds. That number may require some tweaking. However, unlike the old dialog, the script continues to run while the pop-up is shown, so it doesn't really interrupt what the user is doing so much. What was happening on your computer when the pop-up appeared? Three seconds is still a long time for a script to run. Does your computer swap a lot? Were there programs running in the background?
Flags: needinfo?(wmccloskey) → needinfo?(alex_mayorga)
Comment 24•10 years ago
|
||
¡Hola Bill! As it is when I try to move the pointer to the button the notification would be gone by the time I reach it so it is rather confusing. Perhaps the notification bar should stick around until dismissed? I see it pop up a lot during a session restore of not to big sessions (i.e. a couple windows and less than 20 tabs total). My feature request is for it not to pop-up that many times when doing a session restore. The physical memory is at 70% tops so is not really swapping. Did you try this with e10s?
Flags: needinfo?(alex_mayorga) → needinfo?(wmccloskey)
Assignee | ||
Comment 25•10 years ago
|
||
Hopefully bug 1123956 will fix some of these issues. The popup is supposed to stay around for 5 seconds after the hang ends. I've increased that to 10 seconds. If that's not enough, I guess we can keep it around until the user closes it. > Did you try this with e10s? Yes. This bug is specific to e10s.
Flags: needinfo?(wmccloskey)
You need to log in
before you can comment on or make changes to this bug.
Description
•