Closed
Bug 1434593
Opened 6 years ago
Closed 6 years ago
Nightly sometimes completely freezes while being in autoscrolling mode
Categories
(Core :: Graphics: WebRender, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla60
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox58 | --- | unaffected |
firefox59 | --- | unaffected |
firefox60 | --- | unaffected |
People
(Reporter: jan, Assigned: botond)
References
(Blocks 1 open bug, )
Details
(Keywords: hang, nightly-community)
Attachments
(3 files)
Nightly 60 x64 20180131100706 de_DE @ Debian Testing (KDE, Radeon RX480) main profile: gfx.webrender.all, general.autoScroll and others Sometimes while being in autoscrolling mode my Nightly completely freezes. I have to kill it then. This can sometimes also happen if I click on my bookmarks menu toolbar icon. I have reproduced it in a fresh profile with gfx.webrender.all;true and general.autoScroll;true on about:newtab a few minutes ago, but it was luck because it seems to be intermittent or I am not aware of correct STR. As I'm not using a non-WR Nightly I can't say at the moment if it's a non-WR regression. I am suffering from bug 1432375 and its "Failed to lock new back buffer". Maybe they are related?
Reporter | ||
Comment 1•6 years ago
|
||
Either I am forgetful or my feeling is that this might be a regression somewhere in January. I have hit-testing enabled since 2017-01-27. I will disable it for some hours to be absolutely sure that it wasn't the cause.
Reporter | ||
Comment 2•6 years ago
|
||
main profile: gfx.webrender.all;true, gfx.webrender.hit-test;FALSE, general.autoScroll;true and others. (Restarted multiple times.) It looks like hit-testing is innocent. I was autoscrolling around on https://discovery.cryptosense.com/analyze/mx.h.terrax.net/f7be078 and got the full freeze.
Comment 3•6 years ago
|
||
It's plausible that could be a result of bug 1433579. I'll write a patch for that soon and we can see if it fixes it for you.
Depends on: 1433579
Updated•6 years ago
|
Blocks: stage-wr-nightly
Priority: -- → P1
Reporter | ||
Comment 5•6 years ago
|
||
Haven't seen it yet. Looks promising. Thank you! :)
Status: NEW → RESOLVED
Closed: 6 years ago
status-firefox60:
affected → ---
No longer depends on: 1433579
Flags: needinfo?(jan)
Resolution: --- → DUPLICATE
Reporter | ||
Comment 6•6 years ago
|
||
Nightly 60 x64 20180202102708 de_DE @ Debian Testing (KDE, Radeon RX480) main profile: webrender + gpu process It happened again. (Yesterday I re-enabled the gpu-process because of bug 1432375 comment 3.) Is displaying the autoscroll icon a problem itself? (Like showing a bookmarks (sub)menu which can get black?) But I've never seen a black square instead of an autoscroll icon. Is there something I can do? Could I run some special build and redirect all console output to a log file?
Comment 7•6 years ago
|
||
I think if you run a debug build and redirect the output to a log file, that should print some useful information if it detects a deadlock scenario. If it's freezing for some reason other than a deadlock we'll probably need to get a stack of the main thread to see why it's stuck, and the best way to do that is to use gdb and attach to the main process when this happens. Are you familiar with using gdb at all? If not I can provide some instructions.
Reporter | ||
Comment 8•6 years ago
|
||
I haven't seen it anymore.
Status: REOPENED → RESOLVED
Closed: 6 years ago → 6 years ago
Resolution: --- → WORKSFORME
Comment 9•6 years ago
|
||
Just saw such a freeze on Nightly x64 20180215103933 @ Gentoo (KDE 5 / Intel) Webrender, no GPU process. P.S. In my experience it happens only when starting the autoscroll.
Comment 10•6 years ago
|
||
Hi, same for me, Nightly x64 20180217100053 Fedora 27 (Gnome/Intel 520) Webrender.all and autoscroll enabled I know how to replicate bug: 1.) Middle click 2.) Move slightly with mouse inside circle without scrolling 3.) scroll up or down with mousewheel 4.) firefox is frozen It happens only with Webrender enabled
Reporter | ||
Comment 11•6 years ago
|
||
Yes, I can reproduce comment 10 in a fresh profile with gfx.webrender.all. Thank you!
Status: RESOLVED → REOPENED
status-firefox58:
--- → unaffected
status-firefox59:
--- → unaffected
status-firefox60:
--- → unaffected
status-firefox-esr52:
--- → unaffected
Resolution: WORKSFORME → ---
Reporter | ||
Comment 12•6 years ago
|
||
STR from comment 10 bad: 2018-01-12, 2017-12-01, 2017-11-05, 2017-09-15 mozregression --good 2017-08-15 --bad 2017-09-15 --pref layers.acceleration.force-enabled:true gfx.webrender.enabled:true gfx.webrendest.enabled:true gfx.webrender.layers-free:true gfx.webrender.blob-images:true image.mem.shared:true layout.display-list.retain:false general.autoScroll:true startup.homepage_welcome_url:"https://hacks.mozilla.org/" > 20:16.63 INFO: Last good revision: b6c847346cb6b71b24fc5156d36215c32bbcd71d > 20:16.63 INFO: First bad revision: 46a3b3a3ae708ebee123a8aef0887659fc688750 > 20:16.63 INFO: Pushlog: > https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=b6c847346cb6b71b24fc5156d36215c32bbcd71d&tochange=46a3b3a3ae708ebee123a8aef0887659fc688750 > 46a3b3a3ae70 Kartikaya Gupta — Bug 1389143 - Send event regions override information to APZ in layers-free mode. r=jrmuizel > 81116c9af1b1 Kartikaya Gupta — Bug 1389143 - Send event regions items over to APZ in layers-free mode. r=jrmuizel > a8b2d799b3c0 Kartikaya Gupta — Bug 1389143 - Preserve the lastASR tracker across recursions. r=jrmuizel > 52917ec218cb Kartikaya Gupta — Bug 1389143 - Refactor to extract helper method. r=jrmuizel
Reporter | ||
Comment 13•6 years ago
|
||
(At step 3 I scrolled up and then it freezed.)
Reporter | ||
Comment 14•6 years ago
|
||
(If you wonder: Only inside Nightly/ESR/Thunderbird/Chromium I have a white Gtk mouse pointer, otherwise (and in autoscrolling mode in Nightly/TB/ESR) I've my regular dark KDE mouse pointer.)
Comment 15•6 years ago
|
||
Thanks, I can reproduce. Doesn't happen on all pages though. It's another APZ lock ordering problem which results in a deadlock. Still trying to figure out what codepath has the bad lock ordering.
Comment 16•6 years ago
|
||
This seems to be entirely autoscrolling-related and looks like it could happen with webrender disabled as well. In fact I do see the deadlock warning get reported without webrender as well. I suspect that with WR enabled APZ is frequently acquiring the tree lock in order to push updates to WR, so the deadlock (browser freeze) gets hit much more frequently in that case.
Comment 17•6 years ago
|
||
Botond, can you take a look at the backtrace above? Looks like the cancel autoscroll notification (around stack frame #54) triggers a sort of reentrancy back into APZ code at stack frame #7 and it tries to get the tree lock while we are still holding an APZC lock deep in the stack. Is it necessary that all this happen synchronously, or can we dispatch the notification async to fix this ordering issue?
Flags: needinfo?(botond)
Assignee | ||
Comment 18•6 years ago
|
||
That is quite the stack trace :)
Assignee: nobody → botond
Flags: needinfo?(botond)
Assignee | ||
Comment 19•6 years ago
|
||
When APZ cancels an autoscroll, it notifies browser.xml (via the apz:cancel-autoscroll observer notification). It's needless for browser.xml to then send a notification back to APZ via TabParent::StopApzAutoscroll(). I actually have code intended to prevent this from happening: // Set this._autoScrollScrollId to null, so in stopScroll() we // don't call stopApzAutoscroll() (since it's APZ that // initiated the stopping). this._autoScrollScrollId = null; this._autoScrollPresShellId = null; but I'm running this code *after* calling hidePopup(). It didn't occur to me that hidePopup() can fire the "popuphidden" event (whose handler calls stopScroll()) synchronously.
Comment hidden (mozreview-request) |
Assignee | ||
Comment 21•6 years ago
|
||
I can repro the hang (intermittently) without the fix, and not with it. I also checked what the compositor thread is doing during the deadlock, and it is indeed in PushStateToWR().
Comment 22•6 years ago
|
||
mozreview-review |
Comment on attachment 8952834 [details] Bug 1434593 - Ensure that browser.xml does not send APZ back a notification after APZ notifies it of canceling autoscroll. https://reviewboard.mozilla.org/r/222062/#review227958 LGTM, thanks!
Attachment #8952834 -
Flags: review?(bugmail) → review+
Comment 23•6 years ago
|
||
Pushed by bballo@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8455df429711 Ensure that browser.xml does not send APZ back a notification after APZ notifies it of canceling autoscroll. r=kats
Comment 24•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/8455df429711
Status: REOPENED → RESOLVED
Closed: 6 years ago → 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Comment 25•6 years ago
|
||
Thanks, fixed! Now I can use webrender for daily usage.
You need to log in
before you can comment on or make changes to this bug.
Description
•