Closed Bug 1434593 Opened 6 years ago Closed 6 years ago

Nightly sometimes completely freezes while being in autoscrolling mode

Categories

(Core :: Graphics: WebRender, defect, P1)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox-esr52 --- unaffected
firefox58 --- unaffected
firefox59 --- unaffected
firefox60 --- unaffected

People

(Reporter: jan, Assigned: botond)

References

(Blocks 1 open bug, )

Details

(Keywords: hang, nightly-community)

Attachments

(3 files)

Nightly 60 x64 20180131100706 de_DE @ Debian Testing (KDE, Radeon RX480)
main profile: gfx.webrender.all, general.autoScroll and others

Sometimes while being in autoscrolling mode my Nightly completely freezes. I have to kill it then.
This can sometimes also happen if I click on my bookmarks menu toolbar icon.

I have reproduced it in a fresh profile with gfx.webrender.all;true and general.autoScroll;true on about:newtab a few minutes ago, but it was luck because it seems to be intermittent or I am not aware of correct STR.

As I'm not using a non-WR Nightly I can't say at the moment if it's a non-WR regression.

I am suffering from bug 1432375 and its "Failed to lock new back buffer". Maybe they are related?
Either I am forgetful or my feeling is that this might be a regression somewhere in January. I have hit-testing enabled since 2017-01-27. I will disable it for some hours to be absolutely sure that it wasn't the cause.
main profile: gfx.webrender.all;true, gfx.webrender.hit-test;FALSE, general.autoScroll;true and others.
(Restarted multiple times.)

It looks like hit-testing is innocent.
I was autoscrolling around on https://discovery.cryptosense.com/analyze/mx.h.terrax.net/f7be078 and got the full freeze.
It's plausible that could be a result of bug 1433579. I'll write a patch for that soon and we can see if it fixes it for you.
Depends on: 1433579
Can you see if you can still reproduce this?
Flags: needinfo?(jan)
Haven't seen it yet. Looks promising. Thank you! :)
Status: NEW → RESOLVED
Closed: 6 years ago
No longer depends on: 1433579
Flags: needinfo?(jan)
Resolution: --- → DUPLICATE
Nightly 60 x64 20180202102708 de_DE @ Debian Testing (KDE, Radeon RX480)
main profile: webrender + gpu process

It happened again.
(Yesterday I re-enabled the gpu-process because of bug 1432375 comment 3.)

Is displaying the autoscroll icon a problem itself? (Like showing a bookmarks (sub)menu which can get black?)
But I've never seen a black square instead of an autoscroll icon.

Is there something I can do? Could I run some special build and redirect all console output to a log file?
Status: RESOLVED → REOPENED
Depends on: 1433579
Resolution: DUPLICATE → ---
I think if you run a debug build and redirect the output to a log file, that should print some useful information if it detects a deadlock scenario. If it's freezing for some reason other than a deadlock we'll probably need to get a stack of the main thread to see why it's stuck, and the best way to do that is to use gdb and attach to the main process when this happens. Are you familiar with using gdb at all? If not I can provide some instructions.
I haven't seen it anymore.
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → WORKSFORME
Just saw such a freeze on Nightly x64 20180215103933 @ Gentoo (KDE 5 / Intel)
Webrender, no GPU process.

P.S. In my experience it happens only when starting the autoscroll.
Hi, same for me, Nightly x64 20180217100053 Fedora 27 (Gnome/Intel 520) Webrender.all and autoscroll enabled
I know how to replicate bug:
1.) Middle click
2.) Move slightly with mouse inside circle without scrolling
3.) scroll up or down with mousewheel
4.) firefox is frozen
It happens only with Webrender enabled
Yes, I can reproduce comment 10 in a fresh profile with gfx.webrender.all. Thank you!
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
STR from comment 10

bad: 2018-01-12, 2017-12-01, 2017-11-05, 2017-09-15

mozregression --good 2017-08-15 --bad 2017-09-15 --pref layers.acceleration.force-enabled:true gfx.webrender.enabled:true gfx.webrendest.enabled:true gfx.webrender.layers-free:true gfx.webrender.blob-images:true image.mem.shared:true layout.display-list.retain:false general.autoScroll:true startup.homepage_welcome_url:"https://hacks.mozilla.org/"
> 20:16.63 INFO: Last good revision: b6c847346cb6b71b24fc5156d36215c32bbcd71d
> 20:16.63 INFO: First bad revision: 46a3b3a3ae708ebee123a8aef0887659fc688750
> 20:16.63 INFO: Pushlog:
> https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=b6c847346cb6b71b24fc5156d36215c32bbcd71d&tochange=46a3b3a3ae708ebee123a8aef0887659fc688750

> 46a3b3a3ae70	Kartikaya Gupta — Bug 1389143 - Send event regions override information to APZ in layers-free mode. r=jrmuizel
> 81116c9af1b1	Kartikaya Gupta — Bug 1389143 - Send event regions items over to APZ in layers-free mode. r=jrmuizel
> a8b2d799b3c0	Kartikaya Gupta — Bug 1389143 - Preserve the lastASR tracker across recursions. r=jrmuizel
> 52917ec218cb	Kartikaya Gupta — Bug 1389143 - Refactor to extract helper method. r=jrmuizel
Attached video 2018-02-18_22-22-16.mp4
(At step 3 I scrolled up and then it freezed.)
(If you wonder: Only inside Nightly/ESR/Thunderbird/Chromium I have a white Gtk mouse pointer, otherwise (and in autoscrolling mode in Nightly/TB/ESR) I've my regular dark KDE mouse pointer.)
Thanks, I can reproduce. Doesn't happen on all pages though. It's another APZ lock ordering problem which results in a deadlock. Still trying to figure out what codepath has the bad lock ordering.
Attached file Backtrace
This seems to be entirely autoscrolling-related and looks like it could happen with webrender disabled as well. In fact I do see the deadlock warning get reported without webrender as well.

I suspect that with WR enabled APZ is frequently acquiring the tree lock in order to push updates to WR, so the deadlock (browser freeze) gets hit much more frequently in that case.
Botond, can you take a look at the backtrace above? Looks like the cancel autoscroll notification (around stack frame #54) triggers a sort of reentrancy back into APZ code at stack frame #7 and it tries to get the tree lock while we are still holding an APZC lock deep in the stack. Is it necessary that all this happen synchronously, or can we dispatch the notification async to fix this ordering issue?
Flags: needinfo?(botond)
That is quite the stack trace :)
Assignee: nobody → botond
Flags: needinfo?(botond)
When APZ cancels an autoscroll, it notifies browser.xml (via the apz:cancel-autoscroll observer notification). It's needless for browser.xml to then send a notification back to APZ via TabParent::StopApzAutoscroll().

I actually have code intended to prevent this from happening:

    // Set this._autoScrollScrollId to null, so in stopScroll() we
    // don't call stopApzAutoscroll() (since it's APZ that
    // initiated the stopping).
    this._autoScrollScrollId = null;
    this._autoScrollPresShellId = null;

but I'm running this code *after* calling hidePopup(). It didn't occur to me that hidePopup() can fire the "popuphidden" event (whose handler calls stopScroll()) synchronously.
I can repro the hang (intermittently) without the fix, and not with it. I also checked what the compositor thread is doing during the deadlock, and it is indeed in PushStateToWR().
Comment on attachment 8952834 [details]
Bug 1434593 - Ensure that browser.xml does not send APZ back a notification after APZ notifies it of canceling autoscroll.

https://reviewboard.mozilla.org/r/222062/#review227958

LGTM, thanks!
Attachment #8952834 - Flags: review?(bugmail) → review+
Pushed by bballo@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8455df429711
Ensure that browser.xml does not send APZ back a notification after APZ notifies it of canceling autoscroll. r=kats
https://hg.mozilla.org/mozilla-central/rev/8455df429711
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Thanks, fixed! Now I can use webrender for daily usage.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: