Closed Bug 1471508 Opened Last year Closed 2 months ago

Webpages don't render at all since Firefox Quantum 60.0

Categories

(Core :: CSS Parsing and Computation, defect, P3)

60 Branch
x86_64
Windows 10
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: bhaalsen, Unassigned)

References

Details

(Keywords: regression)

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:59.0) Gecko/20100101 Firefox/59.0
Build ID: 20180427210249

Steps to reproduce:

Open any webpage.

Usually, it doesn't happen right away; only after a few page loads. Disabling Addons, using a new Profile or starting empty (vs. loading a previous session) has no effect.

This is on Windows 10 Professional (Version 1709, Build 16299.125); installed as German with everything switched to English using a Language Pack.
Hardware is a Dell OptiPlex 7040 with Intel Core i7-6700, 32GB RAM, two AMD Radeon R5 340X (with Crossfire DISABLED) and two 27'' LG Screens (both at 1920x1080x60Hz @32bpp) connected via DVI (one per GPU)


Actual results:

The status bar shows "Reading $url..." followed by "Transferring data from $url..." after waiting a little, but the webpage view itself stays blank (white background, as if it were about:blank).
Opening the F12 developer tools and picking the Network tab shows the loads successfully complete, and the Response tab in there renders the page correctly.

Pages that were open in a previous session, as well as pages that are listed as Homepage (in my case: multiple ones, delimited by a vertical pipe character) show up correctly on start, but stop working after some time (either by not reacting at all, as if the page was frozen; or by showing the white page on load as described earlier)

This happens with all versions tested starting from 60.0 (including 60.0.1 and 60.0.2) up until 61.0.
The only way to "fix" this is by reverting back to 59.0.3 where the issue does not occur.


Expected results:

Pages load and render correctly instead of the white page.
Slight correction: after trying to reproduce it, nothing happened until I left my machine for a meeting (locking the machine using Windows+L). When I came back, the browser was locked up when trying to navigate either on previously loaded pages or opening a new one.
Trying to load Google shows a white screen but the preview seems fine. No discernable CPU usage (about 1-2%, nothing that really screams).
Please
1. Enter about:support into the address bar.
2. Click the "Copy text to clipboard" button.
3. Paste the text into Notepad and save the file.
4. Click the "Attach File" link above comment 0 to upload it.

(In reply to bhaalsen from comment #0)
> This happens with all versions tested starting from 60.0 (including 60.0.1
> and 60.0.2) up until 61.0.

It's always best to check if an issue is reproducible in a brand new profile with the latest Nightly.
https://support.mozilla.com/kb/profile-manager-create-and-remove-firefox-profiles
https://www.mozilla.org/firefox/nightly/all/

> The only way to "fix" this is by reverting back to 59.0.3 where the issue
> does not occur.

It would be helpful if you could find the regression range.
https://mozilla.github.io/mozregression/quickstart.html
Component: Untriaged → Graphics
Flags: needinfo?(bhaalsen)
OS: Unspecified → Windows 10
Product: Firefox → Core
Hardware: Unspecified → x86_64
Trying the current nightly 63.0a1 (2018-06-26) at the moment; it appears that _just_ locking the machine and coming back immediately doesn't trigger it; so I'll be leaving it to similar conditions as before (machine locked, left alone for some time). Perhaps some sleep state or whatever on the GPU causing this (although I'm not sure if it also happened with software rendering; which I tried some time back when I first noticed the issue with 60.0).

Depending on my results I'll update about:support later; and I'll also give mozregression a shot.
Saved an Output of about:support from 60.0.2 earlier where I was able to reproduce it. Nightly 63.0a1 (2018-06-26) has a similar (if not the same) issue, but it appears after about an hour of usage (regardless if the machine was locked or not).
Sometimes the pages would load, and sometimes the tab would just get stuck on the white page. Usually it's one or more "stuck" for the same page until it does finally load, but no real pattern as to when this actually happens.

Also noteworthy: whenever this happens, opening the F12 developer tools appears ok, but closing the developer tools leaves the top half white and the bottom half black (with a dark theme on the developer tools; as if the developer tools were still there...at least the panel that hosts them).

I'll run mozregression next; but it seems a little tedious since it doesn't simply reproduce cleanly/consistently (or with short turnaround times).
Hm, it looks a little like the bisect isn't really conclusive (or rather, not too specific; because whatever it found seemed unlikely to me...):

Bisecting on mozilla-central [2018-04-30 - 2018-05-09]
Tested mozilla-central build: 2018-05-05 (verdict: b)
Tested mozilla-central build: 2018-05-03 (verdict: b)
Tested mozilla-central build: 2018-05-02 (verdict: b)
Tested mozilla-central build: 2018-05-01 (verdict: g)
Bisecting on mozilla-central [d2a4720d - 2d83e184]
Tested mozilla-central build: 176bba69 (verdict: b)
Tested mozilla-central build: d28c45eb (verdict: b)
Bisecting on autoland [14dc1b26 - d28c45eb]
Tested mozilla-central build: d858e46e (verdict: b)
Tested mozilla-central build: db8dd9bc (verdict: b)
Tested mozilla-central build: 2f76d0c8 (verdict: s)

And the remainder of the log (messages before that seem to indicate IPC with the tested build; so omitted):

2018-06-28T14:20:01: INFO : Narrowed inbound regression window from [14dc1b26, d858e46e] (7 builds) to [14dc1b26, db8dd9bc] (3 builds) (~1 steps left)
2018-06-28T14:20:01: INFO : Running autoland build built on 2018-05-01 11:46:50.904000, revision 2f76d0c8
2018-06-28T14:20:08: ERROR : Unable to start the application
2018-06-28T14:20:20: DEBUG : Starting merge handling...
2018-06-28T14:20:20: DEBUG : Using url: https://hg.mozilla.org/integration/autoland/json-pushes?changeset=db8dd9bc2e5cc5c61b7d329481b7b25eb5b10ef8&full=1
2018-06-28T14:20:21: DEBUG : Found commit message:
Bug 1457942 - Add 'en-CA' to Firefox Nightly builds r=delphine

MozReview-Commit-ID: LCt5lOYpaY9

2018-06-28T14:20:21: INFO : The bisection is done.
2018-06-28T14:20:21: INFO : Stopped

I'll stick to 2018-05-01 for now to see if that "good" decision was wrong; for the others it was pretty obvious about 45 minutes to one hour after start while that one didn't show any problems.

Anything else I could/should try?
Flags: needinfo?(bhaalsen)
Seems I still had a build running in the background; rerunning the bisect for the remaining range to see if it returns something useful.
Well, that didn't really last too long. Different error message ("Unable to exploit the merge commit."), yet again no really conclusive result (at least for me):

2018-06-28T15:26:18: INFO : Narrowed inbound regression window from [14dc1b26, db8dd9bc] (3 builds) to [14dc1b26, 2f76d0c8] (2 builds) (~1 steps left)
2018-06-28T15:26:18: DEBUG : Starting merge handling...
2018-06-28T15:26:18: DEBUG : Using url: https://hg.mozilla.org/integration/autoland/json-pushes?changeset=2f76d0c80038e8ff2065082a4d2dfaf0e8a261d7&full=1
2018-06-28T15:26:19: DEBUG : Found commit message:
Merge mozilla-central to autoland. a=merge on a CLOSED TREE

2018-06-28T15:26:19: DEBUG : This is a merge from mozilla-central
2018-06-28T15:26:19: DEBUG : Using url: https://hg.mozilla.org/mozilla-central/json-pushes?changeset=372d779ec72010064f004df03914022d714b99bf
2018-06-28T15:26:19: DEBUG : Using url: https://hg.mozilla.org/mozilla-central/json-pushes?fromchange=372d779ec72010064f004df03914022d714b99bf&tochange=d2a4720d1c334b64d88a51678758c27ba8f03c89
2018-06-28T15:26:19: DEBUG : Got exception
2018-06-28T15:26:32: INFO : Stopped
It appears the "good" mark for 2018-05-01 was bogus, I'll continue with the bisect between 2018-04-30 and 2018-05-01 (not sure how many changes there were).
Thanks for continuing to bisect this issue!

I'll mark this P3 until we know more how this affects current builds.
Priority: -- → P3
No problem. I'm a developer myself, I know how valuable a good bisect can be; especially in a case like this one where it doesn't appear to be easily reproducible.
Not to mention the fact that I'd actually like to update to the most recent build.

However, I just reached 2018-04-30 (every build takes at least 40-45 minutes minimum before the problem manifests) and it seems that even that one doesn't work.
I'm guessing there is a divergence between the released 59.0.3 (with date 2018-04-30) and the nightly 61.0a1 (with date 2018-04-30) that might be responsible.

I'll keep bisecting until I find a more useful result. I might have to go back until 2018-03-13 though, which is the vanilla 59.0.0 release date.
Just to let you know I'm still running the bisect; getting closer to a range that looks like it could be a possible culprit.

2018-07-03T16:50:29: INFO : Narrowed inbound regression window from [4b484657, 0476c309] (7 builds) to [4b484657, e9edf63c] (4 builds) (~2 steps left)
2018-07-03T16:50:29: INFO : Running autoland build built on 2018-02-28 20:13:34.164000, revision c88a04f0

What jumps out in that range (https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=4b48465720bb75ef569951454ecf3da86515544a&tochange=e9edf63c2aea997f44c6e22b26bdedb861687b02) is the commits related to issue https://bugzilla.mozilla.org/show_bug.cgi?id=1438974

I'll finish this bisect to verify this does indeed cause the problem; but this will probably take me until tomorrow (mostly because I'm going home now). I hope I can present you some conclusive result by then.
And the bisect is done. I was hoping the tool would try to pick individual commits from in there, but I guess there are no automatic individual builds for it, so the best I can offer at this point is this:

app_name: firefox
build_date: 2018-02-28 20:10:25.344000
build_file: C:\Users\bhaal\.mozilla\mozregression\persist\023a3a83d667--autoland--target.zip
build_type: inbound
build_url: https://queue.taskcluster.net/v1/task/ZJkOaKd5RzC_G1AcpgAiyA/runs/0/artifacts/public%2Fbuild%2Ftarget.zip
changeset: 023a3a83d667d1c10319def65340467cb64db086
pushlog_url: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=4b48465720bb75ef569951454ecf3da86515544a&tochange=023a3a83d667d1c10319def65340467cb64db086
repo_name: autoland
repo_url: https://hg.mozilla.org/integration/autoland
task_id: ZJkOaKd5RzC_G1AcpgAiyA

2018-07-04T08:28:30: INFO : Narrowed inbound regression window from [4b484657, e9edf63c] (4 builds) to [4b484657, c88a04f0] (3 builds) (~1 steps left)
2018-07-04T08:28:30: INFO : Running autoland build built on 2018-02-28 20:10:25.344000, revision 023a3a83
2018-07-04T08:28:38: INFO : Launching c:\Users\bhaal\AppData\Local\Temp\tmpfmlcpe\firefox\firefox.exe
2018-07-04T08:28:38: INFO : Application command: c:\Users\bhaal\AppData\Local\Temp\tmpfmlcpe\firefox\firefox.exe -profile c:\users\emanue~1.wla\appdata\local\temp\tmp8ol8in
2018-07-04T08:28:38: INFO : application_buildid: 20180228184333
2018-07-04T08:28:38: INFO : application_changeset: 023a3a83d667d1c10319def65340467cb64db086
2018-07-04T08:28:38: INFO : application_display_name: Firefox Nightly
2018-07-04T08:28:38: INFO : application_id: {ec8030f7-c20a-464f-9b0e-13a3a9e97384}
2018-07-04T08:28:38: INFO : application_name: Firefox
2018-07-04T08:28:38: INFO : application_remotingname: firefox
2018-07-04T08:28:38: INFO : application_repository: https://hg.mozilla.org/integration/autoland
2018-07-04T08:28:38: INFO : application_vendor: Mozilla
2018-07-04T08:28:38: INFO : application_version: 60.0a1
2018-07-04T08:28:38: INFO : platform_buildid: 20180228184333
2018-07-04T08:28:38: INFO : platform_changeset: 023a3a83d667d1c10319def65340467cb64db086
2018-07-04T08:28:38: INFO : platform_repository: https://hg.mozilla.org/integration/autoland
2018-07-04T08:28:38: INFO : platform_version: 60.0a1
2018-07-04T09:05:01: INFO : Unable to read VR Path Registry from C:\Users\bhaal\AppData\Local\openvr\openvrpaths.vrpath
2018-07-04T09:05:01: INFO : [Child 15500, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
2018-07-04T09:05:01: INFO : Unable to read VR Path Registry from C:\Users\bhaal\AppData\Local\openvr\openvrpaths.vrpath
2018-07-04T09:05:01: INFO : [Parent 7620, Gecko_IOThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
2018-07-04T09:05:01: INFO : [Parent 7620, Gecko_IOThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
2018-07-04T09:05:01: INFO : Unable to read VR Path Registry from C:\Users\bhaal\AppData\Local\openvr\openvrpaths.vrpath
2018-07-04T09:05:01: INFO : [Child 4148, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
2018-07-04T09:05:01: INFO : [Child 4148, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
2018-07-04T09:05:01: INFO : Unable to read VR Path Registry from C:\Users\bhaal\AppData\Local\openvr\openvrpaths.vrpath
2018-07-04T09:05:01: INFO : [Child 3236, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
2018-07-04T09:05:01: INFO : [Child 3236, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
2018-07-04T09:05:01: INFO : Unable to read VR Path Registry from C:\Users\bhaal\AppData\Local\openvr\openvrpaths.vrpath
2018-07-04T09:05:01: INFO : [Child 3560, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
2018-07-04T09:05:01: INFO : [Child 3560, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
2018-07-04T09:05:01: INFO : Unable to read VR Path Registry from C:\Users\bhaal\AppData\Local\openvr\openvrpaths.vrpath
2018-07-04T09:05:01: INFO : [Child 12356, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
2018-07-04T09:05:01: INFO : [Child 12356, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
2018-07-04T09:05:01: INFO : Unable to read VR Path Registry from C:\Users\bhaal\AppData\Local\openvr\openvrpaths.vrpath
2018-07-04T09:05:01: INFO : [GPU 16844, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
2018-07-04T09:05:01: INFO : 
2018-07-04T09:05:01: INFO : ###!!! [Child][MessageChannel::SendAndWait] Error: Channel error: cannot send/recv
2018-07-04T09:05:01: INFO : 
2018-07-04T09:05:04: INFO : Narrowed inbound regression window from [4b484657, c88a04f0] (3 builds) to [4b484657, 023a3a83] (2 builds) (~1 steps left)
2018-07-04T09:05:04: DEBUG : Starting merge handling...
2018-07-04T09:05:04: DEBUG : Using url: https://hg.mozilla.org/integration/autoland/json-pushes?changeset=023a3a83d667d1c10319def65340467cb64db086&full=1
2018-07-04T09:05:05: DEBUG : Found commit message:
Bug 1438974 - Dispatch to the appropriate event target. r=smaug

MozReview-Commit-ID: 6mCk1PjStND


2018-07-04T09:05:05: INFO : The bisection is done.
2018-07-04T09:05:05: INFO : Stopped


If there's anything else I can help with (such as testing whether new builds fix this), please let me know.
Blocks: 1438974
Has Regression Range: --- → yes
Hm, this is certainly a worrisome bug report. Thank you so much for helping track down the regression range!

It does seem plausible that the switch to async CSS parsing could somehow be related to your indefinite hangs, but it's hard to say exactly how. A few things I'd like to check about:

(1) Can you confirm that you're running an entirely fresh profile? You mentioned this in comment 0, but the about:support in comment 4 shows various addons enabled. For investigation it would be really helpful to always test with clean profiles, since that eliminates a lot of variables.

(2) I'd like to figure out whether the content process is stuck in an infinite loop, or whether the load is just stalled. You mentioned in comment 1 that there was no discernable CPU usage - were you looking at the Firefox process, or the "Web Content" process? The Web Content process is the one I would expect to see pegged to 100% CPU, which would explain the inability to repaint the page after closing devtools. Are you able to easily switch to other tabs?

(3) It seems like there may be something specific to your system causing issues, since I'd imagine we would have heard more about this if it was happening to everyone. Are you in an office with other similar machines, and if so, does this happen on those machines as well? I also notice you have antivirus running - does anything change when you disable it?

(4) Assuming the browser UI is generally responsive, it would be useful to capture a profile when the hang occurs. You can install the profiler at https://perf-html.io/ , and then get a permalink to your profile with the share button. Use a Nightly build for this, since release builds don't have all the symbols.
Flags: needinfo?(bhaalsen)
1) I tried it once initially, but since it didn't seem to make a difference I kept using my profile for testing (since, after all, it took at least half an hour if not more for the problem to become visible, and I did have to use a browser to keep doing my work).
Will do, though.

2) I was looking at processes, where only firefox.exe showed up, I don't see any "Web Content" processes here at the moment.

3) That was also kinda of odd, I thought it would've hit the forums/tracker already when I checked back as 60.0 was released.
I'm in an office and we do have a few almost identical machines, but they either use Chrome/different browsers or don't see the issue (or at least haven't noticed it happen frequently enough to notice and shrugged it off as some server issue).
AV is simply the default one that comes with Windows, I'll see if it makes a difference (although, it doesn't affect 59.0.3 and below, so it seems odd if it did).

4) I'll give that a shot, but it'll take a while ;)
It looks like just with my initial bisect result, I also misjudged the results of a clean profile.

Clean profile seems to work just fine, so I tried to disable half the addons I got. And so far, with Personas Plus (ID: personas[at]christopher.beard) and Personas Rotator (ID: {6e73f6b7-b9ab-44b8-b744-6393e3c2e351}) disabled everything seems fine as well.

I'll try and see if it stays that way with them disabled (mostly because I don't really use them anymore anyways), hoping it fixes my problem.
In that case, I'd also like to apologize for wasting your time with a bug report for the wrong product.
No worries! It would actually still be really helpful if you could narrow down what addon it was. If it's really causing hard-to-diagnose problems like this, we definitely want to warn the developers and get it fixed. So if you're able to determine which addon was causing the problem that's a great start, and if you're able to capture a profile that's even better.

Thanks for helping make Firefox better! :-)
Alright, it appears that Personas Rotator (ID: {6e73f6b7-b9ab-44b8-b744-6393e3c2e351}) is causing the issue. I vaguely remember that it was supposed to rotate themes every hour or so; and that is consistent with the rough 45 minute to 1 hour window I saw for the issue to manifest (ie. it seems to happen as soon as Rotator tries to apply a new theme).
Disabling the Addon immediately fixes the loading issue, both for new tabs and for already affected tabs that didn't load before.

I cannot really tell if the Addon works at all though, since it only shows an empty popup when selecting the icon on the toolbar; for Personas Plus it shows the list of installed themes (which Rotator was supposed to switch through). Back when Personas first came out, it required an active login on the Personas page; otherwise it wouldn't switch themes, so that might be related (since I'm pretty sure that I haven't logged in there in years).

Are reports for Addons also handled here on Bugzilla, or should I go report this elsewhere?
Flags: needinfo?(bhaalsen)
Component: Graphics → CSS Parsing and Computation
(In reply to bhaalsen from comment #17)
> Are reports for Addons also handled here on Bugzilla, or should I go report
> this elsewhere?

Looks like this is known looking at https://addons.mozilla.org/en-US/firefox/addon/personas-rotator/reviews/. Given those it's likely it's indeed this addon the underlying issue.

Andreas, do you know if there's any procedure for these kind of add-ons? Looks like this add-on broke more than a year ago.
Flags: needinfo?(awagner)
I recommend contacting the developer (again, maybe?). We can also restrict compatibility. Do we know which version started breaking which Firefox version?
Flags: needinfo?(awagner)
(In reply to Andreas Wagner [:TheOne] [use NI] from comment #19)
> I recommend contacting the developer (again, maybe?). We can also restrict
> compatibility. Do we know which version started breaking which Firefox
> version?

The specific (scary) breakage reported in this bug started in FF60 according to comment 12. The reviews of the add-on suggest that it may have been broken since before that, but I think the symptoms here are serious enough that we should mark as incompatible with 60+. Can you make that happen?
Flags: needinfo?(awagner)
Alright, I set the compatibility for all versions of the add-on to Firefox 0-59.*

Also adding Baris, the developer of the add-on. Baris, please let us know when the issue is fixed so we can look into removing the compatibility override.
Flags: needinfo?(awagner) → needinfo?(barisderin)
I've had the Addon loaded since...well...dunno, when did personas launch? 5-6 years ago? If it weren't for the blank page breakage reported here, I probably wouldn't have noticed that the Addon itself was broken.
(In reply to bhaalsen from comment #22)
> I've had the Addon loaded since...well...dunno, when did personas launch?
> 5-6 years ago? If it weren't for the blank page breakage reported here, I
> probably wouldn't have noticed that the Addon itself was broken.

Thanks for noticing, and taking the time to track it down! You've made the Firefox experience better for 2000 users (the number of current users of Personas Rotator). :-)
Thank you very much for bug reporting. I will trace it and will try to fix it as soon as possible.
Flags: needinfo?(barisderin)
(In reply to Baris Derin from comment #24)
> Thank you very much for bug reporting. I will trace it and will try to fix
> it as soon as possible.

Awesome, thanks!

The extension was never updated, but lightweight themes were discontinued a while ago. It doesn't appear the extension causes this kind of breakage anymore, since it shouldn't be able to load themes at all. Confirmed this in latest release.

Status: UNCONFIRMED → RESOLVED
Closed: 2 months ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.