blank and unresponsive screen since version 98 on Sony Xperia XZ1
Categories
(Core :: mozglue, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox99 | --- | wontfix |
firefox100 | --- | wontfix |
firefox101 | --- | verified |
firefox102 | --- | verified |
People
(Reporter: agi, Assigned: gsvelto)
References
(Regression)
Details
(Keywords: regression, Whiteboard: [geckoview:m102])
Attachments
(1 file)
48 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-beta+
|
Details | Review |
From github: https://github.com/mozilla-mobile/fenix/issues/24226.
Steps to reproduce
After the apk is updated to version 98 or 99 beta, opening the app makes a blank screen (no decoration, no url bar, no buttons or anything to interact). Rolling back to 97.3.0 makes firefox work again.
Expected behaviour
firefox should show menus and web content
Actual behaviour
The app cannot be interacted with
Device name
Sony Xperia XZ1
Android version
Android 8
Firefox release type
Firefox
Firefox version
98.0.0
Device logs
Cannot pull logs from the application. Checked catlog while opening the app but could not find anything obvious (no tracebacks, at least), can find something better if anyone knows a keyword I can search for
Additional information
No response
Change performed by the Move to Bugzilla add-on.
Reporter | ||
Comment 1•2 years ago
|
||
Some users report that the 98 upgrade causes the browser to be completely unresponsive on the Sony Xperia XZ1.
Reporter | ||
Updated•2 years ago
|
Comment 2•2 years ago
|
||
This bug is a regression between GV 97 and 98.
Can we ask the reporter to bisect the Fenix or GVE Nightly 98 builds using mozregression?
Updated•2 years ago
|
Updated•2 years ago
|
(In reply to Chris Peterson [:cpeterson] from comment #2)
This bug is a regression between GV 97 and 98.
Can we ask the reporter to bisect the Fenix or GVE Nightly 98 builds using mozregression?
Hi, I'm am not the original reporter, but I have the same issue. How can I run mozregression on an Android device?
Comment 4•2 years ago
|
||
AIUI you want to use mozregression -n gve
to bisect the geckoview example app. It should try to use a connected android device (you need to ensure that remote debugging is enabled so it can be accessed via adb). https://mozilla.github.io/mozregression/ has some general information. Note that I haven't actually done this myself, so it may be there are some more steps required to get things working.
Comment 5•2 years ago
|
||
Jamie: I'm going to speculatively guess this is somewhat likely to be a gfx issue, or at least you're in a good position to get a device and help find the regression range.
Comment 6•2 years ago
|
||
Thanks bmarne. I wrote some instructions on the github issue here. Please let me know if you need any help with that!
Comment 7•2 years ago
|
||
I've ordered a device. In the meantime, if we get a regression range soon we can decide whether to revert the commit, else we can move affected users to software webrender (assuming it's a driver/webrender bug)
Updated•2 years ago
|
Comment 8•2 years ago
|
||
From github, this seems to affect the XZ (Snapdragon 820, Adreno 530), as well as XZ1 and XZ1C (Snapdragon 835, Adreno 540). All users who have reported an Android version have reported 8.0. I'd assume all Sony devices with those chips on Android 8 are affected.
A user also gave this regression range: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=524ae1444004ec35dfcc348104060e91e1efce82&tochange=b34a32e1fc3e6fe4e0b7bcbb68cca4b797e9733d
I'm not sure what to think about that. Only bug 1752168 seems related to rendering, and it doesn't seem like it should cause this
Comment 9•2 years ago
|
||
(In reply to Jamie Nicol [:jnicol] from comment #8)
A user also gave this regression range: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=524ae1444004ec35dfcc348104060e91e1efce82&tochange=b34a32e1fc3e6fe4e0b7bcbb68cca4b797e9733d
Since bug happens at app startup, maybe Bug 1751041 (Compute the process startup timestamp early during startup) is related? Maybe these devices have a clock bug that is causing GeckoView to hang?
Comment 10•2 years ago
|
||
Could be. Hopefully my device arrives tomorrow or the day after and I can get a precise range. For now, there's nothing to indicate to me we should remotely blocklist webrender. We could perhaps make some test builds with potential causes backed out, and ask users to test them. Other than that we may just have to sit tight until it arrives.
Comment 11•2 years ago
|
||
Two users on github have confirmed that bug 1751041 was the regressing bug.
Gabriele, any ideas why that could have caused this on these devices? Do you have any suggestions for a fix or even just some logging that we can ask the users to test an APK with? And can we safely revert this patch for 100 if we need to?
Assignee | ||
Comment 12•2 years ago
|
||
The issue is odd but is not the first one that gets reported, see bug 1759541 too. The patch in bug 1751041 can be reverted safely but it will cause one of the hazard tests to fail, see bug 1678152 comment 39. I'll try and dig into the change itself to see if there's something wrong with it that can be addressed directly.
Assignee | ||
Comment 13•2 years ago
|
||
I gave this and bug 1759541 some thought and I think I know what might be happening: moving the timestamp so early during startup means we're running it in a static initializer. That function is not trivial, it's calling getenv(), strcmp() and clock_gettime() at the very least which are all coming from glibc - and bionic on Android. Both libraries might be setting up stuff early during static initialization and might not work properly. Also there's the issue of sandboxing, I don't know how the content sandbox behaves at that point and maybe I should have kept that in mind.
Either way what might be happening is that content processes might be crashing or getting stuck on startup. I'll write a patch later that reverts the change while avoiding to break the GC hazards test. What bugs me though is that this will paper over the problem: the next time we move timestamp generation too early we might hit this again - and that might happen purely by chance with some code calling TimeStamp::ProcessCreation() indirectly being moved around.
Assignee | ||
Comment 14•2 years ago
|
||
Reporter | ||
Comment 15•2 years ago
|
||
FWIW my Xperia XZ1 is supposedly being delivered on Apr 30th.
Comment 16•2 years ago
•
|
||
Hi all!
I've verified if I can reproduce this issue on the Sony Xperia Z5 Premium (Android 7.1.1) device.
I've installed Beta 97.0.0-beta.6, browsed a little. I've then updated to Beta 98.0.0-beta.4. Everything works just fine: the tabs opened on Beta 97 were still displayed after update, no blank tabs/screens.
The same behavior on RC 97.1.0, updated to RC 98.1.0.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 17•2 years ago
|
||
Moving this bug to the mozglue component because this is a mozglue bug, not a GeckoView bug.
Comment 18•2 years ago
|
||
Set release status flags based on info from the regressing bug 1751041
Comment 19•2 years ago
|
||
Gabriele, after this fix lands in Nightly, can you please uplift it to Beta 101?
Assignee | ||
Comment 20•2 years ago
|
||
(In reply to Chris Peterson [:cpeterson] from comment #19)
Gabriele, after this fix lands in Nightly, can you please uplift it to Beta 101?
Sure! I'm leaving the NI? so I don't forget.
Comment 21•2 years ago
|
||
Mike, can you please take a look at this patch? We'd like to get this fixed ASAP.
Updated•2 years ago
|
Comment 22•2 years ago
|
||
Pushed by gsvelto@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/1b5e8b70a1bf Compute the process creation timestamp lazily r=glandium
Comment 23•2 years ago
|
||
bugherder |
Assignee | ||
Comment 24•2 years ago
|
||
Comment on attachment 9274057 [details]
Bug 1766342 - Compute the process creation timestamp lazily r=glandium
Beta/Release Uplift Approval Request
- User impact if declined: Fenix is unusable on certain devices
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: Yes
- If yes, steps to reproduce: Launch a nightly build of Fenix on a Sony Xperia XZ1 device and ensure it works instead of just showing a blank screen.
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): This changes is effectively a revert of bug 1751041. Previous to my change in bug 1751041 this code had worked for years so moving back to the old behavior can't hurt.
- String changes made/needed: none
- Is Android affected?: Yes
Assignee | ||
Updated•2 years ago
|
Comment 25•2 years ago
|
||
Comment on attachment 9274057 [details]
Bug 1766342 - Compute the process creation timestamp lazily r=glandium
Approved for Fenix 101.0.0-beta.5.
Comment 26•2 years ago
|
||
bugherder uplift |
Updated•2 years ago
|
Comment 27•2 years ago
|
||
I tested the issue using a Sony Xperia Z5 (Android 7.0) by installing Firefox RC 97.3.0 and the updating to RC 98.3.0 but was unable to reproduce the issue. We also tested the issue using a Sony Xperia - model number SGP511 (Android 6.0.1) and we were not able to reproduce the issue. Unfortunately we do not have a device that is affected by this issue.
Updated•2 years ago
|
Comment 28•2 years ago
|
||
Multiple people in the upstream issue have verified that Nightly and Beta are working as expected now.
Updated•2 years ago
|
Description
•