Closed
Bug 841041
Opened 11 years ago
Closed 11 years ago
[B2G][OTA] Ridiculously janky / unresponsive behavior after OTA update (making it hard to even unlock the phone)
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(blocking-b2g:tef+, firefox19 wontfix, firefox20 wontfix, firefox21 fixed, b2g18 fixed, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 fixed)
People
(Reporter: nkot, Assigned: dhylands)
References
Details
(Keywords: smoketest)
Attachments
(5 files, 1 obsolete file)
Description: Homescreen does not display after OTA update without having to restart device Repro Steps: 1) Go to Settings => Device information => Software updates => Check now 2) Install new System Update 2013-02-13-071150 3) Wait for the homescreen to appear or press power button if display went off Expected: *Locked Homescreen displays, user is able to unlock and use device Actual: *Homecreen appears black - see screenshot *Homescreen displays successfully after restart Repro frequency: 100% 3/3 devices *screenshot attached *log file attached Notes: Updated from: Gecko:70c8f2cf813626e8c7b0f89676e1a62fe4ddfcae Gaia:ecca2ee860825547d5e1109436b50b74dfe9261e Build ID:20130212070205
Reporter | ||
Comment 1•11 years ago
|
||
Comment 2•11 years ago
|
||
That sounds bad. Can you consistently reproduce?
blocking-b2g: --- → tef?
Component: Gaia → Gaia::Homescreen
Comment 3•11 years ago
|
||
Something most definitely blew up here. Weird logcat errors: 02-13 08:46:05.212: I/Gecko(108): ###!!! [Parent][AsyncChannel] Error: Channel error: cannot send/recv 02-13 08:46:11.508: E/GeckoConsole(20200): [JavaScript Error: "formatURLPref: Couldn't get pref: app.update.url.details" {file: "jar:file:///system/b2g/omni.ja!/components/nsURLFormatter.js" line: 126}]
Updated•11 years ago
|
blocking-b2g: tef? → ---
Comment 5•11 years ago
|
||
I ran into this today. :overholt did too.
Comment 6•11 years ago
|
||
Actually maybe I ran into a related but different issue. In my case the screen stayed entirely blank, then I got the boot image for quite a while, then back to entirely blank.
Comment 7•11 years ago
|
||
I ran into this yesterday and today. I have to pull the battery to get anything other than the "hardware" buttons to light up.
blocking-b2g: --- → tef?
Component: Gaia::Homescreen → General
Comment 9•11 years ago
|
||
(In reply to Andrew Overholt [:overholt] from comment #8) > Gah, should probably be shira? Why shira? Isn't this going to impact tef?
Comment 10•11 years ago
|
||
I've applied the update locally, and I also see "app.update.url.details" errors, but I'm able to get into the phone -- albeit _very_ slowly. When the screen is off, pressing the power button to turn it back on takes on the order of ~15-20 seconds on my device, and then almost immediately goes back off. If you tap the screen some while after you press the power button, you can avoid the timeout, but then you have to patiently drag the lock drawer up, and press the unlock button while the device remains extremely unresponsive. Doing a quick top, I noticed that the b2g process is eating between 97-98%! 2577 0 98% S 36 163016K 53780K fg root /system/b2g/b2g Looking into why that might be..
Flags: needinfo?(marshall)
Comment 11•11 years ago
|
||
(In reply to Jason Smith [:jsmith] from comment #9) > (In reply to Andrew Overholt [:overholt] from comment #8) > > Gah, should probably be shira? > > Why shira? Isn't this going to impact tef? Agreed
Assignee: nobody → marshall
blocking-b2g: shira? → tef?
Comment 12•11 years ago
|
||
(See also bug 841517, which might be a dupe of this bug.)
Comment 13•11 years ago
|
||
So this has basically bricked the device for me. Restarting doesn't help, I just get a black screen. I can barely get the lockscreen to display.
Assignee | ||
Comment 14•11 years ago
|
||
What does: top -t -m 5 show? (from an adb shell)
Comment 15•11 years ago
|
||
adb doesn't even see the device, for me at least. (adb shell says "error: device not found") (I have udev rules set up, and I ran adb w/ sudo for good measure - so I'm not getting blocked from seeing the device due to lack-of-privileges)
Comment 16•11 years ago
|
||
It showed /system/b2g/b2g at 96%. I ended up kill that process and after it restart the phone is responsive again. I'm not sure why yanking the battery out previously didn't fix it, unless I just gave it more time to finish whatever it was trying to do.
Comment 17•11 years ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #15) > adb doesn't even see the device, for me at least. (adb shell says "error: > device not found") > > (I have udev rules set up, and I ran adb w/ sudo for good measure - so I'm > not getting blocked from seeing the device due to lack-of-privileges) I ran into this at first, I think it was due to the device being so behind/bogged down that the daemon wasn't running (yet).
Assignee | ||
Comment 18•11 years ago
|
||
(In reply to Lucas Adamski from comment #16) > It showed /system/b2g/b2g at 96%. I was hoping to see the whole line, so we could tell which thread was consuming the CPU (top -t shows individual threads).
Assignee | ||
Comment 19•11 years ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #15) > adb doesn't even see the device, for me at least. (adb shell says "error: > device not found") > > (I have udev rules set up, and I ran adb w/ sudo for good measure - so I'm > not getting blocked from seeing the device due to lack-of-privileges) adb might be disabled. It is by default for dogfooding. You can enable it by enabling: Settings->Device Information->More Information->Developer->Remote Debugging
Comment 20•11 years ago
|
||
(In reply to Dave Hylands [:dhylands] from comment #18) > > I was hoping to see the whole line, so we could tell which thread was > consuming the CPU (top -t shows individual threads). Sorry, I'd closed the window by the time I saw your question. :(
Comment 21•11 years ago
|
||
Spoke too soon. Checked for updates, applied it, now stuck again: User 83%, System 16%, IOW 0%, IRQ 0% User 272 + Nice 0 + Sys 54 + Idle 0 + IOW 0 + IRQ 0 + SIRQ 0 = 326 PID TID PR CPU% S VSS RSS PCY UID Thread Proc 526 550 0 24% R 157880K 52444K fg root DOM Worker /system/b2g 526 549 0 24% R 157880K 52444K fg root DOM Worker /system/b2g 526 555 0 24% R 157880K 52444K fg root DOM Worker /system/b2g 526 546 0 24% R 157880K 52444K fg root DOM Worker /system/b2g 617 617 0 1% R 1088K 444K fg root top top
Assignee | ||
Comment 22•11 years ago
|
||
So yeah dholbert is seeing the same thing. 4 DOM Workers consuming most of the CPU. And they're all in the main process. I don't know what the DOM Workers do.
Assignee | ||
Comment 23•11 years ago
|
||
I was able to reproduce by flasing my unagi with this image: https://pvtbuilds.mozilla.org/pub/mozilla.org/b2g/nightly/mozilla-b2g18-unagi/2013/02/2013-02-12-07-02-05/ and then perform an OTA update, which updated to: http://update.boot2gecko.org/nightly/b2g_update_20130214070203.mar?build_id=20130214070203&version=18.0
Assignee | ||
Comment 24•11 years ago
|
||
I tried flashing https://pvtbuilds.mozilla.org/pub/mozilla.org/b2g/nightly/mozilla-b2g18-unagi/2013/02/2013-02-12-07-02-05/ and then OTA updating to my local built version and it didn't reproduce (my locally built version is a v1-train) I reflashed and OTA updated as in comment 23 and it reproduced (so not a one off).
Assignee | ||
Comment 25•11 years ago
|
||
I was able to get into gdb and get some back traces. I'm not sure of the validity of the symbols since the image that was being wasn't from the tree I was in.
Comment 26•11 years ago
|
||
[clarifying summary]
Summary: [B2G][OTA] Homescreen fails to display after OTA update → [B2G][OTA] Ridiculously janky / unresponsive behavior after OTA update (making it hard to even unlock the phone)
(In reply to Dave Hylands [:dhylands] from comment #25) > I'm not sure of the validity of the symbols since the image that was being > wasn't from the tree I was in. Yeah... The traces don't look right to me.
Comment 29•11 years ago
|
||
Do you guys need more information to debug this issue? i just reproduced these symptoms myself when updating to: Gecko http://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/d1288313218e Gaia 6544fdb8dddc56f1aefe94482402488c89eeec49 BuildID 20130214070203 Version 18.0
Comment 30•11 years ago
|
||
For what its worth, if i pull battery and reboot, I can recover the device back into a usable state. Just fodder for triage drivers under consideration.
Updated•11 years ago
|
blocking-b2g: tef? → tef+
Assignee | ||
Comment 31•11 years ago
|
||
I was able to reproduce using a local build. STR: 1 - Modify gecko/toolkit/content/UpdateChannel.sh near the end to override the channel: channel = "foobar"; return channel; 2 - build. 3 - Create an update ./build.sh gecko-update-full 4 - Setup the phone to use the update tools/update-tools/test-update.py ${GECKO_OBJDIR}/dist/b2g-update/b2g-gecko-update.mar 5 - Do a Check Now and then do the update This message: AUS:SVC UpdateManager:get activeUpdate - channel has changed, reloading default preferences to workaround bug 802022 from here: https://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/nsUpdateService.js#2795 seems to be the key. I'm hypothesising that the reload-default-prefs is the actual trigger.
Attachment #714262 -
Attachment is obsolete: true
Assignee | ||
Comment 32•11 years ago
|
||
I let the process run for a bit longer and grabbed another set of backtraces
Assignee | ||
Comment 33•11 years ago
|
||
I picked one of the DOM Workers and just hit n over and over in gdb. It wound up doing these 4 lines continuously: 3428 #endif (gdb) 3408 WorkerRunnable* event; (gdb) 3410 MutexAutoLock lock(mMutex); (gdb) 3412 while (!mControlQueue.Pop(event) && !syncQueue->mQueue.Pop(event)) {
Assignee | ||
Comment 34•11 years ago
|
||
(In reply to Tony Chung [:tchung] from comment #29) > Do you guys need more information to debug this issue? I can reproduce at will now, so now it's mostly just trying to figure out whats going on.
Comment 35•11 years ago
|
||
Assignee: marshall → anygregor
Attachment #714586 -
Flags: review?(bent.mozilla)
Updated•11 years ago
|
Attachment #714586 -
Flags: review?(bent.mozilla) → review+
Comment 36•11 years ago
|
||
thx jdm! https://hg.mozilla.org/integration/mozilla-inbound/rev/424de8168602 dhylands mentions that the bug is not completely gone.
Assignee: anygregor → dhylands
Whiteboard: leave-open
Assignee | ||
Comment 37•11 years ago
|
||
I filed bug 841962 to followup on this problem and removed the leave-open on this bug. At least with the patch applied in this bug, the phone now just takes alot longer to bootup after an update-channel change, but it seems to perform ok.
Whiteboard: leave-open
Comment 38•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/424de8168602
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 39•11 years ago
|
||
https://hg.mozilla.org/releases/mozilla-b2g18/rev/63211dc2a63e https://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/af83e7e7f52a
status-b2g18:
--- → fixed
status-b2g18-v1.0.0:
--- → wontfix
status-b2g18-v1.0.1:
--- → fixed
status-firefox19:
--- → wontfix
status-firefox20:
--- → wontfix
status-firefox21:
--- → fixed
Target Milestone: --- → B2G C4 (2jan on)
Reporter | ||
Comment 41•11 years ago
|
||
Verifying fix on V1-train branch - OTA goes smoothly tested with the following: 1. manually flashed to Unagi build 2013-03-20-070206 Gecko http://hg.mozilla.org/releases/mozilla-b2g18/rev/778da49486f0 Gaia 6c3767c2dea43b5e9aff7d156d36d69649005621 2. revertNightly 3. OTA to Build 2013-03-21-070203 Gecko http://hg.mozilla.org/releases/mozilla-b2g18/rev/7508c5a1026b Gaia 7af427d35c4d557c75b2060022815f07851acc28 Issue seems still to be occurring when OTA from V.1.0.1 builds but there are other bugs to cover that, please refer to bugs 847511 and 842932 (note that switching update channels takes place here)
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•