Closed Bug 1048639 Opened 10 years ago Closed 10 years ago

Homescreen does not paint (just see system wallpaper)

Categories

(Firefox OS Graveyard :: Gaia::Homescreen, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

(blocking-b2g:2.0+, b2g-v2.0 fixed, b2g-v2.1 fixed)

RESOLVED FIXED
2.1 S3 (29aug)
blocking-b2g 2.0+
Tracking Status
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed

People

(Reporter: ggrisco, Assigned: kgrandon)

References

()

Details

(Keywords: crash, Whiteboard: [caf-crash 263][caf priority: p1][b2g-crash][CR 686636][systemsfe])

Attachments

(8 files, 5 obsolete files)

1.50 MB, application/zip
Details
4.35 MB, application/x-bzip
Details
523.66 KB, image/png
Details
862.21 KB, application/x-bzip
Details
46 bytes, text/x-github-pull-request
tkundu
: feedback+
Details | Review
312.41 KB, application/zip
Details
2.61 MB, application/x-bzip
Details
46 bytes, text/x-github-pull-request
crdlc
: review+
Details | Review
[Blocking Requested - why for this release]:

This is a re-open of bug 1033618 which is still happening.
hi Kevin,

We are again hitting this issue after 48 hour stability testing. 
STR will be difficult. 

Observations:
1) |adb shell b2g-info| shows that homescreen process is not running. 
2) its not a memleak.

Could you please provide us a patch which will log more details during homescreen launching and will help us to understand why it is not launched again after killed ?
Flags: needinfo?(kgrandon)
Whiteboard: [b2g-crash][CR 686636] → [caf priority: p1][b2g-crash][CR 686636]
Whiteboard: [caf priority: p1][b2g-crash][CR 686636] → [caf-crash 263][caf priority: p1][b2g-crash][CR 686636]
Keywords: crash
Observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.043
Moz BuildID: 20140721000201
B2G Version: 2.0
Gecko Version: 32.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=8cb1a949f2e9650bb2c5598e78a6f24a58bbaf97
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=5f27d3ee3ccf01ac91a3efacb5e3e22ea62fd73c
Thanks for the info Tapas. It sounds like we should wait for a patch in bug 1038854 and check with that patch.
Flags: needinfo?(kgrandon)
Blocks a blocker.
blocking-b2g: 2.0? → 2.0+
DOes this fall under your plate?
Flags: needinfo?(tlee)
According to comment 2 and 4, the patch in bug 1038854 was just landed on 8/7. We need to retest with that patch. 
Hi Greg, could you please help?
Flags: needinfo?(ggrisco)
I have no comment before getting test result of the bug 1038854.  Please ni me, again, if bug 1038854 does not fix the issue.
Flags: needinfo?(tlee)
We have been running tests with the fixes in bug 1038854. We will update if it shows up again.
Flags: needinfo?(ggrisco)
(In reply to Inder from comment #9)
> We have been running tests with the fixes in bug 1038854. We will update if
> it shows up again.

The fix for bug 1038854 has already landed and has been built into AUs since AU51.
We've ruled out anything Nuwa or process launching related.  The homescreen is definitely running.  Still investigating why it doesn't paint.
Summary: Applications (e.g. homescreen) can fail to start if the preallocated process is in the process of dying → Homescreen does not paint (just see system wallpaper)
Attached file layerdump.tar.bz2 (obsolete) —
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #12)
> We've ruled out anything Nuwa or process launching related.  The homescreen
> is definitely running.  Still investigating why it doesn't paint.


As discussed, here is the layer dump for both affected and unaffected device
Hi Kyle,

As discussed , here is gecko memory report from unaffected device.
The homescreen is missing most of its DOM nodes, so this is a Gaia bug until proven otherwise.
Component: DOM: Content Processes → Gaia::Homescreen
Product: Core → Firefox OS
Hi Greg, Kyle,

Can you provide any of:

1 - Screenshot the problem.
2 - Video of getting into this state.
3 - Dump of the dom nodes of the homescreen (according to comment 15).
Flags: needinfo?(khuey)
Flags: needinfo?(ggrisco)
4 - Also a logcat of the homescreen would be useful, didn't see one here.
Passing this to Kevin as he's looking into this issue.
Assignee: nobody → kgrandon
> 3 - Dump of the dom nodes of the homescreen (according to comment 15).

The about:memory reports have that (via the CC dumps).  I can show you how to read it tomorrow.

Tapas has #1 and #4.
Flags: needinfo?(khuey)
Flags: needinfo?(tkundu)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #19)
> > 3 - Dump of the dom nodes of the homescreen (according to comment 15).
> 
> The about:memory reports have that (via the CC dumps).  I can show you how
> to read it tomorrow.
> 
> Tapas has #1 and #4.

Here is the link to download #1 and #4 from device which we debugged together for this issue.

https://drive.google.com/file/d/0B1cSMS8_GuAETF93SlpGVkM0OHc/edit?usp=sharing
Flags: needinfo?(tkundu)
Flags: needinfo?(ggrisco)
Attached image Screenshot of issue (obsolete) —
I see some lines in the logcat regarding dragdrop. I don't think this actually impacts this issue, but removing the line will help us debug it, and maybe even fix it. Bug 1051061 will fix this.
Depends on: 1051061
Tapas - we are going to try to get the patch from bug 1051061 reviewed and landed soon, but in the meantime would you want to try to test with this patch?

Also - the line numbers in the log do not match up. Could you provide us the homescreen zip file for us to debug this. The file should be located at: /system/b2g/webapps/verticalhome.gaiamobile.org/application.zip
Flags: needinfo?(tkundu)
Attached file verticalhome.gaiamobile.org.tar.bz2 (obsolete) —
(In reply to Kevin Grandon :kgrandon from comment #23)
> Tapas - we are going to try to get the patch from bug 1051061 reviewed and
> landed soon, but in the meantime would you want to try to test with this
> patch?
> 
Thanks . I asked out internal test team to test it.

> Also - the line numbers in the log do not match up. 

As we discussed in #gaia IRC, here is the exact gaia/gecko SHA1s which we picked up for logs in Comment 20

https://www.codeaurora.org/cgit/quic/lf/b2g/mozilla/gaia/commit/?h=mozilla/v2.0&id=8cc28fd31905a0ea2b2e15d13e80a0eab2feb1ba
https://www.codeaurora.org/cgit/quic/lf/b2g/mozilla/gecko/commit/?h=mozilla/v2.0&id=f7bd772b1e42774708a4ede13b149a1706a59b25

Please note that we also cherry-picked following patches on top of above gaia/gecko:

gaia patch: bug 1050751 attachment 8470312 [details] , bug 1050423
gecko patch: bug 1008791, bug 1030112, bug 1050423, bug 1028532, bug 1047149, bug 1044322, bug 1047645, bug 1049806 attachment 8469653 [details] [diff] [review]


> Could you provide us the
> homescreen zip file for us to debug this. 
> The file should be located at:
> /system/b2g/webapps/verticalhome.gaiamobile.org/application.zip

Yes. I attached it here.
Flags: needinfo?(tkundu)
Flags: needinfo?(kgrandon)
Thanks Tapas. These log lines do match up and I will see if there is additional information I can pull out of the logs.

Are you by chance able to provide a remote debug session to the device in this state with the app manager enabled for certified applications?

These are the device preferences necessary to access the app manager: https://github.com/mozilla-b2g/gaia/blob/master/build/preferences.js#L256

I am essentially looking for the HTML content of the body, as well as the content within the javascript object located at: app.grid.getItems().
Flags: needinfo?(kgrandon) → needinfo?(tkundu)
(In reply to Kevin Grandon :kgrandon from comment #25)
> Thanks Tapas. These log lines do match up and I will see if there is
> additional information I can pull out of the logs.
> 
> Are you by chance able to provide a remote debug session to the device in
> this state with the app manager enabled for certified applications?
> 
> These are the device preferences necessary to access the app manager:
> https://github.com/mozilla-b2g/gaia/blob/master/build/preferences.js#L256
> 
> I am essentially looking for the HTML content of the body, as well as the
> content within the javascript object located at: app.grid.getItems().

Sorry, device is rebooted but i can arrange another remote gdb session with app manager enabled.
Thanks for the ideas.
(In reply to Kevin Grandon :kgrandon from comment #25)
> Thanks Tapas. These log lines do match up and I will see if there is
> additional information I can pull out of the logs.
> 
> Are you by chance able to provide a remote debug session to the device in
> this state with the app manager enabled for certified applications?
> 
> These are the device preferences necessary to access the app manager:
> https://github.com/mozilla-b2g/gaia/blob/master/build/preferences.js#L256
> 
> I am essentially looking for the HTML content of the body, as well as the
> content within the javascript object located at: app.grid.getItems().

Please note that it is always fast if we can do analysis using logcat logs. So please give us some patch for this if possible.
Flags: needinfo?(tkundu) → needinfo?(kgrandon)
Tapas - here is a patch which will add a bunch of output to the vertical homescreen. It's not nearly as large as debugging IPC logging, but it should still provide significant logging. Please try this patch out if possible and send me the updated logs if you have any.

https://github.com/mozilla-b2g/gaia/pull/22805
Flags: needinfo?(kgrandon) → needinfo?(tkundu)
We got device ready with gdb access with this issue reproduced. Tapas is going to get in touch with Kevin for debug.
(In reply to Inder from comment #29)
> We got device ready with gdb access with this issue reproduced. Tapas is
> going to get in touch with Kevin for debug.
Sorry, nevermind -- wrong bug :(. We are still not able to reproduce this issue with additional logs.
Flags: in-moztrap?(ychung)
New test case needs to be added. There is no existing test case.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Test case added in moztrap:

https://moztrap.mozilla.org/manage/case/14321/
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(ychung)
Flags: in-moztrap+
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage+][2.0-signoff-need+]
Whiteboard: [caf-crash 263][caf priority: p1][b2g-crash][CR 686636] → [caf-crash 263][caf priority: p1][b2g-crash][CR 686636][systemsfe]
Target Milestone: --- → 2.1 S2 (15aug)
Not reproducing anymore, closing. Will reopen if it comes back again.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(tkundu)
Resolution: --- → WORKSFORME
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Attached file tapas_cr_686636.zip
it is reproduced again and It has following logs: 

1) Screenshot of affected device after pressing home key 
2) memory report from affected device after pressing home key 
3) kernel and logcat logs

We also hav gdb acess to affected device . Please make a NI on me if you want to debug it remotely using gdb or not
NI :kgrandon here to get started on the investigation.
Flags: needinfo?(kgrandon)
Attached file about-memory-1.tar.bz2
Kyle,

I attached another memory report from same device and it contains gc-cc logs for visualizing dom tree nodes

I remember that gecko memory report and dumping layer tree (using gdb trick) didn't help us here last time and you concluded that homescreen does not have layers for icon

Please let me know how I can help you to debug further
In the latest screenshots there is zero chrome from the homescreen as all. Tracking as a system issue for now.
Component: Gaia::Homescreen → Gaia::System::Window Mgmt
Flags: needinfo?(kgrandon)
Attached image Screenshot of issue
Updating the screenshot based on latest findings. I don't think it's the same as the originally reported issues, so clearing out legacy attachments as well.
Attachment #8471088 - Attachment is obsolete: true
Attachment #8471208 - Attachment is obsolete: true
Attachment #8471222 - Attachment is obsolete: true
Attachment #8471720 - Attachment is obsolete: true
Attachment #8471838 - Attachment is obsolete: true
Attachment #8476084 - Attachment description: Screen shot of issue → Screenshot of issue
I agree with Kevin that this appears to be different since the search box is not painting.  The layer tree dump might prove useful after all.
Flags: needinfo?(khuey)
Although the DOM does look pretty much the same in both the system app and the homescreen compared to last time.
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #39)
> I agree with Kevin that this appears to be different since the search box is
> not painting.  The layer tree dump might prove useful after all.

here is the layer tree dump in logcat after pressing homekey on affected device.
Flags: needinfo?(khuey)
The layer tree dump indicates that we're scrolled down 56 pixels on the home screen, which is probably why we don't see the search box.  Other than that this appears to be identical to the last time we looked at this.
Component: Gaia::System::Window Mgmt → Gaia::Homescreen
Flags: needinfo?(khuey)
just as an update: We have both marionette and gdb access to affected device. Please let us know if you want to try something on device for debugging.
QA Whiteboard: [QAnalyst-Triage+][2.0-signoff-need+] → [QAnalyst-Triage+][lead-review+][2.0-signoff-need+]
(In reply to Tapas Kumar Kundu from comment #43)
> just as an update: We have both marionette and gdb access to affected
> device. Please let us know if you want to try something on device for
> debugging.

I did check with kyle sometime back and he had sufficient information to investigate and we discussed we'll do additional follow-up once he has some ideas from his debugging.
We noticed this in the logs which seems weird: 08-20 18:31:27.137 32331 32331 E GeckoConsole: [JavaScript Error: "TypeError: manifest.entry_points is undefined" {file: "app://verticalhome.gaiamobile.org/gaia_build_defer_index.js" line: 417}]

Tapas - this patch will add a small safeguard and adds some logging when we can't find an entrypoint. Can you apply this patch and try to get into this scenario again? Thanks!
Attachment #8476335 - Flags: feedback?(tkundu)
I've added a second commit to the pull request with more logging.
(In reply to Kevin Grandon :kgrandon from comment #46)
> I've added a second commit to the pull request with more logging.

I am asking our internal team to reproduce with this log. I will update here asap
Attached file cr_686636.zip
Issue is reproduced on device around "08-21 13:55:00"
Flags: needinfo?(khuey)
Flags: needinfo?(kgrandon)
Kevin has this under control for now.
Flags: needinfo?(khuey)
This is a good hint: 08-21 13:55:07.076  9786  9786 E GeckoConsole: [JavaScript Error: "TypeError: cyclic object value" {file: "app://verticalhome.gaiamobile.org/gaia_build_defer_index.js" line: 330}]

Going to see if I can reproduce this on a phone, otherwise I will provide another patch which should flush this out a bit more.
Flags: needinfo?(kgrandon)
Tapas - I am investigating to see if I can reproduce the situation locally. In the meantime, I've refreshed the attached patch to change the logging. Would you mind applying it again and reporting the logs? Thank you.

https://github.com/mozilla-b2g/gaia/pull/23115
Flags: needinfo?(tkundu)
Attachment #8476335 - Flags: feedback?(tkundu) → feedback+
Flags: needinfo?(tkundu)
(In reply to Kevin Grandon :kgrandon from comment #51)
> Tapas - I am investigating to see if I can reproduce the situation locally.
> In the meantime, I've refreshed the attached patch to change the logging.
> Would you mind applying it again and reporting the logs? Thank you.
> 
> https://github.com/mozilla-b2g/gaia/pull/23115

Thanks for helping . I will give it a try.
Target Milestone: 2.1 S2 (15aug) → 2.1 S3 (29aug)
Is there any progress on here? There hasn't been any update for five days.

Thanks
Flags: needinfo?(tkundu)
We thought that this issue is fixed as it was not coming stability runs.
But we have seen this issue again in latest stability run today. I will update here as soon as I get logs from our test team .
(In reply to Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me) from comment #54)
> We thought that this issue is fixed as it was not coming stability runs.
> But we have seen this issue again in latest stability run today. I will
> update here as soon as I get logs from our test team .

Inder/Tapas, any logs on this yet ?
Flags: needinfo?(ikumar)
Attached file mozilla_logs.tar.bz2
(In reply to bhavana bajaj [:bajaj] from comment #55)
> (In reply to Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me) from
> comment #54)
> > We thought that this issue is fixed as it was not coming stability runs.
> > But we have seen this issue again in latest stability run today. I will
> > update here as soon as I get logs from our test team .
> 
> Inder/Tapas, any logs on this yet ?

We tried to reproduce it in test build with logging patch from Comment 52 but that test build didn't reproduce this issue. 

But we are seeing same issue in another test build.
Flags: needinfo?(tkundu)
Flags: needinfo?(kgrandon)
Flags: needinfo?(ikumar)
Hi Kevin,

Could you please take a look in logs of Comment 56 ?
Attached file Github pull request
After inspecting the patch, it appears that the issue might not reproduce after applying it because we are catching the exception. It seems absolutely impossible to get into this case, so I'm not sure how it's happening, unless you have some test app that is doing weird things, or you are installing some malformed apps. If you are not seeing additional weirdness with the homescreen it seems that we can land this guard and it may fix it. Though it seems this may be an issue that the user will never see.

Cristian - Could you review this patch? Thanks!
Attachment #8480998 - Flags: review?(crdlc)
Flags: needinfo?(kgrandon)
Comment on attachment 8480998 [details] [review]
Github pull request

LGTM, thanks
Attachment #8480998 - Flags: review?(crdlc) → review+
In master: https://github.com/mozilla-b2g/gaia/commit/4419091ae760328a52606920e335331a16fcb448
v2.0: https://github.com/mozilla-b2g/gaia/commit/aa7bcde4a751aaf65dd8c57bdfa6cbd75eabaa77

Tapas - please let us know if you run into this issue again or see something similar. Thanks!
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Tapas -- please monitor this fix in the stability runs. Keeping a ni on you.
Flags: needinfo?(tkundu)
(In reply to Inder from comment #61)
> Tapas -- please monitor this fix in the stability runs. Keeping a ni on you.

I will update here if this really fixes this bug :)
We are not seeing this issue for a long time. It seems to be fixed !
Flags: needinfo?(tkundu)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: