Closed Bug 1041510 Opened 6 years ago Closed 6 years ago

[v2.1] Updating a contact causes the phone to restart

Categories

(Core :: Layout, defect, critical)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla34
blocking-b2g 2.1+
Tracking Status
b2g-v2.1 --- verified

People

(Reporter: viorela, Assigned: tnikkel)

References

Details

(Keywords: regression, smoketest, Whiteboard: [fromAutomation][xfail][regression])

Attachments

(7 files, 1 obsolete file)

Attached file logcat.txt
This issue is reproducible on the latest master build. After updating a contact, the phone is restarting. 
I was able to reproduce this both manually and with automation.
The issue is not reproducible on v2.0

Prerequisities:
Make sure you have one contact saved on your phone

#STR:
1. Launch Contacts app
2. Tap on the existing contact
3. Tap on edit contact button 
4. Modify the name of the contact
5. Tap Update button 

#Expected results:
Contact's details page is returned, with the name updated

#Actual results:
The phone is restarted


Build info:
Device: Flame
Gecko     https://hg.mozilla.org/mozilla-central/rev/42c6a5418370
BuildID   20140721040210
Version   33.0a1
ro.build.version.incremental=108
ro.build.date=Tue Jun 10 19:40:40 CST 2014

Link to Jenkins report: http://jenkins1.qa.scl3.mozilla.com/job/flame.mozilla-central.ui.functional.smoke/11/console
Logcat attached
Severity: normal → critical
Whiteboard: [fromAutomation][xfail][regression]
blocking-b2g: --- → 2.1?
The phone is also restarted after we tap the call log button, in order to see all calls we made.

#STR: 
1. Launch Dialer app
2. Type a phone number, then press Call button
3. After the call is launched, press Hang up button
4. Tap Call log button (the one in the bottom - left)

#Expected resuls:
The call log list is displayed

#Actual results:
The phone is restarted
From the inbound builds we can't get the usual build info:

But comparing the 2 sources xml of the build it looks like the only diff is the geko commit

Working with:
Mercurial-Information: <project name="https://hg.mozilla.org/integration/b2g-inbound" path="gecko" remote="hgmozillaorg" revision="7a809cadbf0d"/>

Failing with:
Mercurial-Information: <project name="https://hg.mozilla.org/integration/b2g-inbound" path="gecko" remote="hgmozillaorg" revision="4bafe35cfb65"/>


Geko diff:
http://hg.mozilla.org/integration/b2g-inbound/pushloghtml?fromchange=7a809cadbf0d&tochange=4bafe35cfb65
managed to get better build info:

last good:
#####    Unzip application.zip error.
Gecko     https://hg.mozilla.org/integration/b2g-inbound/rev/7a809cadbf0d
BuildID   20140720154716
Version   33.0a1
ro.build.version.incremental=109
ro.build.date=Mon Jun 16 16:51:29 CST 2014
B1TC00011220



First bad:
#####    Unzip application.zip error.
Gecko     https://hg.mozilla.org/integration/b2g-inbound/rev/4bafe35cfb65
BuildID   20140720181517
Version   33.0a1
ro.build.version.incremental=109
ro.build.date=Mon Jun 16 16:51:29 CST 2014
B1TC00011220
QA Contact: ddixon
I am also seeing this reboot in the following areas:

1. Long tapping a Homescreen icon
2. Move icons on the Homescreen.
3. The phone restarts when selecting 'Done.'

1. Edit an existing Calendar Event.
2. Set the start and end times to be the same
3. Trying to save will throw a pop-up message, and then the phone will restart.

1. Open an email.
2. Attempt to reply to the email, and the phone restarts.
MOZILLA INBOUND Regression Window: 

Last Working

Device: Flame Master
Build ID: 20140719062420
Gaia: Unknown
Gecko: 9350909a3401
Version: 33.0a1 (Master)
Firmware Version: v122
User Agent: Mozilla/5.0 (Mobile; rv:33.0) Gecko/33.0 Firefox/33.0

First Broken 

Device: Flame Master
Build ID: 20140719065518
Gaia: Unknown
Gecko: 24a69de91baa
Version: 33.0a1 (Master)
Firmware Version: v122
User Agent: Mozilla/5.0 (Mobile; rv:33.0) Gecko/33.0 Firefox/33.0

Last Working Gaia and First Broken Gecko
Issue DOES occur here. 
Gaia: Unknown
Gecko: 24a69de91baa

Last Working Gecko and First Broken Gaia
Issue DOES NOT occur here. 
Gaia: Unknown
Gecko: 9350909a3401

Gecko Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=9350909a3401&tochange=24a69de91baa

Possible Cause: 

Bug 1022612 - Remove ComputeVisibility pass when painting
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(jmitchell)
broken by bug 1022612 ?
Blocks: 1022612
Flags: needinfo?(jmitchell) → needinfo?(roc)
I just went through all known instances of this restart issue. None of them occur on the Last Working build and all of them occur on the First Broken build from the Regression Window outlined in Comment 5.
Component: Gaia::Contacts → Layout
Product: Firefox OS → Core
I'm not seeing this in a build from a few hours ago.  (I tried the steps from comment 0 and the first set from comment 4.)

My build is based on https://hg.mozilla.org/mozilla-central/rev/0dc711216018 .
I've been able to reproduce this failure in the latest master build:

Gaia      e423c3be8d19c9a8a5ae2571f499c36dc6b0df89
Gecko     https://hg.mozilla.org/mozilla-central/rev/6f702709fab6
BuildID   20140722040212
Version   34.0a1
ro.build.version.incremental=108
ro.build.date=Tue Jun 10 19:40:40 CST 2014

I also found some other tests that failed with the same error, the phone was restarted. STR below:
Case 1:
1. Long press on a Homescreen icon for ~2 sec => the phone restarts

Case 2:
Prerequsisities:
Have an MP3 file saved on your phone
1. Launch Music app
2. Tap on any of the bottom buttons (Artists, Albums, Songs) => the phone restarts
Same thing happens with an 3gp file.

Case 3:
1. Launch Contacts app
2. Tap on settings button
3. Tap Import contacts button
4. On Import contacts page tap on Gmail
5. Enter your gmail credentials, then tap Sign in 
6. Tap Grant access, then wait for contacts list to be displayed
7. Tap Select all, then tap Import and wait for contacts to be imported 
8. Tap back => the phone restarts

The total number of automated tests that are affected by this bug is 10.
This is affecting loads of functionality, we should have backed out something by now.

Can someone test the patch cited in comment 6?
Duplicate of this bug: 1041875
I updated to today's build.  I crashed with the music app, when tapping on the buttons while the database was populating.  I haven't crashed following going back in and tapping on the buttons.

Gaia      e423c3be8d19c9a8a5ae2571f499c36dc6b0df89
Gecko     https://hg.mozilla.org/mozilla-central/rev/6f702709fab6
BuildID   20140722040212
Version   34.0a1
ro.build.version.incremental=110
ro.build.date=Fri Jun 27 15:57:58 CST 2014
B1TC00011230
The long tapping of the homescreen icon is pretty bad and still occurs.  I didn't see any crash reports UI.
Does anybody have a crash report or stack of the crash?
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #14)
> Does anybody have a crash report or stack of the crash?

Actually, I raised a bug that is a dupe of this, and it contains the logcat when it happened:
https://bugzilla.mozilla.org/show_bug.cgi?id=1041875
This bug also has a logcat attachment. Unfortunately logcat doesn't contain crash stacks.
I can reproduce this locally. The underlying problem is the same as that for bug 1041608 - the layer tree has two ContainerLayer instances with the same scrollId. In this case we end up trying to use the same APZC instance for both of them and that ends up creating an APZC tree with an infinite cycle. This causes hangs/crashes when we try to walk it or enumerate the APZCs.
Duplicate of this bug: 1042206
Assignee: nobody → tnikkel
Attachment #8460434 - Flags: review?(roc)
Flags: needinfo?(roc)
See bug 1041608, comment 12 to comment 16 for info about these patches, where they were originally posted.
https://hg.mozilla.org/mozilla-central/rev/83e9ee5700bb
https://hg.mozilla.org/mozilla-central/rev/b102fe6b3565
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
I've tried most of the scenarios in this ticket with a build containing the fix, and can confirm that it's fixing the issue.

However, I experienced the same issue (phone restarting, in the same way it was, for example, when moving an icon around), when launching the "loqui im" application.

STR:
- install "loqui im" from the marketplace
- try launching it from the homescreen (launching it from the marketplace won't work, I created bug 1042624 for that)
- the phone should restart

Attached is the logcat during this restart.
(In reply to Mathieu Agopian [:magopian] from comment #24)
> STR:
> - install "loqui im" from the marketplace
> - try launching it from the homescreen (launching it from the marketplace
> won't work, I created bug 1042624 for that)
> - the phone should restart

Thanks, I'm able to reproduce this on my Flame as well. tn, it seems like the same problem as before. I'm attaching a display list and layer dump from the child process that shows the failure to flatten. Mind taking a look?
Flags: needinfo?(tnikkel)
(The above is from a build running hg cset 717cd9d89a80, which has the previous fixes from this bug).
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Kartikaya, how do you grab those traces you attached? If there's a page explaining how to properly provide traces, I'd love to know ;)
(In reply to Mathieu Agopian [:magopian] from comment #28)
> Kartikaya, how do you grab those traces you attached? If there's a page
> explaining how to properly provide traces, I'd love to know ;)

Unfortunately there isn't much documentation about this. You need to build with --enable-dump-painting in your mozconfig (or export B2G_DUMP_PAINTING=1 in your .userconfig) and then set the layout.display-list.dump pref to true.
Attachment #8461008 - Flags: review?(tnikkel) → review+
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #30)
> (In reply to Mathieu Agopian [:magopian] from comment #28)
> > Kartikaya, how do you grab those traces you attached? If there's a page
> > explaining how to properly provide traces, I'd love to know ;)
> 
> Unfortunately there isn't much documentation about this. You need to build
> with --enable-dump-painting in your mozconfig (or export B2G_DUMP_PAINTING=1
> in your .userconfig) and then set the layout.display-list.dump pref to true.

Note also that if you have a debug build ('export B2G_DEBUG=1' in your .userconfig) then you don't need '--enable-dump-paintinh' or 'export B2G_DUMP_PAINTING=1'.

BenWa just added these instructions to https://wiki.mozilla.org/FirefoxOS/Performance/App_Performance_Validation#5._Layer_Tree (see the "Get a display list dump" point).
Comment on attachment 8461008 [details] [diff] [review]
Improve display list logging

It would be great if we could just output the zindex if it's !=0. That way we only print it when it's relevant and save a bit of spew.
We set a z-index override of 0, however we use the override value being greater than zero as a sentinel to determine that it has been overridden. So we fall back to getting the actually z-index, which was -1 in this case, and we get sorted into the wrong place.
Attachment #8461161 - Flags: review?(roc)
Flags: needinfo?(tnikkel)
(In reply to Timothy Nikkel (:tn) from comment #32)
> It would be great if we could just output the zindex if it's !=0. That way
> we only print it when it's relevant and save a bit of spew.

We can do this, but it'll be an extra nsPrintfCString() probably. I guess it's probably worth it considering this is debug only. I'll update the patch and test your latest fix.
Updated to only print non-zero z-indices, carrying r+.

Confirmed your other patch works to fix the STR in comment 24.
Attachment #8461008 - Attachment is obsolete: true
Attachment #8461173 - Flags: review+
Comment on attachment 8461161 [details] [diff] [review]
Fix z-index override code

Review of attachment 8461161 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/base/nsDisplayList.h
@@ +2611,5 @@
>    // The frames from items that have been merged into this item, excluding
>    // this item's own frame.
>    nsTArray<nsIFrame*> mMergedFrames;
>    nsRect mBounds;
> +  bool mHasZIndexOverride;

Put the bool last.
Attachment #8461161 - Flags: review?(roc) → review+
I checked the scenarios described in comment 0, comment 1 and comment 9 and the issue with the phone restarting is not reproducible anymore in the latest master build.

Gaia      15c84c943e41ad834640a45e1e1c2ac804168af7
Gecko     https://hg.mozilla.org/mozilla-central/rev/30907d52c4c2
BuildID   20140723160203
Version   34.0a1
ro.build.version.incremental=109
ro.build.date=Mon Jun 16 16:51:29 CST 2014
https://hg.mozilla.org/mozilla-central/rev/3c04a2241acd
https://hg.mozilla.org/mozilla-central/rev/9c3d8f8b46f7
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Gaia      c72257b2d27135bfcd68e89dd584182797784016
Gecko     https://hg.mozilla.org/mozilla-central/rev/06ac51c2b8a8
BuildID   20140724040205
Version   34.0a1
ro.build.version.incremental=110
ro.build.date=Fri Jun 27 15:57:58 CST 2014
B1TC00011230
Flame
Status: RESOLVED → VERIFIED
Depends on: 1043610
Switching the 2.1?->2.1+, on these fixed bugs as these are regression.

Nothing to land here, its just flag-cleanup of 2.1? list. Please Ni me if there is confusion/disagreement.
blocking-b2g: 2.1? → 2.1+
You need to log in before you can comment on or make changes to this bug.