Closed Bug 966476 Opened 10 years ago Closed 10 years ago

Settings app is killed when showing the passcode panel with the Hardware Composer turned on

Categories

(Core :: Panning and Zooming, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
blocking-b2g 1.4+

People

(Reporter: sotaro, Unassigned)

References

Details

(4 keywords, Whiteboard: [MemShrink:P2][xfail])

Attachments

(1 file)

Settings app is killed because of OOM during scrolling. I saw the crash on master hamachi. I did not saw the problem on b2g v1.3. The problem happens only when APZ is enabled. But the crash did not happens on v1.3. Other factor might affect to the crash.

STR
[1] start Settings app
[2] scroll to bottom of the page.
[3] scroll very quickly until top
[4] scroll very quickly until bottom

continue [3] [4] until, the app is killed by low memory killer.
blocking-b2g: --- → 1.4?
Before the crash setting app's rendering becomes white.
Component: Gaia::Settings → Graphics
Product: Firefox OS → Core
Severity: normal → critical
Keywords: perf, regression
Whiteboard: [MemShrink]
here is another STR:

1. Open Settings
2. Open Screen Lock
3. Select to create Passcode
4. Type in the code
5. Tap "Create"

==> the Settings is killed, the user is unable to set Passcode
Keywords: smoketest
Attached file Logcat-Comment2
Logcat for issue described in Comment 2
blocking Sign in to WiFi from Settings as well
Jason does this affect 1.3?
(In reply to Andreas Gal :gal from comment #5)
> Jason does this affect 1.3?

Nope.

Talked with Alexandre about this - he indicated he couldn't reproduce this when APZC is disabled. I've confirmed this as well - you only OOM here if APZC is enabled.
Component: Graphics → Panning and Zooming
Vivien recommended I ping Botond about this, so I'm putting needinfo on him.
Flags: needinfo?(botond)
Blocks: gaia-apzc
This was working on the 1/30/2014 build, busted on 1/31/2014 build. Working on translating to a range right now.
Working:

Build ID: 20140130040201 
Gecko: bf49e4428906 
Gaia: 0bc0e703df197d46dfffb9ac65cb85d2e3e10c4a 

Busted:

Build ID: 20140131095418 
Gecko: 735a648bca0d 
Gaia: aedd5c9636f305d4433491056a0ca984dfb859b1
APZC is enabled in 1.3 so it does affect 1.3 then?
sounds like a recent regression in code that hasn't been uplifted to 1.3 yet. I see a number of possibilities in that window but not sure which is most likely. Can we narrow the window further using inbound or local builds?

Also, when people run into this, does the memory usage increase gradually before OOMing (a leak) or does it just OOM on one set of giant allocation requests?
At a minimum, we can definitely can cut this window in half, as we now have two builds per day by default. Adding regression window wanted to cut this down to 12 hours.
Keywords: qablocker
Blocks: 964842
On Nexus S, I'm reproducing this easily, even just taping in the settings or loading the settings can trigger. I've also reproduced on Desire Z, when tapping on the dialer keypad while performing some USSD.
Whiteboard: [MemShrink] → [MemShrink][xfail]
Unlike most of the APZ problems this kills the child, not the parent, so P2.

Let me know if there's anything we can help with, but I suspect that kats et al. have this under control.
Whiteboard: [MemShrink][xfail] → [MemShrink:P2][xfail]
Last Working Environmental Variables:
Device: Buri v1.4 Mozilla RIL
BuildID: 20140130040201
Gaia: 0bc0e703df197d46dfffb9ac65cb85d2e3e10c4a
Gecko: bf49e4428906
Version: 29.0a1
Base Image: V1.2-device.cfg

First Broken Environmental Variables:
Device: Buri v1.4 Mozilla RIL
BuildID: 20140130160200
Gaia: 09064f43116d1b965cb3ab6516fa0f1fa3c98a4c
Gecko: 6f544aa66c1a
Version: 29.0a1
Base Image: V1.2-device.cfg
(In reply to Jason Smith [:jsmith] from comment #18)
> http://hg.mozilla.org/mozilla-central/
> pushloghtml?fromchange=bf49e4428906&tochange=6f544aa66c1a

Maybe bug 956690?
Flags: needinfo?(botond)
Dale is going to manually verify a backout of bug 956690. If it verifies, then we'll back that patch out.
blocking-b2g: 1.4? → 1.4+
(In reply to Jason Smith [:jsmith] from comment #20)
> Dale is going to manually verify a backout of bug 956690. If it verifies,
> then we'll back that patch out.

Looks like Dale has left the work week, so I'm going to find someone else who can look into this.
Whiteboard: [MemShrink:P2][xfail] → [MemShrink:P2][xfail][status:Waiting on Vivien to verify a backout]
(In reply to Jason Smith [:jsmith] from comment #21)
> (In reply to Jason Smith [:jsmith] from comment #20)
> > Dale is going to manually verify a backout of bug 956690. If it verifies,
> > then we'll back that patch out.
> 
> Looks like Dale has left the work week, so I'm going to find someone else
> who can look into this.

I tried to back out the mentionned bug locally and I still see the crash :/
(In reply to Vivien Nicolas (:vingtetun) (:21) from comment #22)
> (In reply to Jason Smith [:jsmith] from comment #21)
> > (In reply to Jason Smith [:jsmith] from comment #20)
> > > Dale is going to manually verify a backout of bug 956690. If it verifies,
> > > then we'll back that patch out.
> > 
> > Looks like Dale has left the work week, so I'm going to find someone else
> > who can look into this.
> 
> I tried to back out the mentionned bug locally and I still see the crash :/

Can't not reproduce if I turn off the HW composer. Will get a stack in a few.
Sorry the stack is a bit small:

(gdb) bt
#0  0x42f04c38 in hwc_set (dev=0x40310840, dpy=<value optimized out>, sur=<value optimized out>, list=0x114)
    at hardware/qcom/display/libhwcomposer/a-family/hwcomposer.cpp:1789
#1  0x00000000 in ?? ()
Summary: Settings app is killed because of OOM during scrolling → Settings app is killed when showing the passcode panel with the Hardware Composer turned on
Whiteboard: [MemShrink:P2][xfail][status:Waiting on Vivien to verify a backout] → [MemShrink:P2][xfail]
Sotaro, your change in the this range is not the reason (I don't see how), but do you have any insight into when a HWC may crash as above, but the OGL compositing would work?  What should we be looking for? There were some layout/layer changes in the range, could it be that HWC doesn't like the type/number/whatever of layers we're sending it?  This is the range: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=bf49e4428906&tochange=6f544aa66c1a
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Vivien Nicolas (:vingtetun) (:21) from comment #24)
> Sorry the stack is a bit small:
> 
> (gdb) bt
> #0  0x42f04c38 in hwc_set (dev=0x40310840, dpy=<value optimized out>,
> sur=<value optimized out>, list=0x114)
>     at hardware/qcom/display/libhwcomposer/a-family/hwcomposer.cpp:1789
> #1  0x00000000 in ?? ()

On my master hamachi, after a backout of bug 956690, The STR in Comment 0 becomes difficult to reproduce, but the STR in Comment 2 still happens.

But I did not see the above hw composer's abort.
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Vivien Nicolas (:vingtetun) (:21) from comment #24)
> Sorry the stack is a bit small:
> 
> (gdb) bt
> #0  0x42f04c38 in hwc_set (dev=0x40310840, dpy=<value optimized out>,
> sur=<value optimized out>, list=0x114)
>     at hardware/qcom/display/libhwcomposer/a-family/hwcomposer.cpp:1789
> #1  0x00000000 in ?? ()

I never saw the abort around there. I feel that this abort is side effect of other problem. on my environment, hwcomposer.cpp:1789 is not set function, but drawLayerUsingBypass() function. The debug information seems not correct.
(In reply to Sotaro Ikeda [:sotaro] from comment #27)
> (In reply to Vivien Nicolas (:vingtetun) (:21) from comment #24)
> > Sorry the stack is a bit small:
> > 
> > (gdb) bt
> > #0  0x42f04c38 in hwc_set (dev=0x40310840, dpy=<value optimized out>,
> > sur=<value optimized out>, list=0x114)
> >     at hardware/qcom/display/libhwcomposer/a-family/hwcomposer.cpp:1789
> > #1  0x00000000 in ?? ()
> 
> I never saw the abort around there. I feel that this abort is side effect of
> other problem. on my environment, hwcomposer.cpp:1789 is not set function,
> but drawLayerUsingBypass() function. The debug information seems not correct.

sorry, my environment is master hamachi. The above might using other hardware.
I updated my master hamachi ROM from recent source code. By the ROM the crash does not happen by the STR in Comment 0 and the STR in Comment 2.
Recent change around APZ seems only Bug 968495. It changed apz.pan_repaint_interval from 250ms to 40 ms.
Can QA confirm if the problem happens on today(most recent) ROM?
Keywords: qawanted
Bug 968495 was just landed to m-c. Today's rom might not include the change.
This issue is no longer present on today's Buri Master M-C build.

Environmental Variables:
Device: Buri Master M-C mozRIL
BuildID: 20140207040203
Gaia: 3fc26ae786e3869a7ef1e23afc9807ac1b4741f2
Gecko: d05c721ea1b0
Version: 30.0a1
v1.2-device.cfg
Keywords: qawanted
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: