Closed Bug 845867 Opened 7 years ago Closed 6 years ago

crash in mozilla::layers::LayerManagerOGL::WorldTransformRect @ libGLES_hgl.so@0x4.... on Samsung ARMv6 devices with Broadcom VideoCore IV GPU running Gingerbread

Categories

(Core :: Graphics: Layers, defect, critical)

22 Branch
ARM
Android
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 925608
Tracking Status
firefox21 --- affected
firefox22 + affected
firefox23 + wontfix
firefox24 --- affected
firefox25 --- affected
fennec + ---

People

(Reporter: scoobidiver, Unassigned)

References

Details

(Keywords: crash, regression, topcrash-android-armv6, Whiteboard: [native-crash][ARMv6])

Crash Data

With combined signatures, it's #4 top crasher in 22.0a1.
It first showed up in 22.0a1/20130221072044. The regression range might be (discontinuous across builds):
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=401b967b2dfc&tochange=702d2814efbf

Affected devices are:
* samsung GT-S5360 = Galaxy Y
* samsung GT-B5510L = Galaxy Y Pro
* samsung GT-S5830i = Galaxy Ace
* samsung GT-S5830M = Galaxy Ace

Signature 	libGLES_hgl.so@0x478b4 More Reports Search
UUID	a25c158a-57b3-47e4-8ebe-25ebd2130227
Date Processed	2013-02-27 16:12:12
Uptime	32
Install Age	12.8 minutes since version was first installed.
Install Time	2013-02-27 15:59:28
Product	FennecAndroid
Version	22.0a1
Build ID	20130227030925
Release Channel	nightly
OS	Android
OS Version	0.0.0 Linux 2.6.35.7 #1 PREEMPT Fri Nov 9 13:21:57 KST 2012 armv6l samsung/GT-S5360/GT-S5360:2.3.6/GINGERBREAD/DDLK2:user/release-keys
Build Architecture	arm
Build Architecture Info	
Crash Reason	SIGBUS
Crash Address	0x0
App Notes 	
AdapterDescription: 'Broadcom -- VideoCore IV HW -- OpenGL ES 2.0 -- Model: GT-S5360, Product: GT-S5360, Manufacturer: samsung, Hardware: bcm21553'
EGL? EGL+ GL Context? GL Context+ GL Layers? GL Layers+ 
nothumb Build
samsung GT-S5360
samsung/GT-S5360/GT-S5360:2.3.6/GINGERBREAD/DDLK2:user/release-keys
Processor Notes 	sp-processor05.phx1.mozilla.com_20098:2008; exploitablity tool: ERROR: unable to analyze dump
EMCheckCompatibility	True
Adapter Vendor ID	Broadcom
Adapter Device ID	VideoCore IV HW
Device	samsung GT-S5360
Android API Version	10 (REL)
Android CPU ABI	armeabi

Frame 	Module 	Signature 	Source
0 	libGLES_hgl.so 	libGLES_hgl.so@0x478b4 	
1 	dalvik-heap (deleted) 	dalvik-heap @0x6d3ffe 	
2 	libxul.so 	gfxMatrix::TransformBounds const 	gfx/thebes/gfxMatrix.cpp:124
3 	dalvik-heap (deleted) 	dalvik-heap @0x733ffe 	
4 	libc.so 	libc.so@0x11026 	
5 	libGLES_hgl.so 	libGLES_hgl.so@0x476aa 	
6 	libEGL.so 	libEGL.so@0x433d 	
7 	dalvik-heap (deleted) 	dalvik-heap @0x733ffe 	
8 	libxul.so 	mozilla::layers::LayerManagerOGL::WorldTransformRect 	gfx/layers/opengl/LayerManagerOGL.cpp:1279
9 	libEGL.so 	libEGL.so@0x9a1e 	
10 	libxul.so 	mozilla::gl::GLContextEGL::MakeCurrentImpl 	gfx/gl/GLLibraryEGL.h:164
11 	libxul.so 	mozilla::layers::LayerManagerOGL::MakeCurrent 	obj-firefox/dist/include/GLContext.h:185
12 	libxul.so 	mozilla::layers::LayerManagerOGL::Render 	gfx/layers/opengl/LayerManagerOGL.cpp:1075
13 	dalvik-heap (deleted) 	dalvik-heap @0x81eaa9 	

More reports at:
https://crash-stats.mozilla.com/report/list?signature=libGLES_hgl.so%400x47710
https://crash-stats.mozilla.com/report/list?signature=libGLES_hgl.so%400x478b4
Crash Signature: [@ libGLES_hgl.so@0x47710 ] [@ libGLES_hgl.so@0x478b4 ] → [@ libGLES_hgl.so@0x47710 ] [@ libGLES_hgl.so@0x478b4 ] [@ libGLES_hgl.so@0x48e60 ] [@ libGLES_hgl.so@0x479b8 ] [@ libGLES_hgl.so@0x47b5c ]
Summary: crash in mozilla::layers::LayerManagerOGL::WorldTransformRect @ libGLES_hgl.so@0x47... on Samsung Galaxy Y and Ace with Broadcom VideoCore IV GPU running Gingerbread → crash in mozilla::layers::LayerManagerOGL::WorldTransformRect @ libGLES_hgl.so@0x4.... on Samsung Galaxy Y and Ace with Broadcom VideoCore IV GPU running Gingerbread
(In reply to Scoobidiver from comment #0)
> Affected devices are:
> * samsung GT-S5360 = Galaxy Y
> * samsung GT-B5510L = Galaxy Y Pro
> * samsung GT-S5830i = Galaxy Ace
> * samsung GT-S5830M = Galaxy Ace

We should have access to these devices in QA. We'll provide URLs for your testing.
I couldn't find any URLs in the trunk signatures listed in this bug.
Keywords: needURLs
Crash Signature: [@ libGLES_hgl.so@0x47710 ] [@ libGLES_hgl.so@0x478b4 ] [@ libGLES_hgl.so@0x48e60 ] [@ libGLES_hgl.so@0x479b8 ] [@ libGLES_hgl.so@0x47b5c ] → [@ gfxMatrix::TransformBounds(gfxRect const&) const ] [@ @0x0 | gfxMatrix::TransformBounds(gfxRect const&) const ] [@ libGLES_hgl.so@0x47710 ] [@ libGLES_hgl.so@0x478b4 ] [@ libGLES_hgl.so@0x48e60 ] [@ libGLES_hgl.so@0x479b8 ] [@ libGLES_hgl.so@0x47…
Snorp - can you take a look at recent related changes while we wait for QA feedback?
Assignee: nobody → snorp
There is way too much stuff in that range to see anything obvious. Streaming WebGL buffers is in there (bug 716859), but no reason to think it's involved at all. Also the stack trace looks like garbage (LayerManagerOGL::WorldTransformRect does not call any OpenGL functions).
Crash Signature: libGLES_hgl.so@0x47b5c ] [@ libGLES_hgl.so@0x47cfc ] → libGLES_hgl.so@0x47b5c ] [@ libGLES_hgl.so@0x47cfc ] [@ libGLES_hgl.so@0x47770 ] [@ libGLES_hgl.so@0x47d0c ] [@ libGLES_hgl.so@0x49004 ]
CC'ing a few gfx folks to help take a look.
Jet, we need and assignee from gfx
Assignee: snorp → bugs
tracking-fennec: --- → ?
Jeff, do you have time to take a look at this?
Assignee: bugs → jgilbert
tracking-fennec: ? → 22+
Flags: needinfo?(jgilbert)
Only if I can get a device to work with. :)
Flags: needinfo?(jgilbert)
(In reply to Jeff Gilbert [:jgilbert] from comment #9)
> Only if I can get a device to work with. :)

Jeff, i'm at offsite today and tomorrow, but i do have a Galaxy Y on me right now.  I can drop it off on thursday in mountain view if you are there.  

Also, check with kbrosnan (3rd floor, MV), if he has a Galaxy Ace on him.
(In reply to Tony Chung [:tchung] from comment #10)
> (In reply to Jeff Gilbert [:jgilbert] from comment #9)
> > Only if I can get a device to work with. :)
> 
> Jeff, i'm at offsite today and tomorrow, but i do have a Galaxy Y on me
> right now.  I can drop it off on thursday in mountain view if you are there.
> 
> 
> Also, check with kbrosnan (3rd floor, MV), if he has a Galaxy Ace on him.

I've dropped off the Galaxy Y at Jeff's desk.  Need it back by thursday, if possible.
Jeff, this is pretty high on ARMv6 on 20, esp. if you add up those libGLES_hgl.so@0x4* signatures, see https://crash-analysis.mozilla.com/rkaiser/2013-03-28/2013-03-28.fennecandroid.20.0b7.armv6.topcrash.html - did you get anything out of that Galaxy Y?
Crash Signature: libGLES_hgl.so@0x47b5c ] [@ libGLES_hgl.so@0x47cfc ] [@ libGLES_hgl.so@0x47770 ] [@ libGLES_hgl.so@0x47d0c ] [@ libGLES_hgl.so@0x49004 ] → libGLES_hgl.so@0x47b5c ] [@ libGLES_hgl.so@0x47cfc ] [@ libGLES_hgl.so@0x47770 ] [@ libGLES_hgl.so@0x47d0c ] [@ libGLES_hgl.so@0x49004 ] [@ libGLES_hgl.so@0x4d3c4 ] [@ libGLES_hgl.so@0x4b8a8 ] [@ libGLES_hgl.so@0x4cee8 ] [@ libGLES_hgl.so@0x4e500…
Jeff, have you got a device now?
Flags: needinfo?(jgilbert)
Yep, working on reproing presently.
Flags: needinfo?(jgilbert)
It's #13 top crasher in 23.0a1 and has never happened in Aurora.
I doubt of the Aurora population representativity (see also bug 847021).
Jeff - Just checking on status
Flags: needinfo?(jgilbert)
I have a device, and will be picking up where I left off in reproing next week.
Flags: needinfo?(jgilbert) → needinfo?
This is by far the #1 issue on 22 beta and ARMv6, it's also a top 3 issue on 21 and ARMv6 - is there any progress here?
Flags: needinfo?
Flags: needinfo?(jgilbert)
I think it's the HGL version of bug 863313.

(In reply to Robert Kaiser (:kairo@mozilla.com) [away until early June] from comment #18)
> This is by far the #1 issue on 22 beta and ARMv6
Yes indeed. It accounts for 67% of crashes on ARMv6 devices over the last day in 22.0 Beta.
Given that it's probably too late to get something in for 22, I'm nominating for 23. That said, even 21 is affected as I laid out in comment #18.
(In reply to Robert Kaiser (:kairo@mozilla.com) [away until early June] from comment #20)
> That said, even 21 is affected as I laid out in comment #18.
Bug 863313 also but they have been filed for the spike since 22.0.
tracking-fennec: 22+ → +
It's #1 top crasher in the first hours of 22.0 (all devices, not only ARMv6 devices) and accounts for 7.6% of all crashes.
The only crash I've been able to reproduce on 24/25 has been bug 863313, I think. (moz-gdb also refuses to attach for some reason, so it's really hard to tell (maybe armv6-related?)) Lack of repro case in this bug makes it really hard, too. (and as snorp mentioned, that stack trace is garbage)

Since this looks a lot like bug 863313, I would tentatively assume this is a dupe of it, and check back after 873313 is fixed, which it sounds like we have a lead on.
Flags: needinfo?(jgilbert)
It looks like this is turning out to be *by far* the top issue on ARMv6 22.0 release, actually, right now, signatures from this bug occupy the top 3 spots on https://crash-analysis.mozilla.com/rkaiser/2013-06-26/2013-06-26.fennecandroid.22.0.armv6.topcrash.html
Summary: crash in mozilla::layers::LayerManagerOGL::WorldTransformRect @ libGLES_hgl.so@0x4.... on Samsung Galaxy Y and Ace with Broadcom VideoCore IV GPU running Gingerbread → crash in mozilla::layers::LayerManagerOGL::WorldTransformRect @ libGLES_hgl.so@0x4.... on Samsung ARMv6 devices with Broadcom VideoCore IV GPU running Gingerbread
Crash Signature: libGLES_hgl.so@0x4e500 ] [@ libGLES_hgl.so@0x4b5e4 ] [@ libGLES_hgl.so@0x48944 ] [@ libGLES_hgl.so@0x4cf1a ] [@ libGLES_hgl.so@0x4d3f6 ] → libGLES_hgl.so@0x4e500 ] [@ libGLES_hgl.so@0x4b5e4 ] [@ libGLES_hgl.so@0x48944 ] [@ libGLES_hgl.so@0x4cf1a ] [@ libGLES_hgl.so@0x4d3f6 ] [@ libGLES_hgl.so@0x47ea0 ] [@ libGLES_hgl.so@0x47c64 ] [@ libGLES_hgl.so@0x47a18 ] [@ libGLES_hgl.so@0x4da38…
Here is a non buggy stack trace (bp-1c5191dd-697a-4d55-a374-1ce2e2130702):
Frame 	Module 	Signature 	Source
0 	libGLES_hgl.so 	libGLES_hgl.so@0x479b8 	
1 	libxul.so 	mozilla::layers::ContainerLayer::DefaultComputeEffectiveTransforms(gfx3DMatrix const&) 	gfx/layers/Layers.cpp
2 	libxul.so 	mozilla::layers::CompositorOGL::BeginFrame(mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const*, gfxMatrix const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*) 	gfx/layers/opengl/CompositorOGL.cpp
3 	libxul.so 	mozilla::layers::LayerManagerComposite::Render() 	gfx/layers/composite/LayerManagerComposite.cpp
4 	libxul.so 	mozilla::layers::LayerManagerComposite::EndTransaction(void (*)(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) 	gfx/layers/composite/LayerManagerComposite.cpp
5 	libxul.so 	mozilla::layers::LayerManagerComposite::EndEmptyTransaction(mozilla::layers::LayerManager::EndTransactionFlags) 	gfx/layers/composite/LayerManagerComposite.cpp
6 	libxul.so 	mozilla::layers::CompositorParent::Composite() 	gfx/layers/ipc/CompositorParent.cpp
7 	libxul.so 	mozilla::layers::CompositorParent::ResumeComposition() 	gfx/layers/ipc/CompositorParent.cpp
8 	libxul.so 	RunnableMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)(mozilla::ipc::AsyncChannel*, mozilla::ipc::AsyncChannel::Side), Tuple2<mozilla::ipc::AsyncChannel*, mozilla::ipc::AsyncChannel::Side> >::Run() 	ipc/chromium/src/base/tuple.h
9 	libxul.so 	MessageLoop::RunTask(Task*) 	ipc/chromium/src/base/message_loop.cc
10 	libxul.so 	MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) 	ipc/chromium/src/base/message_loop.cc
11 	libxul.so 	MessageLoop::DoWork() 	ipc/chromium/src/base/message_loop.cc
12 	libxul.so 	base::MessagePumpDefault::Run(base::MessagePump::Delegate*) 	ipc/chromium/src/base/message_pump_default.cc
13 	libxul.so 	MessageLoop::RunInternal() 	ipc/chromium/src/base/message_loop.cc
14 	libxul.so 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc
15 	libxul.so 	base::Thread::ThreadMain() 	ipc/chromium/src/base/thread.cc
16 	libxul.so 	ThreadFunc 	ipc/chromium/src/base/platform_thread_posix.cc
17 	libc.so 	libc.so@0x11bb2 	
18 	libc.so 	libc.so@0x1176e
Crash Signature: [@ gfxMatrix::TransformBounds(gfxRect const&) const ] [@ @0x0 | gfxMatrix::TransformBounds(gfxRect const&) const ] [@ libGLES_hgl.so@0x47710 ] [@ libGLES_hgl.so@0x478b4 ] [@ libGLES_hgl.so@0x48e60 ] [@ libGLES_hgl.so@0x479b8 ] [@ libGLES_hgl.so@0x47… → [@ gfxMatrix::TransformBounds(gfxRect const&) const ] [@ @0x0 | gfxMatrix::TransformBounds(gfxRect const&) const ] [@ _moz_cairo_matrix_transform_distance ] [@ libGLES_hgl.so@0x47710 ] [@ libGLES_hgl.so@0x478b4 ] [@ libGLES_hgl.so@0x48e60 ] [@ libGL…
If I see things correctly, then bug 887097 has landed in 23.0b2, but this stays hugely on top of the ARMv6 crash list in that version.
I've hit this occasionally on an ARMv6 phone and bug 894933. Largely interacting with the awesome screen, just after typing an address into it. I have been using nightly and beta to attempt to reproduce this.
We're going to build our final beta, without a fix for this I'm marking it wontfix for 23.
Duplicate of this bug: 906505
So, if I look at the top signatures on ARMv6, which are all this bug - see https://crash-analysis.mozilla.com/rkaiser/2013-09-19/2013-09-19.fennecandroid.24.0.armv6.topcrash.html - I see that it's really that narrow in terms of graphics adapters:

Vendor 	Adapter 	Report Count 	Percentage
Broadcom 	VideoCore IV HW	853 	96.932 %
Broadcom 	VideoCore IV HW / Chainfire3D 	22 	2.500 %
Qualcomm 	VideoCore IV HW / Chainfire3D 	5 	0.568 %

And it's all Samsung devices with Android API level "10 (REL)" (we now get all that from signature summaries).
I can't make any useful progress, since I was unable to reproduce this. (I also don't have the device anymore) For people who can reproduce this, what you'll probably want is an APITRACE of the gl calls made leading up to the crash.
Assignee: jgilbert → nobody
This stays by far the topcrash on ARMv6, but from my current estimates, this might affect ~4k out of ~200k ARMv6 active installations on 25.0 release.

(How I got to that estimate: Google Play data shows us that 1-2% of our installs seem to be on ARMv6, so I applied that factor to current ADI, and the ~4k come out of looking into affected installs on the highest-ranked signatures.)
This crash seems to have dropped significantly in 28, probably because of bjacob's excellent work in bug 925608. There's still a handful I see on crash-stats but I think we can take off the topcrash status and/or mark this bug fixed?
Flags: needinfo?(kairo)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #33)
> This crash seems to have dropped significantly in 28, probably because of
> bjacob's excellent work in bug 925608. There's still a handful I see on
> crash-stats but I think we can take off the topcrash status and/or mark this
> bug fixed?

Yes, that is as expected and it's really awesome, esp. as it eliminated the vast majority of our ARMb6 crashes completely and 28 will be the best release ever for that class of devices, stability-wise. I'd mark this crash as a dupe of bug 925608, but feel free to resolve in other ways. :)
Flags: needinfo?(kairo)
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 925608
You need to log in before you can comment on or make changes to this bug.