Closed Bug 771774 Opened 12 years ago Closed 11 years ago

crash with eglMakeCurrent failed (EGL error 3000) and abort message: "OpenGL-accelerated layers are a hard requirement on this platform [...]" on Vivante GPUs (e.g. in Rockchip rk29board, imapx200 and Vimicro chipsets)

Categories

(Firefox for Android Graveyard :: Toolbar, defect)

16 Branch
ARM
Android
defect
Not set
critical

Tracking

(firefox16+ affected, firefox17 affected, firefox18 affected, firefox19 affected)

RESOLVED WORKSFORME
Tracking Status
firefox16 + affected
firefox17 --- affected
firefox18 --- affected
firefox19 --- affected

People

(Reporter: scoobidiver, Unassigned)

References

Details

(Keywords: crash, regression, Whiteboard: [native-crash][startupcrash][armv6])

Crash Data

There are several startup crashes from the same user in 16.0a1/20120706075126.
It's likely a regression from bug 766251.

Signature 	mozalloc_abort | __swrite | libxul.so@0x10aa6ae More Reports Search
UUID	e1ccb83b-08b1-4915-9bef-9cb5e2120707
Date Processed	2012-07-07 10:23:47
Uptime	8
Last Crash	14 seconds before submission
Install Age	51 seconds since version was first installed.
Install Time	2012-07-07 10:24:41
Product	FennecAndroid
Version	16.0a1
Build ID	20120706075126
Release Channel	nightly
OS	Linux
OS Version	0.0.0 Linux 2.6.32.27 #143 PREEMPT Wed Sep 14 15:36:43 CST 2011 armv7l
Build Architecture	arm
Build Architecture Info	
Crash Reason	SIGSEGV
Crash Address	0x0
App Notes 	
AdapterDescription: 'An error occurred earlier while querying gfx info: eglMakeCurrent failed (EGL error 3000).  --  --  -- Model: AN10G2, Product: rk29sdk, Manufacturer: unknown, Hardware: rk29board'
xpcom_runtime_abort(###!!! ABORT: OpenGL-accelerated layers are a hard requirement on this platform. Cannot continue without support for them.: file /builds/slave/m-cen-andrd-ntly/build/widget/xpwidgets/nsBaseWidget.cpp, line 858)
unknown AN10G2
rockchip/rk29sdk/rk29sdk:2.3.1/GINGERBREAD/eng.hwg.20110913.192033:user/test-keys
EMCheckCompatibility	True

Frame 	Module 	Signature 	Source
0 	libmozalloc.so 	mozalloc_abort 	memory/mozalloc/mozalloc_abort.cpp:19
1 	libc.so 	__swrite 	
2 	libxul.so 	libxul.so@0x10aa6ae 	
3 	dalvik-heap (deleted) 	dalvik-heap @0x11f911f

More reports at:
https://crash-stats.mozilla.com/report/list?signature=mozalloc_abort+|+__swrite+|+libxul.so%400x10aa6ae
It was hit by another user with also rk29board: bp-9922ec79-fa75-4a9b-92aa-f00732120707.
Crash Signature: [@ mozalloc_abort | __swrite | libxul.so@0x10aa6ae] → [@ mozalloc_abort | __swrite | libxul.so@0x10aa6ae] [@ mozalloc_abort ]
*Oh*

"An error occurred earlier while querying gfx info: eglMakeCurrent failed (EGL error 3000)."

I looked up what EGL error 0x3000 was and... it's EGL_SUCCESS.

That makes very little sense :-)

So here's the code producing this error:

http://hg.mozilla.org/mozilla-central/file/afbb478ed7a1/mobile/android/base/gfx/GfxInfoThread.java#l134

// obtain GL strings, store them in mDataQueue
if (!egl.eglMakeCurrent(eglDisplay, eglSurface, eglSurface, eglContext)) {
    eglError(egl, "eglMakeCurrent failed");
    return;
}

and the eglError function that it's calling is there:

http://hg.mozilla.org/mozilla-central/file/afbb478ed7a1/mobile/android/base/gfx/GfxInfoThread.java#l28

So anyway, what's happening is that eglMakeCurrent is returning FALSE which indicates an error:
http://www.khronos.org/opengles/documentation/opengles1_0/html/eglMakeCurrent.html
And we are doing the right thing by not continuing here: we can't continue if eglMakeCurrent fails. But then, when we print the error message, the EGL error code is EGL_SUCCESS which is inconsistent with eglMakeCurrent having failed; this seems like it can only be a driver bug, this driver is not setting the EGL error code properly, so I would ignore this EGL_SUCCESS code and consider that since eglMakeCurrent returns FALSE, we do the right thing by aborting.
So, these devices seem to use a board named "Rockchip", specifically the rk2918. This is where the "rk29" in the crash reports comes from.

http://en.wikipedia.org/wiki/Rockchip#RK2918

which says that it has a "Vivante GC800 GPU" which should be able to do GLES 2.0.

http://www.chipestimate.com/ip.php?id=18719
Summary: crash with gfx info error and abort message: "OpenGL-accelerated layers are a hard requirement on this platform. Cannot continue without support for them." → crash with gfx info error and abort message: "OpenGL-accelerated layers are a hard requirement on this platform [...]" on Rockchip rk2918 with Vivante GPU
We need to reach out to Vivante. Unfortunately I don't have any contact at Vivante. Kev, can you help here?

http://en.wikipedia.org/wiki/Vivante
Summary: crash with gfx info error and abort message: "OpenGL-accelerated layers are a hard requirement on this platform [...]" on Rockchip rk2918 with Vivante GPU → crash with gfx info error and abort message: "OpenGL-accelerated layers are a hard requirement on this platform [...]" on Rockchip rk29board (rk2918 with Vivante GPU)
Looking into this, I am not sure that it's a regression i.e. that things were working just fine on these devices before this assertion. Indeed, we do check the return value of eglMakeCurrent in several places and we fail a lot of things if that fails. So I suspect that things were already not working correctly (probably crashing or at least not displaying anything) and this change only made it noticeable in crash-stats.
Crash Signature: [@ mozalloc_abort | __swrite | libxul.so@0x10aa6ae] [@ mozalloc_abort ] → [@ mozalloc_abort | __swrite | libxul.so@0x10aa6ae] [@ mozalloc_abort | __swrite | libxul.so@0x10aa6ae | pkg.apk@0x676e63 ] [@ mozalloc_abort ]
It happens also with imapx200: bp-c5dc601a-20e4-4be3-8335-7b2ce2120714.
I was running into this crash with the Motorola Backflip. Do you know if that device should work?
Googling shows that the Backflip has a Adreno 130 GPU.

According to this,
https://developer.qualcomm.com/discover/chipsets-and-modems/adreno-gpu

the Adreno 130 only suppors OpenGL ES 1.1.

So it can't support GL layers. Did GL layers ever work on it? This seems like exactly what this assertion wants to catch. Before, Fennec would have loaded and run for a few seconds but GL layers would have failed to initialize and assert.
(In reply to Scoobidiver from comment #7)
> It happens also with imapx200: bp-c5dc601a-20e4-4be3-8335-7b2ce2120714.

Some googling shows that the imapx200 seems to have a Vivante GC600 GPU:
http://www.techknow.t0xic.nl/forum/index.php?topic=2006.0

So the common factor here is Vivante GPUs. Renaming the bug.
Summary: crash with gfx info error and abort message: "OpenGL-accelerated layers are a hard requirement on this platform [...]" on Rockchip rk29board (rk2918 with Vivante GPU) → crash with gfx info error and abort message: "OpenGL-accelerated layers are a hard requirement on this platform [...]" on Vivante GPUs (e.g. in Rockchip rk29board and in imapx200 chipsets)
(In reply to Benoit Jacob [:bjacob] from comment #9)
> Googling shows that the Backflip has a Adreno 130 GPU.
> 
> According to this,
> https://developer.qualcomm.com/discover/chipsets-and-modems/adreno-gpu
> 
> the Adreno 130 only suppors OpenGL ES 1.1.
> 
> So it can't support GL layers. Did GL layers ever work on it? This seems
> like exactly what this assertion wants to catch. Before, Fennec would have
> loaded and run for a few seconds but GL layers would have failed to
> initialize and assert.

Ok, that's fine. I only recently tried running Firefox on it so I don't know what the behaviour was like before, but I wouldn't be surprised at all if it did what you describe.
It's a popular nightly crash (2 users per nightly) specifically amongst users with ARMv6 phones.
(In reply to Scoobidiver from comment #12)
> It's a popular nightly crash (2 users per nightly) specifically amongst
> users with ARMv6 phones.

It's not surprising that ARMv6 phones would be overrepresented among phones not capable of OpenGL ES 2, although technically ARMv6 is orthogonal to OpenGL support. If we decide that that's an issue and assuming we keep confirming from the crash reports and googling that these phones don't have OpenGL ES 2 support, then the only thing we might conceivably do about this would be to add either OpenGL ES 1.x support or non-OpenGL support. OpenGL ES 1.x support would be expensive to code, test, and maintain, and might require OpenGL extensions that such devices wouldn't support (for example, if we need to convert between BGRA and RGBA, that's going to require fragment shaders or other extensions). Doing non-OpenGL support would be less expensive for us but would give poor performance, especially on underpowered ARMv6 phones.
Crash Signature: [@ mozalloc_abort | __swrite | libxul.so@0x10aa6ae] [@ mozalloc_abort | __swrite | libxul.so@0x10aa6ae | pkg.apk@0x676e63 ] [@ mozalloc_abort ] → [@ mozalloc_abort | __swrite | libxul.so@0x10aa6ae] [@ mozalloc_abort | __swrite | libxul.so@0x10aa6ae | pkg.apk@0x676e63 ] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ]
Crash Signature: [@ mozalloc_abort | __swrite | libxul.so@0x10aa6ae] [@ mozalloc_abort | __swrite | libxul.so@0x10aa6ae | pkg.apk@0x676e63 ] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ] → [@ mozalloc_abort | __swrite | libxul.so@0x10aa6ae] [@ mozalloc_abort | __swrite | libxul.so@0x10aa6ae | pkg.apk@0x676e63 ] [@ mozalloc_abort | __swrite | libxul.so@0x10cc079] [@ mozalloc_abort | __swrite | libxul.so@0x10d47f9] [@ mozalloc_abort | __s…
(In reply to Naoki Hirata :nhirata from comment #14)
> top crash in nightly

Really? I tried to get the list of top-crashers, and at rank #4 I indeed found mozalloc_abort linking to this bug, but most of the reports there have nothing to do with this bug:

https://crash-stats.mozilla.com/report/list?range_value=7&range_unit=days&date=2012-07-26&signature=mozalloc_abort&version=FennecAndroid%3A17.0a1

for example:

https://crash-stats.mozilla.com/report/index/a46a25c2-96f3-4ad9-9e71-0f95d2120725

Here the AppNotes do not have the message that is characteristic of this bug, and in fact they confirm that the bug did not happen on this phone.

It just seems that the way we do assertions (at least on Android) is utterly unfriendly to the way we classify crashers by signature.

Maybe we should revisit that and have a separate (maybe templated) abort function per assertion?
Or maybe find a way to make the assertion message part of the" signature" i.e. extend the concept of signature.
(In reply to Benoit Jacob [:bjacob] from comment #16)
> Or maybe find a way to make the assertion message part of the" signature"
> i.e. extend the concept of signature.

That's something we should already be doing for Java exceptions. Is there some way we can get to the assertion message in a crash report in a reasonable way (App Notes don't count, they're a mess, nothing we can clearly filter for)? If so, we should definitely file a bug on Socorro to look into using it in signatures. If not, maybe we should file a bug to get an "assertion message" field or so into crash annotations and then use that in signatures.
Keywords: topcrash
Hey, we control how our assertions work (except for those that come directly from libraries that we call into, but that's not the issue here). So we can do _anything_ with them ;-) For example, we could certainly have assertions copy their message into a pre-allocated fixed-size buffer. In fact, isn't that being done on some platforms already? I remember seeing assertion messages in crash reports. Once we have that, it's just a matter of patching breakpad/socorro to take that string and make it part of what's called the "signature". Or maybe even better, allow searches into the AppNotes / whatever field the assertion message goes to? In fact, if we had arbitrary deep searches into crash reports, then no matter where in crash reports the assertion messages go (AppNotes or elsewhere) we could search for them.
Deep searches into crash data are coming, the Socorro team has just hit a long row of snags when trying to deploy ElasticSearch (not to mention that other, higher, priorities hit them every few months), but it should hopefully be deployed this quarter and then they'll add search capability for more and more data step by step (as the full set of data from every crash will be indexed there).

If we can send the assertion message in a separate crash annotation field, that would surely be helpful, esp. once we get such search capabilities.
I initially focused this bug on Vivante GPUs as that's where we initially got the most reports. I'm OK either way but we should decide once and for all whether we keep this bug Vivante-focused (in which case bug 778175 should be considered a separate bug, unless the HTC Buzz Wildfire happens to have a Vivante GPU?)
(In reply to Benoit Jacob [:bjacob] from comment #21)
> I initially focused this bug on Vivante GPUs as that's where we initially
> got the most reports. I'm OK either way but we should decide once and for
> all whether we keep this bug Vivante-focused (in which case bug 778175
> should be considered a separate bug, unless the HTC Buzz Wildfire happens to
> have a Vivante GPU?)

My bad, HTC Wildfire doesn't support OpenGL and I didn't know that Firefox needs OpenGL to run.
Whiteboard: [native-crash][startupcrash] → [native-crash][startupcrash][ARMv6]
Phones with ARMv7 processors can have GPU not supporting OpenGL ES 2.0.
Whiteboard: [native-crash][startupcrash][ARMv6] → [native-crash][startupcrash]
(In reply to Benoit Jacob [:bjacob] from comment #21)
> I initially focused this bug on Vivante GPUs as that's where we initially
> got the most reports. I'm OK either way but we should decide once and for
> all whether we keep this bug Vivante-focused (in which case bug 778175
> should be considered a separate bug, unless the HTC Buzz Wildfire happens to
> have a Vivante GPU?)
As those Vivante GPUs are supposed to be compatible with OpenGL ES 2.0, I will undupe bug 778175.
Summary: crash with gfx info error and abort message: "OpenGL-accelerated layers are a hard requirement on this platform [...]" on Vivante GPUs (e.g. in Rockchip rk29board and in imapx200 chipsets) → crash with eglMakeCurrent failed (EGL error 3000) and abort message: "OpenGL-accelerated layers are a hard requirement on this platform [...]" on Vivante GPUs (e.g. in Rockchip rk29board and in imapx200 chipsets)
Crash Signature: __swrite | libxul.so@0x10cbf11 | png_get_user_transform_ptr ] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ] → __swrite | libxul.so@0x10cbf11 | png_get_user_transform_ptr ] [@ mozalloc_abort | __swrite | libxul.so@0x10ca9e5 ] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ]
Crash Signature: __swrite | libxul.so@0x10cbf11 | png_get_user_transform_ptr ] [@ mozalloc_abort | __swrite | libxul.so@0x10ca9e5 ] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ] → __swrite | libxul.so@0x10cbf11 | png_get_user_transform_ptr ] [@ mozalloc_abort | __swrite | libxul.so@0x10ca9e5 ] [@ mozalloc_abort | __swrite | libxul.so@0x10ca9a5] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ]
It's #1 top crasher in 16.0b2 for ARMv6 devices, mainly caused by the imapx200 chipset.
The current Google Play blacklist in Beta only takes into account the CPU freq/RAM size: https://etherpad.mozilla.org/armv6blocklist
Depends on: 775237
Keywords: topcrash
Aha, didn't realize that Vivante would be so widespread among ARMv6 devices. I agree that we should block these devices from showing Firefox on Google Play. Do you know how to do that i.e. what bugzilla component?
I updated my script from bug 778175 to count rk29board/imapx200 devices, and that confirms what Scoobidiver says. Look at what happens around September 7:

Command:

$ for x in `cat dates`; do echo "$x `zcat $x-pub-crashdata.csv.gz | grep FennecAndroid.1[678].0 | grep -v armv6 | wc -l` `zcat $x-pub-crashdata.csv.gz | grep FennecAndroid.1[678].0 | grep 'OpenGL-accelerated layers are a hard requirement on this platform'| grep -v armv6 | wc -l` `zcat $x-pub-crashdata.csv.gz | grep FennecAndroid.1[678].0 | grep armv6 | wc -l` `zcat $x-pub-crashdata.csv.gz | grep FennecAndroid.1[678].0 | grep 'OpenGL-accelerated layers are a hard requirement on this platform'| grep armv6 | wc -l` `zcat $x-pub-crashdata.csv.gz | grep FennecAndroid.1[678].0 | grep armv6 | egrep rk29board\|imapx200 | wc -l` `zcat $x-pub-crashdata.csv.gz | grep FennecAndroid.1[678].0 | grep 'OpenGL-accelerated layers are a hard requirement on this platform'| grep armv6 | egrep rk29board\|imapx200 | wc -l`"; done


***Note: this includes only versions 16+. Whence the sudden increase in numbers around the 15 release date, when 16 moved into beta.***


Result:

Date | ARM v7  | ARM v7   | ARMv6   | ARMv6    | ARMv6   | ARMv6
     | crashes | no-GLES2 | crashes | no-GLES2 | Vivante | Vivante
     |         | crashes  |         | crashes  | crashes | no-GLES2
     |         |          |         |          |         | crashes
-----+---------+----------+---------+----------+---------+---------+
0813 |     354 |        2 |      36 |       22 |      11 |    11
0814 |     450 |       31 |      19 |       15 |       0 |     0
0815 |     516 |       23 |      20 |        6 |       2 |     2
0816 |     390 |        8 |      19 |        9 |       0 |     0
0817 |     499 |        2 |      22 |       15 |      15 |    15
0818 |     427 |       11 |       9 |        5 |       5 |     5
0819 |     411 |       10 |       7 |        0 |       0 |     0
0820 |     469 |       45 |      12 |        8 |       0 |     0
0821 |     343 |       14 |       2 |        0 |       0 |     0
0822 |     408 |        8 |       3 |        0 |       0 |     0
0823 |     432 |       14 |       2 |        1 |       1 |     1
0824 |     388 |       12 |       1 |        1 |       0 |     0
0825 |     464 |        7 |       5 |        2 |       0 |     0
0826 |     396 |        8 |       1 |        0 |       0 |     0
0827 |     408 |       10 |       2 |        0 |       0 |     0
0828 |     544 |        6 |       4 |        0 |       0 |     0
0829 |     489 |       26 |       0 |        0 |       0 |     0
0830 |    1338 |       56 |       1 |        0 |       0 |     0
0831 |    5534 |      161 |       2 |        0 |       0 |     0
0901 |    6678 |      105 |       2 |        0 |       0 |     0
0902 |    7436 |      100 |       1 |        0 |       0 |     0
0903 |    6874 |      103 |       1 |        0 |       0 |     0
0904 |    6624 |       58 |       4 |        0 |       0 |     0
0905 |    6937 |       90 |       2 |        0 |       0 |     0
0906 |    6958 |      136 |      20 |       17 |       2 |     2
0907 |    7286 |       60 |      51 |       45 |      42 |    42
0908 |    8375 |       87 |     144 |      128 |     125 |   125
0909 |    8160 |       72 |     214 |      192 |     181 |   181
0910 |    7117 |       71 |     103 |       83 |      76 |    76
0911 |    7151 |       77 |     157 |      134 |     121 |   121
0912 |    7321 |       67 |     122 |      107 |      90 |    90
0913 |    7283 |       75 |     158 |      134 |     134 |   134


This shows that 90%+ of ARMv6 crashes are no-OpenGL2 startup crashes, and 90%+ of those are on Vivante GPUS (rk29board and imapx200 hardware).
Crash Signature: __swrite | libxul.so@0x10cbf11 | png_get_user_transform_ptr ] [@ mozalloc_abort | __swrite | libxul.so@0x10ca9e5 ] [@ mozalloc_abort | __swrite | libxul.so@0x10ca9a5] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ] → __swrite | libxul.so@0x10cbf11 | png_get_user_transform_ptr ] [@ mozalloc_abort | __swrite | libxul.so@0x10ca9e5 ] [@ mozalloc_abort | __swrite | libxul.so@0x10ca9a5] [@ mozalloc_abort | __swrite | libxul.so@0x10cc341] [@ mozalloc_abort ] [@ libmozal…
Whiteboard: [native-crash][startupcrash] → [native-crash][startupcrash][armv6]
URLs listed:

6 	about:home
3 	about:blank
1 	file:///mnt/storage/sdcard/zj/tv.html
1 	http://www.google.ca/search?hl=en&redir_esc=&client=ms-android-samsung&source=an
Blocked as much as I could in bug 775237 for beta 16.
Crash Signature: __swrite | libxul.so@0x10cbf11 | png_get_user_transform_ptr ] [@ mozalloc_abort | __swrite | libxul.so@0x10ca9e5 ] [@ mozalloc_abort | __swrite | libxul.so@0x10ca9a5] [@ mozalloc_abort | __swrite | libxul.so@0x10cc341] [@ mozalloc_abort ] [@ libmozal… → __swrite | libxul.so@0x10cbf11 | png_get_user_transform_ptr ] [@ mozalloc_abort | __swrite | libxul.so@0x10ca9e5 ] [@ mozalloc_abort | __swrite | libxul.so@0x10ca9a5] [@ mozalloc_abort | __swrite | libxul.so@0x10cc341] [@ mozalloc_abort | 0 (deleted)@…
What further action should be taken?  ie how can we move this bug forward?
(In reply to Naoki Hirata :nhirata from comment #32)
> What further action should be taken?  ie how can we move this bug forward?
For ARMv6 devices, the Google Play blocklist has been updated (see bug 775237 comment 52).
For ARMv7 devices, nothing has been done.
(In reply to Scoobidiver from comment #33)
> (In reply to Naoki Hirata :nhirata from comment #32)
> > What further action should be taken?  ie how can we move this bug forward?
> For ARMv6 devices, the Google Play blocklist has been updated (see bug
> 775237 comment 52).
> For ARMv7 devices, nothing has been done.

No? The spreadsheet I provided there wasn't armv6 specific.
Good news, we now have a contact at Vivante (thanks Jeff) and have reported the issue.
(In reply to Benoit Jacob [:bjacob] from comment #34)
> > For ARMv7 devices, nothing has been done.
> No? The spreadsheet I provided there wasn't armv6 specific.
I meant the blocklist in Google Play hasn't been updated for ARMv7 devices AFAIK.

Anyway, this bug is still #2 top crasher with many dupes in 16.0b4 impacting both ARMv6 and ARMv7 devices.
We should check with recent crash stats that impacted ARMv6 devices are already blocklisted (users installed FN from an .apk).
Crash Signature: (deleted)@0x11f911f] [@ mozalloc_abort | 0 (deleted)@0x11f911f] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ] → (deleted)@0x11f911f] [@ mozalloc_abort | 0 (deleted)@0x11f911f] [@ mozalloc_abort | __swrite | libxul.so@0x10cbc69] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ]
Android TV Box is impacted: bp-08268f89-676c-4bac-a28b-5b9852120928.
Indeed, it has a rk29board. Which is "rm", not armv6. You're right, we need to take action on armv7.
(In reply to Scoobidiver from comment #33)
> (In reply to Naoki Hirata :nhirata from comment #32)
> > What further action should be taken?  ie how can we move this bug forward?
> For ARMv6 devices, the Google Play blocklist has been updated (see bug
> 775237 comment 52).
> For ARMv7 devices, nothing has been done.

We've blocked as many devices as were possible on Google Play (both ARMv6 and ARMv7).

Are the majority of remaining crashes ARMv6 devices? If not, perhaps this represents a new regression?
Crash Signature: (deleted)@0x11f911f] [@ mozalloc_abort | 0 (deleted)@0x11f911f] [@ mozalloc_abort | __swrite | libxul.so@0x10cbc69] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ] → (deleted)@0x11f911f] [@ mozalloc_abort | 0 (deleted)@0x11f911f] [@ mozalloc_abort | __swrite | libxul.so@0x10cbc69] [@ mozalloc_abort | __swrite | libxul.so@0x10cc339] [@ mozalloc_abort | CursorWindow (deleted)@0x76e63] [@ mozalloc_abort | encode_mcu…
(In reply to Alex Keybl [:akeybl] from comment #39)
> We've blocked as many devices as were possible on Google Play (both ARMv6
> and ARMv7).
Have you kept the block for Firefox 16 after the ARMv6 postpone to 17?
Here are impacted ARMv7 devices in the first hours of 16.0: rk29sdk, sdkDemo, AN10G2, MW0821, QBEN QB700, TM-7022, AN8G2, MS-N7Y1, MW0812, TR810, ODYS-Xpress, MS-N7Y1, TR719, LT8029, imito am801, N12, U9GT_S, N12FT_A, ENERGY I724, TV-BOX
Crash Signature: encode_mcu_huff] [@ mozalloc_abort | fast_composite_scaled_bilinear_neon_8888_8_0565_normal_SRC] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ] → encode_mcu_huff] [@ mozalloc_abort | fast_composite_scaled_bilinear_neon_8888_8_0565_normal_SRC] [@ mozalloc_abort | __swrite | libxul.so@0x10cc315] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ]
Crash Signature: encode_mcu_huff] [@ mozalloc_abort | fast_composite_scaled_bilinear_neon_8888_8_0565_normal_SRC] [@ mozalloc_abort | __swrite | libxul.so@0x10cc315] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ] → encode_mcu_huff] [@ mozalloc_abort | fast_composite_scaled_bilinear_neon_8888_8_0565_normal_SRC] [@ mozalloc_abort | __swrite | libxul.so@0x10cc315] [@ mozalloc_abort | __swrite | libxul.so@0x10cc63d] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ]
Crash Signature: encode_mcu_huff] [@ mozalloc_abort | fast_composite_scaled_bilinear_neon_8888_8_0565_normal_SRC] [@ mozalloc_abort | __swrite | libxul.so@0x10cc315] [@ mozalloc_abort | __swrite | libxul.so@0x10cc63d] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ] → encode_mcu_huff] [@ mozalloc_abort | fast_composite_scaled_bilinear_neon_8888_8_0565_normal_SRC] [@ mozalloc_abort | __swrite | libxul.so@0x10cc315] [@ mozalloc_abort | __swrite | libxul.so@0x10cc63d] [@ mozalloc_abort | __swrite | libxul.so@0x1142f15…
Depends on: 803796
Crash Signature: libxul.so@0x1142f15] [@ mozalloc_abort | __swrite | libxul.so@0x1143c21] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ] → libxul.so@0x1142f15] [@ mozalloc_abort | __swrite | libxul.so@0x1143c21] [@ mozalloc_abort | __swrite | libxul.so@0x11453d1] [@ mozalloc_abort | __swrite | libxul.so@0x10cc995] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ]
Crash Signature: libxul.so@0x1142f15] [@ mozalloc_abort | __swrite | libxul.so@0x1143c21] [@ mozalloc_abort | __swrite | libxul.so@0x11453d1] [@ mozalloc_abort | __swrite | libxul.so@0x10cc995] [@ mozalloc_abort ] [@ libmozalloc.so@0x1704 ] → libxul.so@0x1142f15] [@ mozalloc_abort | __swrite | libxul.so@0x1143c21] [@ mozalloc_abort | __swrite | libxul.so@0x11453d1] [@ mozalloc_abort | __swrite | libxul.so@0x10cc995] [@ mozalloc_abort ] [@ mozalloc_abort | nsStringBuffer::Release] [@ libm…
Crash Signature: libmozalloc.so@0x1704 ] → libmozalloc.so@0x1704 ] [@ mozalloc_abort | __swrite | libxul.so@0x114642d]
This *requirement* (no Vivante GPU) seems more constraining than the min 512 MB RAM.
Crash Signature: libmozalloc.so@0x1704 ] [@ mozalloc_abort | __swrite | libxul.so@0x114642d] → mozalloc_abort | nsStringBuffer::Release()] [@ libmozalloc.so@0x1704 ] [@ mozalloc_abort | __swrite | libxul.so@0x114642d]
Summary: crash with eglMakeCurrent failed (EGL error 3000) and abort message: "OpenGL-accelerated layers are a hard requirement on this platform [...]" on Vivante GPUs (e.g. in Rockchip rk29board and in imapx200 chipsets) → crash with eglMakeCurrent failed (EGL error 3000) and abort message: "OpenGL-accelerated layers are a hard requirement on this platform [...]" on Vivante GPUs (e.g. in Rockchip rk29board, imapx200 and Vimicro chipsets)
Question: Currently, 'unknown (rk29sdk)' is marked as excluded in Google Play. A device such as http://tabletyplug.pl/produkty/tablety/plug-101/ has such an identifier (Jellybean/4x Mali-400MP) and a user in #mobile claims all is fine. Should 'rk29sdk' not be excluded here?
I don't remember where exactly, but somewhere we further investigated this and it appeared that while 100% of imapx200 hardware had the bug, only a minority of rk29board has the bug, and a majority of rk29board's are fine. So yes, there may well be room to un-blacklist a lot of rk29board's. Can you do it per-Android-version? The devices with the most recent Android versions will typically have newer drivers, hence are less likely to be affected. I would vouch for un-blacklisting rk29boards with Android 4.1 (JellyBean) or newer; maybe even 4.0 (ICS) or newer.
Cant by Android version on the store dashboard it's all or nothing
OK, then we could still un-blacklist specific devices that are known to be fine, such as the one mentioned in comment 42, and more generally, any recent device that is known to ship a recent (say 4.1+) version of Android.
Again, this applies only to rk29-based hardware. imapx200 hardware must stay on the blacklist.
(In reply to Benoit Jacob [:bjacob] from comment #45)
> OK, then we could still un-blacklist specific devices that are known to be
> fine, such as the one mentioned in comment 42, and more generally, any
> recent device that is known to ship a recent (say 4.1+) version of Android.

Unfortunately those devices are blocked not by device model or name but by the 'rk29sdk' token; so to offer Firefox to those devices, one would have to flip the toggle on 'rk29sdk'
Oh. Is that a fundamental limitation of the Android app store or is that something we can tweak on our end?
Found back the data I had about that. It was an email conversation from 2 months ago. Here's what I had found:

==== BEGIN OLD EMAIL ===

bjacob:~/crash-stats$ zcat 20120930-pub-crashdata.csv.gz | grep
rk29board | wc -l
544
bjacob:~/crash-stats$ zcat 20120930-pub-crashdata.csv.gz | grep
rk29board | grep 'OpenGL-accelerated layers are a hard requirement on
this platform'| wc -l
22
bjacob:~/crash-stats$ zcat 20120930-pub-crashdata.csv.gz | grep imapx200
| wc -l
175
bjacob:~/crash-stats$ zcat 20120930-pub-crashdata.csv.gz | grep imapx200
| grep 'OpenGL-accelerated layers are a hard requirement on this
platform'| wc -l
175

This means that rk29board-based devices only occasionally reproduce this
bug (22 crashes out of 544), but imapx200-based devices reproduce ALL
the time (175 out of 175).

==== END OLD EMAIL ====

So before blacklisting, only 5% of the crashes we got on rk29board were that crash. That suggests that at least 95% of rk29boards are fine.

Conclusion: you are right, we should probably un-blacklist rk29board's. Thanks!
Removed 'rk29sdk' from the developer console exclusion list on Firefox/Firefox Beta.
There's an exploitable stack trace: https://crash-stats.mozilla.com/report/list?signature=mozalloc_abort%28char+const*%29+|+NS_DebugBreak_P+|+nsBaseWidget%3A%3AComputeShouldAccelerate%28bool%29
Crash Signature: mozalloc_abort | nsStringBuffer::Release()] [@ libmozalloc.so@0x1704 ] [@ mozalloc_abort | __swrite | libxul.so@0x114642d] → mozalloc_abort | nsStringBuffer::Release()] [@ libmozalloc.so@0x1704 ] [@ mozalloc_abort | __swrite | libxul.so@0x114642d] [@ mozalloc_abort(char const*) | NS_DebugBreak_P | nsBaseWidget::ComputeShouldAccelerate(bool)]
Crash Signature: mozalloc_abort | nsStringBuffer::Release()] [@ libmozalloc.so@0x1704 ] [@ mozalloc_abort | __swrite | libxul.so@0x114642d] [@ mozalloc_abort(char const*) | NS_DebugBreak_P | nsBaseWidget::ComputeShouldAccelerate(bool)] → mozalloc_abort | nsStringBuffer::Release()] [@ libmozalloc.so@0x1704 ] [@ mozalloc_abort | __swrite | libxul.so@0x114642d] [@ mozalloc_abort | __swrite | libxul.so@0x1255449] [@ mozalloc_abort | __swrite | libxul.so@0x1253dd5] [@ mozalloc_abort | Cra…
"mozalloc_abort | 0 (deleted)@0x11f911f" continues to be the topcrash on ARMv6 for 18.0.
that's a cryptic signature, so how do we know that it relates to this issue?
https://crash-stats.mozilla.com/report/list?signature=mozalloc_abort%20%7C%200%20%28deleted%29%400x11f911f gives you a list of those crashes, as with most abort crashes, I guess there might be abort messages in the data that are way more useful than the signature.
Hm, indeed. I looked at 2 random reports from this query and they are both this exact bug.
The new crash signature since 20.0 contains nsBaseWidget::ComputeShouldAccelerate but is also for several abort messages.
The abort message for this bug says basically that the driver is buggy.

Maybe a solution similar to the one in bug 778175 (prompt at startup) is a good idea even if some users know that their device is theoretically compatible.
Here in Toronto we just received a imapx200 device from China that will hopefully allow us to debug this. Need to root it first, see bug 836374.
Crash Signature: CrashReporter::AppendAppNotesToCrashReport(nsACString_internal const&)] [@ mozalloc_abort(char const*) | NS_DebugBreak_P | nsBaseWidget::ComputeShouldAccelerate(bool)] → CrashReporter::AppendAppNotesToCrashReport(nsACString_internal const&)] [@ mozalloc_abort | arena_dalloc] [@ mozalloc_abort(char const*) | NS_DebugBreak_P | nsBaseWidget::ComputeShouldAccelerate(bool)]
Since the fix of bug 838603 (GL layer forced enabled), it crashes later with the abort message of bug 834243. See bp-8939b83d-7849-4d4e-b0cb-896892130210.
It's no longer a top crasher.
Keywords: topcrash
Here is a crash with a non-buggy stack trace, sometimes with the message, bp-fedb8183-2ac9-4221-a87f-f971c2130305, and other times without, bp-07bb5f99-b258-4699-9dbf-0d8aa2130301.

Frame 	Module 	Signature 	Source
0 	libsurfaceflinger_client.so 	libsurfaceflinger_client.so@0x135e6 	
1 	libsurfaceflinger_client.so 	libsurfaceflinger_client.so@0x14a89 	
2 	libsurfaceflinger_client.so 	libsurfaceflinger_client.so@0x14b11 	
3 	libEGL_VIVANTE.so 	libEGL_VIVANTE.so@0x2c8a 	
4 	libEGL_VIVANTE.so 	libEGL_VIVANTE.so@0xa682 	
5 	dalvik-LinearAlloc (deleted) 	dalvik-LinearAlloc @0x1fc99e 	
6 	libdvm.so 	libdvm.so@0xa345a 	
7 	dalvik-heap (deleted) 	dalvik-heap @0x55b28e 	
8 	libdvm.so 	libdvm.so@0x5f077 	
9 	dalvik-heap (deleted) 	dalvik-heap @0x55b28e 	
10 	libdvm.so 	libdvm.so@0xa7fce 	
11 	libEGL_VIVANTE.so 	libEGL_VIVANTE.so@0x3112 	
12 	libEGL_VIVANTE.so 	libEGL_VIVANTE.so@0xabd2 	
13 	libdvm.so 	libdvm.so@0x50369 	
14 	libEGL.so 	libEGL.so@0x9b5e 	
15 	libEGL.so 	libEGL.so@0x3ce5 	
16 	libEGL.so 	libEGL.so@0x507f 	
17 	libxul.so 	mozilla::gl::GLContextEGL::SwapBuffers 	gfx/gl/GLLibraryEGL.h:298
18 	libxul.so 	mozilla::layers::LayerManagerOGL::Render 	gfx/layers/opengl/LayerManagerOGL.cpp:1072
19 	libxul.so 	mozilla::layers::LayerManagerOGL::EndTransaction 	gfx/layers/opengl/LayerManagerOGL.cpp:700
20 	libxul.so 	mozilla::layers::LayerManagerOGL::EndEmptyTransaction 	gfx/layers/opengl/LayerManagerOGL.cpp:641
21 	libxul.so 	mozilla::layers::CompositorParent::Composite 	gfx/layers/ipc/CompositorParent.cpp:614
22 	libxul.so 	RunnableMethod<mozilla::layers::CompositorParent, void 	ipc/chromium/src/base/tuple.h:383
23 	libxul.so 	MessageLoop::RunTask 	ipc/chromium/src/base/message_loop.cc:333
24 	libxul.so 	MessageLoop::DeferOrRunPendingTask 	ipc/chromium/src/base/message_loop.cc:341
25 	libxul.so 	MessageLoop::DoWork 	ipc/chromium/src/base/message_loop.cc:441
26 	libxul.so 	base::MessagePumpDefault::Run 	ipc/chromium/src/base/message_pump_default.cc:23
27 	libxul.so 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:215
28 	libxul.so 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:208
29 	libxul.so 	base::Thread::ThreadMain 	ipc/chromium/src/base/thread.cc:156
30 	libxul.so 	ThreadFunc 	ipc/chromium/src/base/platform_thread_posix.cc:39
31 	libc.so 	libc.so@0xcf36 	
32 	libc.so 	libc.so@0xca8a 	

More reports at:
https://crash-stats.mozilla.com/report/list?signature=libsurfaceflinger_client.so%400x135e6
Crash Signature: CrashReporter::AppendAppNotesToCrashReport(nsACString_internal const&)] [@ mozalloc_abort | arena_dalloc] [@ mozalloc_abort(char const*) | NS_DebugBreak_P | nsBaseWidget::ComputeShouldAccelerate(bool)] → CrashReporter::AppendAppNotesToCrashReport(nsACString_internal const&)] [@ mozalloc_abort | arena_dalloc] [@ mozalloc_abort(char const*) | NS_DebugBreak_P | nsBaseWidget::ComputeShouldAccelerate(bool)] [@ libsurfaceflinger_client.so@0x135e6 ]
I don't see an abort in 20.0.1 with this abort message.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.