Open Bug 1595821 Opened 1 year ago Updated 4 months ago

Crash in [@ libc.so@0x1a41a | libhwui.so@0x2fe32]

Categories

(GeckoView :: General, defect, P5)

Unspecified
Android
defect

Tracking

(firefox-esr68 affected)

Tracking Status
firefox-esr68 --- affected

People

(Reporter: marcia, Unassigned)

References

Details

(Keywords: crash, regression)

Crash Data

This bug is for crash report bp-b9ac1565-293c-431d-9e10-a49850191112.

#21 overall crash which seems to have grown in 68.0.2: https://bit.ly/32HzQXg. The crash affects API 27 users running mostly Motorola devices.

Some correlations:

99.80% in signature vs 03.56% overall) Module "u:object_r:product_prop:s0" = true
(99.80% in signature vs 05.69% overall) Module "u:object_r:base_os_prop:s0" = true
(99.80% in signature vs 07.24% overall) android_brand = motorola
(99.80% in signature vs 07.27% overall) android_manufacturer = motorola
(98.78% in signature vs 07.19% overall) android_board = msm8953
(100.0% in signature vs 09.24% overall) reason = SIGABRT
(100.0% in signature vs 11.98% overall) android_version = 27 (REL)
(98.78% in signature vs 07.32% overall) adapter_device_id = Adreno (TM) 506 [98.78% vs 14.29% if adapter_vendor_id = Qualcomm]

Top 10 frames of crashing thread:

0 libc.so libc.so@0x1a41a 
1 libhwui.so libhwui.so@0x2fe32 
2 libhwui.so libhwui.so@0x2fe32 
3 libhwui.so libhwui.so@0x2fe32 
4 liblog.so liblog.so@0x655f 
5 libhwui.so libhwui.so@0x9a90d 
6 liblog.so liblog.so@0x657a 
7 dalvik-main space (region space) (deleted) dalvik-main space @0xdb26163 
8 system@framework@boot-core-oj.art system@framework@boot-core-oj.art@0x8c263 
9 dalvik-main space (region space) (deleted) dalvik-main space @0xd99524d 

Priority: -- → P3

A recent crash report is here https://crash-stats.mozilla.org/report/index/b6b7df7d-b2d4-4600-83bf-ae77c0200629

Probably nothing we can do. Seems to be initiated in GL code if I'm reading the stack correctly (so maybe having a gpu process could help? but that's a lot of work).

Priority: P3 → --
Product: Firefox for Android → GeckoView
Version: Firefox 68 → 68 Branch
Version: 68 Branch → Trunk

Setting to P5 since there doesn't seem to be anything actionable here.

Priority: -- → P5
Crash Signature: [@ libc.so@0x1a41a | libhwui.so@0x2fe32] → [@ libc.so@0x1a41a | libhwui.so@0x2fe32] [@ libc.so@0x22570 | libc.so@0x2254c | liblog.so@0x86b8 ]
Crash Signature: [@ libc.so@0x1a41a | libhwui.so@0x2fe32] [@ libc.so@0x22570 | libc.so@0x2254c | liblog.so@0x86b8 ] → [@ libc.so@0x1a41a | libhwui.so@0x2fe32] [@ libc.so@0x22570 | libc.so@0x2254c | liblog.so@0x86b8 ] [@ libllvm-glnext.so@0x732610 ]

We're seeing similar crashes with Webrender enabled (https://bugzilla.mozilla.org/show_bug.cgi?id=1609191). @jnicol looked at this and it's an adreno 505/506 driver bug causing random crashes during shader compilation. He has not been able to identify a workaround.

See Also: → 1609191

There is a user on reddit experiencing the issue - is there anything we can get from them to get some more insight into the issue and to get it fixed in GV?

https://www.reddit.com/r/firefox/comments/i9ist7/having_a_ton_of_problems_with_the_new_update_for/

Flags: needinfo?(ktaeleman)

@Jamie: could you take a look if this is the same shader compilation issue but with WR disabled?

Flags: needinfo?(ktaeleman) → needinfo?(jnicol)

I've asked the user some questions on reddit.

I think that the libllvm-glnext.so signature is out of place amongst the others here. Presumably it is shader compilation, but I doubt the others are. I think the reason for this confusion is that some of the shader compilation crashes we have seen do come from libhwui.

Looking at the crash dumps from the reddit user's crash and some other [@ libc.so@0x1a41a | libhwui.so@0x2fe32 ] ones I can see glTexImage2D error! GL_OUT_OF_MEMORY (0x505) and GSL MEM ERROR: kgsl_sharedmem_al. (in perhaps the logcat buffer? I don't really know how to read crash dumps, I just ran strings on the file...) So they might be an OOM crash in libhwui. I'm not sure how much we can do about that, other than reducing our memory usage.. Historically lots of our OOMs come from poor layerization, so webrender should help. I can't see any webrender in that signature, though that might be due to the small nightly population.

The other signatures, liblog.so and libllvm-glnext.so do not have an GL_OUT_OF_MEMORY strings, so they might have a different cause.

Flags: needinfo?(jnicol)
You need to log in before you can comment on or make changes to this bug.