Closed
Bug 1318667
Opened 8 years ago
Closed 8 years ago
Crash in libdvm.so@0x849e6
Categories
(Firefox for Android Graveyard :: General, defect, P1)
Tracking
(relnote-firefox 51+, fennec51+, firefox51+ verified, firefox52+ verified, firefox53+ verified, firefox54+ verified)
People
(Reporter: marcia, Assigned: sebastian)
References
Details
(Keywords: crash, regression, topcrash, Whiteboard: [MobileAS])
Crash Data
Attachments
(2 files)
59 bytes,
text/x-review-board-request
|
ahunt
:
review+
Sylvestre
:
approval-mozilla-aurora+
|
Details |
5.66 KB,
patch
|
Sylvestre
:
approval-mozilla-beta+
Sylvestre
:
approval-mozilla-release+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-793d8ae8-e984-4d8d-9baa-d7d0e2161118.
=============================================================
This appeared as a new startup crash in the first beta: http://bit.ly/2fM7oMI
All the devices seem to be TR10CS1 which is a tablet.
Comment 1•8 years ago
|
||
Hi Sebastian,
Can you help take a look at this one? It crashed in 51.0b1.
Flags: needinfo?(s.kaspari)
Reporter | ||
Comment 2•8 years ago
|
||
Adding a few signatures, with the added volume this is becoming the top overall crash on 51.0b1.
Crash Signature: [@ libdvm.so@0x849e6] → [@ libdvm.so@0x849e6]
[@ libdvm.so@0x84d96]
[@ libdvm.so@0x84c86]
[@ libdvm.so@0x849d6]
Keywords: topcrash
Assignee | ||
Comment 3•8 years ago
|
||
This seems to be a crash in native code. I'll flag it for triage.
tracking-fennec: --- → ?
Flags: needinfo?(s.kaspari)
Comment 4•8 years ago
|
||
This is crash number 2, 3, 4, and 7 of the top 10 crashes.
Crash Signature: [@ libdvm.so@0x849e6]
[@ libdvm.so@0x84d96]
[@ libdvm.so@0x84c86]
[@ libdvm.so@0x849d6] → [@ libdvm.so@0x849e6]
[@ libdvm.so@0x84d96]
[@ libdvm.so@0x84c86]
[@ libdvm.so@0x849d6]
[@ libdvm.so@0x84736]
[@ libdvm.so@0x3f686]
Comment 5•8 years ago
|
||
[Tracking Requested - why for this release]: top crash
tracking-firefox51:
--- → ?
Hello, i have device with produce such crashes.
https://crash-stats.mozilla.com/report/index/b4912898-fd19-4737-9ae8-a48fc2161203
https://crash-stats.mozilla.com/report/index/22e9e03a-6619-4444-be28-d6d602161203
Dell Venue 7, Android 4.4.2, Intel Atom based.
Comment 8•8 years ago
|
||
Hi :sebastian,
Can you help find an owner for this?
Flags: needinfo?(s.kaspari)
This looks like it could be media related. Blake, could you folks take a look?
tracking-fennec: ? → 51+
Flags: needinfo?(s.kaspari) → needinfo?(bwu)
Comment 10•8 years ago
|
||
I don't recall libdvm.so is media related. I could be wrong.
John,
Could you have a look?
Flags: needinfo?(bwu) → needinfo?(jolin)
Comment 11•8 years ago
|
||
IIRC, libdvm is Dalvik Java VM. Do we have bugreport or logcat dump for these crashes?
Flags: needinfo?(jolin)
Yeah, libdvm is dalvik. I saw a few logcats that indicated there was some media stuff going on, but looking at some others now I don't see that, so maybe it's unrelated. Regardless, I don't really see anything actionable since the trace is so useless.
Comment 13•8 years ago
|
||
Mark 51 won't fix as there is nothing actionable now.
Reporter | ||
Comment 14•8 years ago
|
||
One of these signatures ([@ libdvm.so@0x849e6 ]) spiked slightly in the last Fennec beta, and there are now about 1500 crashes (202 installs) in the 51 release build of Fennec.
The crashing device in this signature is TR10CS1, which appears to be some kind of educational tablet.
Reporter | ||
Comment 15•8 years ago
|
||
Look as if most of the other signatures map to different types of tablets, all running Kit Kat (API 19) - Examples:
libdvm.so@0x849d6: Asus
libdvm.so@0x84c86: Dell
Crash Signature: [@ libdvm.so@0x849e6]
[@ libdvm.so@0x84d96]
[@ libdvm.so@0x84c86]
[@ libdvm.so@0x849d6]
[@ libdvm.so@0x84736]
[@ libdvm.so@0x3f686] → [@ libdvm.so@0x849e6]
[@ libdvm.so@0x84d96]
[@ libdvm.so@0x84c86]
[@ libdvm.so@0x849d6]
[@ libdvm.so@0x84736]
[@ libdvm.so@0x3f686]
[@ libdvm.so@0x85a56]
Updated•8 years ago
|
Keywords: regression
Reporter | ||
Comment 16•8 years ago
|
||
snorp: I know you noted in Comment 12 that is isn't actionable - any other ideas as what we could do further to investigate? Volume is fairly high in early 51 data. Some of the comments in the first signature do mention videos.
Also is there any significance to the fact that it started in beta and wasn't present before that?
Flags: needinfo?(snorp)
Comment 17•8 years ago
|
||
These are all x86 devices. They don't exist in great numbers for Aurora & Nightly. TR10CS1 seems to be an educational tablet computer with a keyboard distributed to Venezuelan students. The Dell Venue devices might be the easiest to come by in the NA market. It is doubtful that these will be available as new, they were produced in 2014.
Reporter | ||
Updated•8 years ago
|
Crash Signature: [@ libdvm.so@0x849e6]
[@ libdvm.so@0x84d96]
[@ libdvm.so@0x84c86]
[@ libdvm.so@0x849d6]
[@ libdvm.so@0x84736]
[@ libdvm.so@0x3f686]
[@ libdvm.so@0x85a56] → [@ libdvm.so@0x849e6]
[@ libdvm.so@0x84d96]
[@ libdvm.so@0x84c86]
[@ libdvm.so@0x849d6]
[@ libdvm.so@0x84736]
[@ libdvm.so@0x3f686]
[@ libdvm.so@0x85a56]
[@ libdvm.so@0x85c86]
[@ libdvm.so@0x84a46]
Comment 18•8 years ago
|
||
these signatures are accounting for 35% of all reported crashes on fennec 51 right now.
Comment 19•8 years ago
|
||
I temporarily halted 51 fennec staged rollout (which was set at 10% of all users) this morning while we investigated. It is now re-enabled.
The specific device most heavily affected is TR10CS1_19, seen here: https://crash-stats.mozilla.com/search/?signature=%5Elibdvm.so%400x8&android_model=%3DTR10CS1&android_model=%3DTR10CS1&version=51.0&date=%3E%3D2016-04-01T00%3A00%3A00.000Z&date=%3C2017-01-26T16%3A56%3A00.000Z&_sort=-date&_facets=signature&_facets=android_brand&_facets=android_model&_facets=android_device&_facets=android_hardware&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-android_device
I added TR10CS1_19 (just one of many TR10CS1 devices on the market) to the list of excluded devices in the Play Store. I am fairly sure this means that until we fix the crashing issue and un-exclude that device, the TR10CS1 users will keep the version they have, but won't be able to update, and won't be able to install Firefox from a new download.
Comment 20•8 years ago
|
||
This search should include all the problematic devices. Unfortunately ECS is an OEM so the devices show up from many different brands. The search looks for crashes from Firefox for Android where the signature starts with libdvm.so@0x and the device architecture is x86.
https://crash-stats.mozilla.com/search/?signature=%5Elibdvm.so%400x&cpu_arch=x86&product=FennecAndroid&_sort=-date&_facets=android_device&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=cpu_arch#facet-android_device
Comment 21•8 years ago
|
||
I looked at a few of the crashes and they all seem to happen on the "GeckoIconTask" thread. GeckoIconTask landed in bug 1300543 for 51, so that makes sense. Maybe we can spin a dot release where we disable GeckoIconTask just for this device.
Assignee: nobody → s.kaspari
Flags: needinfo?(snorp) → needinfo?(s.kaspari)
Comment 22•8 years ago
|
||
BTW I think "GeckoIconTask" is only for loading favicons, so disabling it involves some loss of functionality but it's not significant IMO.
Comment 23•8 years ago
|
||
This may show up in beta 52 and if so maybe we can fix it in time for 52 release.
status-firefox52:
--- → ?
tracking-firefox52:
--- → +
Assignee | ||
Comment 24•8 years ago
|
||
(In reply to Jim Chen [:jchen] [:darchons] from comment #21)
> I looked at a few of the crashes and they all seem to happen on the
> "GeckoIconTask" thread.
Where do you see that it's related to GeckoIconTask? I can't find it in the crash reports. This is plain Java code. If it's crashing libdvm.so then something pretty weird is going on.
(In reply to Jim Chen [:jchen] [:darchons] from comment #22)
> BTW I think "GeckoIconTask" is only for loading favicons, so disabling it
> involves some loss of functionality but it's not significant IMO.
That's correct. That could be an emergency quick fix but isn't really a solution I would like to ship.
Flags: needinfo?(s.kaspari)
Comment 25•8 years ago
|
||
(In reply to Sebastian Kaspari (:sebastian) from comment #24)
> (In reply to Jim Chen [:jchen] [:darchons] from comment #21)
> > I looked at a few of the crashes and they all seem to happen on the
> > "GeckoIconTask" thread.
>
> Where do you see that it's related to GeckoIconTask? I can't find it in the
> crash reports. This is plain Java code. If it's crashing libdvm.so then
> something pretty weird is going on.
I looked at the binary minidump files from several crash reports, and the crashing threads were all GeckoIconTask.
Seems like some kind of dalvik bug we're hitting. There are definitely different things we can try (e.g. not use ThreadPoolExecutor), but we have to act quickly.
Reporter | ||
Updated•8 years ago
|
Crash Signature: [@ libdvm.so@0x849e6]
[@ libdvm.so@0x84d96]
[@ libdvm.so@0x84c86]
[@ libdvm.so@0x849d6]
[@ libdvm.so@0x84736]
[@ libdvm.so@0x3f686]
[@ libdvm.so@0x85a56]
[@ libdvm.so@0x85c86]
[@ libdvm.so@0x84a46] → [@ libdvm.so@0x849e6]
[@ libdvm.so@0x84d96]
[@ libdvm.so@0x84c86]
[@ libdvm.so@0x849d6]
[@ libdvm.so@0x84736]
[@ libdvm.so@0x3f686]
[@ libdvm.so@0x85a56]
[@ libdvm.so@0x85c86]
[@ libdvm.so@0x84a46]
[@ dalvik-zygote (deleted)@0x14bf]
Comment 26•8 years ago
|
||
Still the top crash for 51.0.1 for Fennec. I'm blocking updates (excluding in Play Store) for a few more of the devices most severely affected by the startup crash:
Dell Venue 8 yellowtail
ZTE V975– redhookbay
acer B1-730HD vespa
AcerA1-830– ducati
AsusTransformer Pad (TF103CG)– K018
Assignee | ||
Comment 27•8 years ago
|
||
I got one of those devices from Ebay and will debug this as soon as it arrives. In the meantime excluding those devices on Google Play might be our best option. Disabling the IconTask for them is a bit cumbersome if we do not know all affected devices. Disabling this code for all users is not an option.
Comment 28•8 years ago
|
||
I added these tablets to the excluded devices:
ASUS Transformer Pad K018, K017, K014, K01A, K00g
Dell Venue 8 - yellowtail
Once we have a fix, we should try taking them off the list so folks can update.
Assignee | ||
Comment 29•8 years ago
|
||
I received the device today and the "good" news is that it's crashing permanently. I'll try to debug this today.
Assignee | ||
Comment 30•8 years ago
|
||
Took quite a while to get this reproducing with my own builds. So far only the release version crashed, but not Nightly, Aurora, Beta and local builds. However I can now produce it using the release branch and the official release branding:
The palette support library is the culprit. We use it to extract a dominant color from icons (we use the color in the UI). Coincidentally our release Firefox icon is triggering this crash in the support library. That's why it's mostly only happening in the release version and that's also the reason why it's happening so often (and on tablets): We load the icon to display it on the tab strip etc when loading about:home. But there's no reason why other icons shouldn't trigger that too.
51.0 is the first release where we switched to the palette library because it's significantly faster than our custom implementation (that we still have in the code base for other reasons). A quick fix is to switch back to our own implementation on x86 devices (and keep the faster library on other devices). One more reason to update the support library soon (bug 1333704).
I'll prepare a patch - should be slow and risk should be low.
Comment hidden (mozreview-request) |
Assignee | ||
Comment 32•8 years ago
|
||
I need to create a separate patch for the release branch (and maybe others).
Comment 33•8 years ago
|
||
mozreview-review |
Comment on attachment 8833355 [details]
Bug 1318667 - Do not use palette library on x86 devices (Use BitmapUtils.getDominantColor()).
https://reviewboard.mozilla.org/r/109602/#review110668
Eughh. (I wonder if newer versions fix this?)
Attachment #8833355 -
Flags: review?(ahunt) → review+
Assignee | ||
Comment 34•8 years ago
|
||
This is the patch for the release branch.
Assignee | ||
Comment 35•8 years ago
|
||
Comment on attachment 8833355 [details]
Bug 1318667 - Do not use palette library on x86 devices (Use BitmapUtils.getDominantColor()).
This one applies to Aurora too (I'll add the details for the request later after adding all the patches).
Attachment #8833355 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 36•8 years ago
|
||
Comment on attachment 8833360 [details] [diff] [review]
1318667-Release.patch
This one applies to Beta and Release.
Attachment #8833360 -
Flags: approval-mozilla-release?
Attachment #8833360 -
Flags: approval-mozilla-beta?
Comment 37•8 years ago
|
||
Pushed by s.kaspari@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/26b85661155e
Do not use palette library on x86 devices (Use BitmapUtils.getDominantColor()). r=ahunt
Assignee | ||
Comment 38•8 years ago
|
||
(Request for Aurora, Beta and Release uplift)
Approval Request Comment
[Feature/Bug causing the regression]: In Firefox 51.0 we refactored the icon code and decided to switch to the palette library for color extraction (Faster than our own implementation). The switch was done in bug 1300569.
[User impact if declined]: We see a bunch of crashes on x86 devices. This doesn't happen for all icons but at least for our Firefox release icon, which gets loaded quite often. So for those users it's basically a crash loop.
[Is this code covered by automated tests?]: Yes, the new icon code is covered. But this crash is coming from inside Android's support library.
[Has the fix been verified in Nightly?]: Nightly is not directly affected - or at least we do not know which other website icon might trigger this. So far I manually verified the patch with a custom release build on a TF103CG.
[Needs manual test from QE? If yes, steps to reproduce]: Not necessarily. But the steps are: Get one of the affected devices. Install the release version of 51.0. Load a website, open a new tab.
[List of other uplifts needed for the feature/fix]: -
[Is the change risky?]: The patch itself is not risky.
[Why is the change risky/not risky?]: On x86 devices (that's the "smallest" group I can identify of impacted devices) we do not use the palette library anymore with that patch. Instead we fallback to our custom color extraction code. This code has been in place in previous releases and has no known crashes.
[String changes made/needed]: None
Comment 39•8 years ago
|
||
FOr the current release, might it not be better, and more inline with normal procedures to just revert the patch from bug 1300569, and take this fix for version 52 and forward?
Assignee | ||
Comment 40•8 years ago
|
||
You'd need to revert the patch from bug 1300569 and (or only) patch 6 from bug 1300543 (this one actually replaces the code in the icon pipeline). And the code was modified since then so we'd need a custom patch again anyways. Not sure if we gain much by that.
Comment 41•8 years ago
|
||
(In reply to Sebastian Kaspari (:sebastian) from comment #40)
> You'd need to revert the patch from bug 1300569 and (or only) patch 6 from
> bug 1300543 (this one actually replaces the code in the icon pipeline). And
> the code was modified since then so we'd need a custom patch again anyways.
> Not sure if we gain much by that.
OK if this is simpler than the backout. Was just saying our policy is backout and it is way past release date for this and still hardly anyone running version 51 on Android devices from what I can see. If this code avoids the issue and is easier than the backout, I am all for it.
Assignee | ||
Comment 42•8 years ago
|
||
Okay, I just verified. Just backing this one out is an option for release too:
https://hg.mozilla.org/mozilla-central/rev/4e9bf0dca65a
This works without conflicts and just revokes the change in the pipeline (we still include the library in the build though, we just don't use it) -> No crash.
Comment 43•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox54:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 54
Updated•8 years ago
|
Comment 44•8 years ago
|
||
Comment on attachment 8833360 [details] [diff] [review]
1318667-Release.patch
OK, let's take it on all branches. I also prefer the patch than the backout.
By the way, maybe we should write a none regression test here?
Attachment #8833360 -
Flags: approval-mozilla-release?
Attachment #8833360 -
Flags: approval-mozilla-release+
Attachment #8833360 -
Flags: approval-mozilla-beta?
Attachment #8833360 -
Flags: approval-mozilla-beta+
Updated•8 years ago
|
Attachment #8833355 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Updated•8 years ago
|
Comment 45•8 years ago
|
||
Possibly related to this g-bug (same Android version 19=4.4.0-4.4.4, on intel devices):
https://code.google.com/p/android/issues/detail?id=174522
They claim to have added code to catch Exceptions in the palette library in support library 23.1.0 (we're on 23.4.0), so it could be completely unrelated.
Comment 46•8 years ago
|
||
bugherder uplift |
Comment 47•8 years ago
|
||
(In reply to Gerry Chang [:gchang] from comment #13)
> Mark 51 won't fix as there is nothing actionable now.
Now that we have something actionable shouldn't this be changed to affected?
Comment 48•8 years ago
|
||
I literally mid-aired your comment trying to change it.
Comment 49•8 years ago
|
||
bugherder uplift |
Comment 50•8 years ago
|
||
bugherder uplift |
Comment 51•8 years ago
|
||
Added (probably by Gerry) to the release notes with
"Fix a top crash caused by Android library (Palette) on some x86 devices (Bug 1318667)"
relnote-firefox:
--- → 51+
Assignee | ||
Comment 52•8 years ago
|
||
I just tested the release APK on my Asus tablet and something weird is happening: The x86 APK is working and does not crash anymore. However I can install the ARM APK too. It doesn't show our "wrong architecture" toast - it just starts normally. However it then crashes with the same signature. It looks like that the tablet not only can run ARM APKs in some compatibility mode. It also pretends to be an ARM device and our checks do not work. I'll file a separate bug for that. However the consequence for this bug is that we are still going to see this crash if the wrong APK is installed on a device (Hopefully this does not happen via Google Play).
Comment 54•8 years ago
|
||
We have verification of the fix from the duplicate.
Status: RESOLVED → VERIFIED
Assignee | ||
Updated•8 years ago
|
Iteration: --- → 1.15
Priority: -- → P1
Whiteboard: [MobileAS]
Updated•4 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•