Closed Bug 1412853 Opened 7 years ago Closed 5 years ago

Crash in compiler_builtins::int::udiv::__udivsi3::__aeabi_uidiv

Categories

(Core :: Audio/Video: cubeb, defect, P3)

Unspecified
Android
defect

Tracking

()

RESOLVED DUPLICATE of bug 1410456
mozilla60
Tracking Status
firefox57 --- unaffected
firefox58 + wontfix
firefox59 --- wontfix
firefox60 --- fixed

People

(Reporter: marcia, Assigned: achronop)

References

Details

(Keywords: crash, regression)

Crash Data

This bug was filed from the Socorro interface and is 
report bp-1a51c3bd-cd2d-439b-9611-bec1b0171021.
=============================================================

Seen while looking at nightly crash stats: http://bit.ly/2lpXkQ4. Crashes appear to have started using the 20171014100451 build. 22 crashes/14 installs on a variety of devices.
media/libcubeb/src/cubeb_opensl.c hasn't changed for a long time, so not sure why we'd suddenly see a spike in crashes there unless we've enabled/changed some higher level feature that's triggering this.  A lot of the crash reports don't have any stack below the __aeabi_uidiv entry (which is resolving into Rust's stdlib, which seems wrong, but maybe common/inlined code?), but they're all on MediaPlayback threads.

I assume the SIGILL in __aeabi_uidiv would be a div-by-0?  For the crashes that do have stacks, if you trust the opensl_destroy_recorder entry, there's no obvious division in that code path and the crash report points to an assert (which, from a quick look at bionic's implementation, would crash with the usual SIGABRT).  If we're really crashing at the assert, it'd be pretty simple to rearrange opensl_destroy_recorder to free the buffers before reporting the error and then treat this as a non-fatal error.

Maybe Alex has a better guess than I do?
Flags: needinfo?(achronop)
The strange is that the recorder is not-NULL on a playback scenario. Recorder is used for input and I would expect to have a non NULL value on a duplex scenario like WebRTC call and not on a playback scenarios. For playback, recorder is initialized to NULL at the beginning of opensl_stream_init and it does not change through out the session.

If we want to avoid the assert I could create a different check for calling the opensl_destroy_recorder but I think that's not the root cause here.
Flags: needinfo?(achronop)
I'm making this a low P1; it's Nightly on Android so usage is fairly low, this could blow up on beta and release. Kinetik, would you be the right person to own this?
Rank: 8
Flags: needinfo?(kinetik)
Priority: -- → P1
Android crashes only, on 58

The only division in OpenCubeb appears to be "/ 1000", so I doubt a divide-by-0.
Also, I see no direct calls to SaferMultDiv(), nor anything that calls it.

And the library routine that's crashing is in Rust??
compiler_builtins::int::udiv::__udivsi3::__aeabi_uidiv 	src/libcompiler_builtins/src/lib.rs:40
That's the weird part

Snorp, any ideas?
Flags: needinfo?(snorp)
(In reply to Andreas Pehrson [:pehrsons] from comment #3)
> I'm making this a low P1; it's Nightly on Android so usage is fairly low,
> this could blow up on beta and release. Kinetik, would you be the right
> person to own this?

Not really, I just took a quick look while scanning through recent bugs in the libcubeb component.  Probably best to ask Anthony to find an owner for it.
Flags: needinfo?(kinetik)
(In reply to Matthew Gregan [:kinetik] from comment #5)
> Probably best to ask Anthony to find an owner for it.
Flags: needinfo?(ajones)
Blake - this might be something than Chunmin and John can collaborate on.

James - given that the code hasn't changed recently, have there been any build changes that could affect this code?
Flags: needinfo?(ajones) → needinfo?(bwu)
These are all the commits for that build-id:

386175:386117,386174   a31334a65a1c   2017-10-13 23:37 +0200   archaeopteryx
  merge mozilla-inbound to mozilla-central. r=merge a=merge

386176   e89e0285e766   2017-10-13 23:34 -0700   ffxbld
  No bug, Automated HSTS preload list update from host bld-linux64-spot-329 - a=hsts-update

386177   7192c4630797   2017-10-13 23:34 -0700   ffxbld
  No bug, Automated HPKP preload list update from host bld-linux64-spot-329 - a=hpkp-update

386178:386116   2ff3b62c2b14   2017-10-13 10:48 -0500   emilio
  servo: Merge #18863 - style: Share code between Gecko and Servo for DOM APIs (from emilio:dom-api-dont-repeat); r=jdm

386179   e2a919fabb6a   2017-10-13 09:47 -0700   continuation
  Bug 1408459 - Remove unused declaration of NS_MeanAndStdDev(). r=erahm

386180   8240c498e88c   2017-10-13 18:16 +0200   mail
  Bug 1407437 - Unskip test_security.py and test_ev_certificate.py. r=jmaher

386181   e7d01449105f   2017-10-13 12:51 -0400   kgupta
  Bug 1407213 - Update webrender to commit a624aa6d3b6006c510c8b14026567af4ac545d2f. r=jrmuizel

386182   f24e5fe2d767   2017-10-13 12:57 -0400   kgupta
  Bug 1407213 - Update Cargo lockfiles and re-vendor rust libraries. r=jrmuizel

386183   e748a2c59145   2017-10-13 12:58 -0400   a
  Bug 1407213 - Update bindings for changes in WR PR 1853. r=jrmuizel

386184   72d52134e375   2017-10-13 12:58 -0400   kgupta
  Bug 1407213 - Update reftest listings for various webrender changes. r=jrmuizel

386185   74a760610fd0   2017-09-25 11:19 +0800   mtseng
  Bug 1403459 - Passing transform-style from display item directly. r=kats

386186   e1863419e38b   2017-10-13 13:30 -0500   mozilla
  Bug 1406164 - We're bringing eBay back. r=flod

386187   c4b56772f649   2017-10-13 20:45 +0200   archaeopteryx
  Backed out changeset e2a919fabb6a (bug 1408459) for build bustage on Android 4.2 x86 opt: invalid conversion. r=backout

386188   c38729d10244   2017-10-12 18:15 -0400   tchiovoloni
  Bug 1408180 - Ensure LoginRec.toString doesn't contain the password. r=kitcambridge

386189   6077a8b545d6   2017-10-13 12:26 -0500   simon
  servo: Merge #18854 - Make optional the usage of some unstable features (from servo:servo-unstable-feature); r=nox

386190   fe87c31e80c4   2017-10-13 15:03 -0400   kgupta
  Bug 1408261 - Update profiler tracing stuff to use the new macro. r=mstange

386191   3a8f6746185e   2017-10-10 09:47 -0600   mozilla
  Bug 1408145: Report comm-central revision to treeherder when building from a comm branch; r=dustin

386192   0cd1372b3fa9   2017-10-13 12:43 -0700   bwerth
  Bug 1358299 Part 1: Stop collecting data for BOX_ALIGN_PROPS_IN_BLOCKS_FLAG probe. r=chutten

386193   83ae3528255f   2017-10-13 12:44 -0700   bwerth
  Bug 1358299 Part 2: Remove the histogram definition for BOX_ALIGN_PROPS_IN_BLOCKS_FLAG. r=chutten

386194   68ccc9cb5de0   2017-10-13 17:59 +0100   ato
  Bug 1408454 - Move error.pprint to format.pprint. r=whimboo

386195   35866a88d567   2017-10-13 11:48 -0700   nalexander
  Bug 1407872 - Use Python yes-like pipe for --no-interactive in |mach bootstrap|. r=gbrown

386196   366262985fa8   2017-10-02 13:27 -0700   nalexander
  Bug 1352599 - Part 1: Add a Proguard toolchain task for Android builds. r=froydnj

386197   4f50c65e62c3   2017-10-12 14:28 -0700   nalexander
  Bug 1352599 - Part 2: Add PROGUARD_JAR configure option. r=chmanchester

386198   6fa6873c200c   2017-10-13 14:11 -0500   emilio
  servo: Merge #18864 - style: Reformat a few signatures to follow a consistent style (from emilio:reformat); r=jdm

386199   20c73eec1bd2   2017-10-12 20:06 +0200   jh+bugzilla
  Bug 1407835 - Don't keep BrowserApp unnecessarily alive through sScreenOrientationDelegate. r=nalexander

386200   a0f88980b79c   2017-10-13 19:13 +0100   moz-ian
  Bug 1399429 - Properly determine if the content window is private for the contextMenu. r=Felipe

386201   be1d227ae5b3   2017-10-13 07:19 +0900   hikezoe
  Bug 1399314 - Move getBindingElementAndPseudo into shared/inspector/css-logic.js. r=ochameau

386202   eaa699653dc4   2017-10-13 07:19 +0900   hikezoe
  Bug 1399314 - Introdue CssLogic.getCSSStyleRules to get style rules for ::before and ::after pseudo elements handy. r=bgrins

386203   47f85792a78c   2017-10-04 13:19 -0400   botond
  Bug 1402995 - Reflow scrollbars after partial reflow of scroll frame. r=tnikkel

386204   2e9d07abdc13   2017-10-13 11:59 +1100   me
  Bug 1407843 part 1 - Introduce a global-level AtomArray type alias. r=froydnj

386205   39713e51fe82   2017-10-13 12:54 +1100   me
  Bug 1407843 part 2 - Move tree pseudo matching code from nsTreeBodyFrame into nsCSSRuleProcessor. r=heycam

386206   408f93434478   2017-10-13 13:50 +1100   me
  Bug 1407843 part 3 - Remove nsICSSPseudoComparator. r=heycam

386207   a01135b451a4   2017-10-13 13:22 +0900   mh+mozilla
  Bug 1408277 - Add a toolchain job for clang 5.0. r=froydnj

386208   bbc2354236d3   2017-10-05 15:55 -0700   nalexander
  Bug 1405412 - Pre: Allow toolchain task images to not cache tc-vcs. r=dustin

386209   d4d6cad20606   2017-10-03 11:45 -0700   nalexander
  Bug 1405412 - Migrate Android SDK to android-sdk-linux toolchain task. r=dustin

386210   27213145f391   2017-10-05 16:57 -0700   nalexander
  Bug 1405412 - Post: Remove JDK repackaging script. r=dustin

386211:386210,386175   1199e3c98d9f   2017-10-14 00:02 +0200   archaeopteryx
  merge mozilla-central to autoland. r=merge a=merge

386212   247bb54b615f   2017-10-12 17:31 -0400   bsilverberg
  Bug 1408099 - Fix ExtensionPreferencesManager.getLevelOfControl to deal with undefined settings, r=aswan

386213   c844b986015b   2017-10-13 18:33 +0900   mh+mozilla
  Bug 1408317 - Take endianness into consideration when looking for rust target. r=froydnj

386214   f9020d7696a9   2017-10-13 09:15 +0900   mh+mozilla
  Bug 1408224 - Avoid confusing errors in automation logs when failing to purge toolchain cache. r=mshal

386215   17fbd19e4360   2017-10-06 13:27 -0700   kyle
  Bug 1406224 - Remove nsIDOMHTMLImageElement; r=bz

386216   3bbb5071bd20   2017-10-04 14:39 -0700   ksteuber
  Bug 1394851 - downloads.download API should default to use Firefox's "Save As" pref r=kmag

386217   6d8203bb0816   2017-10-13 14:11 -0700   giles
  Bug 1408565 - mozboot: Upgrade to rust 1.21.0. r=nalexander

386218:386177,386217   0dd64d5842e8   2017-10-14 11:39 +0200   archaeopteryx
  merge autoland to mozilla-central. r=merge a=merge

386219:386174   b109053be6ac   2017-10-13 14:26 -0400   bzbarsky
  Bug 1397815.  Add memory reporting for button frame's mInnerFocusStyle and other additional style contexts.  r=emilio

386220   d2ab84fce45f   2017-10-13 16:01 -0400   ryanvm
  Bug 1407861 - Update WebVR test expectations now that Mac is enabled by default. r=dmu, r=bz

386221   19cc8ad9bfba   2017-10-09 20:36 -0400   ehsan
  Bug 1407309 - Part 1: Rewrite HTMLEditor::CopyLastEditableChildStyles() to use internal DOM APIs; r=masayuki

386222   490ba7b770e9   2017-10-09 20:39 -0400   ehsan
  Bug 1407309 - Part 2: Remove some dead code; r=masayuki

386223   d59889304b3f   2017-10-12 17:39 -0400   ehsan
  Bug 1408125 - Part 1: Add an optional argument to InsertNode() to allow callers to pass the child node when they know it; r=masayuki

386224   19fee71d7e27   2017-10-12 17:48 -0400   ehsan
  Bug 1408125 - Part 2: Pass the child node at the offset as an extra argument where possible to InsertNode(); r=masayuki

386225   cafd9cdafeb7   2017-10-13 16:29 -0400   spohl
  Bug 1405151: Ensure that crashes appear correctly in Socorro in the case of SIGABRT crashes on macOS. r=ted

386226   95dff4968a19   2017-09-26 14:19 -0700   giles
  Bug 1408211 - Update builders to rust 1.20.0. r=mshal

386227   f5034aeb6407   2017-10-14 00:05 +0200   archaeopteryx
  Backed out changeset 19fee71d7e27 (bug 1408125)

386228   1f8ee2e6f065   2017-10-14 00:05 +0200   archaeopteryx
  Backed out changeset d59889304b3f (bug 1408125) for asserting in clipboard test editor/libeditor/tests/test_bug1306532.html. r=backout on a CLOSED TREE

386229:386228,386175   62e4765b2d97   2017-10-14 00:11 +0200   archaeopteryx
  merge mozilla-central to mozilla-inbound. r=merge a=merge on a CLOSED TREE

386230   5278d6756359   2017-10-14 07:33 +0900   sotaro
  Bug 1407748 - Force enable alpha channel to make sure ANGLE use correct framebuffer formart r=jgilbert

386231   ec879327cd7c   2017-08-18 15:58 -0700   leonardo
  Bug 1374290 - Import diff contents from a local Test262 folder. r=shu

386232   0a6f7d9bc6f0   2017-10-13 14:40 -0700   leonardo
  Bug 1374290 - Test262 export script. r=shu

386233   5306302faf01   2017-09-12 13:45 -0700   leonardo
  Bug 1374290 - Test the local import script. r=sfink

386234   9789b0ad7982   2017-10-12 16:41 -0700   leonardo
  Bug 1374290 - Update the skip list. r=sfink

386235   873e2f3a9cf5   2017-10-14 14:02 +0900   mh+mozilla
  Fixup for cramtests and some python tests after bug 1407468. r=me

386236:386218,386235   d43c1c0fa038   2017-10-14 11:49 +0200   archaeopteryx
  merge mozilla-inbound to mozilla-central. r=merge a=merge
Just a wild guess: is it possible that Rust code installed a SIGABRT handler which leads to SIGILL?
Flags: needinfo?(jolin)
Interesting... IIUC the Rust source of src/libcompiler_builtins/src/lib.rs:40 [1] shows it's a abort() function?

[1] https://github.com/rust-lang-nursery/compiler-builtins/blob/master/src/lib.rs#L40
Tracking 58+ for this crash. We have about 50 crashes on nightly so far, so I think this warrants tracking.
Since we are still lacking an owner for this and P1's need an owner I'm assigning to triage owner until a better owner is found.
Assignee: nobody → ajones
Flags: needinfo?(ajones)
Indeed, crash looks fixed.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(snorp)
Resolution: --- → WORKSFORME
Flags: needinfo?(ajones)
this signature is spiking up now on 58.0b12, so not sure if WFM is the right solution. should we reopen this bug or file a new one?
Reopening - we didn't land anything to fix it.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Stack is a bit weird, but it appears we went into opensl_get_preferred_sample_rate() and called into the OS code there.  Perhaps some odd device/driver -- any correlations there?
Currently the #6 overall browser top crash in Fennec Beta 5.
bug 1410456 removes `opensl_get_preferred_rate`, we'll monitor when it lands.
Depends on: 1410456
No hits on Beta60 so far - looks like we might be in the clear here?
Flags: needinfo?(padenot)
Yeah. I mean, we still don't know _why_ this happened, but we've stopped doing the things that caused it.
Flags: needinfo?(padenot)
Assignee: ajones → achronop
Target Milestone: --- → mozilla60
Moving to p3 because no activity for at least 24 weeks.
Priority: P1 → P3
Still looking good. I'm closing this as fixed by bug 1410456.
Status: REOPENED → RESOLVED
Closed: 6 years ago5 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.