Closed Bug 1127464 Opened 10 years ago Closed 10 years ago

still crashing in nsObserverService::RemoveObserver(nsIObserver*, char const*)

Categories

(Firefox for Android Graveyard :: Toolbar, defect)

34 Branch
ARM
Android
defect
Not set
critical

Tracking

(firefox35 wontfix, firefox36+ wontfix, firefox37+ wontfix, firefox38+ wontfix, firefox39 fixed, fennec37+)

RESOLVED FIXED
Firefox 39
Tracking Status
firefox35 --- wontfix
firefox36 + wontfix
firefox37 + wontfix
firefox38 + wontfix
firefox39 --- fixed
fennec 37+ ---

People

(Reporter: snorp, Assigned: snorp)

References

Details

(Keywords: crash, reproducible, topcrash-android-armv7)

Crash Data

Attachments

(1 file, 3 obsolete files)

+++ This bug was initially created as a clone of Bug #1062758 +++ https://crash-stats.mozilla.com/report/index/3bee9ab7-9790-4b87-b55d-bd95d2150124 We are somehow exiting in the middle of a composite, possible due to divide-by-zero. No STR.
Assignee: nobody → snorp
[Tracking Requested - why for this release]:
tracking-fennec: --- → 37+
Messing with the summary to prevent gmail from being stupid.
Summary: crash in nsObserverService::RemoveObserver(nsIObserver*, char const*) → still crashing in nsObserverService::RemoveObserver(nsIObserver*, char const*)
We seem to have a signal handler installed to catch SIGFPE (and as a result, do not crash/exit), but I wonder if some versions of Android have their own which does exit.
Frame 22 in that crash is libgsl.so@0xd2a7, which on a 4.4.2 mako build is a "blx exit@plt" in ioctl_kgsl_sharedmem_alloc. Two instructions before, it's calling an os_alog function so there presumably is something in logcat. So, there are two issues here: - we should MOZ_CRASH during exit() instead of crashing while running some of our destructors. Since I want to keep our linker code working outside of our usecase, I'd rather make us crash optionally and enable that on our builds only, something like a MOZ_LINKER_CRASH_AT_EXIT define. - there's an OOM happening, which we might be able to do something about, considering the "many pictures (~450)" comment in the crash report.
This crash is reproducible whilst scrolling around the embedded Google map on http://www.flightradar24.com/data/flights/ua931/ https://crash-stats.mozilla.com/report/index/efc1f999-1b89-457a-9d9c-9bbdc2150203 Device: Galaxy Note 3 (4.4.2)
From some logcats that mfinkle sent me: 01-27 08:21:10.454 7364 7382 D skia : SkGrPixelRef::onReadPixels failed to alloc bitmap for result! 01-27 08:21:10.454 7364 7382 D skia : SkROLockPixelsPixelRef::onLockPixels failed! 01-27 08:21:10.674 7364 7513 W Adreno-GSL: <sharedmem_gpumem_alloc_id:1489>: sharedmem_gpumem_alloc: mmap failed errno 12 Out of memory 01-27 08:21:10.674 7364 7513 E Adreno-GSL: <ioctl_kgsl_sharedmem_alloc:1590>: ioctl_kgsl_sharedmem_alloc: FATAL ERROR : (null) So Skia was trying to do a readback and failed to allocate, but then there was some other allocation attempt in GL that caused the exit().
Milan there may be some sort of Skia or GL-related leak here
Component: General → Graphics, Panning and Zooming
Catalin it seems like we already had a crash with the STR you gave there. Can you remember which one that is? Did we fix it?
Flags: needinfo?(catalin.suciu)
https://bugzilla.mozilla.org/show_bug.cgi?id=1098227 This is the other crash and I can't reproduce it using the initial steps
Flags: needinfo?(catalin.suciu)
Looks like it is too late for 36...
An alternative to this patch would be turning the MOZ_CRASH() into a debug assert and just skipping the finalization. I like this one better because the app is still going down unexpectedly (crashing), and we will get a report. But it won't look like an actual crash like this one, we should get the assertion message.
Comment on attachment 8569446 [details] [diff] [review] Assert when we unexpectedly unload libraries on Android Review of attachment 8569446 [details] [diff] [review]: ----------------------------------------------------------------- ::: mozglue/linker/ElfLoader.cpp @@ +492,5 @@ > void > ElfLoader::Init() > { > Dl_info info; > + expect_shutdown = false; might as well set it in the ElfLoader constructor... except there is no constructor at the moment. Meh. That said, as I said in comment 4, defaults should be sensible to non-Fennec use, and defaulting to false means a voluntary crash will always happen in that case. @@ +591,5 @@ > > void > ElfLoader::__wrap_cxa_finalize(void *dso_handle) > { > + if (!ElfLoader::Singleton.IsShutdownExpected()) { \ What are the backslashes for? You can also remove ElfLoader::
Attachment #8569446 - Flags: review?(mh+mozilla) → review-
(In reply to Mike Hommey [:glandium] from comment #13) > Comment on attachment 8569446 [details] [diff] [review] > Assert when we unexpectedly unload libraries on Android > > Review of attachment 8569446 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: mozglue/linker/ElfLoader.cpp > @@ +492,5 @@ > > void > > ElfLoader::Init() > > { > > Dl_info info; > > + expect_shutdown = false; > > might as well set it in the ElfLoader constructor... except there is no > constructor at the moment. Meh. > > That said, as I said in comment 4, defaults should be sensible to non-Fennec > use, and defaulting to false means a voluntary crash will always happen in > that case. > Yeah, forgot about that. I'll add some #ifdef stuff. > @@ +591,5 @@ > > > > void > > ElfLoader::__wrap_cxa_finalize(void *dso_handle) > > { > > + if (!ElfLoader::Singleton.IsShutdownExpected()) { \ > > What are the backslashes for? You can also remove ElfLoader:: Oh, I had it in a macro at first, but then realized I only needed to check the one spot. Oops.
One interesting thing is that finalizers do not get called if you exit(0), so we won't be able to catch that.
Attachment #8569446 - Attachment is obsolete: true
Attachment #8570513 - Flags: review?(mh+mozilla)
Comment on attachment 8570513 [details] [diff] [review] Assert when we unexpectedly unload libraries on Android Review of attachment 8570513 [details] [diff] [review]: ----------------------------------------------------------------- ::: mozglue/linker/ElfLoader.cpp @@ +489,5 @@ > } > } > > +ElfLoader::ElfLoader() > + : expect_shutdown(false) Trivial enough to go in the class definition. Default should be true. @@ +497,5 @@ > void > ElfLoader::Init() > { > Dl_info info; > + expect_shutdown = false; redundant with the constructor. @@ +596,5 @@ > > void > ElfLoader::__wrap_cxa_finalize(void *dso_handle) > { > +#ifdef MOZ_LINKER_CRASH_UNEXPECTED_EXIT This shouldn't be a compile-time thing. Just make the load*Libs functions in APKOpen.cpp set the flag to false.
Attachment #8570513 - Flags: review?(mh+mozilla) → review-
Attachment #8570513 - Attachment is obsolete: true
Attachment #8571374 - Flags: review?(mh+mozilla)
Ugh, this seems to have some problems with PR_LoadLibrary. I guess I need to check that the library is libxul or something before asserting.
Comment on attachment 8571374 [details] [diff] [review] Assert when we unexpectedly unload libraries on Android Review of attachment 8571374 [details] [diff] [review]: ----------------------------------------------------------------- ::: mozglue/android/APKOpen.cpp @@ +407,2 @@ > GeckoStart(args, &sAppData); > + ElfLoader::Singleton.ExpectShutdown(true); Note, considering gecko doesn't survive the shutdown, we might as well never allow it for gecko. ::: mozglue/linker/ElfLoader.h @@ +459,5 @@ > friend int __wrap_dlclose(void *handle); > const char *lastError; > > private: > + ElfLoader() : expect_shutdown(true){} space before {
Attachment #8571374 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8571374 [details] [diff] [review] Assert when we unexpectedly unload libraries on Android Review of attachment 8571374 [details] [diff] [review]: ----------------------------------------------------------------- ::: mozglue/linker/ElfLoader.cpp @@ +591,5 @@ > ElfLoader::__wrap_cxa_finalize(void *dso_handle) > { > + if (!Singleton.IsShutdownExpected()) { > + MOZ_CRASH("Unexpected shutdown"); > + } Ah, wrt comment 19, you want thsi part in ~ElfLoader instead.
(In reply to Mike Hommey [:glandium] from comment #20) > Comment on attachment 8571374 [details] [diff] [review] > Assert when we unexpectedly unload libraries on Android > > Review of attachment 8571374 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: mozglue/android/APKOpen.cpp > @@ +407,2 @@ > > GeckoStart(args, &sAppData); > > + ElfLoader::Singleton.ExpectShutdown(true); > > Note, considering gecko doesn't survive the shutdown, we might as well never > allow it for gecko. We do survive if the main loop has exited, typically, because we have a chance to clean up after ourselves. This avoids the xpcom freak-out that happens when exit() is called randomly.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 39
37 and 38 are marked as affected. wdyt about uplifting the fix?
Flags: needinfo?(snorp)
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #26) > 37 and 38 are marked as affected. wdyt about uplifting the fix? This isn't *really* fixed. The patch just keeps us from crashing in a stupid way. The root cause is an OOM (at least it seems to be most of the time). I don't see a reason to uplift it right now.
Flags: needinfo?(snorp)
From speaking with snorp, we're going to leave this one closed even though the issue hasn't really been resolved. There's nothing to uplift but we expect that the patch will produce different signatures that should help in the diagnosis of the OOM issue that this bug is about.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Lawrence did you mean to reopen this?
Flags: needinfo?(lmandel)
Nope.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Flags: needinfo?(lmandel)
Resolution: --- → FIXED
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: