Closed Bug 1085742 Opened 5 years ago Closed 5 years ago

crash in nsAutoPtr<mozilla::CacheData>* nsTArray_Impl<nsAutoPtr<mozilla::CacheData>, nsTArrayInfallibleAllocator>::AppendElement<mozilla::CacheData*>(mozilla::CacheData* const&)

Categories

(Core :: Preferences: Backend, defect, critical)

All
Android
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla36
Tracking Status
firefox33 --- wontfix
firefox34 + wontfix
firefox35 + wontfix
firefox36 + wontfix
fennec 34+ ---

People

(Reporter: kbrosnan, Assigned: ehsan)

Details

(Keywords: crash, topcrash-android-armv7)

Crash Data

Attachments

(2 files)

[Tracking Requested - why for this release]: Topcrash, startup crash

This bug was filed from the Socorro interface and is 
report bp-64c6fe0b-b2f7-43d1-9fe9-2e5e82141014.
=============================================================

https://crash-stats.mozilla.com/report/list?signature=nsAutoPtr%3Cmozilla%3A%3ACacheData%3E*+nsTArray_Impl%3CnsAutoPtr%3Cmozilla%3A%3ACacheData%3E%2C+nsTArrayInfallibleAllocator%3E%3A%3AAppendElement%3Cmozilla%3A%3ACacheData*%3E%28mozilla%3A%3ACacheData*+const%26%29&product=FennecAndroid&query_type=contains&range_unit=weeks&process_type=any&version=FennecAndroid%3A33.0&hang_type=any#tab-reports

Top 10 devices crashing
samsung 	GT-I9505
asus 		Nexus 7
samsung 	GT-I9500
Sony 		D5503
samsung 	SCH-I545
LGE 		Nexus 4
LGE 		Nexus 5
samsung 	SAMSUNG-SM-N900A
samsung 	SM-G900P
samsung 	SAMSUNG-SM-G900A

No URLs
Component: General → Preferences: Backend
Product: Firefox for Android → Core
Ehsan, are you the right assignee here?
Assignee: nobody → ehsan.akhgari
Attachment #8509887 - Flags: review?(ehsan.akhgari)
FTR.... this is 100% speculative
Comment on attachment 8509887 [details] [diff] [review]
null_prefs_cache.patch

Review of attachment 8509887 [details] [diff] [review]:
-----------------------------------------------------------------

Not sure why I'm the right assignee here.  ;-)  But this is definitely incorrect since it breaks the caching of data unless the preferences service has been initialized.  The right fix would be to initialize it.  I'll write a patch.
Attachment #8509887 - Flags: review?(ehsan.akhgari) → review-
Actually, I can't quite tell how this can possibly happen.  AddBoolVarCache calls GetBool first, which should call InitStaticMembers, which should instantiate the preferences service, which should call GetInstanceForService, which should assign to gCacheData.  The only thing that I can think of to cause this is for Preferences::Init() to fail, but when that happens we can't really do much more than crash.  The question is, am I missing another scenario?  And also how can Preferences::Init() fail?

Benjamin, what do you think?
Flags: needinfo?(benjamin)
We're in the XPCOM startup sequence, but not especially early in it:

http://hg.mozilla.org/releases/mozilla-release/annotate/3d7920ecd8b4/xpcom/build/nsXPComInit.cpp#l684 is after the component manager is initialized, so it's not that do_GetService failed or anything like that.

It would be interesting to know whether Preferences::sShutdown is true at this point, which might indicate that we're trying to start XPCOM twice in the same process (which will definitely fail). The XPCOM launch comes from the Android Java code, so I really don't know about that. snorp?
Flags: needinfo?(benjamin) → needinfo?(snorp)
tracking-fennec: ? → 34+
Note this crash is dropping in rank but I don't believe that this spontaneously working. I suspect that this is driving people to stop using Firefox.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #6)
> We're in the XPCOM startup sequence, but not especially early in it:
> 
> http://hg.mozilla.org/releases/mozilla-release/annotate/3d7920ecd8b4/xpcom/
> build/nsXPComInit.cpp#l684 is after the component manager is initialized, so
> it's not that do_GetService failed or anything like that.
> 
> It would be interesting to know whether Preferences::sShutdown is true at
> this point, which might indicate that we're trying to start XPCOM twice in
> the same process (which will definitely fail). The XPCOM launch comes from
> the Android Java code, so I really don't know about that. snorp?

It can probably be started more than once, though that's not something that we'd ever do on purpose. The startup/shutdown of Gecko related to the Android Activity lifecycle is...messy. Once XRE shuts down, however, we call exit(), so it shouldn't be possible to start it again in that process.
Flags: needinfo?(snorp)
Brad - Ehsan is on PTO until next week. If we still want to try to fix this in 34, we can get a fix in on Thu for beta11. But, we'll need a different owner. Can you find someone else to take the bug this week?
Flags: needinfo?(blassey.bugs)
bsmedberg, what do you want to do here?
Flags: needinfo?(blassey.bugs) → needinfo?(benjamin)
As I said on IRC, I don't think this bandaid will actually make things better; we'd just be moving the crash. You could either:

* make NS_InitXPCOM early-fail if XPCOM has already started
* figure out and fix the dual-XPCOM-init

This assumes that dual XPCOM init is the cause, which seems likely to me but not guaranteed. Otherwise I don't have good suggestions.
Flags: needinfo?(benjamin)
Is this what we want then? I didn't see any other obvious way to tell if we had been initialized already, and it seems like we may not want to rely on gXPCOMShuttingDown at all.
Attachment #8525513 - Flags: review?(benjamin)
We're build 34 RC today. This is wontfix for 34.
Comment on attachment 8525513 [details] [diff] [review]
Bail out of XPCOM init early if we're already initialized

I think that gXPCOMShuttingDown would probably work, but this is also acceptable. I think I'd put the `initialized = true;` line up at the top, not at the bottom.

Also probably worth adding a MOZ_ASSERT... I'd tell you to do MOZ_RELEASE_ASSERT except that would probably just cause this bug to move in your particular case.

Shouldn't this be named "sInitialized" because it's a static?
Attachment #8525513 - Flags: review?(benjamin) → review+
https://hg.mozilla.org/mozilla-central/rev/1bfc3a8040da
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
Target Milestone: mozilla37 → mozilla36
35 is marked as affected. Do you think this can be uplifted?
Flags: needinfo?(snorp)
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #17)
> 35 is marked as affected. Do you think this can be uplifted?

This was just a speculative fix, I think. Did it really help anything? The crash report link in comment #0 doesn't seem to show any instances higher than 33 for me, so it's probably incorrect.
Flags: needinfo?(snorp) → needinfo?(kbrosnan)
I think comment 18 means that this bug should be reopened.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I'm not seeing this in FennecAndroid topcrashers, wontfixing for 35 at this late stage in the cycle.
Flags: needinfo?(kbrosnan)
per comment #20
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.