Closed Bug 1120395 Opened 9 years ago Closed 9 years ago

crash in nsUrlClassifierStreamUpdater::OnStartRequest(nsIRequest*, nsISupports*)

Categories

(Core :: Networking, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla38
tracking-b2g backlog
Tracking Status
e10s + ---
firefox37 + verified
firefox38 --- verified

People

(Reporter: mihaelav, Assigned: mrbkap)

References

()

Details

(Keywords: crash, Whiteboard: [mozmill])

Crash Data

Attachments

(1 file, 2 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-c456674d-4e54-46c6-8cfa-a07642150112.
=============================================================

We got this while running one of our functional mozmill tests.
Report: http://mozmill-daily.blargon7.com/#/functional/report/5bb4668cfa6af53a0ba505835d6355fa
monica - related to your recent changes maybe?
Flags: needinfo?(mmc)
After further investigation, this is not related to https://bugzilla.mozilla.org/show_bug.cgi?id=1100024.

These crashes date from at least 12/26, before https://bugzilla.mozilla.org/show_bug.cgi?id=1100024 landed (the first time). Most likely they are due to the timer in nsUrlClassifierStreamUpdater not setting the callback when it sets its own timer in https://bugzilla.mozilla.org/show_bug.cgi?id=1110891.
Depends on: 1110891
Some crash details:

Crash Reason 	SIGSEGV
Crash Address 	0x0

0 	libxul.so 	nsUrlClassifierStreamUpdater::OnStartRequest(nsIRequest*, nsISupports*) 	toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp
1 	libxul.so 	mozilla::net::HttpBaseChannel::DoNotifyListener() 	netwerk/protocol/http/HttpBaseChannel.cpp
2 	libxul.so 	nsRunnableMethodImpl<void (mozilla::net::HttpBaseChannel::*)(), void, true>::Run() 	xpcom/glue/nsThreadUtils.h
3 	libxul.so 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp
4 	libxul.so 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/glue/nsThreadUtils.cpp
5 	libxul.so 	mozilla::dom::indexedDB::::QuotaClient::ShutdownTransactionService 	dom/indexedDB/ActorsParent.cpp
6 	libxul.so 	mozilla::dom::quota::QuotaManager::Observe(nsISupports*, char const*, char16_t const*) 	dom/quota/QuotaManager.cpp
7 	libxul.so 	nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) 	xpcom/ds/nsObserverList.cpp
8 	libxul.so 	nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) 	xpcom/ds/nsObserverService.cpp
9 	libxul.so 	nsXREDirProvider::DoShutdown() 	toolkit/xre/nsXREDirProvider.cpp
10 	libxul.so 	ScopedXPCOMStartup::~ScopedXPCOMStartup() 	toolkit/xre/nsAppRunner.cpp

Given by Soccorro it's a cross-platform crash.
OS: Linux → All
Whiteboard: [mozmill]
Release Note Request (optional, but appreciated)
[Why is this notable]:
[Suggested wording]:
[Links (documentation, blog post, etc)]:

[Tracking Requested - why for this release]:

[Blocking Requested - why for this release]:
blocking-b2g: --- → 2.0?
blocking-fx: --- → ?
tracking-b2g: --- → ?
relnote-b2g: --- → ?
Flags: sec-review?
Flags: needinfo?
Flags: in-testsuite?
Flags: in-moztrap?
Flags: a11y-review?
Release Note Request (optional, but appreciated)
[Why is this notable]:
[Suggested wording]:
[Links (documentation, blog post, etc)]:

[Tracking Requested - why for this release]:

[Blocking Requested - why for this release]:
Flags: needinfo?
Please do not play around with bug flags. Thanks.
blocking-b2g: 2.0? → ---
blocking-fx: ? → ---
relnote-b2g: ? → ---
Flags: sec-review?
Flags: needinfo?
Flags: in-testsuite?
Flags: in-moztrap?
Flags: a11y-review?
[Tracking Requested - why for this release]:

The signature has spiked in v37:
Product	        Version	Percentage	Number Of Crashes
Firefox 	37.0a1 	91.67 %	1057
FennecAndroid 	37.0a1 	4.08 %	47
Firefox 	34.0.5 	2.08 %	24
Firefox 	35.0b99 0.61 %	7
FennecAndroid 	34.0.1 	0.61 %	7
Firefox 	34.0 	0.52 %	6
Firefox 	35.0 	0.17 %	2
FennecAndroid 	35.0b8 	0.09 %	1
FennecAndroid 	35.0b11 0.09 %	1
Firefox 	35.0b3 	0.09 %	1
My mistake, the above table is for "nsUrlClassifierPrefixSet::Contains(unsigned int, bool*)"
but I suspect it might be the same underlying problem.  Here's the table for
"nsUrlClassifierStreamUpdater::OnStartRequest(nsIRequest*, nsISupports*)":
Product	Version	Percentage	Number Of Crashes
Firefox 	37.0a1 	76.92 %	10
Firefox 	37.0a2 	15.38 %	2
Firefox 	18.0b5 	7.69 %	1
Crash Signature: [@ nsUrlClassifierStreamUpdater::OnStartRequest(nsIRequest*, nsISupports*)] → [@ nsUrlClassifierStreamUpdater::OnStartRequest(nsIRequest*, nsISupports*)] [@ nsUrlClassifierPrefixSet::Contains(unsigned int, bool*)]
Submitted another crash on OSX 10.7 and also I have this one, is this related too? On the same machine:
https://crash-stats.mozilla.com/report/index/1195a38a-b6e5-417a-bdad-a777e2150120

Doesn't have this bug at relevant bugs, I'll file a different one if it's unrelated, I see it's also about UrlClassifier.
This signature is showing up as the #2 topcrasher for e10s-enabled Firefox 37 with 500 crashes in the last 7 days. It isn't on the list of topcrashers for 37 as a whole, though.  So it may be e10s related. 

https://crash-stats.mozilla.com/search/?product=Firefox&dom_ipc_enabled=!__null__&version=37.0a1&process_type=browser&date=%3E%3D2015-01-12&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature
tracking-e10s: --- → ?
Assignee: nobody → mrbkap
Tracking 37 until we know that this doesn't impact when e10s is disabled.
My crash in nsUrlClassifierPrefixSet::Contains was with e10s disabled.
(In reply to Mats Palmgren (:mats) from comment #13)
> My crash in nsUrlClassifierPrefixSet::Contains was with e10s disabled.

The crash in nsUrlClassifierPrefixSet::Contains was fixed in https://hg.mozilla.org/integration/mozilla-inbound/rev/fbe8067063a0 (https://bugzilla.mozilla.org/show_bug.cgi?id=1120145#c5) probably in 1/13's build. Are you still seeing this signature?
Flags: needinfo?(mats)
No, I haven't seen it in newer than 1/13 builds, so it's likely my crashes were that other bug.
Crash Signature: [@ nsUrlClassifierStreamUpdater::OnStartRequest(nsIRequest*, nsISupports*)] [@ nsUrlClassifierPrefixSet::Contains(unsigned int, bool*)] → [@ nsUrlClassifierStreamUpdater::OnStartRequest(nsIRequest*, nsISupports*)]
Flags: needinfo?(mats)
Flags: needinfo?(mrbkap)
Attached patch Possible patch (obsolete) — Splinter Review
Looking through the stacks, it seems like this is strictly a shutdown crash. It's definitely happening because we're clearing mDownloadErrorCallback before receiving OnStartRequest, which can happen when we're shutting down (xpcom-shutdown-threads causes nsUrlClassifierDBService to cancel any pending updates, which clears mDownloadErrorCallback, as well as most of the rest of our state). This patch tries to detect this state and bail out without doing further stuff.

Monica, does this seem reasonable to you?
Attachment #8554932 - Flags: review?(mmc)
This is probably not related to e10s, based on the stacks and is a shutdown crash.
Flags: needinfo?(mrbkap)
Comment on attachment 8554932 [details] [diff] [review]
Possible patch

Review of attachment 8554932 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM. gcp may want to take a look.
Attachment #8554932 - Flags: review?(mmc)
Attachment #8554932 - Flags: review?(gpascutto)
Attachment #8554932 - Flags: review+
Attachment #8554932 - Flags: review?(gpascutto) → review+
Comment on attachment 8554932 [details] [diff] [review]
Possible patch

Review of attachment 8554932 [details] [diff] [review]:
-----------------------------------------------------------------

From reading the code a bit, the shutdown ends up forcing an UpdateError in
https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#605
which calls into
https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp#65 which suggests that mIsUpdating or the callback != nullptr is what you are supposed to check. At least it's not obvious to me mChannel being null there is guaranteed to be directly related.

I think mChannel == nullptr is coming from StreamUpdater noticing the shutdown event too:
https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp#589

The only other place that clears it is https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp#565 but that's one that you patched, so this doesn't look right.

I'm going to change my review based on this, I think you need to check mIsUpdating because currently you're dependent on the observer noticing the shutdown event, but I can be convinced otherwise if I missed anything.
Attachment #8554932 - Flags: review+ → review-
Attached patch Possible patch v2 (obsolete) — Splinter Review
Attachment #8554932 - Attachment is obsolete: true
Attachment #8556090 - Flags: review?(gpascutto)
Attachment #8556090 - Flags: review?(gpascutto) → review+
I have no idea how that assertion can hit. I have stared at this code
for way too long now and am giving up. This patch is safer and is green
on try <https://treeherder.mozilla.org/#/jobs?repo=try&revision=a3dace903c55>.
Attachment #8556090 - Attachment is obsolete: true
Attachment #8558684 - Flags: review?(gpascutto)
Attachment #8558684 - Flags: review?(gpascutto) → review+
https://hg.mozilla.org/mozilla-central/rev/b3fa9d6b1310
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Crashed again over the weekend(2/7/2015	4:36 AM), with aurora 37.0a2 zh-CN on windows 8-32-2.

https://crash-stats.mozilla.com/report/index/ed8ad7d3-d62a-48f5-b224-065e12150209.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
...which is why it says fixed in Firefox 38 not in Firefox 37...
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Comment on attachment 8558684 [details] [diff] [review]
Possible patch v3

Approval Request Comment
[Feature/regressing bug #]: Long standing bug exposed by e10s.
[User impact if declined]: Occasional crashes on shutdown.
[Describe test coverage new/current, TreeHerder]: None
[Risks and why]: Low, added check to prevent the crash.
[String/UUID change made/needed]: N/A
Attachment #8558684 - Flags: approval-mozilla-aurora?
Comment on attachment 8558684 [details] [diff] [review]
Possible patch v3

Trivial fix. Aurora+
Attachment #8558684 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: