Closed Bug 1061125 Opened 10 years ago Closed 3 years ago

crash in [@ nsBrowserStatusFilter::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, tag_nsresult)] from testSafeBrowsingNotificationBar.js

Categories

(Core :: DOM: Navigation, defect, P5)

31 Branch
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox31 --- wontfix
firefox32 --- affected
firefox33 --- affected
firefox34 --- affected
firefox-esr24 --- wontfix
firefox-esr31 --- affected

People

(Reporter: danisielm, Unassigned)

References

()

Details

(Keywords: crash, Whiteboard: [mozmill])

Crash Data

Firefox 33.0a2 crashed in one of our testruns & here is the crash report I submitted:
bp-2abc1ab3-7026-4f16-a48f-66acb2140901

I didn't see any related bugs with this crash, even though we logged another one just these days happening with the same test (bug 1057987).

Here is the mozmill log:
> TEST-START | testSecurity/testSafeBrowsingNotificationBar.js | testNotificationBar
> ###!!! [Child][MessageChannel::SendAndWait] Error: Channel error: cannot send/recv
> mozcrash INFO | Saved minidump as /home/mozauto/.mozilla/firefox/Crash Reports/pending/019c98e3-90f8-06fe-16d6add0-35116aaa.dmp
> mozcrash INFO | Saved app info as /home/mozauto/.mozilla/firefox/Crash Reports/pending/019c98e3-90f8-06fe-16d6add0-35116aaa.extra
> PROCESS-CRASH | /home/mozauto/jenkins/workspace/mozilla-aurora_remote/data/mozmill-tests/firefox/tests/remote/testSecurity/testSafeBrowsingNotificationBar.js | application crashed [Unknown top frame]
> Crash dump filename: /home/mozauto/jenkins/workspace/mozilla-aurora_remote/data/profile/minidumps/019c98e3-90f8-06fe-16d6add0-35116aaa.dmp
> No symbols path given, can't process dump.
> MINIDUMP_STACKWALK not set, can't process dump.

The last 10 threads:
Frame	Module	Signature	Source
0		@0x14cd835c	
1	libxul.so	nsBrowserStatusFilter::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, tag_nsresult)	toolkit/components/statusfilter/nsBrowserStatusFilter.cpp
2	libxul.so	nsDocLoader::DoFireOnStateChange(nsIWebProgress*, nsIRequest*, int&, tag_nsresult)	uriloader/base/nsDocLoader.cpp
3	libxul.so	nsDocLoader::doStopDocumentLoad(nsIRequest*, tag_nsresult)	uriloader/base/nsDocLoader.cpp
4	libxul.so	nsDocLoader::DocLoaderIsEmpty(bool)	uriloader/base/nsDocLoader.cpp
5	libxul.so	nsDocLoader::OnStopRequest(nsIRequest*, nsISupports*, tag_nsresult)	uriloader/base/nsDocLoader.cpp
6	libxul.so	nsLoadGroup::RemoveRequest(nsIRequest*, nsISupports*, tag_nsresult)	netwerk/base/src/nsLoadGroup.cpp
7	libxul.so	nsDocument::DoUnblockOnload()	content/base/src/nsDocument.cpp
8	libxul.so	nsUnblockOnloadEvent::Run()	content/base/src/nsDocument.cpp
9	libxul.so	nsThread::ProcessNextEvent(bool, bool*)	xpcom/threads/nsThread.cpp
This crash also happens on Windows but overall more (83%) on 32bit machines. Current listings of crashes go also back to Firefox 31 ESR. Interestingly the stack on Windows is much shorter. Maybe it's an indication for something:

0 	xul.dll 	nsBrowserStatusFilter::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, tag_nsresult) 	toolkit/components/statusfilter/nsBrowserStatusFilter.cpp
1 	xul.dll 	nsDocLoader::DoFireOnStateChange(nsIWebProgress* const, nsIRequest* const, int&, tag_nsresult) 	uriloader/base/nsDocLoader.cpp
2 	xul.dll 	nsDocLoader::doStopDocumentLoad(nsIRequest*, tag_nsresult) 	uriloader/base/nsDocLoader.cpp
3 	xul.dll 	nsDocLoader::DocLoaderIsEmpty(bool) 	uriloader/base/nsDocLoader.cpp

All the above code is very old without changes happening in the last years. The only change I see in nsDocLoader around those lines is from bug 493701. Arpad, do you have an idea if that could be related to your changes in April this year?
Severity: normal → critical
Component: General → Document Navigation
Flags: needinfo?(arpad.borsos)
OS: Linux → All
Product: Toolkit → Core
Hardware: x86 → All
Summary: crash in nsBrowserStatusFilter::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, tag_nsresult) → crash in [@ nsBrowserStatusFilter::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, tag_nsresult)]
Whiteboard: [mozmill]
Version: unspecified → 31 Branch
The only real change here is that the `nsListenerInfo` elements are allocated directly inside the `AutoTArray`, which can reallocate, and not on the heap.

Is this code executed in parallel?
We get a reference here: https://hg.mozilla.org/mozilla-central/rev/30f8577510ea#l1.152
Is it possible that some code executing the compact here runs in parallel?: https://hg.mozilla.org/mozilla-central/rev/30f8577510ea#l1.163

But then again, this should prevent calling the function for any pointer that is not a valid object, right?: https://hg.mozilla.org/mozilla-central/rev/30f8577510ea#l1.156

And once we do_QueryReferent, the local nsCOMPtr holds a strong reference, right? So it’s not freed while we call the method on it?
Flags: needinfo?(arpad.borsos) → needinfo?(bzbarsky)
So one thread gets a reference via .GetNext, checks the flags and then yields to another thread
That other thread does some remove/append, so the pointer is to a *different* listener that does *not* match the flags. Thread yields.
First thread picks up, doing the QueryReferent successfully and then calls the method for the object that does *not* match the flags.

Is that a possible scenario?
Safest thing to try is to return to heap-allocating the listeners?
> Is this code executed in parallel?

It better not be.  This is all mainthread-only code.

The breakpad stacks are all truncated, sadly; there's no way that DocLoaderIsEmpty is the entrypoint!
Flags: needinfo?(bzbarsky)
Crash Signature: nsBrowserStatusFilter::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, tag_nsresult) → [@ nsBrowserStatusFilter::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, tag_nsresult) ]
Summary: crash in [@ nsBrowserStatusFilter::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, tag_nsresult)] → crash in [@ nsBrowserStatusFilter::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, tag_nsresult)] from testSafeBrowsingNotificationBar.js
Crash Signature: [@ nsBrowserStatusFilter::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, tag_nsresult) ] → [@ nsBrowserStatusFilter::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, tag_nsresult) ] [@ nsBrowserStatusFilter::OnStateChange ]

Hello! Wayne could you please provide more detailed steps on how to reproduce this issue or a testcase?
Is this issue being worked on?

Flags: needinfo?(vseerror)

Sorry, I never crashed with this issue. I don't have a testcase

Flags: needinfo?(vseerror)

Having said that, with only a couple crashes per month, is there anything still worth pursuing.
bp-34719538-6d8b-49a2-89a0-d4a3a0210303
bp545bd06a-39bd-46e0-b737-5ae2c0210305
both startup crashes

Flags: needinfo?(nkochar)

This is very low crash volume but since these are startup crashes, I'll still have farre give it a look (post his current urgent assigned work) and confirm.

Severity: critical → S2
Flags: needinfo?(nkochar) → needinfo?(afarre)
Priority: -- → P5

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WORKSFORME
Flags: needinfo?(afarre)
You need to log in before you can comment on or make changes to this bug.