Closed Bug 1546191 Opened 5 months ago Closed 2 days ago

Crash in [@ nsCORSListenerProxy::OnStopRequest]

Categories

(Core :: Networking: HTTP, defect, P1, critical)

Unspecified
Android
defect

Tracking

()

RESOLVED DUPLICATE of bug 1529911
Tracking Status
firefox-esr68 --- affected
firefox66 --- affected
firefox67 --- affected
firefox68 --- affected

People

(Reporter: gsvelto, Assigned: mayhemer)

Details

(Keywords: crash, Whiteboard: [necko-triaged])

Crash Data

This bug is for crash report bp-294c86aa-5506-4176-bd98-6dd940190421.

Top 10 frames of crashing thread:

0 libxul.so nsCORSListenerProxy::OnStopRequest netwerk/protocol/http/nsCORSListenerProxy.cpp:619
1 libxul.so mozilla::extensions::ChannelWrapper::RequestListener::OnStopRequest toolkit/components/extensions/webrequest/ChannelWrapper.cpp:945
2 libxul.so mozilla::net::HttpBaseChannel::DoNotifyListener netwerk/protocol/http/HttpBaseChannel.cpp:3278
3 libxul.so mozilla::net::HttpAsyncAborter<mozilla::net::nsHttpChannel>::HandleAsyncAbort netwerk/protocol/http/HttpBaseChannel.h:875
4 libxul.so mozilla::detail::RunnableMethodImpl<mozilla::net::nsHttpChannel*, void  xpcom/threads/nsThreadUtils.h:1174
5 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1180
6 libxul.so NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486
7 libxul.so mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:88
8 libxul.so MessageLoop::Run ipc/chromium/src/base/message_loop.cc:290
9 libxul.so nsBaseAppShell::Run widget/nsBaseAppShell.cpp:137

Looks like the crash is caused because |listener| [1] is null. Maybe we should add a null check.

[1] https://searchfox.org/mozilla-central/rev/f46e2bf881d522a440b30cbf5cf8d76fc212eaf4/netwerk/protocol/http/nsCORSListenerProxy.cpp#619

Priority: -- → P2
Whiteboard: [necko-triaged]

Over 2600 crashes in Fennec 68 and was also high volume in Fennec 67. Perhaps we can do what Kershaw suggested in Comment 1? I also noticed a high correlation to ublock on release:

(62.06% in signature vs 06.63% overall) Module "uBlock0@raymondhill.net.xpi" = true
(62.06% in signature vs 07.15% overall) Addon "uBlock0@raymondhill.net" = true

Having said that, this isn't in the top 20 crash signatures overall at the moment on release.

P1 since the crash rate is quite high.

Priority: P2 → P1

All crashing calls are coming from ChannelWrapper::RequestListener::OnStopRequest. Apparently, listener, loaded from mOuterListener is null at [1]. The only possibility here is that OnStopRequest on the CORS proxy is called more than once. This looks to me as a bug in the webext backend.

Moving.

[1] https://searchfox.org/mozilla-central/rev/b38e3beb658b80e1ed03e0fdf64d225bd4a40327/netwerk/protocol/http/nsCORSListenerProxy.cpp#646,648

Component: Networking: HTTP → General
Product: Core → WebExtensions

All we're doing is firing off an event then passing on to the upstream listener[1] we got when calling setNewListener. So IIUC, we would have to be called twice as well. Or am I missing something here?

[1] https://searchfox.org/mozilla-central/rev/b38e3beb658b80e1ed03e0fdf64d225bd4a40327/toolkit/components/extensions/webrequest/ChannelWrapper.cpp#944-954

Flags: needinfo?(honzab.moz)

Got it, thanks. Then this is a necko issue, but needs some deeper look because we have added duplication checks long ago. Apparently, we are still leaking somewhere, just don't know if this has been fixed in 69 or not, or at all.

Although, according this report it seems impossible:
HttpBaseChannel.cpp@a092972b53f0e566a36770e7b03363036ff820ec (annotated)
nsHttpChannel.cpp@a092972b53f0e566a36770e7b03363036ff820ec (annotated)

HttpBaseChannel::DoNotifyListener() checks and sets mOnStopRequestCalled (on stack for the crash). nsHttpChannel::ContinueOnStopRequest also sets (doesn't check, tho) mOnStopRequestCalled. These two are only places we call OnStopRequest on the listener.

mOnStopRequestCalled is not a bit field, so possible threading issue is off the table too. It's never dropped back to false (on current mozilla-central)

It's also not a reentrance issue, something we fixed in bug 1529911.

So, only other option is that the proxy is somehow shared by more than one channels. Back to Necko.

Component: General → Networking: HTTP
Flags: needinfo?(honzab.moz)
Product: WebExtensions → Core

I will look into this.

Assignee: nobody → honzab.moz

One aspect is weird: the crash spikes are time-correlated, not build-id correlated. The sudden spike on Jul 12 and sudden drop on Jul 26 (but not back to the previous numbers!) look like an external influence. All crashes I examined had addons (uBlock, AdBlock, ...) installed. Some cancellation logic is in place here, that triggers a duplicate call path.

I was looking more deeply in the code and we can't share nsCORSListenerProxy among channels. Http channel is inserted one in the listener chain as the first thing in AsyncOpen (nsContentSecurityManager::doContentSecurityCheck call). This either synchronously fails, bubbling to asyncOpen result, or is done only once as we set a flag on the load info to not do this again (loadInfo->Set/GetInitialSecurityCheckDone(), in case of e.g. a redirect). The listener is only given to redirected-to channels, before that we don't call streamlistener methods on it, only the last channel does.

We have fixed duplicate calls (only possible cause of this bug then) in 69, bug 1529911. Fennec (all channels!) is ESR68 based. There are no desktop 69+ crashes.

Status: NEW → RESOLVED
Closed: 2 days ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1529911
You need to log in before you can comment on or make changes to this bug.