Crash in nsPrefetchService [@ nsIChannel::GetLoadInfo]

RESOLVED FIXED

Status

()

Core
DOM
--
critical
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: rbarker, Assigned: freesamael)

Tracking

(4 keywords)

51 Branch
crash, regression, reproducible, topcrash
Points:
---

Firefox Tracking Flags

(firefox50 unaffected, firefox51+ fixed)

Details

(Whiteboard: [necko-active] [fixed by patch from bug #1268962], crash signature)

Attachments

(1 attachment)

(Reporter)

Description

2 years ago
STR:
1) Visit micechat.com
2) Scroll to bottom of page
3) Select desktop theme.
4) Wait for page to reload.

Expected result: page reloads.
Actual result: fennec crashes with following stack trace.
This is 100% reproducible for me. This is a very recent regression as I just started seeing it August 9, 2016.

Stack:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 11956]
GetLoadInfo (this=0x0)
    at /Volumes/firefox/jpz/obj-arm-linux-androideabi/dist/include/nsIChannel.h:121
121	    mozilla::DebugOnly<nsresult> rv = GetLoadInfo(getter_AddRefs(result));
(gdb) bt
#0  GetLoadInfo (this=0x0)
    at /Volumes/firefox/jpz/obj-arm-linux-androideabi/dist/include/nsIChannel.h:121
#1  nsPrefetchNode::OnStartRequest (this=0x8f3e1dc0, aRequest=0x8ad18030, 
    aContext=<optimized out>)
    at /Volumes/firefox/jpz/uriloader/prefetch/nsPrefetchService.cpp:198
#2  0x9cbaac2a in mozilla::net::nsHttpChannel::CallOnStartRequest (
    this=this@entry=0x8ad18000)
    at /Volumes/firefox/jpz/netwerk/protocol/http/nsHttpChannel.cpp:1138
#3  0x9cbaaeda in mozilla::net::nsHttpChannel::ContinueOnStartRequest3 (
    this=this@entry=0x8ad18000, result=result@entry=NS_OK)
    at /Volumes/firefox/jpz/netwerk/protocol/http/nsHttpChannel.cpp:6329
#4  0x9cbaeb24 in mozilla::net::nsHttpChannel::ContinueOnStartRequest2 (
    this=0x8ad18000, result=<optimized out>)
    at /Volumes/firefox/jpz/netwerk/protocol/http/nsHttpChannel.cpp:6320
#5  0x9cbaeb8e in mozilla::net::nsHttpChannel::ContinueOnStartRequest1 (
    this=0x8ad18000, result=<optimized out>)
    at /Volumes/firefox/jpz/netwerk/protocol/http/nsHttpChannel.cpp:6297
#6  0x9cbaf6a0 in mozilla::net::nsHttpChannel::OnStartRequest (
    this=0x8ad18000, request=<optimized out>, ctxt=<optimized out>)
    at /Volumes/firefox/jpz/netwerk/protocol/http/nsHttpChannel.cpp:6269
#7  0x9caf2324 in nsInputStreamPump::OnStateStart (this=this@entry=0x8a78f940)
    at /Volumes/firefox/jpz/netwerk/base/nsInputStreamPump.cpp:524
#8  0x9caf6968 in nsInputStreamPump::OnInputStreamReady (this=0x8a78f940, 
    stream=<optimized out>)
    at /Volumes/firefox/jpz/netwerk/base/nsInputStreamPump.cpp:426
#9  0x9caa62e2 in nsInputStreamReadyEvent::Run (this=0x8a7b1ca0)
    at /Volumes/firefox/jpz/xpcom/io/nsStreamUtils.cpp:95
#10 0x9caaf2c8 in nsThread::ProcessNextEvent (this=0xaf85d200, 
    aMayWait=<optimized out>, aResult=0xaf7ff86f)
    at /Volumes/firefox/jpz/xpcom/threads/nsThread.cpp:1058
#11 0x9cac2ec8 in NS_ProcessNextEvent (aThread=<optimized out>, 
    aMayWait=<optimized out>)
    at /Volumes/firefox/jpz/xpcom/glue/nsThreadUtils.cpp:290
#12 0x9cc3cfce in mozilla::ipc::MessagePump::Run (this=0xaf864460, 
    aDelegate=0xaf86a060) at /Volumes/firefox/jpz/ipc/glue/MessagePump.cpp:96
#13 0x9cc184ce in RunHandler (this=0xaf86a060)
    at /Volumes/firefox/jpz/ipc/chromium/src/base/message_loop.cc:225
#14 MessageLoop::Run (this=0xaf86a060)
    at /Volumes/firefox/jpz/ipc/chromium/src/base/message_loop.cc:205
#15 0x9d936ef4 in nsBaseAppShell::Run (this=0x96ef8470)
    at /Volumes/firefox/jpz/widget/nsBaseAppShell.cpp:156
#16 0x9dd595c8 in nsAppStartup::Run (this=0x98edf1c0)
    at /Volumes/firefox/jpz/toolkit/components/startup/nsAppStartup.cpp:284
#17 0x9dd896ea in XREMain::XRE_mainRun (this=this@entry=0xaf7ff9e0)
    at /Volumes/firefox/jpz/toolkit/xre/nsAppRunner.cpp:4279
#18 0x9dd899aa in XREMain::XRE_main (this=this@entry=0xaf7ff9e0, 
    argc=argc@entry=6, argv=argv@entry=0xaf8064c8, 
    aAppData=aAppData@entry=0xafa82888 <sAppData>)
    at /Volumes/firefox/jpz/toolkit/xre/nsAppRunner.cpp:4406
#19 0x9dd89b92 in XRE_main (argc=6, argv=0xaf8064c8, 
    aAppData=0xafa82888 <sAppData>, aFlags=<optimized out>)
    at /Volumes/firefox/jpz/toolkit/xre/nsAppRunner.cpp:4497
#20 0x9dd89b92 in XRE_main (argc=6, argv=0xaf8064c8, 
    aAppData=0xafa82888 <sAppData>, aFlags=<optimized out>)
    at /Volumes/firefox/jpz/toolkit/xre/nsAppRunner.cpp:4497
I think bug 1268962 looks to be the most likely culprit
Flags: needinfo?(sawang)
Flags: needinfo?(bzbarsky)
Indeed.  The relevant code is in nsPrefetchNode::OnStartRequest and looks like this:

>+    nsCOMPtr<nsILoadInfo> loadInfo = mChannel->GetLoadInfo();

That actually looks wrong to me now that I read it more carefully.  Shouldn't we be getting the loadinfo off aRequest?  Seems to me that the current code actually does the wrong thing for redirects....

In any case, mChannel can be null if we went through nsPrefetchNode::CancelChannel.  So dereferencing it here without at least a null-check is wrong.

Samael, do you have time to check the redirect situation and fix as needed?
Flags: needinfo?(bzbarsky)
Created attachment 8779972 [details] [diff] [review]
Get loadinfo from aRequest instead of mChannel
Assignee: nobody → sawang
Status: NEW → ASSIGNED
Comment on attachment 8779972 [details] [diff] [review]
Get loadinfo from aRequest instead of mChannel

My bad :(

I added a simple function in the test script which should be able to reproduce the crash, along with the fix. Would you take a look?
Flags: needinfo?(sawang)
Attachment #8779972 - Flags: review?(bzbarsky)
Component: Networking → DOM
Duplicate of this bug: 1294467
Crash Signature: [@ nsPrefetchNode::OnStartRequest]
Comment on attachment 8779972 [details] [diff] [review]
Get loadinfo from aRequest instead of mChannel

>+  // Not acutally verifying any value, just to ensure cancelPrefetch

s/acutally/actually/

I would like us to add a correctness test for the actual tainting behavior as well.

I _think_ the following test would have failed with the old code: start a same-origin prefetch without CORS, have the server redirect to a different-origin URL which then returns a 404.  I _think_ that would have ended up with LoadTainting::Basic on mChannel->GetLoadInfo() and hence mShouldFireLoadEvent would have ended up false and we would have (incorrectly) gotten an error event.  In this situation we should fire a load event, not an error event.

r=me with such a test added and verification that it fails without this patch and passes with this patch.

If the test passes without this patch, please let me know and I'l try to figure out a test that does fail.
Attachment #8779972 - Flags: review?(bzbarsky) → review+
Duplicate of this bug: 1294449
Summary: Crash in nsPrefetchService in Fennec → Crash in nsPrefetchService in Fennec [@ nsIChannel::GetLoadInfo]
Moving tracking flag over from Bug 1294449.
tracking-firefox51: --- → +
(In reply to Boris Zbarsky [:bz] (Vacation until Aug 21) from comment #7)
> Comment on attachment 8779972 [details] [diff] [review]
> Get loadinfo from aRequest instead of mChannel
> 
> r=me with such a test added and verification that it fails without this
> patch and passes with this patch.

Yes this test works. I'm merging it back to bug 1268962 since it's already backed out.
This is not only Fennec but also at least Firefox. Updating the summary and flags accordingly.
Severity: normal → critical
Keywords: crash, topcrash
OS: Android → All
Hardware: ARM → All
Summary: Crash in nsPrefetchService in Fennec [@ nsIChannel::GetLoadInfo] → Crash in nsPrefetchService [@ nsIChannel::GetLoadInfo]
Crash Signature: [@ nsPrefetchNode::OnStartRequest] → [@ nsIChannel::GetLoadInfo] [@ nsPrefetchNode::OnStartRequest]
status-firefox50: --- → unaffected
Keywords: regression, reproducible
> Yes this test works.

In that it fails without the patch in this bug but passes with this patch?
Flags: needinfo?(sawang)
Crash volume for signature 'nsPrefetchNode::OnStartRequest':
 - Ni.  (version 51): 338 crashes from 2016-08-01.
 - Aur. (version 50): 0 crashes from 2016-08-01.
 - Beta (version 49): 0 crashes from 2016-08-02.
 - Rel. (version 48): 0 crashes from 2016-07-25.
 - Esr  (version 45): 0 crashes from 2016-04-20.

Crash volume on the last weeks (Week N is from 08-08 to 08-14):
              W. N-1
 - Ni.  (51)       0

Affected platforms: Windows, Mac OS X, Linux

Crash volume for signature 'nsIChannel::GetLoadInfo':
 - Ni.  (version 51): 6942 crashes from 2016-08-01.
 - Aur. (version 50): 0 crashes from 2016-08-01.
 - Beta (version 49): 0 crashes from 2016-08-02.
 - Rel. (version 48): 1 crash from 2016-07-25.
 - Esr  (version 45): 1 crash from 2016-04-20.

Crash volume on the last weeks (Week N is from 08-08 to 08-14):
              W. N-1  W. N-2  W. N-3  W. N-4  W. N-5  W. N-6  W. N-7
 - Ni.  (51)       0
 - Rel. (48)       1       0
 - Esr  (45)       0       0       0       1       0       0       0

Affected platform: Windows

Comment 16

2 years ago
16 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 16

Platform breakdown:
* linux64: 9
* windows8-64: 4
* windows7-32: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1294159&startday=2016-08-12&endday=2016-08-12&tree=all

Comment 18

2 years ago
79 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 79

Platform breakdown:
* linux64: 34
* windows7-32: 24
* windows8-64: 21

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1294159&startday=2016-08-08&endday=2016-08-14&tree=all
(In reply to Boris Zbarsky [:bz] (Vacation until Aug 21) from comment #14)
> > Yes this test works.
> 
> In that it fails without the patch in this bug but passes with this patch?


Yes it is. The redirection case with either no-cache or 404 cause error events before patch, and generate load events after patch. I added both tests.
Flags: needinfo?(sawang)
Since the patch has been merged and checked in thro bug 1268962. I'll mark this fixed. Let me know if the issue still exists.
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Fixed per
>(Wes Kocher (:KWierso) from bug #1268962 comment #53)
> https://hg.mozilla.org/mozilla-central/rev/0ec6e52818ae
status-firefox51: affected → fixed
Whiteboard: [necko-active] → [necko-active] [fixed by patch from bug #1268962]
You need to log in before you can comment on or make changes to this bug.