# Delay updating nsPrefetchNode's reference to redirect channel until we know redirect is approved

RESOLVED FIXED in mozilla17

Core
Networking: HTTP
--
critical
RESOLVED FIXED
5 years ago
5 years ago

## Tracking

### (Blocks: 1 bug, {crash})

Trunk
mozilla17
crash
Points:
---
Dependency tree / graph

## Attachments

### (2 attachments, 1 obsolete attachment)

 61.56 KB, text/plain Details 5.32 KB, patch jduell : review+ mayhemer : checkin+ Details | Diff | Splinter Review
(Reporter)

### Description

5 years ago
Created attachment 634507 [details]
crash report

1. http://www.universalstudioshollywood.com/attractions/transformers-the-ride-3d/
2. ABORT twice

ABORT: You can't dereference a NULL nsCOMPtr with operator->().: 'mRawPtr != 0', file c:\work\mozilla\builds\beta\mozilla\firefox-debug\dist\include\nsCOMPtr.h, line 809

0  KERNELBASE.dll + 0x33e2e
eip = 0x75653e2e   esp = 0x0031ce40   ebp = 0x0031ce44   ebx = 0x00000001
esi = 0x0913e688   edi = 0x08d35228   eax = 0x00000000   ecx = 0x00000001
edx = 0x00000000   efl = 0x00200202
Found by: given as instruction pointer in context
1  xul.dll!NS_DebugBreak_P [nsDebugImpl.cpp : 369 + 0x4]
eip = 0x68bc6af1   esp = 0x0031ce4c   ebp = 0x0031d264
Found by: previous frame's frame pointer
2  xul.dll!nsCOMPtr<nsIChannel>::operator->() [nsCOMPtr.h : 809 + 0x22]
eip = 0x676afc4a   esp = 0x0031d26c   ebp = 0x0031d284
Found by: call frame info
3  xul.dll!nsPrefetchNode::OnStopRequest(nsIRequest *,nsISupports *,unsigned int) [nsPrefetchService.cpp : 338 + 0xa]
eip = 0x685878cb   esp = 0x0031d28c   ebp = 0x0031d290
Found by: call frame info
4  xul.dll!nsStreamListenerTee::OnStopRequest(nsIRequest *,nsISupports *,unsigned int) [nsStreamListenerTee.cpp : 82 + 0x27]
eip = 0x676effcb   esp = 0x0031d298   ebp = 0x0031d2bc
Found by: call frame info
5  xul.dll!nsHttpChannel::OnStopRequest(nsIRequest *,nsISupports *,unsigned int) [nsHttpChannel.cpp : 4495 + 0x55]
eip = 0x677ad3e7   esp = 0x0031d2c4   ebp = 0x0031d35c
Found by: call frame info

Linux, Windows; Beta/14, Aurora/15, Nightly/16 Debug only. Doesn't crash opt builds.

I can crash reliably the first time with a fresh profile, but not reliably after that. Try start the browser or use a fresh profile if you can't reproduce.

### Comment 1

5 years ago
#9  0x00007fc903b3acdb in nsPrefetchNode::OnStopRequest (this=0x7fc8def6a3c0, aRequest=0x7fc8ddbfd058, aContext=0x0, aStatus=0) at ../../../spdy/uriloader/prefetch/nsPrefetchService.cpp:306
(gdb) print mChannel
\$1 = {
mRawPtr = 0x0
}

I've never read this code before, but it seems pretty clear to me the caching channel has been canceled and reset to null.

I put in a printf to confirm that and reproduced with it.

### Comment 2

5 years ago
Created attachment 634597 [details] [diff] [review]
patch 0

Boris, I'm not very familiar with this code - but the null check appears to be sufficient if the channel has been canceled.
Attachment #634597 - Flags: review?(bzbarsky)

### Comment 3

5 years ago
I have chased this crash many times before from crash stats (bug 598790) and never been able to come up with a situation in which we would have a success status but null channel. Got any theories?

5 years ago
Blocks: 598790

### Comment 4

5 years ago
Yeah, that that was my question.  How can we possibly end up with mChannel null but mStatus NS_OK?

Patrick, if you can reproduce this, would you mind adding printfs in nsPrefetchNode::CancelChannel and in the HTTP channel Cancel() code listing the addresses of the two objects and the nsresult value being used?

### Comment 5

5 years ago
It involves a redirect. Lovely. The new channel in accepted and the acceptance is posted async to the old channel. The new channel is then canceled. Then the old channel receives the posted onredirectverifycallback(). Then the old channel executes OnStopRequest(NS_OK because it hasn't been canceled). Off the top of my head I would have expected that OnStopRequest to come from the new channel, but maybe I'm wrong on that.

jason is the redir guru.

Here's the slime trail:

nsprefetchnode::AsyncOnChannelRedirect this=0x7f65f3b78e80 mChannel=0x7f65f25dc858 oldChannel=0x7f65f25dc858 newChannel=0x7f65f22c0058

nsprefetchnode::cancelchannel this=0x7f65f3b78e80 mChannel=0x7f65f22c0058 code=804b0002 (binding aborted)

nsHttpChannel::Cancel [this=0x7f65f22c0000 status=804b0002]

nsHttpChannel::OnRedirectVerifyCallback [this=0x7f65f25dc800] result=0 stack=2 mWaitingForRedirectCallback=1

nsHttpChannel::OnStopRequest [this=0x7f65f25dc800 request=0x7f65f384a200 status=0 mCanceled=0 mStatus=0]

I can repro this fairly easily with bob's url and a fresh profile.. about 1 in 5 times.. happens immediately (or not).

### Comment 6

5 years ago
> Off the top of my head I would have expected that OnStopRequest to come from the new
> channel

That's how it should work, I would think, yes.

### Comment 7

5 years ago
(In reply to Boris Zbarsky (:bz) from comment #6)
> > Off the top of my head I would have expected that OnStopRequest to come from the new
> > channel
>
> That's how it should work, I would think, yes.

maybe the fact that the new channel was canceled before the OnRedirectVerifyCallback was processed made that verification fail (even though it was sent as OK). Hopefully Jason can shed light.

### Comment 8

5 years ago
I'll take a look at this, but probably not before the end of the month.

Every time I wade into the redirect code my brain hurts :)

### Comment 9

5 years ago
Comment on attachment 634597 [details] [diff] [review]
patch 0

I really don't think this is the right fix...
Attachment #634597 - Flags: review?(bzbarsky) → review-
(Assignee)

### Comment 10

5 years ago
(In reply to Jason Duell (:jduell) from comment #8)
> Every time I wade into the redirect code my brain hurts :)

I know it well, so I could take a look at this.  But as well, probably not sooner then the next month.
(Assignee)

### Updated

5 years ago
Assignee: nobody → honzab.moz
(Assignee)

### Comment 11

5 years ago
Do I need flash installed to reproduce?
(Reporter)

### Comment 12

5 years ago
It is hard to say. The abort is so random it is hard to say if it only appears with Flash. I just got inconsistent results with beta and nightly on win7. I'll try again later.
(Assignee)

### Comment 13

5 years ago
Flash is not needed.  I managed to reproduce by moving code of nsPrefetchNode::CancelChannel to the tail of nsPrefetchNode::AsyncOnChannelRedirect.

Canceling the target (new) channel makes it fail to AsyncOpen.  Then we cannot process the redirect and fall back to processing the redirect request normally.  Hence OnStopRequest comes from the old channel and status is OK.
(Assignee)

### Comment 14

5 years ago
Created attachment 641213 [details] [diff] [review]
v1

I am using nsIRedirectResultListener to switch mChannel.  That callback is called after the redirect is finally going to happen or not.  When it tells the prefetch node redirect is happening, it switches its mChannel to the target (new) channel.

This way, when CancelChannel is called after AsyncOnChannelRedirect and before the redirect is actually going to happen, the original channel is canceled and therefor will carry the failure state to OnStopRequest correctly.

Bob, could you please test this patch?  Try: https://tbpl.mozilla.org/?tree=Try&rev=4568a6fd9ffa

As I see this patch now, I believe there are other places in the code that changes the "working" channel of a class in AsyncOnChannelRedirect to something that later may not be the true channel the notifications come from.  I think we should check these places and turn them to using OnRedirectResult.  OnRedirectResult signature should change to pass also the channel that is going to make the load as an arg to simplify the code.

There is an extension https://mxr.mozilla.org/addons/search?find=%2F&string=nsIRedirectResultListener that use this method, but only for logging.
Attachment #634597 - Attachment is obsolete: true
Attachment #641213 - Flags: review?(bzbarsky)
(Assignee)

### Updated

5 years ago
Status: NEW → ASSIGNED
(Reporter)

### Comment 15

5 years ago
I used the try build on windows 7 to load the page many times over night and did not experience the abort.

### Comment 16

5 years ago
Comment on attachment 641213 [details] [diff] [review]
v1

stealing (feel free to chime in too, bz)
Attachment #641213 - Flags: review?(bzbarsky) → review?(jduell.mcbugs)

### Comment 17

5 years ago
Comment on attachment 641213 [details] [diff] [review]
v1

Review of attachment 641213 [details] [diff] [review]:
-----------------------------------------------------------------

As discussed on IRC I think it would also be helpful to revert bug 584863, so we no longer have the semantics that cancelling a redirected-to channel also vetoes the redirect iff AsyncOpen hasn't been called on it yet.   But Honza has convinced me on IRC that the fix here is sufficient, and the bug 584863 reversion too risky to do before the next merge.  I've filed bug 773475 for that. (if it turns out this patch is buggy for any reason, I think that patch will also fix this crash, by itself, FWIW).

nsPrefetchNode::CancelChannel(nsresult error)
{
-    mChannel->Cancel(error);
-    mChannel = nsnull;
+    if (mChannel) {
+        mChannel->Cancel(error);
+        mChannel = nsnull;
+    }

While debugging with Honza's suggested extra CancelChannel inserted into AsyncOnChannelRedirect, I wound up hitting cancel twice, which blows up.  So I suggest the above addition to be safe, even though it's not clear it would ever happen in practice.

@@ +335,1 @@
>      return NS_ERROR_NO_INTERFACE;

Is there some reason why nsPrefetchNode isn't using the standard QI stuff here?  I.e why it fails instead of QIs to nsIRequestObserver, etc.

Not important for +r, just curious.
Attachment #641213 - Flags: review?(jduell.mcbugs) → review+
(Assignee)

### Comment 18

5 years ago
(In reply to Jason Duell (:jduell) from comment #17)

Thanks for the review.

>   nsPrefetchNode::CancelChannel(nsresult error)
>  {
> -    mChannel->Cancel(error);
> -    mChannel = nsnull;
> +    if (mChannel) {
> +        mChannel->Cancel(error);
> +        mChannel = nsnull;
> +    }
>
> While debugging with Honza's suggested extra CancelChannel inserted into
> AsyncOnChannelRedirect, I wound up hitting cancel twice, which blows up.  So
> I suggest the above addition to be safe, even though it's not clear it would
> ever happen in practice.

Hmmm... I think this is not necessary since we have never crashed here, AFAIK - crash stats are dead...

### Comment 19

5 years ago
> I think this is not necessary since we have never crashed here

I don't see the harm in fixing the code now in case it ever happens.  But either way is OK I guess.

Filed bug 773481 for fixing other classes that need the same OnRedirectResult fix we're doing here.
Summary: ABORT: You can't dereference a NULL nsCOMPtr with operator->().: 'mRawPtr != 0' [@ !nsPrefetchNode::OnStopRequest] → Delay updating nsPrefetchNode's reference to redirect channel until we know redirect is approved
(Assignee)

### Comment 20

5 years ago
Comment on attachment 641213 [details] [diff] [review]
v1

https://hg.mozilla.org/integration/mozilla-inbound/rev/c55c42ea076e
Attachment #641213 - Flags: checkin+
(Assignee)

### Comment 21

5 years ago
https://tbpl.mozilla.org/?tree=Try&rev=6c65467433ca

### Comment 22

5 years ago
https://hg.mozilla.org/mozilla-central/rev/c55c42ea076e
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla17
You need to log in before you can comment on or make changes to this bug.