Crash when opening ftp location Trunk [@ nsIndexedToHTML::OnDataAvailable][@ nsIndexedToHTML::OnStopRequest]

VERIFIED FIXED in mozilla0.9.9

Status

()

P1
critical
VERIFIED FIXED
17 years ago
17 years ago

People

(Reporter: kleist, Assigned: bbaetz)

Tracking

({crash, regression, topcrash+})

Trunk
mozilla0.9.9
crash, regression, topcrash+
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [fix-in-hand], crash signature, URL)

Attachments

(2 attachments, 1 obsolete attachment)

2.51 KB, text/plain
Details
v2
2.01 KB, patch
dougt
: review+
darin.moz
: superreview+
dbaron
: approval+
Details | Diff | Splinter Review
(Reporter)

Description

17 years ago
Build ID: 2002 02 20 03. Windows 2000.

Incident ID: TB3190449K. I have reproduced the crash several times.

Comment 1

17 years ago
Confirming, 2002022003/Win98 -> TB3191961W

BTW: Reporter, always add keyword crash for bugs like this =)) 
Keywords: crash
confirming with win2k and a 2h old CVS build..

Reporter: 
Please select the correct component the next time, this one is easy.. :-)

My debug doesn't crash with FTP logging enabled.
Assignee: new-network-bugs → bbaetz
Status: UNCONFIRMED → NEW
Component: Networking → Networking: FTP
Ever confirmed: true
(Assignee)

Comment 3

17 years ago
Ooh, this is evil. Whats happening is that the data channel by the server is
being closed before we get the 550 for the RETR. This means that mRetrying is
false, and so we tell our listener that there was no data. Then when we get the
real data, we don't send the OnStart, so the stream converter isn't initialised.

I think I can work arround the crash by resetting mDelayedOnStartFired (ie the
same situation we would have got before I made us send onstart when we got no
data), but you'll probably still have the side effects of our streamlistener
being called twice. I'm really not sure how to fix that. Darin, if I call
Suspend() on my sokcet transport request, that will stop onstart, right? I could
then either Cancel() or Resume() it once I get the results from the control
connection, can't I? I'm not sure if that matters, though.
Status: NEW → ASSIGNED
Keywords: regression
Target Milestone: --- → mozilla0.9.9
(Assignee)

Comment 4

17 years ago
Created attachment 70777 [details] [diff] [review]
patch

OK, suspend and resume do work for this. I'll added the call to cancel in there
because I was worried that we wouldn't get the onstop when theother side closed
it. I could just call resume, but ut should be there anyway, I think, although
it doesn't solve the problem that block of code was added to prevent, which is
a much deeper issue.

Anyway, this fixes the crash.

Comment 5

17 years ago
there might be a race condition with this, since calling Suspend/Resume/Cancel
from the UI thread on a socket transport (which is running on the socket
transport thread) may not prevent an event already in the UI threads event queue
from being processed.  so, you may need to be prepared to handle
nsIStreamListener events even though you've suspended the socket transport request.
(Assignee)

Comment 6

17 years ago
The crash is prevented by just resettingg the variable, so I guess that the
worst that happens if this fails is that "Document: done" is dislpayed too
early, and the entry isn't in the cache.

Alos, I suspend as soon as I create it, but before I kick off the transfer, so
there shouldn't be any data in the pipe. Suspend is synchronous wrt new
notifications, isn't it?
Comment on attachment 70777 [details] [diff] [review]
patch

I have reviewed this, but I believe that it need much baking.
(Assignee)

Comment 8

17 years ago
tever is going to run the ftp regression test. Otherwise I'll do the one liner
patch for 0.9.9, which will fix the crash, but leave things like send page
broken, and won't use the cache. Thats better than crashing, though...

Updated

17 years ago
OS: Windows 2000 → All
Hardware: PC → All
(Assignee)

Comment 9

17 years ago
*** Bug 127587 has been marked as a duplicate of this bug. ***
(Assignee)

Comment 10

17 years ago
tever is seeing some problems with saving files, but I suspect that they are
related to the other save as bugs we have - I don't have a problem, with or
without my patch.

Can someone else try out the patch? Otherwise I'll go with the one liner crash
fix for 0.9.9, and get the real one in for 1.0.
(Assignee)

Comment 11

17 years ago
I'll attach the alternative one liner, but I really do think that we should get
the real fix in for 0.9.9.
Keywords: patch, review
Priority: -- → P1
Whiteboard: [fix-in-hand]
(Assignee)

Comment 12

17 years ago
Actually, the one liner still causes crashes on ftp.mutt.org (akk's test case).
The full patch stops the crash there as well, although I also did manage to see
the hanging download, but thats not a regression - compare to 0.9.8. Thats
because of the pasv mess. I hope to have that sorted out for 1.0.

Tom, are you seeing any hangs which you don't see in a nightly without my patch?

Comment 13

17 years ago
This one is seeing enough external crashes to make the Trunk topcrash lists.
Updating summary and keywrod info.
Keywords: topcrash
Summary: Crash when opening ftp location → Crash when opening ftp location Trunk [@ nsIndexedToHTML::OnDataAvailable]

Comment 14

17 years ago
Created attachment 71308 [details]
Call Stacks

The Trunk topcrash lists are also showing crashes ending at the signature:
nsIndexedToHTML::OnStopRequest. (Attaching both stacks) Wondering if that crash
will be taken care of by the fix for this bug. bbaetz, any idea if your patch
wil also fix this bug? If not, let me know and I will file a separate bug
report. Thanks.
*** Bug 127665 has been marked as a duplicate of this bug. ***
(Assignee)

Comment 16

17 years ago
Yes, thats the same crash - you'll get the OnStop if the directory was empty.

Updated

17 years ago
Summary: Crash when opening ftp location Trunk [@ nsIndexedToHTML::OnDataAvailable] → Crash when opening ftp location Trunk [@ nsIndexedToHTML::OnDataAvailable][@ nsIndexedToHTML::OnStopRequest]

Comment 17

17 years ago
Comment on attachment 70777 [details] [diff] [review]
patch

sr=darin
Attachment #70777 - Flags: superreview+
nominating for nsbeta1 since it is a topcrash bug. Marking it as nsbeta1+ since it 
has patch. 
Comment on attachment 70777 [details] [diff] [review]
patch

r=dougt
Attachment #70777 - Flags: review+
I marked the bug r=dougt with the knowledge that tever reported problems which
sound unreleated and minor compared to the crash.  

Tever - please confirm this.
(Assignee)

Comment 22

17 years ago
Checked in.
Status: ASSIGNED → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED
(Assignee)

Comment 23

17 years ago
ftp.netscape.com/Welcome sees this hang, and thats one of the tbox test urls.
Sleestack backed this out - reopening.

I think I know what the problem was, but won't get to it before tomorrow.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 24

17 years ago
OK, I kow what was wrong - I wasn't calling ->Resume in the RETR case. I'll
attach a patch which works for me.

However, theres another problem - why did my patch work at all? I tested
downloading a large file from localhost, and several small files. I should not
have got any data at all, ever, but I did. If I call Suspend() before any data
is received/sent, will I still get the first ODA? Itspossible that my file could
have been in one chunk over localhost, I guess, but a 1 meg chunk seems a bit
large to me. I definately appear to be avoiding the onstop, which is why this
fixes the crash.

Comments?
Status: REOPENED → ASSIGNED
(Assignee)

Comment 25

17 years ago
Created attachment 71652 [details] [diff] [review]
v2
Attachment #70777 - Attachment is obsolete: true
I am checkin out that last patch.  so far looks good.

Comment 27

17 years ago
Doug, yeah there were a few problems with it.  I'd like to do some more testing
on the new patch.  Can you get a test build to me so I can take a look at it.
*** Bug 128657 has been marked as a duplicate of this bug. ***
Tever, *what* are the problems.  List them please.  
*** Bug 128771 has been marked as a duplicate of this bug. ***

Comment 31

17 years ago
Details caught by Bugtoaster using Windows XP build 2002030208:
  Program: mozilla.exe (no version)
  Company: not provided
   Module: necko.dll (no version)
  Company: not provided
 Incident: {EB1BA655-4B4F-4BB1-A622-56FD910D22A1}
Exception:
          Exception Code: C0000005
         Virtual Address: 608FBF06
  Image Relative Address: 0001BF06

EAX=00000000 EBX=02770564 ECX=00000000 EDX=00000000
ESI=80000000 EDI=02770558
 DS=00000023  ES=00000023  FS=00000038
 SS=00000023 ESP=00000007 EBP=FFFFFFFF
 CS=0000001B EIP=00000000            EFlag=00000246
@EIP=????
@ESP=????

Comment 32

17 years ago
bbaetz: it would be nice if you could add some explanations in the code as to
why you need to suspend + resume mDPipeRequest.

Comment 33

17 years ago
Actually there was only one problem - I could not download at all.  Clicking a
file on an ftp site or typing it in the url bar would result in a hang.    
(Assignee)

Comment 34

17 years ago
tever: Can you still reproduce that with my second patch?I think darin or dougt
were going to help you with builds.

darin: OK.
(Assignee)

Comment 35

17 years ago
*** Bug 129206 has been marked as a duplicate of this bug. ***

Comment 36

17 years ago
Reproducible crash, marking as topcrash+
Keywords: topcrash → topcrash+

Comment 37

17 years ago
bbaetz:  This looks good to me on windows but I still need to test a linux
build.  I will take a look at it this afternoon.  I went to about 10 different
server types, did some misc downloads from a number of popular sites plus our
internal server.  I haven't crashed so far.

Comment 38

17 years ago
I did notice that I am getting an assert for every download with the message -
'Channel doesn't have a prompt!  'prompt',  file d: ... \nsFTPChannel.cpp line
543.'   Is that a problem?
(Assignee)

Comment 39

17 years ago
OK, thanks.

That assertion is normal - the save-as code doesn't set a prompter. There is a
bug on this, but I'm not at home now to check my mail archives for the #. I
think its assigned to law.

Comment 40

17 years ago
looks good on linux.   I am seeing a bit of a delay before the progress dialog
closes but I think this is because I am running mozilla over the network.
(Assignee)

Comment 41

17 years ago
The progress dialog is sometimes a bit slow closing.

so, if it works for everyone, can I get review. darin/dougt?

Comment 42

17 years ago
Comment on attachment 71652 [details] [diff] [review]
v2

sr=darin
Attachment #71652 - Flags: superreview+
Comment on attachment 71652 [details] [diff] [review]
v2

r=dougt.  been runing with it for a while.
Attachment #71652 - Flags: review+
a=asa (on behalf of drivers) for checkin to the 0.9.9 branch as well. Bradley,
can you get this into the branch quickly? We're coming down to the wire. Thanks.
(Assignee)

Comment 46

17 years ago
Checked in, trunk and branch
Status: ASSIGNED → RESOLVED
Last Resolved: 17 years ago17 years ago
Resolution: --- → FIXED

Comment 47

17 years ago
*** Bug 130330 has been marked as a duplicate of this bug. ***

Comment 48

17 years ago
tom: this bug was open while I was on sabatical.
Should it be marked VERIFIED?

Comment 49

17 years ago
yeah, the changes were checked in and it is working fine - verified
Status: RESOLVED → VERIFIED
Crash Signature: [@ nsIndexedToHTML::OnDataAvailable] [@ nsIndexedToHTML::OnStopRequest]
You need to log in before you can comment on or make changes to this bug.