Closed Bug 194402 Opened 23 years ago Closed 23 years ago

publishing doesn't complete; opens multiple FTP control connetctions

Categories

(Core Graveyard :: Networking: FTP, defect, P1)

Product:

Component:

Type:

defect

Priority:

P1

Severity:

critical

Tracking

(Not tracked)

Status:

VERIFIED FIXED

Milestone:

mozilla1.4beta

People

(Reporter: chrispetersen, Assigned: darin.moz)

References

(
URL
)

Details

(Keywords: topembed-)

Attachments

(4 files, 1 obsolete file)

Stack trace from crash reporter 23 years ago Chris Petersen 5.90 KB, text/plain		Details
Talkback info under Linux from Sarah L 23 years ago Chris Petersen 1.26 KB, text/plain		Details
A reduced test case of apple's site 23 years ago Chris Petersen 4.32 KB, text/html		Details
v1 patch 23 years ago Darin Fisher 52.34 KB, patch		Details \| Diff \| Splinter Review
v1.1 patch 23 years ago Darin Fisher 70.35 KB, patch	dougt : review+ sfraser_bugs : superreview+	Details \| Diff \| Splinter Review

Reporter

Description

•

23 years ago

Build: 2003-02-20-03(OS X Mach-0), 2003-02-20-04 (Win32) Platform: All Expected Results: The publishing process should finish the transfer of image amd html data. What I got: During the transfer of images to the server, the Composer application crashes. Steps to reproduce: 1) Go to www.apple.com 2) Select Edit page from File menu 3) In Composer, select Publish as 4) In the Settings tab, type a ftp address in the Publishing address field 5) Type a valid user name and password 6) In the Publish tab, type file name with a .html. Click Publish. 7) The file transfer should begin. After a dozen or so images have been transfered, the application crashes. Note: This doesn't happen at every site. I can't reproduce , for example, with google.com or a simple HTML page created in Composer. It could be related to number on elements on the page perhaps.

Reporter

Comment 1

•

23 years ago

Also, this problem occurs in the 7.02 (OS X and Linux) builds.

Reporter

Comment 2

•

23 years ago

Attached file Stack trace from crash reporter — Details

Reporter

Comment 3

•

23 years ago

Attached file Talkback info under Linux from Sarah L — Details

sairuh (rarely reading bugmail)

Comment 4

•

23 years ago

brade suggests assigning to darin...

Assignee: dougt → darin

Severity: normal → critical

Keywords: crash

Assignee

Comment 5

•

23 years ago

#0 0x00000000 in 0x0 #1 0x0087a0c8 in nsSocketTransportService::RememberHost(nsACString const&, unsigned short, PRIPv6Addr*) ooh... this is a duplicate of bug 192207. *** This bug has been marked as a duplicate of 192207 ***

Status: NEW → RESOLVED

Closed: 23 years ago

Resolution: --- → DUPLICATE

Reporter

Comment 6

•

23 years ago

Attached file A reduced test case of apple's site — Details

Attempting to publish this document results in a crash.

Assignee

Comment 7

•

23 years ago

interesting, publishing this page seems to cause us to create an endless number of FTP connections. as a result we hit the hard limit on the number of allowed socket connections. bug 192207 is about the fact that we crash when we reach that hard limit. with my patch for bug 192207 in my tree, we no longer crash. however, publishing still fails. i'm going to reopen this bug to track the publishing failure. i'm removing the crash keyword and updating the summary to reflect the real problem.

Severity: critical → major

Status: RESOLVED → REOPENED

Depends on: 192207

Keywords: crash

Priority: -- → P2

Resolution: DUPLICATE → ---

Summary: While publishing to a ftp server from Composer, the application suddenly crashes → publishing doesn't complete; keeps opening new FTP connections

Target Milestone: --- → mozilla1.3final

Assignee

Comment 8

•

23 years ago

ok, the problem is that we don't do anything to gate or limit the number of simultaneous FTP connections. netstat indicates that a bunch of control and data connections have been opened. it looks like the old socket transport service had some code to make things continue to work even when the hard limit on the number of sockets is reached, but there was a big XXX comment indicating that this condition was not fully implemented. perhaps it just barely worked before. ultimately, i think we need the socket transport to recover gracefully when the hard limit on the number of sockets is reached, and maybe we need FTP to limit how many sockets it tries to open.

Bradley Baetz (:bbaetz)

Comment 9

•

23 years ago

ftp should be being limited by the per-host limit in teh socket transport service, shouldn't it?

Doug Turner (:dougt)

Comment 10

•

23 years ago

both. we should limit the control to 4-5 per site and a total number of controls to 12-15. In prefs, of course.

Kathleen :Brade

Comment 11

•

23 years ago

I'm grasping at straws here... What I don't fully understand at the moment is that the persist code is supposed to be attempting to serialize publishing of multiple files. Composer appears to be setting the flag for non-local files that are published and I don't see any changes in the past few weeks in the persist code in that vicinity. Was it working based on some old assumption for how necko worked? Long ago, I did setup my own (linux) ftp server and limit the number of connections that could be made to 1 and I was able to successfully publish an html file with several images. Why are the connections not closing? Perhaps I have the pref set for immediate timeout?

Assignee

Comment 12

•

23 years ago

-> 1.4

Status: REOPENED → ASSIGNED

Target Milestone: mozilla1.3final → mozilla1.4alpha

Comment 13

•

23 years ago

brade: I'm going through the current ftp bugs, and I don't see any reports of excessive ftp control connections from downloading, so perhaps it has only to do with uploading? If you use different functions than downloading, I could see how the connection management logic might break somehow. Any idea when this regressed?

Summary: publishing doesn't complete; keeps opening new FTP connections → publishing doesn't complete; opens multiple FTP control connetctions

Assignee

Updated

•

23 years ago

Keywords: nsbeta1, topembed

Priority: P2 → P1

Michael Buckland

Comment 14

•

23 years ago

Discussed in edt. Minusing. Not a currrent concern for our embedding customers.

Keywords: topembed → topembed-

Assignee

Updated

•

23 years ago

Severity: major → critical

Target Milestone: mozilla1.4alpha → mozilla1.4beta

Assignee

Comment 15

•

23 years ago

Attached patch v1 patch (obsolete) — Details — Splinter Review

this patch cleans up a lot of stuff. 1- socket transport now asks the socket transport service (STS) if it is ok to "attach a new socket". if the STS says yes, then the socket transport goes ahead and does its normal stuff. else, the socket transport asks the STS to notify it when it can attach a new socket. this allows us to limit the number of sockets allocated by mozilla, which is a requirement on some platforms. 2- now, to play nicely with this, FTP also has to limit the number of control connections it caches. so, i made nsFtpProtocolHandler::mRootConnectionList hold no more than 8 connections. the list is ordered by age, so i just trim from the head of the list. well, actually... i look for the eldest connection to the same host and trim that first. if no matches, then i trim the eldest overall. 3- finally, when publishing we don't close the data connection until the application shuts down. i fixed this by ensuring that mDPipe is closed when the channel completes. so, just like we have the channel telling the state "DataConnectionEstablished" ... we now have the channel telling the state "DataConnectionComplete". i tried to avoid this explicit function call, but i couldn't come up with anything nearly as clean and simple. 4- finally, i touched up a few things in the async stream copier... mainly cosmetic.

Assignee

Updated

•

23 years ago

Attachment #119315 - Flags: review?(dougt)

Doug Turner (:dougt)

Comment 16

•

23 years ago

Comment on attachment 119315 [details] [diff] [review] v1 patch I don't understand this comment before or after this change. Maybe we can explain why we are cheating? - // we'll cheat a little bit here since we know that NS_AsyncCopy does pass - // an event queue. + // we'll cheat a little bit here since we know that NS_AsyncCopy does not + // pass an event queue. I don't see the nsISocketEventHandler interface file. You may want to reorder these so that the more common interface is the first in the list: +NS_IMPL_THREADSAFE_ISUPPORTS4(nsSocketTransport, + nsISocketEventHandler, nsISocketTransport, nsITransport, nsIDNSListener) don't you have a race in InitiateSocket? Suppose the limit is reached on the number of sockets created. A new request comes in, and you return with |notifyWhenCatAttachSocket|. Sometime later a socket becomes available and before the MSG_RETRY_INIT_SOCKET event is handled (but is already has been posted), a different socket calls CanAttachSocket() Jud made me to it, i swear! : - (void) mFTPState->DataConnectionEstablished(); + mFTPState->DataConnectionEstablished(); I am worried about cycles with the new ftp handler global. Can you make that we do clean up properly. Extra points for making any references to "ftp" consistent wrt case. :-)

Assignee

Comment 17

•

23 years ago

>I don't understand this comment before or after this change. Maybe we can >explain why we are cheating? because we ignore the "eventQ" parameter (assuming it will always be null). i'll add a comment to that effect. >I don't see the nsISocketEventHandler interface file. it is included at the bottom of nsISocketTransportService.idl >You may want to reorder these so that the more common interface is the first in >the list: i chose that ordering based on the fact that nsISocketEventHandler would probably be the most commonly QI'd interface (given that it is part of the event dispatch mechanism for the socket thread). >don't you have a race in InitiateSocket? yes, and that is ok. it is by design. to avoid that race would have required re-entrant callbacks. i would prefer to avoid that. since each call to InitiateSocket checks CanAttachSocket, everything will be fine. sure, an older socket request will sometimes lose out, but i don't think that will be a big issue. i'll document the code. as for DataConnectionEstablished, it returns |void| now =) >I am worried about cycles with the new ftp handler global. nsFtpProtocolHandler owns no references to the nsFtpState, so i think this is safe. plus, my logs confirm that each nsFtpState is destroyed, so i think this should be safe. extra points... no thanks! ;-)

Mark Smith [:mcs]

Comment 18

•

23 years ago

It sounds like this patch will also fix bug 193908 (because FTP data connections will now be closed in a timely fashion). Darin, do you agree?

Doug Turner (:dougt)

Comment 19

•

23 years ago

fix the comment to be more verbose. You may want to document the fact that there is a race condition and mention this bug.

Assignee

Comment 20

•

23 years ago

mark: yeah, it should fix that too.

Blocks: 193908

Assignee

Comment 21

•

23 years ago

Attached patch v1.1 patch — Details — Splinter Review

Attachment #119315 - Attachment is obsolete: true

Assignee

Updated

•

23 years ago

Attachment #119315 - Flags: review?(dougt)

Assignee

Updated

•

23 years ago

Attachment #119341 - Flags: superreview?(bzbarsky)

Attachment #119341 - Flags: review?(dougt)

Doug Turner (:dougt)

Updated

•

23 years ago

Attachment #119341 - Flags: review?(dougt) → review+

Assignee

Updated

•

23 years ago

Attachment #119341 - Flags: superreview?(bzbarsky) → superreview?(alecf)

Simon Fraser [no longer active]

Updated

•

23 years ago

Attachment #119341 - Flags: superreview?(alecf) → superreview+

Assignee

Comment 22

•

23 years ago

fixed-on-trunk

Status: ASSIGNED → RESOLVED

Closed: 23 years ago → 23 years ago

Resolution: --- → FIXED

Comment 23

•

23 years ago

Talking to brade in #composer yesterday, there was a thought that perhaps this patch would fix the problem with publishing to a webdav server. We've tested this here at OEone, and it doesn't fix the problems I reported in bug 201210. Publishing to a webdav server is still broken with patch 1.1 applied.

Assignee

Comment 24

•

23 years ago

right, this patch has nothing to do with HTTP.

Reporter

Comment 25

•

23 years ago

Ok, using the reduced test case this problem is no longer occuring. In previous builds, using this test case would reproduce the crash. Verified on the 2003-04-10-08 Macho and 2003-04-10-04 Win32 trunk builds.

Status: RESOLVED → VERIFIED

Updated

•

23 years ago

Depends on: 201663

Updated

•

2 years ago

Product: Core → Core Graveyard

You need to log in before you can comment on or make changes to this bug.