Closed Bug 790184 Opened 12 years ago Closed 12 years ago

[SPDY] Firefox V15 & 15.0.1 fails with attachment upload on gmail with squid proxy

Categories

(Core :: Networking: HTTP, defect)

15 Branch
x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla18
Tracking Status
firefox15 --- affected
firefox16 + ---
firefox17 + fixed
firefox18 --- fixed

People

(Reporter: informatique, Assigned: mcmanus)

References

(Blocks 1 open bug)

Details

(Whiteboard: [spdy])

Attachments

(1 file, 2 obsolete files)

User Agent: Mozilla/5.0 (X11; Linux i686; rv:10.0.2) Gecko/20100101 Firefox/10.0.2 Iceweasel/10.0.2
Build ID: 20120217184109

Steps to reproduce:

Bonjour et merci de votre Fondation.
Depuis la mise à jour 15.0, 
connecté sur Gmail, j'essaye d'ajouter un fichier joint de 5Mo à un mail. 

Mon réseau utilise un proxy de type squid. 
Mon débit disponible en upload est de l'ordre de 300kbits/s .


Actual results:

Cela démarre très lentement et cela finit par un message "Échec de l'ajout de la pièce jointe". 
Différents essais de version et d'environnement:
J'essaye une piece jointe de 2Mo: cela fonctionne
Piece Jointe 5Mo , FF V15.0.1(windows XP SP3) , même erreur.
Piece Jointe 5Mo , FF V12 (windows XP SP3) : cela fonctionne
Piece Jointe 5Mo , FF V10 (Linux Debian Squeeze) : cela fonctionne.
Piece jointe 5Mo , Invalidant tous les plugins : cela ne change pas le comportement.

A signaler: sous IE8 on voit encore "fonctionnalité de base des pieces jointe" dans les parametres de gmail. On ne voit plus cette option dans Firefox.


Expected results:

La pièce jointe aurait dû se télécharger tranquillement.


Merci de vos efforts.

Bien amicalement
Jerome
Group: core-security
Hello Jerome !

I'm sorry but we usually English in bugzilla. I have translated your report using google translate but it makes the communication a little bit difficult if we use mixed languages. 

As a test, could you please open about:config (enter as URL in Firefox), confirm the warning, enter "spdy" in the top filter bar without the quotes and change the entry "network.http.spdy.enabled" to "false". Restart Firefox and try it again to upload the attachment. Does it work better ?

btw: I added Loic to this report. He should understand french and make it easier for you in case you don't understand english.
@Jerome: Vous devriez utiliser le forum français de support sur Geckozone (http://www.geckozone.org/forum/) et reposter ici s'il y a vraiment un bug dans Firefox 15.

Translation: he says he's using a network with a Squid proxy and FF15 fails to attach 5-MB attachment to an email on Gmail but works with 2-MB attachment.
I redirected him to the French support board anyway.
Hi Matthias and Loic ..

Many apologies! Sorry for french speaking here!

I have just tried your tuning on V15.0.1
"network.http.spdy.enabled" to "false" , and it works  fine !

Now, how may I deploy this tuning over my LAN , because lots of  my users are very annoyed ?

Thanks a lot for your solution.

Jerome
>"network.http.spdy.enabled" to "false" , and it works  fine !
Great, that confirms that i was right.

>Now, how may I deploy this tuning over my LAN
I apologize for the issues but I can't answer your question.
Bugzilla is a pure bug database and our missions is to fix bugs and not for support help. The french support forum can probably help. One way could be a Mozilla autoconfiguration file.

Anyway, would you help us with a log file if the networking developer requests it ?
He or I will post the instructions in that case.
Component: Untriaged → Networking: HTTP
Product: Firefox → Core
Summary: Firefox V15 & 15.0.1/windows XP SP3 + GMAIL: Échec de l'ajout de la pièce jointe → [SPDY] Firefox V15 & 15.0.1 fails with attachment upload on gmail with squid proxy
log instructions: https://developer.mozilla.org/en-US/docs/HTTP_Logging

this would be critical - though the squid notes are interesting.. as is http://code.google.com/p/chromium/issues/detail?id=69813#c62
Assignee: nobody → mcmanus
Whiteboard: [spdy]
Jerome:
Could you please help us and generate a http log and attach it here using the "add an attachment" link in this report ?

Be sure that spdy is enabled again and follow the instructions posted in the link from Patrick in Comment#5
It would help our network developer /Patrick) to get this issue fixed.
OK 
-spdy enabled again
-problem repeated
-Log recorded.
-Gzipped log.txt size: 13Mo
I can upload it to the dedicated area you will show me, unless
you tell me what to grep in it. 
Jerome
Send me the log via mail and i will host it on my server
Log sent to Matthias at his email adress.
Jerome
jerome - you got a ping timeout. interesting.

I wonder if squid isn't doing any flow control.

would you say that you have a much faster connection to the proxy than squid does to gmail? (e.g. squid is on your lan and squid has poorish internet connectivity)
This is pretty interesting.

FF uploads for 44 seconds, still has a way to go, and then generates a spdy ping because it hasn't heard anything from the server in a while and it wants to make sure it isn't uploading into the wind.

If the buffering is too deep, either at squid or in the local socket buffers (or a combination of both), then that buffer bloat creates a huge round trip for that ping.. it doesn't make it back in 8 seconds and the session is aborted.

Our local socket buffers probably need to be tighter for spdy because of the inline ping.. but proxies will still play a role. We might need to extend the deadline considerably in the face of streams that have been uploading a lot of data - which makes dead connection detection much harder but most sessions are not woken up by this kind of large upload.
this bug was filed on linux, and on non-windows platforms we generally use TCP auto tuned buffer sizes. Doing a 25MB gmail attachment I saw my local OS socket buffers grow as large as 4MByte. If I've only got 1 megabit of upstream bandwidth that's 32 seconds worth - way too much local buffering which prevents the ping from being answered quickly enough. Cancel's will also perform poorly.

I could see pings take several seconds to complete and that's on my pretty well connected desktop. The reported showed >200ms ping times, so his situation would perform worse.

Let's start with a change to get that under control - when doing large uploads set the buffer size to 128KB as we do for windows (which doesn't auto tune on older versions). Leave autotuning in tact when uploads aren't going on because that's a memory friendly thing to do.

That reduces the ping turnaround time I see to at most 1 second.

I'll make some try builds and ask jerome to try it out. Its possible that squid will still be adding too much buffering and we'll need to adjust the timeout period - but generally proxies are pretty aware of buffering too much [though admittedly they are more aware of it in the downstream direction]
interestingly this problem will be mitigated by v3 in ff16 as the per stream flow control will typically have the effect of keeping the buffers short. However that is a per server setting so we shouldn't simply rely on it.
Attached patch patch 0 (obsolete) — Splinter Review
In my testing I confirmed that this kept buffer queues down, improved ping rtt times substantially, and didn't impact overall upload time.
Attachment #668488 - Flags: review?(honzab.moz)
Maybe the spdy ping is designed wrong?  

I remember AllPeers multiplexer protocol was after a lot of research finally not using ping/pong reply, but just a regular ping, or no-op packet (with no expected reply to that ping) in 35 seconds after something has been sent out to the peer.  It was symmetric, from the peer it was expected to receive something in 35 seconds as well.  When for 60 seconds (if I remember the numbers well) nothing was heard from the peer, the connection closed.

We wouldn't need to play with buffers (what I'm not a big fan of, btw) when the ping wouldn't be of the request/reply type.

Thought for SPDY spec...
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Comment on attachment 668488 [details] [diff] [review]
patch 0

Review of attachment 668488 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/SpdyStream2.cpp
@@ +481,5 @@
> +  // the session and cap the send buffers at 128KB. (10Mbit/sec @ 100ms)
> +  //
> +  if ((mTotalSent > (128 * 1024)) && !mSetTCPSocketBuffer) {
> +    mSetTCPSocketBuffer = 1;
> +    mSocketTransport->SetSendBufferSize(128 * 1024);

Turn this to one or even two separate prefs please.
Attachment #668488 - Flags: review?(honzab.moz) → review+
(In reply to Honza Bambas (:mayhemer) from comment #16)
> Maybe the spdy ping is designed wrong?  
> 
> I remember AllPeers multiplexer protocol was after a lot of research finally
> not using ping/pong reply, but just a regular ping, or no-op packet (with no
> expected reply to that ping) in 35 seconds after something has been sent out
> to the peer.  It was symmetric, from the peer it was expected to receive
> something in 35 seconds as well.  When for 60 seconds (if I remember the
> numbers well) nothing was heard from the peer, the connection closed.
> 

its an interesting suggestion. It would have to be negotiated due to some implementation's concerns about battery - but that could be ok as the negotiation wouldn't be a pre-req to doing something useful.

> We wouldn't need to play with buffers (what I'm not a big fan of, btw) when
> the ping wouldn't be of the request/reply type.

as long as this stays tcp based relatively short buffers are still pretty interesting in order to effectively support prioritization and cancel semantics, right? We don't aggressively do a lot with that right now, but its potentially the biggest responsiveness benefit we can get and could forseeably mean tightening them further to match cwnd (which isn't easily knowable). Its also a reason that this shouldn't be tcp based in the long run.

> 
> Thought for SPDY spec...

Thanks!
jerome, will you try out a build from

http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mcmanus@ducksong.com-66f3e649c9a0/

using your proxy and let me know if the attachment succeeds?
Flags: needinfo?(informatique)
Attached patch patch v1 (obsolete) — Splinter Review
small idl change to nsISocketTransport
Attachment #668488 - Attachment is obsolete: true
Attachment #668624 - Flags: superreview?(cbiesinger)
Attachment #668624 - Flags: review+
Comment on attachment 668624 [details] [diff] [review]
patch v1

nsSocketTransport2.cpp

Don't you have to do locking to access mFD, similar to GetSelfAddr?
Attachment #668624 - Flags: superreview?(cbiesinger) → superreview-
(the IDL change looks good though)
Attached patch patch 2Splinter Review
its a good point. is this what you had in mind?
Attachment #668624 - Attachment is obsolete: true
Attachment #668643 - Flags: superreview?(cbiesinger)
Attachment #668643 - Flags: review+
Comment on attachment 668643 [details] [diff] [review]
patch 2

yes. thanks.
Attachment #668643 - Flags: superreview?(cbiesinger) → superreview+
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/c71ac48601d9 - it built on Windows, which is nice, and pile-of-letters Armv7a ICS, which is whatever, but not on any other platforms.
Sigh, this probably needs bug 730805, which I've been neglecting for awhile.  I might pick up the little I've done for that bug and finish up a patch by the end of the weekend, but in the meantime, just casting to some natural type is probably the quickest fix.
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #27)
> Sigh, this probably needs bug 730805, which I've been neglecting for awhile.
> I might pick up the little I've done for that bug and finish up a patch by
> the end of the weekend, but in the meantime, just casting to some natural
> type is probably the quickest fix.

thanks jeff, but we won't need it here.. this bounced because I screwed up and left in a diag line I slipped in to locally confirm a small sr requested change before pushing.
 https://hg.mozilla.org/integration/mozilla-inbound/rev/ecd4c4304219

the first landing of this patch ended up in a range that triggered a talos regression warning (along with other patches).. so I took the backout as an opportunity to run it and a control run (i.e. same tip without the patch) on try and it compares fine on tp5.

https://tbpl.mozilla.org/?tree=Try&rev=a5bf591261ee
and
https://tbpl.mozilla.org/?tree=Try&rev=408ff9d50ddb
https://hg.mozilla.org/mozilla-central/rev/ecd4c4304219
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
I added the tracking flags here because the original bug was filed on a release version and it seemed the filer was concerned about getting a fix to be used across a significant number of users.

Since Firefox 16 is coming out on Tuesday, it seemed lame to try to create a version 15 version with the fix.  For the same reason it is probably way too late to get this into the version 16 release.  However, if a 16.0.1 is required for some other reason this would be good to take as a tag-along.

I really think we should try to take this for the Firefox 17 beta however.
OH should have said especially since it is my understanding that the next ESR release is going to be based on version 17 and this proxy messing up the ping time issue is highly likely to impact enterprise deployments.
I think the best justification for getting a fix in sooner rather than later is at http://www.squid-cache.org/:

"Squid is used by hundreds of Internet Providers world-wide to provide their users with the best possible web access."

Please nominate for Aurora 17 uplift (before Monday morning PT) or Beta 17 uplift (after Monday morning PT) to get the fix into that Firefox version. Please make sure to include a risk evaluation, since whether or not we take this as part of a 16.0.1 would depend upon risk/testing.
Comment on attachment 668643 [details] [diff] [review]
patch 2

[Approval Request Comment]
Bug caused by (feature/regressing bug #): original spdy issue
User impact if declined: long running gmail attachments or other spdy uploads can time out depending on network path properties. I have heard some generic reports of non-reproducible attachment problems so this may be relevant.
Testing completed (on m-c, etc.): generally manual testing. it has only been on m-c for 1 day. Fairly safe approach but we don't have feedback yet if it is totally effective outside of tested scenarios.
Risk to taking this patch (and alternatives if risky): the code change will only effect spdy sessions that have done uploads.
String or UUID changes made by this patch: none

This should probably be uplifted atomically with 798423
Attachment #668643 - Flags: approval-mozilla-aurora?
Attachment #668643 - Flags: approval-mozilla-aurora? → approval-mozilla-beta+
Hi Everybody,
 
Many thanks for all your fine work.

I tried out this nightly build, and it worked with a 10Mb attachment in my (old) squid environment (squid-2.6.STABLE16) .
Cool !
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mcmanus@ducksong.com-66f3e649c9a0/

Many friendly greets 

Jerome
Flags: needinfo?(informatique)
thanks jerome
See Also: → 1868987
See Also: → 1830918
Blocks: 1877243
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: