Closed Bug 1083930 Opened 10 years ago Closed 9 years ago

Uploading files to google drive takes up 100% of CPU

Categories

(Core :: Networking: HTTP, defect)

32 Branch
x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla44
Tracking Status
firefox44 --- fixed

People

(Reporter: enrique.arizonbenito, Assigned: mcmanus)

References

(Depends on 1 open bug, )

Details

(Keywords: perf, regression)

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:32.0) Gecko/20100101 Firefox/32.0
Build ID: 20140924083558

Steps to reproduce:

Log into Google drive (https://drive.google.com). Upload a file.


Actual results:

The file uploads properly but the CPU ussage is close to 100% and the GUI blocks.


Expected results:

The CPU must be close to 0% while uploading. In Google Chrome it takes about 1-3%.
Severity: normal → critical
Second test:

 Open GDrive on a private browser window (Ctrl+May+N). Upload a test file (20 Mb). This time is the "Socket Thread" that takes up to 99.9% of CPU.
When I cancel the upload the CPU goes down to 0% ("Socket Thread" dissapears from my visible "top" window).
Did you test with a fresh profile?
https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles
Flags: needinfo?(enrique.arizonbenito)
Severity: critical → normal
@Loic: I tested with a new Fresh Profile and got similars results. The "Socket Thread" takes about 99.9% of the CPU when uploading.

I also tested locally with a local server and a web app that works through AJAX request to the server and I noticed the CPU ussage is quite hight too. Doing 2 simultaneous AJAX requests to the local sever takes about 20 milisecs of real CPU time for the "Socket Thread" according to top. I'm using a core-i7. Looks to me to be related to some spin-lock problem (http://en.wikipedia.org/wiki/Spinlock) but I don't have a deep knowledge of the Firefox network stack.
Flags: needinfo?(enrique.arizonbenito)
Product: Firefox → Core
Keywords: perf
Maybe the bug is the similar with bug 559676
Same problem here on Arch Linux with Firefox 35.0.1, uploading to Google+ Photos results in 100% cpu usage.
Did it use to work normally in the past, before FF3x or lower?
Is it specific to Linux?
(In reply to Loic from comment #6)
> Did it use to work normally in the past, before FF3x or lower?
> Is it specific to Linux?

It works normally on 31.x ( Windows ), since 32.0 it will cause 100% CPU on nsSocketTransportService::DoPollIteration, and it is similar with bug 559676, which is a older bug, so I think it was fixed in some edition. but 32.0 introduced the issue again.

You can upload a large file to https://www.virustotal.com/en to reproduce the problem.
See Also: → 559676
Oops, I forgot to post the testcase. :D

STR (for Windows, but surely similar on Linux/OSX)
1) I created a empty file ~125 MB with the command:
fsutil file createnew [name_of_file.extension][size_in_bytes]
2) Open https://www.virustotal.com/en
3) Scan the empty file

To observe the CPU use, it's better to switch to another tab, it's more stable (probably to avoid variations from graphics flushing).
VirusTotal starts by computing the hash then uploads the file.
CPU use in both cases:
-good= 25% (hash) then (upload)
-bad= 40% (hash) then (upload)

Regression range:
good=2014-05-29
bad=2014-05-30
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=1e712b724d17&tochange=e6f113c83095
Status: UNCONFIRMED → NEW
Ever confirmed: true
@Loic 
I think you were facing/mixing another bug(Bug 1061428 and dup Bug 1054972).
And Bug 1054972 was fixed in Firefox33....
My bad, you're right! Sorry for the mess.
Keywords: regression
The spike in CPU is pretty easy to see when uploading to Google Drive. Perhaps unsurprisingly, it also appears to have pretty consistently had a negative affect on my upload throughput as well (watching network utilization in Windows Task Manager).

Regression range: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=2893f60d5903&tochange=41a54c8add09

Too old for mozregression to bisect on inbound, unfortunately. However, local build bisection confirms that this is caused by bug 378637. The range fits nicely with the comments in this bug indicating that it regressed between Fx31 and Fx32 too.
Blocks: 378637
Component: Untriaged → Networking: HTTP
Flags: needinfo?(mcmanus)
I'm working on this.. its a problem with the workaround to https://bugzilla.mozilla.org/show_bug.cgi?id=1213084
Assignee: nobody → mcmanus
Flags: needinfo?(mcmanus)
Keywords: regression
Attachment #8676354 - Flags: review?(hurley)
Comment on attachment 8676354 [details] [diff] [review]
cpu spin during large h2/spdy upload

Review of attachment 8676354 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/nsHttpConnection.cpp
@@ +1404,5 @@
> +    MOZ_ASSERT(!mForceSendTimer);
> +    mForceSendPending = true;
> +    mForceSendTimer = do_CreateInstance("@mozilla.org/timer;1");
> +    return mForceSendTimer->InitWithFuncCallback(
> +        nsHttpConnection::ForceSendIO, this, 17, nsITimer::TYPE_ONE_SHOT);

It would be nice to not have the 17 hanging around here - I know it's (most likely) the "old windows tick interval" referenced in the comment a few lines up, but it always feels squicky to have a number just hanging out like that.

@@ +1414,5 @@
>  {
>      LOG(("nsHttpConnection::ForceRecv [this=%p]\n", this));
>      MOZ_ASSERT(PR_GetCurrentThread() == gSocketThread);
>  
> +    return NS_DispatchToCurrentThread(new HttpConnectionForceIO(this, true));

I know it's not the point of this bug, but is there any particular reason to not do the same kind of MaybeForceSendIO for Recv? (makes the code more symmetrical, which I always like)

::: netwerk/protocol/http/nsHttpConnection.h
@@ +350,5 @@
> +
> +private:
> +    // For ForceSend()
> +    static void ForceSendIO(nsITimer *aTimer, void *aClosure);
> +    nsresult    MaybeForceSendIO();

nit: align these with the rest of the members
Attachment #8676354 - Flags: review?(hurley) → review+
> @@ +1414,5 @@
> >  {
> >      LOG(("nsHttpConnection::ForceRecv [this=%p]\n", this));
> >      MOZ_ASSERT(PR_GetCurrentThread() == gSocketThread);
> >  
> > +    return NS_DispatchToCurrentThread(new HttpConnectionForceIO(this, true));
> 
> I know it's not the point of this bug, but is there any particular reason to
> not do the same kind of MaybeForceSendIO for Recv? (makes the code more
> symmetrical, which I always like)

I like symmetry too, but in this case the change is also kinda gross.. and I can't find a case where the recv path gets into the same root spinning problem (not in the presence of a https proxy), so I'm pretty hesitant to touch it. The ForceSend() path is really pretty confined and understandable in comparison.
https://hg.mozilla.org/mozilla-central/rev/20035adc5b89
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
I note that this can be seen when downloading large files to YouTube / Vimeo. Hopefully, in version 44 is also fixed? :)
*uploading, sorry
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: