Closed Bug 1195767 Opened 9 years ago Closed 9 years ago

mozilla::TaskQueues seem pretty heavy weight on OS X

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla43

Tracking Flags:

Tracking

Status

firefox42

---

verified

firefox43

---

verified

People

(Reporter: jrmuizel, Assigned: froydnj)

References

(Blocks 1 open bug)

Details

Attachments

(7 files)

Screen Shot 2015-08-18 at 12.29.12 PM.png 9 years ago Bobby Holley (:bholley) 279.41 KB, image/png		Details
playback-thread.log 9 years ago Nathan Froyd [:froydnj] 632.66 KB, text/x-log		Details
part 1 - remove nsCOMPtr temporary from nsEventQueue::PutEvent 9 years ago Nathan Froyd [:froydnj] 1.80 KB, patch	mozbugz : review+ Sylvestre : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
part 2 - create an nsEventQueueBase templated over the monitor type 9 years ago Nathan Froyd [:froydnj] 11.88 KB, patch	mozbugz : review+ Sylvestre : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
part 3 - modify nsThreadPool to use a non-reentrant monitor 9 years ago Nathan Froyd [:froydnj] 11.17 KB, patch	mozbugz : review+ Sylvestre : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
part 4 - remove nsEventQueue::GetReentrantMonitor 9 years ago Nathan Froyd [:froydnj] 712 bytes, patch	mozbugz : review+ Sylvestre : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
part 5 - use signaling instead of broadcast when work items are placed in nsEventQueue 9 years ago Nathan Froyd [:froydnj] 1.27 KB, patch	mozbugz : review+ Sylvestre : approval-mozilla-aurora+	Details \| Diff \| Splinter Review

Jeff Muizelaar [:jrmuizel]

Reporter

Description

•

9 years ago

While playing a video on YouTube I see a number of threads using 6% in Instruments. The majority of this time seems to be in synchronization overhead spread over functions like PR_Wait, PR_EnterMonitor, PR_ExitMonitor.

I don't expect this bug to be actionable without some kind of microbenchmark but I'm filing it now to make people aware of a potential problem.

Jeff Muizelaar [:jrmuizel]

Reporter

Updated

•

9 years ago

Blocks: 1195790

Bobby Holley (:bholley)

Comment 1

•

9 years ago

So, the task queues run on top of a thread pool. I'd expect the real work to be distributed across the threads in the thread pool, and for those threads to be idle a lot of the time as well.

Are the profiler samples in synchronization primitives you're seeing actually overhead, or is it just the profiler's way of saying that these threads spend most of their time blocked on their input queues?

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 2

•

9 years ago

(In reply to Bobby Holley (:bholley) from comment #1)
> So, the task queues run on top of a thread pool. I'd expect the real work to
> be distributed across the threads in the thread pool, and for those threads
> to be idle a lot of the time as well.
> 
> Are the profiler samples in synchronization primitives you're seeing
> actually overhead, or is it just the profiler's way of saying that these
> threads spend most of their time blocked on their input queues?

I believe, but am not certain, that this actual overhead and not just blocked time. Instruments has an option for 'Record waiting time' and that was not checked while I was profiling.

Andrew McCreight [:mccr8]

Comment 3

•

9 years ago

This may not be related, but Jeff previously found that OS X locks on 10.10 were weirdly slow (as seen in bug 1137963), to the extent that debug builds were much slower.

Bobby Holley (:bholley)

Comment 4

•

9 years ago

I just profiled youtube on OSX. I see the overhead, but it's not in TaskQueue per se, but mostly in nsThreadPool.

I see the majority of CPU time being spent in nsThreadPool::Run as expected, but interestingly, only about 15% of that is spent in mozilla::TaskQueue::Runner::Run, where the actual work happens. The rest of it is spent in PR_Wait, PR_EnterMonitor, and PR_ExitMonitor.

So either the measurements are screwy (and reporting waiting time as active CPU time), or we're wasting a huge amount of CPU time on super-slow synchronization primitives. I'm tempted to believe the former, but the incident in comment 3 shows real-world (non-profiler) confirmation of problems like that.

So I think we need to investigate this further. Nathan, do you have cycles to take a look, or should we get it on the radar of the Low Level Tools Team?

Flags: needinfo?(nfroyd)

Bobby Holley (:bholley)

Comment 5

•

9 years ago

Attached image Screen Shot 2015-08-18 at 12.29.12 PM.png — Details

Here's a screenshot after my profiler run. Don't forget to profile the plugin-container and not firefox. ;-)

Nathan Froyd [:froydnj]

Assignee

Comment 6

•

9 years ago

I will take a look, but it will be later this week.  Setting ni? for myself to remind myself.

Flags: needinfo?(nfroyd)

Nathan Froyd [:froydnj]

Assignee

Comment 7

•

9 years ago

Ah, looks like a bugzilla bug.

Flags: needinfo?(nfroyd)

Nathan Froyd [:froydnj]

Assignee

Comment 8

•

9 years ago

Jeff, I assume your measurements were taken on 10.10?  Do you have measurements for 10.9?

Flags: needinfo?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 9

•

9 years ago

My measurements are from 10.9

Flags: needinfo?(jmuizelaar)

Bobby Holley (:bholley)

Comment 10

•

9 years ago

Mine were on 10.10.

Nathan Froyd [:froydnj]

Assignee

Comment 11

•

9 years ago

I'm doing the measurements on 10.10, and I'm kind of a novice when it comes to Instruments, so I've just been poking around.  I see roughly the same thing as comment 5: for a thread running a thread pool, there's ~15-20% in TaskQueue::Runner::Run, and the rest spread out in synchronization functions.  This is all with the Time Profiler in Instruments.

Activity Monitor says the CPU is definitely working.

The weird thing is that after turning on NSPR logging for nsThreadPool, I reach a steady state where there's no messages being emitted to the log...but I still see samples in Instruments accumulating underneath the synchronization primitives called by nsThreadPool::Run.  So there's some loop happening that bypasses the logging, but I haven't been able to construct a scenario where that happens yet.

Nathan Froyd [:froydnj]

Assignee

Comment 12

•

9 years ago

...or it's possible that Instruments is just accumulating samples in the synchronization routines because that's where we happen to be paused.  It's not clear to me whether the samples are being accumulated because there's actual work going on or because the thread is just stopped there.

I think there is actual work going on, because there's other thread pools (like the image decoding pool) and AFAICT, those threads are not accumulating samples.  It's only a handful of threads that are getting sampled, and those all line up with media threads that I can see in lldb.

Nathan Froyd [:froydnj]

Assignee

Comment 13

•

9 years ago

Attached file playback-thread.log — Details

Hi, I'm a Gecko hacker who just learned how much fun it is to debug e10s code.

OK, after learning how NSPR logs work with child processes, I can see the MediaPlayback thread pool running.  And it looks like we just have threads waking up fairly constantly to do nothing but fall back asleep again.

The attached log shows NSPR_LOG_MODULES=nsThreadPool:5 for one of the MediaPlayback threads.  It's pretty big, but the following sort of behavior is not unusual:

2015-08-27 22:19:09.028607 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) MediaPlayback waiting [60000]
2015-08-27 22:19:09.029467 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) done waiting
2015-08-27 22:19:09.029481 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) MediaPlayback waiting [60000]
2015-08-27 22:19:09.029628 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) done waiting
2015-08-27 22:19:09.029643 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) MediaPlayback waiting [59999]
2015-08-27 22:19:09.126898 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) done waiting
2015-08-27 22:19:09.126904 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) MediaPlayback waiting [59902]
2015-08-27 22:19:09.127437 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) done waiting
2015-08-27 22:19:09.127457 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) MediaPlayback waiting [59902]
2015-08-27 22:19:09.127677 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) done waiting
2015-08-27 22:19:09.127870 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) MediaPlayback waiting [59901]
2015-08-27 22:19:09.128693 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) done waiting
2015-08-27 22:19:09.128711 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) MediaPlayback waiting [59900]
2015-08-27 22:19:09.128957 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) done waiting
2015-08-27 22:19:09.128970 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) MediaPlayback waiting [59900]
2015-08-27 22:19:09.131001 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) done waiting
2015-08-27 22:19:09.131144 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) MediaPlayback waiting [59898]
2015-08-27 22:19:09.131224 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) done waiting
2015-08-27 22:19:09.131249 UTC - 511442944[11fcb5bc0]: THRD-P(11fd2d710) MediaPlayback running [120111d60]

That is, within ~100ms, we repeated the following sequence nine times:

- started waiting on the thread pool's monitor:

http://mxr.mozilla.org/mozilla-central/source/xpcom/threads/nsThreadPool.cpp#219

- got woken up from our wait
- dropped the monitor, since we exited a ReentrantMonitorAutoEnter block
- reacquired the monitor
- reacquired the monitor again for the event queue to check whether we have any events
- dropped the monitor

Locks may be cheap, but they're not free.  And then imagine four threads (or more, since the MediaPlayback pool has a max of 12) in a thread pool doing all this, and the CPU usage numbers and the profiles start to become very believable: we're doing a bad job of farming out the work to the threads in the pool.  NSPR's synchronization primitives are not great, but I don't think they're at fault here.

Eliminating the last two steps from the above is pretty easy.  I'm not sure if there's a better way to do the first four steps.  It might be worth using a different thread pool loop for the media stuff, since the media pools want all the threads running all the time, and taking advantage of that would make managing the pool significantly easier...

Flags: needinfo?(nfroyd)

Nathan Froyd [:froydnj]

Assignee

Comment 14

•

9 years ago

Attached patch part 1 - remove nsCOMPtr temporary from nsEventQueue::PutEvent — Details — Splinter Review

The comment here suggests that we might AddRef/Release, but we really do
no such thing.  Let's deal with the transfer of ownership directly,
rather than going through nsCOMPtr.  This change makes the code slightly
smaller, and it also makes later refactorings to pull the lock out of
this function easier to do, since we don't have to consider how to hold
the lock within the lifetime of the nsCOMPtr temporary.

Gerald, you said you looked at some nsThreadPool stuff in bug 1161405, so you
get to review.  If you are overloaded or don't feel comfortable reviewing this
stuff, please say so!  I can find someone else.

Attachment #8657135 - Flags: review?(gsquelart)

Nathan Froyd [:froydnj]

Assignee

Comment 15

•

9 years ago

Attached patch part 2 - create an nsEventQueueBase templated over the monitor type — Details — Splinter Review

Clients of nsEventQueue don't always need fully reentrant monitors.
Let's account for that by having a base class templated on the monitor
type.  This change also opens up the possibility of having the monitor
for the event queue not owned by the event queue itself, but by the
client class, which makes a lot more sense than the current design.

Attachment #8657136 - Flags: review?(gsquelart)

Nathan Froyd [:froydnj]

Assignee

Comment 16

•

9 years ago

Attached patch part 3 - modify nsThreadPool to use a non-reentrant monitor — Details — Splinter Review

There's no reason nsThreadPool needs to use a reentrant monitor for
locking its event queue.  Having it use a non-reentrant one should be
slightly more efficient, both in the general operation of the monitor,
and that we're not performing redundant locking in methods like
nsThreadPool::Run.  This change also eliminates the only usage of
nsEventQueue::GetReentrantMonitor.

Attachment #8657137 - Flags: review?(gsquelart)

Nathan Froyd [:froydnj]

Assignee

Comment 17

•

9 years ago

Attached patch part 4 - remove nsEventQueue::GetReentrantMonitor — Details — Splinter Review

The last commit eliminated the only client of this method, so we can
remove it now.

Attachment #8657138 - Flags: review?(gsquelart)

Nathan Froyd [:froydnj]

Assignee

Comment 18

•

9 years ago

Attached patch part 5 - use signaling instead of broadcast when work items are placed in nsEventQueue — Details — Splinter Review

There's no reason to wake up all the threads in a thread pool when one
item gets placed in the queue.  Waking up one will serve the same
purpose and is significantly more efficient for thread pools with large
numbers of threads.

This is likely the most helpful patch, but it's also the riskiest, as it
modifies core behavior of our thread run loop.  Try says everything's green on
x86-64 Linux, however, and broadcast should only be necessary if we were ever
adding multiple items to the queue at once, which we're not.

Attachment #8657140 - Flags: review?(gsquelart)

Nathan Froyd [:froydnj]

Assignee

Comment 19

•

9 years ago

The patch series just posted appears to help some--at least profiling in Instruments no longer shows a mass of samples in locking methods.

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Comment 20

•

9 years ago

Comment on attachment 8657135 [details] [diff] [review]
part 1 - remove nsCOMPtr temporary from nsEventQueue::PutEvent

Review of attachment 8657135 [details] [diff] [review]:
-----------------------------------------------------------------

r+ (looks correct to me), with style suggestions:

::: xpcom/threads/nsEventQueue.cpp
@@ +119,5 @@
>    }
>  
> +  nsIRunnable** location = &mTail->mEvents[mOffsetTail];
> +  MOZ_ASSERT(!*location);
> +  *location = aRunnable.take();

How about we just don't use an indirection, e.g.:
> MOZ_ASSERT(!mTail->mEvents[mOffsetTail]);
> mTail->mEvents[mOffsetTail] = aRunnable.take();
The MOZ_ASSERT only costs in DEBUG&non-optimized mode, which is already slow anyway.
And the hero statement is now clear about what it's doing, without using an indirection over 3 lines.
Of course, the disadvantage is maintenance of this duplicate expression; though hopefully that won't ever be needed. (right?)

If you prefer keeping the indirection, please consider making it |nsIRunnable*&|, and maybe giving it a name that shows its intent, e.g. 'emplacementLocation' or similar.

Attachment #8657135 - Flags: review?(gsquelart) → review+

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Comment 21

•

9 years ago

Comment on attachment 8657136 [details] [diff] [review]
part 2 - create an nsEventQueueBase templated over the monitor type

Review of attachment 8657136 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with question and optional nit:

::: xpcom/threads/nsEventQueue.cpp
@@ +208,5 @@
> +
> +  return Base::Count(mon);
> +}
> +
> +#undef RAII_MONITOR_VARIABLE

What is this #undef about?

::: xpcom/threads/nsEventQueue.h
@@ +32,3 @@
>  // A threadsafe FIFO event queue...
> +template<typename MonitorType>
> +class nsEventQueueBase

This 'nsEventQueueBase' will not be public:ly derived, and will even be used directly (e.g., in nsThreadPool in the following patch), so is it really a *base* class?

Just naming it 'nsEventQueue' would probably cause too much trouble to be worth the effort here, as we'd either:
- have to rename the original nsEventQueue everywhere;
- Or maybe use more complicated templating magic, e.g.:
> template <typename MonitorType = ReentrantMonitor, bool InternalMonitor = true> nsEventQueue; with specializations for InternalMonitor being true/false.
(I'm actually liking this template better, but I understand if you don't want to rework everything now -- maybe a future bug, if you agree?)

Otherwise, you could rename nsEventQueueBase, the most explicit name might be something like 'nsExternallyMonitoredEventQueue', but it's quite a keyboardful.

All this is just a small nit for something that works as-is, so feel free to ignore!

Attachment #8657136 - Flags: review?(gsquelart) → review+

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Comment 22

•

9 years ago

Comment on attachment 8657137 [details] [diff] [review]
part 3 - modify nsThreadPool to use a non-reentrant monitor

Review of attachment 8657137 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/threads/nsThreadPool.cpp
@@ +11,5 @@
>  #include "nsMemory.h"
>  #include "nsAutoPtr.h"
>  #include "prinrval.h"
>  #include "mozilla/Logging.h"
> +#include "mozilla/Monitor.h"

Shouldn't this #include be in the header, where Monitor is first needed?

Attachment #8657137 - Flags: review?(gsquelart) → review+

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Updated

•

9 years ago

Attachment #8657138 - Flags: review?(gsquelart) → review+

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Updated

•

9 years ago

Attachment #8657140 - Flags: review?(gsquelart) → review+

Pulsebot

Comment 23

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/94563bd2e26b
https://hg.mozilla.org/integration/mozilla-inbound/rev/8b01d290c47c
https://hg.mozilla.org/integration/mozilla-inbound/rev/0bd307e90154
https://hg.mozilla.org/integration/mozilla-inbound/rev/3fcd5e172ea2
https://hg.mozilla.org/integration/mozilla-inbound/rev/9eaa5be91318

Nathan Froyd [:froydnj]

Assignee

Comment 24

•

9 years ago

Thanks for the reviews!  I addressed the review comments, except for renaming nsEventQueueBase.  There are a couple other things to clean up around that class, and I figured that doing that work in a separate bug would be better.

Assignee: nobody → nfroyd

Nathan Froyd [:froydnj]

Assignee

Updated

•

9 years ago

Blocks: 1202497

Nigel Babu [:nigelb]

Comment 26

•

9 years ago

This seems to have caused bug 1085667 to permafail.

Flags: needinfo?(nfroyd)

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Comment 27

•

9 years ago

(In reply to Nigel Babu [:nigelb] from comment #26)
> This seems to have caused bug 1085667 to permafail.

Patches here landed on 2015-09-08 (AEST), 12 hours before this comment; but bug 1085667 started failing again on 2015-09-03, so I doubt this is the cause.

Nicholas Nethercote [inactive]

Comment 28

•

9 years ago

This made a moderate improvement to the power consumption of YouTube on Mac. See bug 1195790 comment 8.

Nigel Babu [:nigelb]

Comment 29

•

9 years ago

Turns out it's actually a Gaia bug. Tomcat is backing it out now. Sorry about the confusion, folks!

Flags: needinfo?(nfroyd)

Carsten Book [:Tomcat]

Comment 30

•

9 years ago

https://hg.mozilla.org/mozilla-central/rev/94563bd2e26b
https://hg.mozilla.org/mozilla-central/rev/8b01d290c47c
https://hg.mozilla.org/mozilla-central/rev/0bd307e90154
https://hg.mozilla.org/mozilla-central/rev/3fcd5e172ea2
https://hg.mozilla.org/mozilla-central/rev/9eaa5be91318

Status: NEW → RESOLVED

Closed: 9 years ago

status-firefox43: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla43

Nathan Froyd [:froydnj]

Assignee

Comment 31

•

9 years ago

Comment on attachment 8657135 [details] [diff] [review]
part 1 - remove nsCOMPtr temporary from nsEventQueue::PutEvent

This approval request is for all the patches in this bug.  It's not crucial that these patches are uplifted, but performance/power improvements seem like a reasonable aurora uplift request.

Approval Request Comment
[Feature/regressing bug #]: None.
[User impact if declined]: Higher CPU/power usage on youtube and other video sites; see comment 28, for instance.
[Describe test coverage new/current, TreeHerder]: There are no tests specifically for these changes, but platform features that use this code are pretty well tested.
[Risks and why]: These patches are reasonably low risk.
[String/UUID change made/needed]: None

Attachment #8657135 - Flags: approval-mozilla-aurora?

Sylvestre Ledru [:Sylvestre]

Comment 32

•

9 years ago

Can someone verify this bug?

Keywords: verifyme

Nathan Froyd [:froydnj]

Assignee

Comment 33

•

9 years ago

(In reply to Sylvestre Ledru [:sylvestre] from comment #32)
> Can someone verify this bug?

What sort of verification are you looking for?  Are you looking for something different than comment 28?

Flags: needinfo?(sledru)

Sylvestre Ledru [:Sylvestre]

Updated

•

9 years ago

Status: RESOLVED → VERIFIED

status-firefox42: --- → affected

status-firefox43: fixed → verified

Flags: needinfo?(sledru)

Sylvestre Ledru [:Sylvestre]

Comment 34

•

9 years ago

Comment on attachment 8657135 [details] [diff] [review]
part 1 - remove nsCOMPtr temporary from nsEventQueue::PutEvent

Miss that, thanks.
Taking it in aurora then!

Attachment #8657135 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Sylvestre Ledru [:Sylvestre]

Updated

•

9 years ago

Attachment #8657140 - Flags: approval-mozilla-aurora+

Sylvestre Ledru [:Sylvestre]

Updated

•

9 years ago

Attachment #8657136 - Flags: approval-mozilla-aurora+

Sylvestre Ledru [:Sylvestre]

Updated

•

9 years ago

Attachment #8657137 - Flags: approval-mozilla-aurora+

Sylvestre Ledru [:Sylvestre]

Updated

•

9 years ago

Attachment #8657138 - Flags: approval-mozilla-aurora+

Carsten Book [:Tomcat]

Comment 35

•

9 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/6367e9846a19
https://hg.mozilla.org/releases/mozilla-aurora/rev/92bbcea7ea93
https://hg.mozilla.org/releases/mozilla-aurora/rev/2adf669d9a57
https://hg.mozilla.org/releases/mozilla-aurora/rev/27de5abd6cff
https://hg.mozilla.org/releases/mozilla-aurora/rev/a7bb121ef972

status-firefox42: affected → fixed

Bobby Holley (:bholley)

Comment 36

•

9 years ago

Nathan, this is totally rad work - thank you so much for actively maintaining this stuff!

Ada [:adalucinet]

Comment 37

•

9 years ago

Encountered the following results by using powermetrics command-line utility, under Mac OS X 10.11:

Nightly without this fix (from 2015-08-27)
Name                               ID     CPU ms/s  User%  Deadlines (<2 ms, 2-5 ms)  Wakeups (Intr, Pkg idle)  GPU ms/s
org.mozilla.nightly                246    454.77                                      2361.91 1528.85           42.82     
  plugin-container                 497    335.73    56.63  1074.87 43.70              1227.11 792.13            10.68     
  firefox                          495    121.93    62.44  1003.83 2.99               1134.33 736.66            32.14              6.16   


latest 44.0a1 (2015-10-26)
Name                               ID     CPU ms/s  User%  Deadlines (<2 ms, 2-5 ms)  Wakeups (Intr, Pkg idle)  GPU ms/s
org.mozilla.nightly                252    288.18                                      301.14  194.97            73.03     
  plugin-container                 520    170.45    76.65  53.27   35.72              196.14  123.91            48.15     
  firefox                          519    121.41    78.47  12.57   2.39               104.16  70.43             24.87     

I can see a reduced CPU usage for the plugin-container after this fix. Is this enough to be called verified? OR should I look over something else too? Thanks in advance!

Flags: needinfo?(nfroyd)

Nathan Froyd [:froydnj]

Assignee

Comment 38

•

9 years ago

(In reply to Alexandra Lucinet, QA Mentor [:adalucinet] from comment #37)
> I can see a reduced CPU usage for the plugin-container after this fix. Is
> this enough to be called verified? OR should I look over something else too?
> Thanks in advance!

This is enough, thanks!

Flags: needinfo?(nfroyd)

Ada [:adalucinet]

Comment 39

•

9 years ago

(In reply to Nathan Froyd [:froydnj] from comment #38)
> This is enough, thanks!

Thanks, Nathan!
Same result encountered with 42.0RC build 1 (Build ID: 20151026170526):


Name                               ID     CPU ms/s  User%  Deadlines (<2 ms, 2-5 ms)  Wakeups (Intr, Pkg idle)  GPU ms/s
org.mozilla.firefox                284    210.45                                      223.73  171.24            139.35    
  firefox                          764    211.21    76.32  36.72   16.16              223.51  170.82            139.34    
  plugin-container                 765    0.24      67.25  0.00    0.00               0.40    0.40              0.00     


Marking accordingly.

status-firefox42: fixed → verified

Keywords: verifyme

Nathan Froyd [:froydnj]

Assignee

Comment 40

•

9 years ago

Release Note Request (optional, but appreciated)
[Why is this notable]: Good to highlight performance wins for video playing
[Suggested wording]: "Lower CPU usage while playing videos on popular sites such as Youtube and Vimeo"  (The original problem was on OS X, and that's the only platform we've verified it on, but I'd be *very* surprised if it didn't have some effect on other platforms.)
[Links (documentation, blog post, etc)]: None; I was going to write a techy blog post on it, though, which made me think a relnote would be a good thing. :)

relnote-firefox: --- → ?

Flags: needinfo?(lhenry)

Liz Henry (:lizzard) (relman/hg->git project)

Comment 41

•

9 years ago

OK, great! A blog post would be very useful (best if it comes from a mozilla.org domain) and I can add the link any time during beta.

relnote-firefox: ? → 42+

Flags: needinfo?(lhenry)

Summary: mozilla::TaskQueue's seem pretty heavy weight on OS X → mozilla::TaskQueues seem pretty heavy weight on OS X

Liz Henry (:lizzard) (relman/hg->git project)

Comment 42

•

9 years ago

Just realized this is for 42. I can still add a note for beta.  Sylvestre it sounds like you will want this, anyway!

Nicholas Nethercote [inactive]

Comment 43

•

9 years ago

> "Lower CPU usage while playing videos on popular sites such as Youtube and Vimeo"

A supporting blog post with measurements would be very good. We don't want to oversell the improvement, because some users get upset if we claim an improvement and they don't see it themselves.

Anthony Jones (:ajones, :kentuckyfriedtakahe, :k17e)

Comment 44

•

9 years ago

Our power usage for video playback on Mac sucks and this bug appears to be about it sucking a little bit less. I'm not sure it is something we should draw too much attention to. Contrastingly we have good power usage on Windows albeit not as good as Edge.

Liz Henry (:lizzard) (relman/hg->git project)

Comment 45

•

9 years ago

Since this is actually sylvestre's note for 42, he should make the call.  I suggest removing the note and then putting it back when we know we have a good supporting blog post to link to.

relnote-firefox: 42+ → ?

Flags: needinfo?(sledru)

Flags: needinfo?(nfroyd)

Nathan Froyd [:froydnj]

Assignee

Comment 46

•

9 years ago

That'd be fine with me.  In comment 44, Anthony sounds like he would like to wait to relnote changes until more substantial power reductions have been achieved anyway.

Flags: needinfo?(nfroyd)

Sylvestre Ledru [:Sylvestre]

Comment 47

•

9 years ago

Agreed! Thanks

Flags: needinfo?(sledru)

Sylvestre Ledru [:Sylvestre]

Comment 48

•

8 years ago

Removing the flag, please resubmit when we agree it is time to communicate on it.

relnote-firefox: ? → ---

Jean-Yves Avenard [:jya]

Comment 49

•

8 years ago

While investigating bug 1230641, I've found that tasks take an unreasonable amount of time to be dispatched, we're talking 18ms from time to time with an average of 8ms.

Interestingly, when I dropped the number of threads in the thread pool from 24 to 8, that average dropped to less than 3ms.

There are other issues at hand with bug 1230641, but the slowness of the task queues is certainly a big factor.

Comment 50

•

8 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #49)
> While investigating bug 1230641, I've found that tasks take an unreasonable
> amount of time to be dispatched, we're talking 18ms from time to time with
> an average of 8ms.
> 
> Interestingly, when I dropped the number of threads in the thread pool from
> 24 to 8, that average dropped to less than 3ms.

Is that just the time spent in nsThreadPool::Dispatch, or were you measuring something else?  Is it possible for you to break down that 18ms by function, or even line?

Can you file a separate bug on that slowness?

Flags: needinfo?(jyavenard)

Jean-Yves Avenard [:jya]

Comment 51

•

8 years ago

sorry for the delay.

The issue is only seen on particular mac laptop with an Intel IGP and when performing particular graphic operations.

When the compositor is rendering GPU based frames, everything slows to a crawl, and I had first looked in the logs on how slowly a task were dispatched. Likely a red herring that is a direct consequence to the graphic bottleneck

Flags: needinfo?(jyavenard)

You need to log in before you can comment on or make changes to this bug.