1192216 - Decide whether we should keep RDTSC for PerformanceMonitoring

David Teller [:Yoric] - still alive but not very active

Reporter

Description

•

9 years ago

Once bug 1181175 has landed, we need to determine 1/ whether it's fast enough; 2/ whether it discards too much data.

For 1/, we should keep an eye on Talos and at the benchmark of bug 1152930.
For 2/, we should keep an eye on PERF_MONITORING_TEST_CPU_RESCHEDULING_PROPORTION_MOVED. If the proportion is very low, we can simplify RDTSC by removing the same-CPU check. If the proportion is very high, RDTSC might discard too much data.

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 1

•

9 years ago

Protocol for http://mozilla.pettay.fi/moztests/events/event_speed_3.html (the benchmark of bug 1152930):
- run the benchmark without tweaking Firefox Nightly (this has Perf Monitoring enabled);
- run the benchmark with the following tweak (to deactivate Perf Monitoring)
  let PerformanceStats = Cu.import("resource://gre/modules/PerformanceStats.jsm", {});
  PerformanceStats.performanceStatsService.isMonitoringJank = false;

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 2

•

9 years ago

According to my tests, I cannot see a difference in the speed of http://mozilla.pettay.fi/moztests/events/event_speed_3.html between with and without jank monitoring. Using latest Nightly, Mac OS X.

David Teller [:Yoric] - still alive but not very active

Reporter

Updated

•

9 years ago

Depends on: 1198167

David Teller [:Yoric] - still alive but not very active

Reporter

Updated

•

9 years ago

Depends on: 1199603

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 3

•

9 years ago

If you have a moment, bz, could you confirm?

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 4

•

9 years ago

I'm sorry for the lag here; I'm still digging out from vacation.  :(

Going to try to get to this on Tuesday (Monday is a holiday).

Boris Zbarsky [:bzbarsky]

Comment 5

•

9 years ago

OK, so I did the following tests:

1)  Check out hg revision 4cb5a8a692e5 (this is from before bug 1192216 landed).
2)  Run http://mozilla.pettay.fi/moztests/events/event_speed_3.html normally.
3)  Run it with the isMonitoringJank = false bit from comment 1.
4)  Check out hg revision defee685fedd (from after bug 1192216).
5)  Run the test again normally.
6)  Run it with isMonitoringJank = false.
7)  Pull m-c tip.
8)  Run the test again normally.
9)  Run it with isMonitoringJank = false.

In all cases the versions with the jank monitoring scored the same as the versions without, as far as I can tell, and the tests after bug 1192216 were a bit (about 5ms, on a baseline of 80ms) slower than the tests before.  I have no idea what to make of all that.  :(

Flags: needinfo?(bzbarsky)

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 6

•

9 years ago

That's disappointing. Barring any bug, post bug 1192216, any test without jank monitoring should be a glorified noop. A few constructors/destructors called on the stack, a few `if` and a few dereferences, but that's it.

I'll look further into it. Can you confirm that the benchmark is executed in a single tick (but several microticks) of the event loop?

Flags: needinfo?(bzbarsky)

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 7

•

9 years ago

Oh, I think I found a bug in the code. We fail to bailout early in some function if jank monitoring is deactivated.

Boris Zbarsky [:bzbarsky]

Comment 8

•

9 years ago

Not sure whether we mean the same things by "tick" and "microtick", but the way the benchmark works is that it runs a bunch of tasks off the main event loop.  Each of those tasks does a loop that calls dispatchEvent a number of times and measures the time that takes.  Then it averages together the times from all the tasks.

dispatchEvent does perform a microtask checkpoint when it's done, I think...

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 9

•

9 years ago

But I'm not 100% sure about that last.

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 10

•

9 years ago

Let me clarify: do you know if XPConnect::AfterProcessNextTask is called at the end of the test or at each call to dispatchEvent?

David Teller [:Yoric] - still alive but not very active

Reporter

Updated

•

9 years ago

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 11

•

9 years ago

After the fix for bug 1179909 there is no such thing... What's the relevant function in the new world?

Flags: needinfo?(bzbarsky)

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 12

•

9 years ago

Sorry, I meant `AfterProcessTask`, which I believe is the relevant function in the new world.

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 13

•

9 years ago

Ah, XPCJSRuntime::AfterProcessTask?

That is _not_ called after every event dispatch, just once we return to the event loop.

Flags: needinfo?(bzbarsky)

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 14

•

9 years ago

Ok, so this means that we only call `getrusage` twice, which should make things much faster. I wonder what's going wrong. I'll try and profile the test early next week.

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 15

•

9 years ago

Just profiled on my Mac, with the stopwatch enabled, using Instruments and a sampling rate of 40µs.

Of the ~500ms of the test (I assume that's CPU time):
- 0.8ms were spent in `AutoStopwatch`;
- 0.2ms were spent in `nsPerformanceStats` (which should not be covered by the test but could interfere accidentally);
- 0.1ms in `Stopwatches::commit()`;
- 0.1ms in `Stopwatches.reset()`;
- 0ms on resizing storage.


If I deactivate any monitoring
- 0.5ms were spent in `Stopwatches::reset()`, which is a bit surprising;
- 0.3ms were spent in `Stopwatches::commit()`, which is also surprising;
- 0.2ms were spent in `AutoStopwatch`;
- 0.2ms were spent in `nsPerformanceStats`.


Benchmark results are very similar. While I am surprised that the time spent in `Stopwatches::*` increases when we deactivate monitoring, nothing I see shows any activity which should be measurable in any way.

That's with a single tab. Total time spent on `AutoStopwatch` increases a bit with two tabs open.

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 16

•

9 years ago

So far, I have no explanation for any slowdown. To make matters worse, the test does not seem reliable on my machine, as the average value seems to jump between ~100ms and ~300ms on the same instance of Firefox, with no major background task running that I know of.

Boris, in bug 1152930, how did you determine that the issue was caused by `getrusage`? If we can start from that, we might be able to get some progress.

Flags: needinfo?(bzbarsky)

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 17

•

9 years ago

So, I did another round of benchmarking. I ran each benchmark 5 times, and I didn't get much fluctuation in each version this time. I kept jank monitoring on in all cases.

* Nightly before bug 674779: 73ms.
* Nightly after bug 674779: 70ms.
* Nightly before bug 1181175: 101ms.
* Nightly after bug 1181175: 102ms.

So, while I see a huge increase in the last 6 months, neither this benchmarking nor my profiling of comment 15 can confirm that it is correlated with my patches. Am I missing something?

Boris Zbarsky [:bzbarsky]

Comment 18

•

9 years ago

> Boris, in bug 1152930, how did you determine that the issue was caused by `getrusage`?

Profiling in Instruments.

So I just tried that again, with rev ab01fd91bc02 and rev 66945c5c4a40 (after and before bug 674779 respectively, I believe).  And now the profile shows very little getrusage in it.  :(  I have no idea why...

The huge increase you see may be from async stack stuff, if you're measuring in nightly.  Worth checking how the numbers look if you toggle "javascript.options.asyncstack" to false in about:config.

Flags: needinfo?(bzbarsky)

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 19

•

9 years ago

If I toggle off javascript.options.asyncstack, I shave off ~20ms. Still larger than around bug 674779 by ~10ms, but not as dramatic.

So, at this stage, I am tempted to believe that write this off as a false alarm. Do you concur or did I forget something?

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 20

•

9 years ago

I know I saw a regression back when I filed bug 1152930.  I'm not seeing one now, and I have no idea why.  I agree that it's not clear what we can do with that, if anything.  :(

Flags: needinfo?(bzbarsky)

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 21

•

9 years ago

Well, performance has never been an exact science. I'll take today's measures as a good sign, and we can always come back to this later if we observe another regression.

Depends on: 1206377

Boris Zbarsky [:bzbarsky]

Comment 22

•

8 years ago

David, is the AutoStopWatch ctor/dtor time I'm seeing in bug 1254240, accounting for 1/6 of the mainthread time and for about 1/3 the time under JS::Call, this bug, or should I file a separate one?

Flags: needinfo?(dteller)

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 23

•

8 years ago

I don't think that it is related to RDTSC, so I'd say new bug. That sounds pretty bad, though.

Flags: needinfo?(dteller)

Florian Quèze [:florian]

Comment 24

•

6 years ago

about:performance is being redesigned; mass closing the bugs related to parts of the current about:performance page that we are not keeping.

Our goals with the redesign are to reduce the overhead caused by having the page opened, increase the reliability of the displayed information, and make the offered information actionable for most users. The back-end work is being tracked in bug 1419681.

Status: NEW → RESOLVED

Closed: 6 years ago

Resolution: --- → WONTFIX