Closed Bug 1203427 Opened 9 years ago Closed 9 years ago

Add logging for timer firings

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla43
Tracking Status
firefox43 --- fixed

People

(Reporter: n.nethercote, Assigned: n.nethercote)

Details

(Whiteboard: [Power:P1])

Attachments

(6 files, 4 obsolete files)

Timer firings cause wakeups, which increases power consumption. Therefore, the ability to profile timer firings is useful. The hard part is that you need to be able to identify the callback function when a timer fires.

Back in 2010 azakai tried passing in an extra argument to the timer initialization functions, in order to identify them, using source instrumentation via a script (https://wiki.mozilla.org/Mobile/Powersaving/Wakeups/Debugging). This worked but was fragile -- I resurrected his scripts and got them working but it took non-trivial effort.

Then last year rvitillo tried a less invasive approach of using dladdr() to lookup the callback name (http://robertovitillo.com/2014/02/04/idle-wakeups-are-evil/). This is great but it only works on Mac. (Even on Linux it doesn't work due to the linker not doing the right things with symbol tables.)

I'm working on something of a hybrid approach. There are lots of timers in the codebase but for profiling purposes a relatively small number of them account for almost all the timer firings that occur in practice. So I want to manually annotate the frequent ones with explicit names, and then have the dladdr() fallback available to identify which additional ones should be explicitly named.
Putting this up for feedback rather than review because it's still a bit rough.

This is part 3, but parts 1 and 2 are just minor refactorings so you don't need
to see them right now.

The key additions are the new initWithNamedFuncCallback() and
initWithNameableFuncCallback() functions, which can be used to give an explicit
name to a timer. The former takes a C string, the latter takes a callback,
which is necessary for timers in content JS code that occur via nsGlobalWindow.

Doing the output via MOZ_LOG seemed reasonable, though I don't like the ugly
thread IDs that get put onto every line. You invoke it like this:

  NSPR_LOG_MODULES=TimerFirings:4 firefox > out

And then simple post-processing can identify the timers that occur frequently.

Sample output for explicitly named timers within Gecko:

> 1562818432[7f795be55a00]: [16906]   fn timer (SLACK        100 ms): LayerActivityTracker
> 1562818432[7f795be55a00]: [16906]   fn timer (SLACK       1000 ms): nsIDocument::SelectorCache

Sample output for JS content timers:

> 1562818432[7f795be55a00]: [16906]   fn timer (ONE_SHOT       0 ms): (content
timer) chrome://browser/content/tabbrowser.xml:1807:0
> -2099578304[7f4772642000]: [16964]   fn timer (ONE_SHOT      60 ms): (content timer) http://int.nyt.com/applications/portals/assets/portal/app-d37d526acc5f1ea70c6a0bee273416cc.js:4:12064

Sample output for an unnamed timer obtained on Mac (which does have a working
dladdr()):

> 2082435840[100445640]: [8654]   fn timer (ONE_SHOT     250 ms): [from dladdr]
PresShell::sPaintSuppressionCallback(nsITimer*, void*)

Sample output for an unnamed timer obtained on Linux (which doesn't have a
working dladdr()):

> 1562818432[7f795be55a00]: [16906]   fn timer (ONE_SHOT     160 ms): ???[no dladdr on this platform]

The coverage could be improved quite a bit just by giving explicit names to a
few more timers.

I feel like this is far from the cleanest code I've ever written, so comments
on how that might be improved would be welcome. Thanks.
Attachment #8659099 - Flags: feedback?(nfroyd)
Here's a somewhat cleaned-up version. I also improved the coverage by giving
names to a few more timers.

BTW, here's sample output for timers that have the |Interface| and |Observer|
types:

> 403998592[7fc616a5ba00]: [8167] iface timer (ONE_SHOT     0 ms): 7fc5ed84cde0
> 403998592[7fc616a5ba00]: [8167]   obs timer (ONE_SHOT     0 ms): 7fc5ffa55f80

These aren't nearly as common as |Function| timers so I'm not too concerned by
the fact that I am only printing a pointer for them.
Attachment #8659320 - Flags: feedback?(nfroyd)
Attachment #8659099 - Attachment is obsolete: true
Attachment #8659099 - Flags: feedback?(nfroyd)
There are many sub-classes of nsExpirationTracker. In order to distinguish them
nicely in the logging of timer firings, it's necessary to manually name each
one. (This wouldn't be necessary if there was a way to stringify template
parameters, but there isn't.)
Attachment #8659751 - Flags: review?(nfroyd)
The first argument to this method is always nsGlobalWindow::TimerCallback, and
hard-wiring this makes subsequent patches simpler.
Attachment #8659752 - Flags: review?(nfroyd)
This makes the order of |aDelay| and |aType| match those of the InitWith*()
functions.

I've made this change because the inconsistency tripped me up during the
development of part 4.
Attachment #8659754 - Flags: review?(nfroyd)
Attachment #8659320 - Attachment is obsolete: true
Attachment #8659320 - Flags: feedback?(nfroyd)
Mentor: n.nethercote
Mentor: n.nethercote
Comment on attachment 8659756 [details] [diff] [review]
(part 4) - Remove trailing whitespace from nsITimer.idl

Review of attachment 8659756 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you for separating this out.
Attachment #8659756 - Flags: review?(nfroyd) → review+
Comment on attachment 8659754 [details] [diff] [review]
(part 3) - Change order of InitCommon() arguments

Review of attachment 8659754 [details] [diff] [review]:
-----------------------------------------------------------------

Hooray for a foolish consistency.
Attachment #8659754 - Flags: review?(nfroyd) → review+
Comment on attachment 8659751 [details] [diff] [review]
(part 1) - Add nsExpirationTracker::mName

Review of attachment 8659751 [details] [diff] [review]:
-----------------------------------------------------------------

We have the same problem vis-a-vis stringifying template parameters for things like MOZ_COUNT_CTOR.  It leads to sadness for people.
Attachment #8659751 - Flags: review?(nfroyd) → review+
Comment on attachment 8659752 [details] [diff] [review]
(part 2) - Remove an argument to nsTimeout::InitTimer

Review of attachment 8659752 [details] [diff] [review]:
-----------------------------------------------------------------

I'm OK with this, but I think this is more Andrew's code than mine.
Attachment #8659752 - Flags: review?(nfroyd)
Attachment #8659752 - Flags: review?(continuation)
Attachment #8659752 - Flags: feedback+
Comment on attachment 8659752 [details] [diff] [review]
(part 2) - Remove an argument to nsTimeout::InitTimer

Review of attachment 8659752 [details] [diff] [review]:
-----------------------------------------------------------------

Well, this is peculiar. I don't know why this code was like that, but it is pretty old so it should be okay.
Attachment #8659752 - Flags: review?(continuation) → review+
Comment on attachment 8659783 [details] [diff] [review]
(part 5) - Add logging of timer firings

Review of attachment 8659783 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the idl changes and using mozilla::Variant.

::: xpcom/threads/nsITimer.idl
@@ +160,5 @@
> +  [noscript] void initWithNamedFuncCallback(in nsTimerCallbackFunc aCallback,
> +                                            in voidPtr aClosure,
> +                                            in unsigned long aDelay,
> +                                            in unsigned long aType,
> +                                            in string aName);

Please move these declarations to the end of the interface definition so they don't interfere with vtable orderings.

::: xpcom/threads/nsTimerImpl.cpp
@@ +590,5 @@
> +  }
> +
> +  switch (aCallbackType) {
> +    case CallbackType::Function: {
> +      bool needToFreeName = false;

There has got to be some better way of doing this "sometimes a static buffer, sometimes a dynamically allocated before" dance.  (Not a criticism of the patch, just frustration that it's not easier.)

@@ +606,5 @@
> +
> +      } else {
> +        MOZ_ASSERT(mName.mTag == Name::NameType::None);
> +#if defined(XP_MACOSX)
> +        annotation = "[from dladdr] ";

I wonder if build peers would let us pass --dynamic-list to the linker, so we could pass in a list of relevant timer callbacks on Linux, so dladdr() could work there...at least for developer builds.

::: xpcom/threads/nsTimerImpl.h
@@ +124,5 @@
>  
> +  void LogFiring(CallbackType aCallbackType, CallbackUnion aCallbackUnion);
> +
> +  // A tagged union type representing one of (a) nothing, (b) a string, or (c)
> +  // a function.

We have mozilla/Variant.h for this.  It might require a bit more verbiage to use, but I don't think that'd be a bad thing.
Attachment #8659783 - Flags: review?(nfroyd) → review+
> Hooray for a foolish consistency.

“A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines." - Ralph Waldo Emerson

"Pythagoras was misunderstood, and Socrates, and Jesus, and Luther, and Copernicus, and Galileo, and Newton, and every pure and wise spirit that ever took flesh. To be great is to be misunderstood." - Ralph Waldo Emerson

"I pity the fool." - Mr. T
> There has got to be some better way of doing this "sometimes a static
> buffer, sometimes a dynamically allocated before" dance.  (Not a criticism
> of the patch, just frustration that it's not easier.)

Indeed. I tried and failed to come up with something better. Hard to avoid when I'm dealing with two pre-existing APIs that use different conventions.


> > +  // A tagged union type representing one of (a) nothing, (b) a string, or (c)
> > +  // a function.
> 
> We have mozilla/Variant.h for this.  It might require a bit more verbiage to
> use, but I don't think that'd be a bad thing.

One downside of Variant is that it doesn't handle the "nothing" case naturally; I'd have to use a dummy |int| or somesuch. I'll try it and see how it looks in comparison with the hand-rolled version.
Attached patch (part 5b) - Use mozilla::Variant (obsolete) — Splinter Review
Nathan, using mozilla::Variant is shorter but the handling of the "nothing"
case is a bit odd. Do you have any preference between the two versions?
Attachment #8660994 - Flags: feedback?(nfroyd)
Attached patch (part 5b) - Use mozilla::Variant (obsolete) — Splinter Review
Attachment #8661000 - Flags: feedback?(nfroyd)
Attachment #8660994 - Attachment is obsolete: true
Attachment #8660994 - Flags: feedback?(nfroyd)
Attachment #8661007 - Flags: feedback?(nfroyd)
Attachment #8661000 - Attachment is obsolete: true
Attachment #8661000 - Flags: feedback?(nfroyd)
Comment on attachment 8661007 [details] [diff] [review]
(part 5b) - Use mozilla::Variant

Review of attachment 8661007 [details] [diff] [review]:
-----------------------------------------------------------------

WFM.  This is maybe getting ridiculous, but you could use mozilla/Maybe.h:

mozilla::Maybe<mozilla::Variant<>>

That handles the Nothing case, and if you have Something, it must be one of the two variant cases.
Attachment #8661007 - Flags: feedback?(nfroyd) → feedback+
(In reply to Pulsebot from comment #21)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/309502b3ea1f

This was a follow-up patch where I linked to the MDN docs that I just wrote:
https://developer.mozilla.org/en-US/docs/Mozilla/Performance/TimerFirings_logging
> On non-Mac platforms, unannotated function timer firings will look like the following.
>
> 711637568[7f3219c48000]: [6835]    fn timer (ONE_SHOT    16 ms): ???[dladdr: unavailable/doesn't work on this platform]

Outputing something that can be postprocessed by fix_linux_stack.py would have been helpful.
> Outputing something that can be postprocessed by fix_linux_stack.py would
> have been helpful.

That requires a library name and offset. Can we get that from a function pointer?
Whiteboard: [Power] → [Power:P1]
The dli_fname and dli_fbase members of the Dl_info struct that dladdr fills contain, respectively, the library path and the base address where it's loaded. You can derive the lib name by taking the part of the path after tha last / if there is one, and the offset by substracting the address you're dladdr'ing with the base address.
(In reply to Mike Hommey [:glandium] from comment #26)
> The dli_fname and dli_fbase members of the Dl_info struct that dladdr fills
> contain, respectively, the library path and the base address where it's
> loaded.

Bug 1205941!

> You can derive the lib name by taking the part of the path after tha
> last / if there is one

It turns out using the whole path works; no need to trim it down.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: