Closed Bug 1173354 Opened 5 years ago Closed 3 years ago

Change GC statistics to use TimeStamp directly

Categories

(DevTools :: Performance Tools (Profiler/Timeline), defect, P3)

defect

Tracking

(firefox53 fixed)

RESOLVED FIXED
Firefox 53
Tracking Status
firefox53 --- fixed

People

(Reporter: tromey, Assigned: sfink)

References

Details

Attachments

(2 files, 2 obsolete files)

In bug 1159506 we're going to land a somewhat hackish solution to getting
the GC event timestamps to line up with other performance tool data sources.
However, after bug 858927 lands, we must go back and fix this up to
use TimeStamp directly.
Assignee: nobody → ttromey
Status: NEW → ASSIGNED
Depends on: 1159506, 858927
Triaging. Filter on ADRENOCORTICOTROPIC (yes).
Priority: -- → P3
Summary: change GC statistics to use TimeStamp directly → Change GC statistics to use TimeStamp directly
This does not change the two places where we expose a epoch-based timestamp to the outside world. Which I think was the original intention of the bug I'm attaching this patch to. I'm still not 100% sure what is desired there. (mozilla::TimeStamp does not have any sort of epoch, and we could just use ProcessCreation but then e10s would have different epochs for different processes.)

Anyway, this is what we need to get monotonic times for the GC statistics, and is better code anyway. (No more magic x / 1000 or x * 1000 to convert between seconds, milliseconds, microseconds.)
Attachment #8819112 - Flags: review?(jcoppeard)
Assignee: ttromey → sphink
Duplicate of this bug: 1312908
Comment on attachment 8819112 [details] [diff] [review]
Switch GC statistics over to TimeStamp/TimeDuration

Review of attachment 8819112 [details] [diff] [review]:
-----------------------------------------------------------------

This looks really good and is a great improvement over what we have already.  Not r+ing yet because the patch may be incomplete (see first comment).

::: dom/base/nsJSEnvironment.cpp
@@ +2126,5 @@
>  
>        if (!sShuttingDown) {
>          if (sPostGCEventsToObserver || Telemetry::CanRecordExtended()) {
>            nsString json;
> +          json.Adopt(aDesc.formatJSON(aCx, TimeStamp::Now()));

I can't see a change to GCDescription::formatJSON to take a TimeStamp rather than a uint64_t so this won't compile.  Did you post the right version of the patch?

::: js/src/gc/Nursery.cpp
@@ +181,5 @@
>      }
>  
>      PodZero(&startTimes_);
> +    PodZero(&profileDurations_);
> +    PodZero(&totalDurations_);

TimeDuration and TimeStamp have default constructors that initialise their contents to zero / the null timestamp, so we can remove these calls to PodZero.

@@ +504,5 @@
>      fprintf(stderr, "\n");
>  }
>  
>  /* static */ void
>  js::Nursery::printProfileTimes(const ProfileTimes& times)

This is not used and can go.  And I should really have called it printProfileDurations all along.

@@ +516,5 @@
> +/* static */ void
> +js::Nursery::printProfileDurations(const ProfileDurations& times)
> +{
> +    for (auto time : times)
> +        fprintf(stderr, " %6.0f", time.ToMilliseconds());

These times should be printed in microseconds otherwise they will all be zero (hopefully).

@@ +638,2 @@
>      rt->addTelemetry(JS_TELEMETRY_GC_MINOR_REASON, reason);
> +    if (totalTime.ToSeconds() > 1.0)

This should be ToMilliseconds().

::: js/src/gc/Nursery.h
@@ +328,5 @@
>          KeyCount
>      };
>  
> +    using ProfileTimes = mozilla::EnumeratedArray<ProfileKey, ProfileKey::KeyCount, mozilla::TimeStamp>;
> +    using ProfileDurations = mozilla::EnumeratedArray<ProfileKey, ProfileKey::KeyCount, mozilla::TimeDuration>;

nit: lines longer than 100 characters.

::: js/src/gc/Statistics.cpp
@@ +695,5 @@
>      char buffer[1024];
>      SprintfLiteral(buffer, format,
>                     (unsigned long long)timestamp,
> +                   longest.ToMilliseconds(),
> +                   total.ToMilliseconds(),

Oh wow, that really is much better :)

@@ +1396,5 @@
>      fprintf(stderr, "\n");
>  }
>  
>  /* static */ void
> +Statistics::printProfileTimes(const ProfileDurations& times)

This should probably get renamed to printProfileDurations too.

::: js/src/gc/Statistics.h
@@ +403,5 @@
>  #undef DEFINE_TIME_KEY
>          KeyCount
>      };
>  
> +    using ProfileDurations = mozilla::EnumeratedArray<ProfileKey, ProfileKey::KeyCount, mozilla::TimeDuration>;

nit: line too long
Oh, sorry for the nsJSEnvironment thing. That was a part of my original patch that I had to revert. We print out some timestamps relative to the epoch in some places, and mozilla::TimeStamp doesn't have any notion of an epoch. (The closest is ProcessCreation(), but that's different between eg different e10s processes.) So I reverted that part, but only tested by building the shell.

I don't know if the Statistics { using TimeStamp = mozilla::TimeStamp; } in this version is sketchy.
Attachment #8819486 - Flags: review?(jcoppeard)
Attachment #8819112 - Attachment is obsolete: true
Attachment #8819112 - Flags: review?(jcoppeard)
Comment on attachment 8819486 [details] [diff] [review]
Switch GC statistics over to TimeStamp/TimeDuration

Review of attachment 8819486 [details] [diff] [review]:
-----------------------------------------------------------------

This looks great apart from some issues about startTimestamp/endTimestamp.  Sorry for all the comments!

::: js/src/gc/Nursery.cpp
@@ +179,5 @@
>          }
>          reportTenurings_ = atoi(env);
>      }
>  
>      PodZero(&startTimes_);

TimeStamp also has a default constructor so this can go too I think.

::: js/src/gc/Statistics.cpp
@@ +743,5 @@
>                     gc::StateName(slice.finalState),
>                     budgetDescription,
>                     pageFaults,
>                     slice.start,
>                     slice.end);

This doesn't compile for me.  I'm not sure what's required here.

::: js/src/gc/Statistics.h
@@ +302,5 @@
>          JS::gcreason::Reason reason;
>          gc::State initialState, finalState;
>          gc::AbortReason resetReason;
> +        TimeStamp start, end;
> +        int64_t startTimestamp, endTimestamp;

I'm also not sure about these change to startTimestamp and endTimestamp.  These were added in bug 1159506 and were set to JS_GetEmbedderTime as an alternative to start and end which were set to PRMJ_Now.  This patch changes them to use PRMJ_Now which seems wrong...

I think the intention is to use TimeStamp for these but I'm not sure how we are going to expose that through debugger-created JS objects which is where this data ends up.
(In reply to Jon Coppeard (:jonco) from comment #6)

> I'm also not sure about these change to startTimestamp and endTimestamp. 
> These were added in bug 1159506 and were set to JS_GetEmbedderTime as an
> alternative to start and end which were set to PRMJ_Now.  This patch changes
> them to use PRMJ_Now which seems wrong...

Yeah, the timeline marker code needs TimeStamps, not PRMJ_Now.
JS_GetEmbedderTime can be removed now that TimeStamp is in mozglue... I will
file a follow-up bug for that.
Blocks: 1324434
Sorry to make you look through this again. Bugzilla's interdiff thing might work? Though I probably rebased, so maybe not.

Note that this patch requires at least one followup, because it revealed that I am mishandling nested GCs.
Attachment #8820943 - Flags: review?(jcoppeard)
Attachment #8819486 - Attachment is obsolete: true
Attachment #8819486 - Flags: review?(jcoppeard)
The code currently attempts to record the time spent in the JS_GC_BEGIN and JS_GC_END phases, which are when GC pre/post callbacks will be invoked. These callbacks are allowed to trigger a "nested" GC themselves. The code was attempting to account for this by "suspending" the (eg) JS_GC_BEGIN phase and later resuming it. But there is only one memory location that stores the time spent in that phase, so the nested GC's timings for that phase would actually include the beginning part of the parent phase. Also, the timings are cleared when they get reported, which meant that when resuming the parent phase, it had a bogus start time. I'm surprised it didn't report that phase taking about 1482364987 seconds or so? Or maybe it did; nesting is rare in practice.

Also, note that the nesting can theoretically be arbitrarily deep, in which case, what is getting timed is even weirder.

It's a fringe case, and it would be nice to record these sorts of timings, but I couldn't come up with a simple way of doing it, so I just ripped the whole mechanism out. Some of the support code is still there, now only used for the PHASE_MUTATOR pseudo-phase which itself could probably be ripped out. But for now, I just want whatever is necessary to land the TimeStamp patch.

If I ever get around to finishing up the gcstats rewrite/simplification, I'll probably murder PHASE_MUTATOR along with it. I was using it for some benchmarking stuff, but it's not worth the trouble -- originally, the implementation was simpler, but then I added the nested GC stuff and it turned out to handle PHASE_MUTATOR as a side effect, and I don't really want to bother going back to the simpler implementation. I'd rather remove it.
Attachment #8820957 - Flags: review?(jcoppeard)
It did a pretty good job of an interdiff -- see https://bugzilla.mozilla.org/attachment.cgi?oldid=8819486&action=interdiff&newid=8820943&headers=1

but note that several of the changes in the interdiff are actually just reverting changes I shouldn't have made in the first place. It's still a lot shorter, though!
Comment on attachment 8820943 [details] [diff] [review]
Switch GC statistics over to TimeStamp/TimeDuration

Review of attachment 8820943 [details] [diff] [review]:
-----------------------------------------------------------------

Cheers for the updates.

::: js/src/gc/Statistics.cpp
@@ +742,5 @@
>                     gc::StateName(slice.initialState),
>                     gc::StateName(slice.finalState),
>                     budgetDescription,
>                     pageFaults,
>                     slice.start,

This still doesn't work - do we need to do the "timestamp minus process creation time" thing here?
Attachment #8820943 - Flags: review?(jcoppeard) → review+
Comment on attachment 8820957 [details] [diff] [review]
Stop tracking GC begin/end callback time since it did not handle nesting

Review of attachment 8820957 [details] [diff] [review]:
-----------------------------------------------------------------

OK great, let's just get rid of this and simplify the code.

Please leave suspendPhases though, I'm going to use it in bug 1297122.
Attachment #8820957 - Flags: review?(jcoppeard) → review+
(In reply to Jon Coppeard (:jonco) from comment #11)
> ::: js/src/gc/Statistics.cpp
> @@ +742,5 @@
> >                     gc::StateName(slice.initialState),
> >                     gc::StateName(slice.finalState),
> >                     budgetDescription,
> >                     pageFaults,
> >                     slice.start,
> 
> This still doesn't work - do we need to do the "timestamp minus process
> creation time" thing here?

Weird. My compiler didn't barf on this. On my try push, osx and linux asan found this but other linux and windows did not. Grr.
Pushed by sfink@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8027870254d1
Switch GC statistics over to TimeStamp/TimeDuration, r=jonco
https://hg.mozilla.org/integration/mozilla-inbound/rev/f647e2eaa1ff
Stop tracking GC begin/end callback time since it did not handle nesting, r=jonco
https://hg.mozilla.org/mozilla-central/rev/8027870254d1
https://hg.mozilla.org/mozilla-central/rev/f647e2eaa1ff
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 53
Depends on: 1326343
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.