Closed Bug 1159486 Opened 6 years ago Closed 6 years ago

Profiler timestamps should use Timestamp and be consistent with timeline

Categories

(Core :: Gecko Profiler, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: shu, Assigned: tromey)

References

Details

Attachments

(1 file, 18 obsolete files)

41.85 KB, patch
tromey
: review+
Details | Diff | Splinter Review
Devtools folks raise the issue that various information that have timestamps attached to them, profiler included, are all offset from different starting times. It would be much easier to deal with if everything was offset from a uniform start time: TimeStamp::ProcessCreation.

Currently, the profiler records a start time on every call to nsIProfiler.StartProfiler. We should change the start time to be TimeStamp::ProcessCreation.
Summary: Offset profiler timestamps from process creation time → Profiler timestamps should use TimeStamp::ProcessCreation
Priority: -- → P1
So far this has hit one difficulty, which is that a profile marker is created
in ProfilerIOInterposeObserver:

  IOMarkerPayload* markerPayload = new IOMarkerPayload(aObservation.Reference(),
                                                       filename.get(),
                                                       aObservation.Start(),
                                                       aObservation.End(),
                                                       stack);

The start and end times are TimeStamps, and this Observation class seems to
be used in many places.
(In reply to Tom Tromey :tromey from comment #2)
> So far this has hit one difficulty, which is that a profile marker is created
> in ProfilerIOInterposeObserver:
> 
>   IOMarkerPayload* markerPayload = new
> IOMarkerPayload(aObservation.Reference(),
>                                                        filename.get(),
>                                                        aObservation.Start(),
>                                                        aObservation.End(),
>                                                        stack);
> 
> The start and end times are TimeStamps, and this Observation class seems to
> be used in many places.

We intentionally added start and end to the observation as opposed to just the duration so that we don't lose information about the start time. Really the intention with this data is to subtract start from end to obtain a duration -- in that case the process creation time is irrelevant.

I can understand the need for making those timestamps relative to process creation time in the case of setting a marker in the profiler timeline, but I'd prefer that code to be localized to ProfilerIOInterposeObserver as opposed to modifying the Observation class.
(In reply to Aaron Klotz [:aklotz] (please use needinfo) from comment #3)

> I can understand the need for making those timestamps relative to process
> creation time in the case of setting a marker in the profiler timeline, but
> I'd prefer that code to be localized to ProfilerIOInterposeObserver as
> opposed to modifying the Observation class.

Despite the bug title (I'll update it) the current plan is to try
to use JS_Now (aka PRMJ_Now) for times across all the devtools actors.

I think Observation makes sense as written.  The issue is that it is using
the wrong time API for us.  The fundamental problem is that we can't use
TimeStamp from SpiderMonkey.

I was thinking of having it capture the times using both APIs.
What do you think of that?
Flags: needinfo?(aklotz)
Summary: Profiler timestamps should use TimeStamp::ProcessCreation → Profiler timestamps should use JS_Now
Another stumbling block is that this has to touch CompositorParent::PostInsertVsyncProfilerMarker,
which gets its time stamp from CompositorVsyncDispatcher, which is part of a larger
facility.  See VsyncDispatcher.h etc.

I am starting to think we should go back to trying to make the allocations use TimeStamp somehow.
This is getting out of hand.

More generally it seems absurd to have two incompatible time notions.  So I'm
also wondering if we could move TimeStamp to be usable by SpiderMonkey.
Attached patch convert profiler to use doubles (obsolete) — Splinter Review
Here's a patch to change the profiler not to lose timestamp precision.
This patch changes the profiler to set sStartTime to the process
creation time, just once.
Attachment #8598941 - Attachment is obsolete: true
These patches (untested) show what we'd do in the profiler if we abandon
the JS_Now approach in favor of TimeStamp with process creation as the epoch.

Clearing the NI since this change renders the previous question irrelevant.
Flags: needinfo?(aklotz)
Attached patch convert profiler to use doubles (obsolete) — Splinter Review
Rebased.
Attachment #8603558 - Attachment is obsolete: true
Attachment #8603559 - Attachment is obsolete: true
Rebased.
Summary: Profiler timestamps should use JS_Now → Profiler timestamps should use Timestamp and be consistent with timeline
Attached patch convert profiler to use doubles (obsolete) — Splinter Review
Rebased.
Attachment #8608155 - Attachment is obsolete: true
Attachment #8608156 - Attachment is obsolete: true
Attached patch convert profiler to use doubles (obsolete) — Splinter Review
Rebased.
Attachment #8609482 - Attachment is obsolete: true
Jordan, this one might require some front-end changes, but I am not completely sure.
Flags: needinfo?(jsantell)
I think bug 1167800 will handle the front end stuff. We'll need to land that first
Flags: needinfo?(jsantell)
This is fine to independently land.
No longer depends on: 1167800
Assignee: nobody → ttromey
Status: NEW → ASSIGNED
Attached patch convert profiler to use doubles (obsolete) — Splinter Review
Rebased.
Attachment #8609480 - Attachment is obsolete: true
Attachment #8609481 - Attachment is obsolete: true
Attachment #8614821 - Flags: review?(shu)
Comment on attachment 8614821 [details] [diff] [review]
convert profiler to use doubles

Let me take that back for the moment, I see one hunk has to move.
Attachment #8614821 - Flags: review?(shu)
Attached patch convert profiler to use doubles (obsolete) — Splinter Review
Move one hunk from the second patch, and remember to update the UUID.
Attachment #8614821 - Attachment is obsolete: true
Attachment #8614822 - Attachment is obsolete: true
Attachment #8614830 - Flags: review?(shu)
Attachment #8614831 - Flags: review?(shu)
Comment on attachment 8614830 [details] [diff] [review]
convert profiler to use doubles

Review of attachment 8614830 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for doing this. With the recent un-packing on ARM and doubling the size of each entry, it would be good to make sure enabling profiling on B2G doesn't get us OOM-killed.

::: tools/profiler/GeckoProfilerFunc.h
@@ +58,2 @@
>  
> +JSObject *mozilla_sampler_get_profile_data(JSContext *aCx, double aSinceTime);

Existing, though could you convert these to T* style while you're here?

::: tools/profiler/ProfileEntry.h
@@ +67,5 @@
>      const char* mTagData;
>      char        mTagChars[sizeof(void*)];
>      void*       mTagPtr;
>      ProfilerMarker* mTagMarker;
>      float       mTagFloat;

Is there a need for mTagFloat anymore? All existing uses of floats could be converted to use the double field without loss of precision.

::: tools/profiler/PseudoStack.h
@@ +234,5 @@
>      // threads would not have any samples to copy forward while sleeping.
>      mSleepId++;
>    }
>  
> +  void addMarker(const char *aMarkerStr, ProfilerMarkerPayload *aPayload, double aTime)

Existing. Please convert to T* while you're here.

::: tools/profiler/TableTicker.cpp
@@ +213,5 @@
>    SpliceableJSONWriter b(mozilla::MakeUnique<OStreamJSONWriteFunc>(stream));
>    StreamJSON(b, aSinceTime);
>  }
>  
> +JSObject* TableTicker::ToJSObject(JSContext *aCx, double aSinceTime)

Existing. Please convert to T*.

::: tools/profiler/TableTicker.h
@@ +192,5 @@
>      return mPrimaryThreadProfile;
>    }
>  
> +  void ToStreamAsJSON(std::ostream& stream, double aSinceTime = 0);
> +  virtual JSObject *ToJSObject(JSContext *aCx, double aSinceTime = 0);

T*

::: tools/profiler/nsProfiler.cpp
@@ +116,5 @@
>    return NS_OK;
>  }
>  
>  NS_IMETHODIMP
> +nsProfiler::GetProfile(double aSinceTime, char **aProfile)

T*

::: tools/profiler/platform.cpp
@@ +182,5 @@
>  }
>  
>  ProfilerMarker::ProfilerMarker(const char* aMarkerName,
>      ProfilerMarkerPayload* aPayload,
> +    double aTime)

Existing. Mind fixing the indentation?

@@ +570,5 @@
>  
>    return t->ToJSON(aSinceTime);
>  }
>  
> +JSObject *mozilla_sampler_get_profile_data(JSContext *aCx, double aSinceTime)

T*
Attachment #8614830 - Flags: review?(shu) → review+
Wilson, I remember you testing profiling on b2g. Do you mind testing on flame-kk or something with Tom's patches and make sure that the increased memory use of the profiler doesn't result in OOM?
Flags: needinfo?(wilsonpage)
If this does OOM b2g, are there any compromises we can do? Less points of precision?
Comment on attachment 8614831 [details] [diff] [review]
make profiler use the process creation time

Review of attachment 8614831 [details] [diff] [review]:
-----------------------------------------------------------------

profiler_time and nsProfiler::GetElapsedTime are now poorly named, since we effectively removed the notion of "profiler time" for the notion of "time since process creation"

I'll file a followup to remove these and use nsDocShell::Now or something in both C++ and JS.
Attachment #8614831 - Flags: review?(shu) → review+
Blocks: 1172161
I applied patched but getting a build error [1].

[1] https://pastebin.mozilla.org/8836100
Flags: needinfo?(wilsonpage)
(In reply to Jordan Santell [:jsantell] [@jsantell] from comment #24)
> If this does OOM b2g, are there any compromises we can do? Less points of
> precision?

Reducing the precision will make the front end trickier.
There was some discussion of this in bug 1152829.

Another option might be to shrink the buffer a bit.
There are already tweaks (see
https://dxr.mozilla.org/mozilla-central/source/tools/profiler/GeckoProfilerImpl.h?from=GeckoProfilerImpl.h#331 )
in place for this.
(In reply to Wilson Page [:wilsonpage] from comment #26)
> I applied patched but getting a build error [1].
> 
> [1] https://pastebin.mozilla.org/8836100

I looked at this, but these errors don't come from this patch.
The fatal error seems to be a compiler bug, or perhaps just a glitch:

/home/vagrant/gecko-dev/js/src/jit/BaselineJIT.cpp:1116:31: note: '*((void*)(& i)+12).js::gc::ArenaCellIterImpl::span.js::gc::FreeSpan::first' was declared here
{standard input}: Assembler messages:
{standard input}:2826713: Warning: end of file in string; '"' inserted
arm-linux-androideabi-g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://source.android.com/source/report-bugs.html> for instructions.
Attached patch convert profiler to use doubles (obsolete) — Splinter Review
Rebased and updated according to the review.
Attachment #8614830 - Attachment is obsolete: true
Attachment #8614831 - Attachment is obsolete: true
Rebased.
Attachment #8617433 - Flags: review+
Attachment #8617434 - Flags: review+
Keywords: checkin-needed
5913 INFO TEST-UNEXPECTED-FAIL | browser/devtools/performance/test/browser_timeline-waterfall-rerender.js | A subset of the total markers was selected.

Looks pretty permafail in that Try push.
Keywords: checkin-needed
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #32)
> 5913 INFO TEST-UNEXPECTED-FAIL |
> browser/devtools/performance/test/browser_timeline-waterfall-rerender.js | A
> subset of the total markers was selected.
> 
> Looks pretty permafail in that Try push.

This is an unrelated intermittent that was fixed recently -- after the
baseline revision of my try push.  See bug 1170105.

Also see the try push in bug 1159507, which includes the patches from this bug,
but which has a newer baseline.  It does not trigger this failure.
Keywords: checkin-needed
Actually, I think something else is wrong with this patch.
It passes all tests but interactively I am seeing some weird behavior.
So, removing checkin-needed while I investigate.
Keywords: checkin-needed
And FWIW this needed a rebase and more changes as well.
(In reply to Tom Tromey :tromey from comment #34)
> Actually, I think something else is wrong with this patch.
> It passes all tests but interactively I am seeing some weird behavior.
> So, removing checkin-needed while I investigate.

The problem turns out to be an instance of bug 1152441.
I don't think this patch should land until that is worked out.
Depends on: 1152441
Attached patch convert profiler to use doubles (obsolete) — Splinter Review
Rebased; needed some more float->double changes.
Attachment #8617433 - Attachment is obsolete: true
Attachment #8617434 - Attachment is obsolete: true
Rebased; one hunk had to move to a different file.
This is the front end change.  I also changed the nsIProfiler so that
users can request the current "profiler time" even when the profiler
is not running.  This lets us compute the start time before any
possible profiler events are created.

On the Java side I just kept the existing behavior, but maybe it's
better to simply drop the mJavaTime stuff in favor of always using the
profiler time.
Attachment #8622674 - Flags: review+
Attachment #8622675 - Flags: review+
Comment on attachment 8622679 [details] [diff] [review]
get correct start time from profiler actor

I needed this to make sure the profile actually sync correctly in the
performance tool.  Otherwise the reported start time is 0, leading to
strange-looking results.

I think I'll have to squash this with the previous patch for it to
make sense (or, really, squash all 3 to avoid updating the UUID twice).

Also I think the "sync epoch" patch from bug 1152829 will be needed as well.
I am thinking I may need to roll part of that into this in order to make
sure that, after this patch, the profiler and timeline agree on a start time.
Attachment #8622679 - Flags: review?(shu)
Attachment #8622679 - Flags: review?(jsantell)
Comment on attachment 8622679 [details] [diff] [review]
get correct start time from profiler actor

Review of attachment 8622679 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, but the prop name of the start time maybe should be "currentTime" like the other methods for consistency (the fake profiler front still exposes "startTime" however)
Attachment #8622679 - Flags: review?(jsantell) → review+
Comment on attachment 8622679 [details] [diff] [review]
get correct start time from profiler actor

Review of attachment 8622679 [details] [diff] [review]:
-----------------------------------------------------------------

Nice.
Attachment #8622679 - Flags: review?(shu) → review+
You can land the patch in bug 1152829, but might make a change to ignore allocations if that's not yet ready. Also, and maybe preferable, not land that patch yet, but we can check in tests that the start times of the separate underlying actors are using the new time system, or synthesize a fake profile and that markers/frames line up correctly
No longer depends on: 1152441
(In reply to Jordan Santell [:jsantell] [@jsantell] from comment #43)
> You can land the patch in bug 1152829, but might make a change to ignore
> allocations if that's not yet ready. Also, and maybe preferable, not land
> that patch yet, but we can check in tests that the start times of the
> separate underlying actors are using the new time system, or synthesize a
> fake profile and that markers/frames line up correctly

What I was thinking was that, for the profiler time change specifically,
we have a way to tell if the actor is providing a start time that is consistent
with the timeline; see appended patch.

This seemed necessary for this patch due to the call to RecordingUtils.offsetSampleTimes.
Once the other patches are in, I think this can be replaced by the patch in
bug 1152829 -- I just didn't want to put in this patch and have the tree be
in an inconsistent state.

diff --git a/browser/devtools/performance/modules/logic/front.js b/browser/devtools/performance/modules/logic/front.js
index b7e0377..021fbf5 100644
--- a/browser/devtools/performance/modules/logic/front.js
+++ b/browser/devtools/performance/modules/logic/front.js
@@ -327,6 +327,14 @@ PerformanceFront.prototype = {
     let timelineStartTime = yield this._timeline.start(options);
     let memoryStartTime = yield this._memory.start(options);
 
+    if (startTime) {
+      // A non-zero profiler start time indicates that the profiler
+      // and the timeline have synced epochs.  So, choose the lower of
+      // the two as the start time.
+      startTime = Math.min(startTime, timelineStartTime);
+      timelineStartTime = startTime;
+    }
+
     let data = {
       profilerStartTime: startTime, timelineStartTime, memoryStartTime,
       generation, position, totalSize
Rebased and squashed.  After discussing with Jordan we decided to drop
the final time-synching hunk in favor of a later commit, as the
current patch does not make the current situation notably worse.
Attachment #8622674 - Attachment is obsolete: true
Attachment #8622675 - Attachment is obsolete: true
Attachment #8622679 - Attachment is obsolete: true
Attachment #8623230 - Flags: review+
A couple of tests checked that the profiler's initial start time was
0; but this is no longer true.
Attachment #8623230 - Attachment is obsolete: true
Attachment #8623476 - Flags: review+
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/4583dd29ef0e
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
You need to log in before you can comment on or make changes to this bug.