Closed Bug 1154115 Opened 5 years ago Closed 5 years ago

Rewrite profile JSON streaming

Categories

(Core :: Gecko Profiler, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox40 --- fixed
firefox41 --- fixed

People

(Reporter: shu, Assigned: shu)

References

Details

Attachments

(9 files, 11 obsolete files)

20.22 KB, patch
djvj
: review+
Details | Diff | Splinter Review
44.43 KB, application/json
Details
6.41 KB, application/javascript
Details
107.48 KB, patch
mstange
: review+
Details | Diff | Splinter Review
7.13 KB, patch
vporof
: review+
jsantell
: review+
Details | Diff | Splinter Review
19.56 KB, patch
jsantell
: review+
Details | Diff | Splinter Review
59.98 KB, patch
vporof
: review+
Details | Diff | Splinter Review
119.17 KB, patch
jsantell
: review+
Details | Diff | Splinter Review
4.26 KB, patch
bgrins
: review+
Details | Diff | Splinter Review
Profiler data streaming time and JSON parsing time are LUDICROUS. We can get pretty far with naive deduplication of frame data.
Profiling a single run of js/src/octane/run.js:

Control
=======
Streaming to string:  2352 ms
Parsing string to JS: 1427 ms
JSON string size:     78905734 bytes

Dedup
=====
Streaming to string:  763 ms
Parsing string to JS: 155 ms
JSON string size:     10220947 bytes
Attachment #8592009 - Flags: review?(kvijayan)
Note that this changes the structure of the profile data.

Whereas before frame objects were inline, now frame objects are OOL. The "frames" property on a sample is now an array of integers, indices into the "framesTable" array, a sibling node to the "samples" array.
Comment on attachment 8592009 [details] [diff] [review]
Deduplicate frames when streaming profiler data.

Oops, how this handles shutdown streaming is wrong.
Attachment #8592009 - Flags: review?(kvijayan)
v2, fixes shutdown streaming issue.
Attachment #8592009 - Attachment is obsolete: true
Attachment #8592021 - Flags: review?(kvijayan)
Attached patch Deduplicate prefixes of stacks. (obsolete) — Splinter Review
This gets another 1MB on the Octane profile, at the cost of extra work for the
frontend.
(In reply to Shu-yu Guo [:shu] from comment #5)
> Created attachment 8592132 [details] [diff] [review]
> Deduplicate prefixes of stacks.
> 
> This gets another 1MB on the Octane profile, at the cost of extra work for
> the
> frontend.

The format for "stackTable" is:

"stackTable": [
  ...,
  [ leafFrameIndex] | [ stackPrefixIndex, leafFrameIndex ],
  ...
]

To recover the stack at index N:

stack(N) =
  if stackTable[N].length == 2
    stack(stackTable[N][0]) ++ [frameTable[stackTable[N][1]]
  else
    [frameTable[stackTable[N][0]]]
Comment on attachment 8592021 [details] [diff] [review]
Deduplicate frames when streaming profiler data.

Review of attachment 8592021 [details] [diff] [review]:
-----------------------------------------------------------------

See comments.  Otherwise looks good.  Nice results!

::: js/src/jit/OptimizationTracking.cpp
@@ +1275,1 @@
>  {

MOZ_ASSERT_IF(optsIndexOut, !*optsIndexOut)  at start.

@@ +1282,5 @@
> +    Maybe<uint8_t> optsIndex = entry.trackedOptimizationIndexAtAddr(addr, &entryOffset);
> +    if (optsIndexOut)
> +        *optsIndexOut = optsIndex;
> +    if (optsIndex.isSome())
> +        return (void*)(((uint8_t*) entry.nativeStartAddr()) + entryOffset);

From what I can see, what you are doing here is normalizing a regular returnaddr into jitcode to a well-defined one indicated by its entry or optimization site.

Do we really want to do that?  Consider the situation where we want to record line numbers of leaf frames.  We might have two leaf frame addresses 0xA and 0xB.  Neither of them have optinfo, so we 'normalize' both of them to entry.nativeStartAddr() and lose the opportunity to emit line number information.

Do we really need address normalization?  For returnAddrs, they will be shared implicitly so normalization is meaningless.  For leaf addresses, there is only 1 per sample and it shouldn't hurt that bad to leave them un-normalized and take the deduplication hit associated with not having them sort well into shared addresses.

::: tools/profiler/ProfileEntry.cpp
@@ +709,5 @@
> +        mSavedUniqueFrames->stream(b, mPseudoStack->mRuntime);
> +        mSavedStreamedUniqueFrames.clear();
> +        mSavedUniqueFrames.reset();
> +      } else {
> +        freshUniqueFrames.stream(b, mPseudoStack->mRuntime);

Question about how this works with regards to interaction with flushing unique frames on shutdown.

When we flush, we would symbolicate the frame data and save it as strings into mSavedStreamedUniqueFrames and mSavedStreamSamples.  After this, the sampler would still run and collect more samples (which would be pure native + pseudostack traces with no JS stack included).

This suggests that at streaming time, we may have _both_ mSavedStreamedUniqueFrames and freshUniqueFrames be nonempty, with both having to be streamed.

Likewise, above, we may have BOTH mSavedStreamSamples be non-empty, and have more un-flushed samples we have still collected that need to be streamed.

Is this case possible, and if so is it handled?  Or is it not possible, or otherwise does not need to be handled?
Attachment #8592021 - Flags: review?(kvijayan) → review+
Summary: Deduplicate frames in profiler data → Rewrite profile JSON streaming
Duplicate of this bug: 1154128
- Deduplicate strings
- Deduplicate frames
- Deduplicate stack prefixes
- Emit "schema" JSON, e.g., [1,2] instead of { "foo": 1, "bar": 2 }
- Remove JSStreamWriter in favor of mfbt's JSONWriter

This is about 4x+ faster and uses 10x+ less space.
Attachment #8592021 - Attachment is obsolete: true
Attachment #8592132 - Attachment is obsolete: true
Attachment #8596305 - Flags: review?(mstange)
Attached patch Rewrite profiler JSON streaming. (obsolete) — Splinter Review
Oops, forgot to add some files.
Attachment #8596305 - Attachment is obsolete: true
Attachment #8596305 - Flags: review?(mstange)
Attachment #8596307 - Flags: review?(mstange)
Comment on attachment 8596303 [details] [diff] [review]
Rewrite the JSAPI profiling API to use a FrameHandle, as to avoid multiple lookups in JitcodeGlobalTable.

Review of attachment 8596303 [details] [diff] [review]:
-----------------------------------------------------------------

Not done review yet.. but I made these comments so far.

::: js/public/ProfilingFrameIterator.h
@@ -129,5 @@
>      bool iteratorDone();
>  };
>  
> -extern JS_PUBLIC_API(ProfilingFrameIterator::FrameKind)
> -GetProfilingFrameKindFromNativeAddr(JSRuntime* runtime, void* pc);

Nice.  Seems like this is unused now.

@@ +185,2 @@
>      // Called once per frame.
> +    virtual void operator()(const FrameHandle& frame) = 0;

I'm a bit confused.  You're changing the abstract interface for ForEachProfiledFrameOp, but I don't see any changes to StreamJSFramesOp (in ProfileEntry.cpp), which implements this interface and presumably overrides the method.

::: js/src/jit/JitcodeMap.h
@@ +714,5 @@
>          MOZ_ASSERT(isQuery());
>          return query_;
>      }
>  
> +    void* regionNativeStartAddrAtAddr(JSRuntime* rt, void* ptr) const {

This name seems confusing.  Should we call this 'normalizedNativeAddrFor' or 'canonicalNativeAddrFor' or something instead?  'regionNativeStartAddr' conveys its implementation, but not the intent.  It would seem the intent is more relevant.
(In reply to Kannan Vijayan [:djvj] from comment #12)
> Comment on attachment 8596303 [details] [diff] [review]
> Rewrite the JSAPI profiling API to use a FrameHandle, as to avoid multiple
> lookups in JitcodeGlobalTable.
> 
> Review of attachment 8596303 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Not done review yet.. but I made these comments so far.
> 
> ::: js/public/ProfilingFrameIterator.h
> @@ -129,5 @@
> >      bool iteratorDone();
> >  };
> >  
> > -extern JS_PUBLIC_API(ProfilingFrameIterator::FrameKind)
> > -GetProfilingFrameKindFromNativeAddr(JSRuntime* runtime, void* pc);
> 
> Nice.  Seems like this is unused now.
> 
> @@ +185,2 @@
> >      // Called once per frame.
> > +    virtual void operator()(const FrameHandle& frame) = 0;
> 
> I'm a bit confused.  You're changing the abstract interface for
> ForEachProfiledFrameOp, but I don't see any changes to StreamJSFramesOp (in
> ProfileEntry.cpp), which implements this interface and presumably overrides
> the method.

The change you're expecting is in the second part, exactly in StreamJSFramesOp.

> 
> ::: js/src/jit/JitcodeMap.h
> @@ +714,5 @@
> >          MOZ_ASSERT(isQuery());
> >          return query_;
> >      }
> >  
> > +    void* regionNativeStartAddrAtAddr(JSRuntime* rt, void* ptr) const {
> 
> This name seems confusing.  Should we call this 'normalizedNativeAddrFor' or
> 'canonicalNativeAddrFor' or something instead?  'regionNativeStartAddr'
> conveys its implementation, but not the intent.  It would seem the intent is
> more relevant.

Sure, in the second part on the profiler side, this is actually referred to as canonicalization.
Attachment #8596307 - Flags: feedback?(kvijayan)
Shu, can you attach an example profile with the new format? Does the format have an easy-to-find marker that consumers can check for, if they want to handle both traditional and new profiles?
Seconding the need for some kind of value indicating the type of response, like a version number or probably better, a property indicating that frame refs are used, as the perf tools must handle older gecko versions
Attached file example.json
Example JSON of new format. I'll upload a new patch shortly with a big block comment explaining the format.
Attached patch Rewrite profiler JSON streaming. (obsolete) — Splinter Review
Add new block comment explaining format of thread profiles + bump profile version to 3.
Attachment #8596307 - Attachment is obsolete: true
Attachment #8596307 - Flags: review?(mstange)
Attachment #8596307 - Flags: feedback?(kvijayan)
Attachment #8596813 - Flags: review?(mstange)
Attachment #8596813 - Flags: feedback?(kvijayan)
(In reply to Markus Stange [:mstange] from comment #14)
> Shu, can you attach an example profile with the new format? Does the format
> have an easy-to-find marker that consumers can check for, if they want to
> handle both traditional and new profiles?

The "meta" property will have "version": 3 instead of "version": 2
Attached patch Rewrite profiler JSON streaming. (obsolete) — Splinter Review
I suck, forgot to squash the commits. The comment is in ProfileEntry.h
Attachment #8596814 - Flags: review?(mstange)
Attachment #8596814 - Flags: feedback?(kvijayan)
Attachment #8596813 - Attachment is obsolete: true
Attachment #8596813 - Flags: review?(mstange)
Attachment #8596813 - Flags: feedback?(kvijayan)
JS function that takes the new, version schema'ed, deduplicated profile and inflates it back to an version 2 profile.
(In reply to Shu-yu Guo [:shu] from comment #20)
> Created attachment 8597171 [details]
> inflate-schema-profile.js
> 
> JS function that takes the new, version schema'ed, deduplicated profile and
> inflates it back to an version 2 profile.

How much time does this take? Does it eat the time improvements we got in the first place?

Would it make sense just to define an interface for interacting with the profiler data and implementing it for each profile version?
(In reply to Nick Fitzgerald [:fitzgen] from comment #21)
> (In reply to Shu-yu Guo [:shu] from comment #20)
> > Created attachment 8597171 [details]
> > inflate-schema-profile.js
> > 
> > JS function that takes the new, version schema'ed, deduplicated profile and
> > inflates it back to an version 2 profile.
> 
> How much time does this take? Does it eat the time improvements we got in
> the first place?
> 
> Would it make sense just to define an interface for interacting with the
> profiler data and implementing it for each profile version?

You have to pay the price of inflating it eventually, lazily or eagerly. We could rewrite a bunch of code to handle the new version -- but the verdict, I think, was that maintaining 2 versions would be hell.

Eager inflation + the deduplication patch should still be much faster than what we have now. Moreover, we can selectively inflate just the samples of the main thread, just the markers, etc.

The numbers from some random run of all of Octane run from xpcshell

Original
--------
>> SHU buf 75207023 bytes
>> SHU ToJSObject took 3460 ms

Dedup + Inflate
---------------
>> SHU buf 3784403 bytes
>> SHU ToJSObject took 389 ms
>> SHU inflate took 275 ms
>> SHU total 275 + 389 = 664 ms

Note that's also inflating all of the profile with markers and all - it's faster just inflating the samples. Even so, that's 20x smaller and 5x faster.
Rebased on top of aSinceTime stuff landing.
Attachment #8596814 - Attachment is obsolete: true
Attachment #8596814 - Flags: review?(mstange)
Attachment #8596814 - Flags: feedback?(kvijayan)
Attachment #8598324 - Flags: review?(mstange)
Attachment #8598324 - Flags: feedback?(kvijayan)
Comment on attachment 8596303 [details] [diff] [review]
Rewrite the JSAPI profiling API to use a FrameHandle, as to avoid multiple lookups in JitcodeGlobalTable.

Review of attachment 8596303 [details] [diff] [review]:
-----------------------------------------------------------------

Not done review yet.. but I made these comments so far.

::: js/public/ProfilingFrameIterator.h
@@ -129,5 @@
>      bool iteratorDone();
>  };
>  
> -extern JS_PUBLIC_API(ProfilingFrameIterator::FrameKind)
> -GetProfilingFrameKindFromNativeAddr(JSRuntime* runtime, void* pc);

Nice.  Seems like this is unused now.

@@ +185,2 @@
>      // Called once per frame.
> +    virtual void operator()(const FrameHandle& frame) = 0;

I'm a bit confused.  You're changing the abstract interface for ForEachProfiledFrameOp, but I don't see any changes to StreamJSFramesOp (in ProfileEntry.cpp), which implements this interface and presumably overrides the method.

::: js/src/jit/JitcodeMap.cpp
@@ +1561,5 @@
> +    label_(label),
> +    depth_(depth)
> +{
> +    updateHasTrackedOptimizations();
> +    updateCanonicalAddress();

Seems like this is the only user of |updateCanonicalAddress|.  It makes sense to put updateHasTrackedOptimizations into a method, since its implementation refers to tracked optimization details, but does canonical address need it's own method?

Also, would be better to rename |updateHasTrackedOptimizations| to |initHasTrackedOptimizations| since it's only called at init time.

::: js/src/jit/JitcodeMap.h
@@ +714,5 @@
>          MOZ_ASSERT(isQuery());
>          return query_;
>      }
>  
> +    void* regionNativeStartAddrAtAddr(JSRuntime* rt, void* ptr) const {

This name seems confusing.  Should we call this 'normalizedNativeAddrFor' or 'canonicalNativeAddrFor' or something instead?  'regionNativeStartAddr' conveys its implementation, but not the intent.  It would seem the intent is more relevant.
Attachment #8596303 - Flags: review?(kvijayan) → review+
Comment on attachment 8598324 [details] [diff] [review]
Rewrite profiler JSON streaming.

Review of attachment 8598324 [details] [diff] [review]:
-----------------------------------------------------------------

This is really good! Sorry it took me so long to get to this.

I think it would have helped if you could have split the patch into more pieces, e.g. (1) convert to the other stream writer, (2) add Maybe<>s everywhere, (3) do the actual changes. But it's fine now.

::: tools/profiler/ProfileEntry.cpp
@@ +606,5 @@
> +
> +  aWriter.StartArrayElement();
> +  {
> +    uint32_t index = 0;
> +    uint32_t lastNonNullIndex = 0;

Please add a comment here that this is to avoid adding nulls at the end of the array, since we only need them if a non-null element follows.

@@ -406,5 @@
>    while (readPos != mWritePos) {
>      ProfileEntry entry = mEntries[readPos];
>      if (entry.mTagName == 'T') {
>        currentThreadID = entry.mTagInt;
> -      hasCurrentTime = false;

Don't you need to replace this with a currentTime.reset()? Or did you determine that this line was unnecessary in the old code?

::: tools/profiler/ProfileEntry.h
@@ +100,5 @@
> +  void WriteElement(mozilla::JSONWriter& aWriter, const char* aStr) {
> +    aWriter.IntElement(GetIndex(aStr));
> +  }
> +
> +  uint32_t GetIndex(const char* aStr);

I think it would be useful to add a comment that this inserts the string into the string table if it doesn't exist yet.

@@ +269,5 @@
>    ProfilerMarkerLinkedList mStoredMarkers;
>  };
>  
> +//
> +// ThreadProfile JSON Format

This comment is amazingly useful. Thank you very much!

::: tools/profiler/ProfilerBacktrace.h
@@ +17,5 @@
>    explicit ProfilerBacktrace(SyncProfile* aProfile);
>    ~ProfilerBacktrace();
>  
> +  // ProfilerBacktraces' stacks are deduplicated in the context of the
> +  // containing profile in which the backtrace is as a marker payload.

containing or contained?

::: tools/profiler/ProfilerMarkers.cpp
@@ +122,4 @@
>  {
> +  streamCommonProps("innerHTML", aWriter, aUniqueStacks);
> +  // TODO: Finish me
> +  //aWriter.NameValue("innerHTML", "<img src=''/>");

Hah, I was going to complain, but then I saw the old code...
Attachment #8598324 - Flags: review?(mstange) → review+
Hey :shu, while you wait for your final feedback, is it possible for you to post an updated version of this patch? It's bitrotted, and I'd like to base any work I do for bug 1116188 on it.

Thanks!
Assignee: nobody → shu
Flags: needinfo?(shu)
Hey Mike, I'm working on the devtools side of the profiler -- I think both the performance tool and gecko profiler will need the same conversion for this, requiring two things:

* A property set somewhere indicating what kind of format the profiler data will be in (devtools can handle this in one way, but not sure if that'd be accessible to GP) -- maybe a property of the nsIProfiler.
* A way to transform pre-Fx40 (or whenever this lands) profiler data to the new version. We'll need this in devtools to support older Geckos, and this will be exposed in a module loadable with the SDK's loader, so Gecko Profiler can use this. I think Shu is working on that now in this patch as well
(In reply to Jordan Santell [:jsantell] [@jsantell] from comment #27)
> Hey Mike, I'm working on the devtools side of the profiler -- I think both
> the performance tool and gecko profiler will need the same conversion for
> this, requiring two things:
> 
> * A property set somewhere indicating what kind of format the profiler data
> will be in (devtools can handle this in one way, but not sure if that'd be
> accessible to GP) -- maybe a property of the nsIProfiler.

The profiler JSON has a "version" property. Is that not good enough?

> * A way to transform pre-Fx40 (or whenever this lands) profiler data to the
> new version. We'll need this in devtools to support older Geckos, and this
> will be exposed in a module loadable with the SDK's loader, so Gecko
> Profiler can use this. I think Shu is working on that now in this patch as
> well

I'll write this soon.
Attached patch Rewrite profiler JSON streaming. (obsolete) — Splinter Review
Rebased for mconley.
Nope, a version property is good enough for our use case -- if we need to know it's format before hand, like if we're determining profiler support (like if we wanted to stop supporting the "old way"), then we'll have other ways to do that just fine.
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #26)
> Hey :shu, while you wait for your final feedback, is it possible for you to
> post an updated version of this patch? It's bitrotted, and I'd like to base
> any work I do for bug 1116188 on it.
> 
> Thanks!

Here you go.
Flags: needinfo?(shu)
Comment on attachment 8600475 [details] [diff] [review]
Make the performance devtool handle the new profiler JSON format.

Review of attachment 8600475 [details] [diff] [review]:
-----------------------------------------------------------------

With this I saw about a 7x speedup for computing the call tree for all of Octane. On my machine it takes about ~450ms (down from ~3000ms) to build the tree. It takes much longer to display.

Things still WIP:

- Tests still broken (maybe you guys wanna help fix?)
- No adapter from old->new version
- Left some measurement printfs in
Attachment #8600475 - Flags: review?(vporof)
Attachment #8600475 - Flags: review?(jsantell)
Comment on attachment 8600475 [details] [diff] [review]
Make the performance devtool handle the new profiler JSON format.

Review of attachment 8600475 [details] [diff] [review]:
-----------------------------------------------------------------

All in all, there's a lot going on here -- there's the inflation occuring, the rewriting of the tree model, and also many rewrites for optimization reasons (I think, although it's not clear to me).

Generally:

* All the optimization stuff makes less clear what's occurring -- and harder to maintain, and review. This should be done separately, and in an isolated way, in a place that we'll ideally never have to look at.
* Prefer objects as first arguments over taking X properties from an object and spreading them out over several arguments in a function
* More documentation and rational behind what's going on, especially in the cases where things are done for optimization reasons

::: browser/devtools/performance/modules/recording-utils.js
@@ -12,5 @@
>  
>  exports.RecordingUtils = {};
>  
>  /**
> - * Filters all the samples in the provided profiler data to be more recent

We'll need to keep this for backwards compat.

::: browser/devtools/performance/views/details-js-call-tree.js
@@ +82,2 @@
>  
> +    var start = Date.now();

remove debugging info

@@ +82,3 @@
>  
> +    var start = Date.now();
> +    let threadNode = new ThreadNode(thread.samples,

We should just pass in `thread` as the first argument, with option as the second

@@ +86,5 @@
> +                                    thread.frameTable,
> +                                    thread.stringTable,
> +                                    { startTime, endTime,
> +                                      contentOnly, invertTree, flattenRecursion });
> +    console.log(">> SHU prepareCallTree took " + (Date.now() - start) + " ms\n");

remove debugging info

::: browser/devtools/performance/views/details-js-flamegraph.js
@@ +62,2 @@
>  
> +    var start = Date.now();

remove debugging info

@@ +62,3 @@
>  
> +    var start = Date.now();
> +    let data = FlameGraphUtils.createFlameGraphDataFromSamples(thread.samples,

same thing, just should pass in `thread` as first argument

@@ +71,3 @@
>        showIdleBlocks: PerformanceController.getOption("show-idle-blocks") && L10N.getStr("table.idle")
>      });
> +    console.log(">> SHU flame graph took " + (Date.now() - start) + " ms");

remove debugging info

::: browser/devtools/shared/profiler/frame-utils.js
@@ +38,5 @@
>  /**
>   * Parses the raw location of this function call to retrieve the actual
>   * function name, source url, host name, line and column.
>   */
> +exports.parseLocation = function parseLocation(location, fallbackLine, fallbackColumn) {

Are all these changes to parseLocation faster? By how much? This is pretty unintuitive. Which of these changes are necessary for this patch? I'd rather minimal changes here, and we can handle improving parseLocation and isContent in bug 1160581

@@ +122,5 @@
> + *        If a chrome frame, the category.
> + * @return boolean
> + *         True if a content frame, false if a chrome frame.
> + */
> +function isContent(location, category) {

would prefer the signature here to still be an object

@@ +128,5 @@
> +  if (category) {
> +    return false;
> +  }
> +
> +  // Locations in frames with function names look like "functionName

"functionName (foo://bar)" should be all in one line, so its easier to read

@@ +165,5 @@
> +  }
> +  return frameTable._inflatedCache;
> +};
> +
> +exports.getOrAddInflatedFrame = function getOrAddInflatedFrame(cache, index, frameTable,

No need to indent extra args on the next line.

@@ +167,5 @@
> +};
> +
> +exports.getOrAddInflatedFrame = function getOrAddInflatedFrame(cache, index, frameTable,
> +                                                               stringTable) {
> +  let inflatedFrame = cache[index];

if (inflatedFrame !== null) {
  return inflatedFrame;
}

then continue everything else

@@ +169,5 @@
> +exports.getOrAddInflatedFrame = function getOrAddInflatedFrame(cache, index, frameTable,
> +                                                               stringTable) {
> +  let inflatedFrame = cache[index];
> +  if (inflatedFrame === null) {
> +    const LOCATION_SLOT = frameTable.schema.location;

`const` is probably less appropriate here than `let`

@@ +177,5 @@
> +    let frame = frameTable.data[index];
> +    inflatedFrame = cache[index] = new InflatedFrame(stringTable[frame[LOCATION_SLOT]],
> +                                                     frame[OPTIMIZATIONS_SLOT],
> +                                                     frame[LINE_SLOT],
> +                                                     frame[CATEGORY_SLOT]);

Signature for InflatedFrame that would be easier to read:

function InflatedFrame (frameTable, stringTable, index) {
  // Indices for frame data
  let { optimizations, line, category, location } = frameTable.schema;
  let frame = frameTable.data[index];
  this.location = stringTable[frame[location]];
  this.optimizations = frame[optimizations];
  this.line = frame[line];
  this.category = frame[category];
  // Would like isContent to still take an object with location and category properties, so we can use it in the future for other types of objects
  this.isContent = isContent(this.location, this.category);
}

@@ +191,5 @@
> + * @param object frameKeyOptions
> + * @return string
> + */
> +exports.getOrAddFrameKey = function getOrAddFrameKey(keyCache, inflatedFrame,
> +                                                     frameIndex, frameKeyOptions) {

Keep on one line

@@ +225,5 @@
> + * @param number line
> + * @param number category
> + */
> +function InflatedFrame(location, optimizations, line, category) {
> +  this._set(location, optimizations, line, category);

this._set() should just be put in the constructor, unless it'll be called elsewhere (I don't see it anywhere else)

@@ +314,5 @@
> +//
> +// They are written this way because they are hot. Each frame is checked for
> +// being content or chrome when processing the profile.
> +
> +function isColonSlashSlash(location, i) {

Again, what is the performance increase of these? I'd rather these be in a separate bug altogether if not necessary here, even if they are great performance increases

::: browser/devtools/shared/profiler/jit.js
@@ +110,5 @@
>   * @type {number} samples
>   * @type {number} id
>   */
>  
> +const OptimizationSite = exports.OptimizationSite = function (id, opts) {

Should add jsdoc notation here like other functions, I'm not sure what these args are

@@ +120,5 @@
>  /**
>   * Returns a boolean indicating if the passed in OptimizationSite
>   * has a "good" outcome at the end of its attempted strategies.
>   *
> + * @param {Array<string>} stringTable

There are no parameters for this function

@@ +164,4 @@
>   */
>  
> +const JITOptimizations = exports.JITOptimizations = function (rawSites, stringTable) {
> +  function maybeString(stringTable, index) {

This should be outside of the constructor, there are no free variables used here. If there's a reason this is inside the constructor, document it (and move it to the bottom of the function scope)

@@ +173,2 @@
>  
> +  addNewSite:

no no no no no labels. is this a performance thing? if it is, this should be heavily documented with a link to labels -> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/label

@@ +218,3 @@
>    }
> +
> +  this.optimizationSites = sites.sort((a, b) => b.samples - a.samples);;

All this is very confusing inside this constructor. Again, is this for performance reasons?

::: browser/devtools/shared/profiler/tree-view.js
@@ +141,2 @@
>        if (this.visibleCells.allocations) {
> +        let childrenAllocations = sum([c.allocations for (c of this.frame.calls)]);

Is this the old Gecko-only way for array comprehensions? We should use the new one https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Array_comprehensions

::: browser/devtools/shared/widgets/FlameGraph.js
@@ +977,5 @@
>   */
>  let FlameGraphUtils = {
>    _cache: new WeakMap(),
>  
> +  createFlameGraphDataFromSamples: function(samples, stackTable, frameTable, stringTable,

Needs documentation, and should take an object as the first arg rather than 4 more args

@@ +1069,2 @@
>        }
> +      inflateTime += (Date.now() - start);

debugging

@@ +1115,2 @@
>        }
> +      bucketTime += (Date.now() - start);

debugging

@@ +1129,5 @@
>      }
>  
>      this._cache.set(samples, out);
> +
> +    console.log(">> SHU inflateTime " + inflateTime + " ms");

debugging

@@ +1130,5 @@
>  
>      this._cache.set(samples, out);
> +
> +    console.log(">> SHU inflateTime " + inflateTime + " ms");
> +    console.log(">> SHU bucketTime " + bucketTime + " ms");

debugging

@@ +1176,4 @@
>     * @return string
>     */
> +  _formatLabel: function (key, frame) {
> +    let { functionName, fileName, line } = FrameUtils.parseLocation(key, frame.line);

Where's the idle case?
Attachment #8600475 - Flags: review?(jsantell)
Also there's been a few changes to several components here that'll need to be taken into account (everything here is moving fast for the next month or so...)
Depends on: 1160691
Blocks: 1160691
No longer depends on: 1160691
(In reply to Jordan Santell [:jsantell] [@jsantell] from comment #34)
> Comment on attachment 8600475 [details] [diff] [review]
> Make the performance devtool handle the new profiler JSON format.
> 
> Review of attachment 8600475 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> All in all, there's a lot going on here -- there's the inflation occuring,
> the rewriting of the tree model, and also many rewrites for optimization
> reasons (I think, although it's not clear to me).
> 
> Generally:
> 
> * All the optimization stuff makes less clear what's occurring -- and harder
> to maintain, and review. This should be done separately, and in an isolated
> way, in a place that we'll ideally never have to look at.
> * Prefer objects as first arguments over taking X properties from an object
> and spreading them out over several arguments in a function
> * More documentation and rational behind what's going on, especially in the
> cases where things are done for optimization reasons
> 
> ::: browser/devtools/performance/modules/recording-utils.js
> @@ -12,5 @@
> >  
> >  exports.RecordingUtils = {};
> >  
> >  /**
> > - * Filters all the samples in the provided profiler data to be more recent
> 
> We'll need to keep this for backwards compat.

Okay, I'll put it back.

> Are all these changes to parseLocation faster? By how much? This is pretty
> unintuitive. Which of these changes are necessary for this patch? I'd rather
> minimal changes here, and we can handle improving parseLocation and
> isContent in bug 1160581
> 

Numbers and patch are in bug 1160691. Sorry, didn't see you already filing your bug.

> @@ +177,5 @@
> > +    let frame = frameTable.data[index];
> > +    inflatedFrame = cache[index] = new InflatedFrame(stringTable[frame[LOCATION_SLOT]],
> > +                                                     frame[OPTIMIZATIONS_SLOT],
> > +                                                     frame[LINE_SLOT],
> > +                                                     frame[CATEGORY_SLOT]);
> 
> Signature for InflatedFrame that would be easier to read:
> 
> function InflatedFrame (frameTable, stringTable, index) {
>   // Indices for frame data
>   let { optimizations, line, category, location } = frameTable.schema;
>   let frame = frameTable.data[index];
>   this.location = stringTable[frame[location]];
>   this.optimizations = frame[optimizations];
>   this.line = frame[line];
>   this.category = frame[category];
>   // Would like isContent to still take an object with location and category
> properties, so we can use it in the future for other types of objects
>   this.isContent = isContent(this.location, this.category);
> }

I disagree on the naming here. The _SLOT suffix is important IMO as a hint about the kind of object being accessed.

> @@ +225,5 @@
> > + * @param number line
> > + * @param number category
> > + */
> > +function InflatedFrame(location, optimizations, line, category) {
> > +  this._set(location, optimizations, line, category);
> 
> this._set() should just be put in the constructor, unless it'll be called
> elsewhere (I don't see it anywhere else)

Oops, will move.

> @@ +218,3 @@
> >    }
> > +
> > +  this.optimizationSites = sites.sort((a, b) => b.samples - a.samples);;
> 
> All this is very confusing inside this constructor. Again, is this for
> performance reasons?
> 

The sorting is done in the constructor because JITOptimizations is now constructed lazily, when the side pane requests opts to be displayed. The histogram is no longer computed during call tree building.

> ::: browser/devtools/shared/profiler/tree-view.js
> @@ +141,2 @@
> >        if (this.visibleCells.allocations) {
> > +        let childrenAllocations = sum([c.allocations for (c of this.frame.calls)]);
> 
> Is this the old Gecko-only way for array comprehensions? We should use the
> new one
> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/
> Array_comprehensions
> 

Comprehensions were killed for ES6. I'd be very surprised if that new form referenced in MDN lives to be standardized in ES7. Would you rather I change it to a for loop?

> ::: browser/devtools/shared/widgets/FlameGraph.js
> 
> @@ +1176,4 @@
> >     * @return string
> >     */
> > +  _formatLabel: function (key, frame) {
> > +    let { functionName, fileName, line } = FrameUtils.parseLocation(key, frame.line);
> 
> Where's the idle case?

In the caller, when setting the text: property.
(In reply to Shu-yu Guo [:shu] from comment #36)
> > Are all these changes to parseLocation faster? By how much? This is pretty
> > unintuitive. Which of these changes are necessary for this patch? I'd rather
> > minimal changes here, and we can handle improving parseLocation and
> > isContent in bug 1160581
> > 
> 
> Numbers and patch are in bug 1160691. Sorry, didn't see you already filing
> your bug.

Great! That bug is just for nsIURI by the way so this is good

> > @@ +177,5 @@
> > > +    let frame = frameTable.data[index];
> > > +    inflatedFrame = cache[index] = new InflatedFrame(stringTable[frame[LOCATION_SLOT]],
> > > +                                                     frame[OPTIMIZATIONS_SLOT],
> > > +                                                     frame[LINE_SLOT],
> > > +                                                     frame[CATEGORY_SLOT]);
> > 
> > Signature for InflatedFrame that would be easier to read:
> > 
> > function InflatedFrame (frameTable, stringTable, index) {
> >   // Indices for frame data
> >   let { optimizations, line, category, location } = frameTable.schema;
> >   let frame = frameTable.data[index];
> >   this.location = stringTable[frame[location]];
> >   this.optimizations = frame[optimizations];
> >   this.line = frame[line];
> >   this.category = frame[category];
> >   // Would like isContent to still take an object with location and category
> > properties, so we can use it in the future for other types of objects
> >   this.isContent = isContent(this.location, this.category);
> > }
> 
> I disagree on the naming here. The _SLOT suffix is important IMO as a hint
> about the kind of object being accessed.

Those kinds of things can be written in the function documentation above it

> 
> > @@ +218,3 @@
> > >    }
> > > +
> > > +  this.optimizationSites = sites.sort((a, b) => b.samples - a.samples);;
> > 
> > All this is very confusing inside this constructor. Again, is this for
> > performance reasons?
> > 
> 
> The sorting is done in the constructor because JITOptimizations is now
> constructed lazily, when the side pane requests opts to be displayed. The
> histogram is no longer computed during call tree building.

Oh, nice -- this is definitely desirable as this is the edge case (people debugging JIT stuff), for now anyway.

> > ::: browser/devtools/shared/profiler/tree-view.js
> > @@ +141,2 @@
> > >        if (this.visibleCells.allocations) {
> > > +        let childrenAllocations = sum([c.allocations for (c of this.frame.calls)]);
> > 
> > Is this the old Gecko-only way for array comprehensions? We should use the
> > new one
> > https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/
> > Array_comprehensions
> > 
> 
> Comprehensions were killed for ES6. I'd be very surprised if that new form
> referenced in MDN lives to be standardized in ES7. Would you rather I change
> it to a for loop?

I'm sure you have more insight on if comprehensions will make it to ES7 -- so if it's unlikely from your perspective, let's just make this an ugly old for-loop. Generally we (devtools) don't want to do any Mozilla-specific JS that is not on a standards track. More of a grey area for things that are on ES* standards track, but still in proposal status.
No longer blocks: 1160691
(In reply to Jordan Santell [:jsantell] [@jsantell] from comment #34)
> Comment on attachment 8600475 [details] [diff] [review]
> 
> @@ +173,2 @@
> >  
> > +  addNewSite:
> 
> no no no no no labels. is this a performance thing? if it is, this should be
> heavily documented with a link to labels ->
> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/
> label
> 

I disagree. You're saying this version:

(1)

  for (let rawSite of rawSites) {
    let found = false;
    for (let i = 0; i < sites.length; i++) {
      if (sites[i].data === rawSite) {
        sites[i].samples++;
        found = true;
        break;
      }
      if (found) {
        continue;
      }
    }
    sites.push(new OptimizationSite(sites.length, rawSite));
  }

is cleaner than this version?

(2)

  addNewSite:
  for (let rawSite of rawSites) {
    for (let i = 0; i < sites.length; i++) {
      if (sites[i].data === rawSite) {
        sites[i].samples++;
        continue addNewSite;
      }
    }
    sites.push(new OptimizationSite(sites.length, rawSite));
  }
(In reply to Jordan Santell [:jsantell] [@jsantell] from comment #37)

> > > @@ +177,5 @@
> > > > +    let frame = frameTable.data[index];
> > > > +    inflatedFrame = cache[index] = new InflatedFrame(stringTable[frame[LOCATION_SLOT]],
> > > > +                                                     frame[OPTIMIZATIONS_SLOT],
> > > > +                                                     frame[LINE_SLOT],
> > > > +                                                     frame[CATEGORY_SLOT]);
> > > 
> > > Signature for InflatedFrame that would be easier to read:
> > > 
> > > function InflatedFrame (frameTable, stringTable, index) {
> > >   // Indices for frame data
> > >   let { optimizations, line, category, location } = frameTable.schema;
> > >   let frame = frameTable.data[index];
> > >   this.location = stringTable[frame[location]];
> > >   this.optimizations = frame[optimizations];
> > >   this.line = frame[line];
> > >   this.category = frame[category];
> > >   // Would like isContent to still take an object with location and category
> > > properties, so we can use it in the future for other types of objects
> > >   this.isContent = isContent(this.location, this.category);
> > > }
> > 
> > I disagree on the naming here. The _SLOT suffix is important IMO as a hint
> > about the kind of object being accessed.
> 
> Those kinds of things can be written in the function documentation above it
> 

No, I'm going to push back on this. The value those consts hold is the slot number. It's not the time itself, or the location itself, or the category itself. It is an immutable index.

frame[LOCATION_SLOT] is clear and self documenting, in that we are *getting* the location in frame, which lives in LOCATION_SLOT. I would not name the index to be something ambiguous, only to be cleared up by documentation above it.
Addressed comments.

Added more explanatory comments and fixed an uninverted call tree bug.

On my machine, all of Octane now takes about ~450ms to build an uninverted call
tree, ~800ms to build an inverted call tree, and ~1000ms to build the flame
graph.
Attachment #8600475 - Attachment is obsolete: true
Attachment #8600475 - Flags: review?(vporof)
Attachment #8600559 - Flags: review?(vporof)
Attachment #8600559 - Flags: review?(jsantell)
(In reply to Markus Stange [:mstange] from comment #25)
> Comment on attachment 8598324 [details] [diff] [review]
> Rewrite profiler JSON streaming.
> 
> Review of attachment 8598324 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This is really good! Sorry it took me so long to get to this.
> 
> I think it would have helped if you could have split the patch into more
> pieces, e.g. (1) convert to the other stream writer, (2) add Maybe<>s
> everywhere, (3) do the actual changes. But it's fine now.

Sorry about that, bad habit of mine. Rewrites are always a bit hard -- should be more incremental from now.

> 
> ::: tools/profiler/ProfileEntry.cpp
> @@ +606,5 @@
> > +
> > +  aWriter.StartArrayElement();
> > +  {
> > +    uint32_t index = 0;
> > +    uint32_t lastNonNullIndex = 0;
> 
> Please add a comment here that this is to avoid adding nulls at the end of
> the array, since we only need them if a non-null element follows.

Done.

> 
> @@ -406,5 @@
> >    while (readPos != mWritePos) {
> >      ProfileEntry entry = mEntries[readPos];
> >      if (entry.mTagName == 'T') {
> >        currentThreadID = entry.mTagInt;
> > -      hasCurrentTime = false;
> 
> Don't you need to replace this with a currentTime.reset()? Or did you
> determine that this line was unnecessary in the old code?

Good catch!

> 
> ::: tools/profiler/ProfileEntry.h
> @@ +100,5 @@
> > +  void WriteElement(mozilla::JSONWriter& aWriter, const char* aStr) {
> > +    aWriter.IntElement(GetIndex(aStr));
> > +  }
> > +
> > +  uint32_t GetIndex(const char* aStr);
> 
> I think it would be useful to add a comment that this inserts the string
> into the string table if it doesn't exist yet.

I've renamed all the GetIndex variants to GetOrAddIndex.

> 
> ::: tools/profiler/ProfilerBacktrace.h
> @@ +17,5 @@
> >    explicit ProfilerBacktrace(SyncProfile* aProfile);
> >    ~ProfilerBacktrace();
> >  
> > +  // ProfilerBacktraces' stacks are deduplicated in the context of the
> > +  // containing profile in which the backtrace is as a marker payload.
> 
> containing or contained?
> 

Containing. The idea is sync profiles' backtraces don't need their own stack/frame/string tables, and should use the outer profile's. I'll reword as "in the context of the profile that contains the backtrace as a marker payload".
(In reply to Shu-yu Guo [:shu] from comment #38)
> (In reply to Jordan Santell [:jsantell] [@jsantell] from comment #34)
> > Comment on attachment 8600475 [details] [diff] [review]
> > 
> > @@ +173,2 @@
> > >  
> > > +  addNewSite:
> > 
> > no no no no no labels. is this a performance thing? if it is, this should be
> > heavily documented with a link to labels ->
> > https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/
> > label
> > 
> 
> I disagree. You're saying this version:
> 
> (1)
> 
>   for (let rawSite of rawSites) {
>     let found = false;
>     for (let i = 0; i < sites.length; i++) {
>       if (sites[i].data === rawSite) {
>         sites[i].samples++;
>         found = true;
>         break;
>       }
>       if (found) {
>         continue;
>       }
>     }
>     sites.push(new OptimizationSite(sites.length, rawSite));
>   }
> 
> is cleaner than this version?
> 
> (2)
> 
>   addNewSite:
>   for (let rawSite of rawSites) {
>     for (let i = 0; i < sites.length; i++) {
>       if (sites[i].data === rawSite) {
>         sites[i].samples++;
>         continue addNewSite;
>       }
>     }
>     sites.push(new OptimizationSite(sites.length, rawSite));
>   }

Yes -- no one uses labels, this is only the second time I've ever seen them in years, making this harder to maintain, harder to contribute, and different than the rest of the devtools codebase, even MDN has a giant warning on that page, and unless there is some extreme performance benefits, we should not use them. In the case we absolutely need to use them, we still should have documentation linking to labels, but I do not want labels anywhere in this code
(In reply to Shu-yu Guo [:shu] from comment #38)
> is cleaner than this version?
> 
> (2)
> 
>   addNewSite:
>   for (let rawSite of rawSites) {
>     for (let i = 0; i < sites.length; i++) {
>       if (sites[i].data === rawSite) {
>         sites[i].samples++;
>         continue addNewSite;
>       }
>     }
>     sites.push(new OptimizationSite(sites.length, rawSite));
>   }

I'd like to suggest this (untested):

  for (let rawSite of rawSites) {
    if (!sites.some(site => {
      if (site.data === rawSite)
        site.samples++;
        return true;
      }
      return false;
    }) {
      sites.push(new OptimizationSite(sites.length, rawSite));
    }
  }
Comment on attachment 8600559 [details] [diff] [review]
Make the performance devtool handle the new profiler JSON format.

Review of attachment 8600559 [details] [diff] [review]:
-----------------------------------------------------------------

Way more digestible for me with the changes from before. Just a few comments below mostly for style. Victor did the original call tree stuff, so haven't looked over that for correctness, he'd be better for that

::: browser/devtools/shared/profiler/frame-utils.js
@@ +242,5 @@
> + * @param object frameTable
> + * @param object stringTable
> + */
> +function InflatedFrame(index, frameTable, stringTable) {
> +  const LOCATION_SLOT = frameTable.schema.location;

This is way more readable with the _SLOT constants being in the constructor

::: browser/devtools/shared/profiler/jit.js
@@ +106,2 @@
>   * @param {Array<RawOptimizationSite>} optimizations
>   * @param {number} optsIndex

Update docs here to match the new args/types, very strict about these docs because JIT stuff is all new to most people working on the tools here, myself included for when I haven't looked at it in awhile

@@ +156,5 @@
>   * Constructor for JITOptimizations. A collection of OptimizationSites for a frame.
>   *
>   * @constructor
> + * @param {Array<object>} rawSites
> + *                        Array of raw optimization sites.

Does this match the RawOptimizationSite type listed in the docs above? If so, use that type here. Also if any of the data in the docs in this file are no longer correct, it'll need to be updated -- again this is for me to understand how all the unfamiliar JIT stuff works so we can display it

@@ +164,4 @@
>   */
>  
> +const JITOptimizations = exports.JITOptimizations = function (rawSites, stringTable) {
> +  function maybeString(stringTable, index) {

Pure function: pull out to very bottom of the file

@@ +173,2 @@
>  
> +  nextRawSite:

My comments on labels in comment #38

::: browser/devtools/shared/profiler/tree-model.js
@@ +470,5 @@
>    }
>  };
> +
> +function copyFrameNode(node) {
> +  var newNode = new FrameNode(node.key, node, node.isMetaCategory);

let
Attachment #8600559 - Flags: review?(jsantell) → review+
Status: NEW → ASSIGNED
(In reply to Markus Stange [:mstange] from comment #43)
> I'd like to suggest this (untested):
> 
>   for (let rawSite of rawSites) {
>     if (!sites.some(site => {
>       if (site.data === rawSite)
>         site.samples++;
>         return true;
>       }
>       return false;
>     }) {
>       sites.push(new OptimizationSite(sites.length, rawSite));
>     }
>   }

Definitely cleaner, but I think a lot of these changes are going for performance (so using for loops over array methods), so while I'll make a lot of concessions on cleanliness and lack of functional magic for performance, I still will not be ok using a JavaScript label :)
Yeah, I'd be very curious about the performance result of that approach. I'd like to assume that it's fast because Array.some is self-hosted and because our JIT compilers can inline code across the functions, but maybe I'm expecting too much.
(In reply to Jordan Santell [:jsantell] [@jsantell] from comment #44)
> Comment on attachment 8600559 [details] [diff] [review]
> Make the performance devtool handle the new profiler JSON format.
> 
> Review of attachment 8600559 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Way more digestible for me with the changes from before. Just a few comments
> below mostly for style. Victor did the original call tree stuff, so haven't
> looked over that for correctness, he'd be better for that
> 
> ::: browser/devtools/shared/profiler/jit.js
> @@ +106,2 @@
> >   * @param {Array<RawOptimizationSite>} optimizations
> >   * @param {number} optsIndex
> 
> Update docs here to match the new args/types, very strict about these docs
> because JIT stuff is all new to most people working on the tools here,
> myself included for when I haven't looked at it in awhile

Okay, will do.

> 
> @@ +156,5 @@
> >   * Constructor for JITOptimizations. A collection of OptimizationSites for a frame.
> >   *
> >   * @constructor
> > + * @param {Array<object>} rawSites
> > + *                        Array of raw optimization sites.
> 
> Does this match the RawOptimizationSite type listed in the docs above? If
> so, use that type here. Also if any of the data in the docs in this file are
> no longer correct, it'll need to be updated -- again this is for me to
> understand how all the unfamiliar JIT stuff works so we can display it
> 

Ditto.

> @@ +164,4 @@
> >   */
> >  
> > +const JITOptimizations = exports.JITOptimizations = function (rawSites, stringTable) {
> > +  function maybeString(stringTable, index) {
> 
> Pure function: pull out to very bottom of the file
> 

Done.

> @@ +173,2 @@
> >  
> > +  nextRawSite:
> 
> My comments on labels in comment #38
> 

Fiiine, I'll change it.
(In reply to Markus Stange [:mstange] from comment #46)
> Yeah, I'd be very curious about the performance result of that approach. I'd
> like to assume that it's fast because Array.some is self-hosted and because
> our JIT compilers can inline code across the functions, but maybe I'm
> expecting too much.

For most cases, the performance would be the same, with Array.some having a small overhead. 

The problem with higher-order functions like "some" in general is that, because there's 1 copy, it can't get precise type information if your program passes in many different callback functions into it. Once the type info is diluted, it's harder for Ion to inline all the way down.

Put another way, higher order functions tend to be more type unstable in JS due to not entirely obvious reasons (like when they're called).

That said, in this particular case, I'm pretty sure it doesn't matter.

(In reply to Jordan Santell [:jsantell] [@jsantell] from comment #45)
> (In reply to Markus Stange [:mstange] from comment #43)
> > I'd like to suggest this (untested):
> > 
> >   for (let rawSite of rawSites) {
> >     if (!sites.some(site => {
> >       if (site.data === rawSite)
> >         site.samples++;
> >         return true;
> >       }
> >       return false;
> >     }) {
> >       sites.push(new OptimizationSite(sites.length, rawSite));
> >     }
> >   }
> 
> Definitely cleaner, but I think a lot of these changes are going for
> performance (so using for loops over array methods), so while I'll make a
> lot of concessions on cleanliness and lack of functional magic for
> performance, I still will not be ok using a JavaScript label :)

I'll admit premature optimization here. These arrays are small enough that the difference is negligible. I'll change it to using .some.
(In reply to Shu-yu Guo [:shu] from comment #48)
> The problem with higher-order functions like "some" in general is that,
> because there's 1 copy, it can't get precise type information if your
> program passes in many different callback functions into it. Once the type
> info is diluted, it's harder for Ion to inline all the way down.
> 
> Put another way, higher order functions tend to be more type unstable in JS
> due to not entirely obvious reasons (like when they're called).

This is really interesting, thanks for the explanation.
(In reply to Markus Stange [:mstange] from comment #49)
> (In reply to Shu-yu Guo [:shu] from comment #48)
> > The problem with higher-order functions like "some" in general is that,
> > because there's 1 copy, it can't get precise type information if your
> > program passes in many different callback functions into it. Once the type
> > info is diluted, it's harder for Ion to inline all the way down.
> > 
> > Put another way, higher order functions tend to be more type unstable in JS
> > due to not entirely obvious reasons (like when they're called).
> 
> This is really interesting, thanks for the explanation.

If you're interested in the general problem, I wrote about it here: http://rfrn.org/~shu/2013/03/20/two-reasons-functional-style-is-slow-in-spidermonkey.html
All right Jordan, I've changed that loop to:

  for (let rawSite of rawSites) {
    let existingSite = sites.find((existingSite) => existingSite.data === rawSite);
    if (existingSite) {
      existingSite.samples++;
    } else {
      sites.push(new OptimizationSite(sites.length, rawSite));
    }
  }
:D Like I said, I prefer functional style things, but all this is for optimization reasons, so that's fine. My only problem with that before was labels. Thanks Shu, this is turning out pretty good, and more of a reason we need to set up some kind of timing tests (Talos, or very least, a local thing) to ensure that any changes here do not affect performance significantly
I'll have a pass through this asap. Thanks for doing the initial review, Jordan.
Made the inflated frames cache live off a weak map instead of on the frame
table directly so as to not mess with saving/loading of profiles.
Attachment #8600559 - Attachment is obsolete: true
Attachment #8600559 - Flags: review?(vporof)
Attachment #8601224 - Flags: review?(vporof)
Comment on attachment 8601225 [details] [diff] [review]
Add adapter that deduplicates old, undeduplicated profiles in the frontend.

Review of attachment 8601225 [details] [diff] [review]:
-----------------------------------------------------------------

Just some non-JSy things could be written differently (more idiomatic), and overall, could use more comments. I don't feel too strongly about anything here, though, as I think legacy gecko scenarios are not going to be the most common, but definitely more comments, so we can iterate on this for speed if we need.

R+ with my comments addressed, and if victor gives it a thumbs up

::: browser/devtools/performance/modules/actors.js
@@ +110,5 @@
>    getProfile: Task.async(function *(options) {
>      let profilerData = yield (actorCompatibilityBridge("getProfile").call(this, options));
> +    // If the backend is not deduped, dedupe it ourselves, as rest of the code
> +    // expects a deduped profile.
> +    if (profilerData.profile.meta.version === 2) {

Is version 1 still around anywhere? What is version 1?

@@ +111,5 @@
>      let profilerData = yield (actorCompatibilityBridge("getProfile").call(this, options));
> +    // If the backend is not deduped, dedupe it ourselves, as rest of the code
> +    // expects a deduped profile.
> +    if (profilerData.profile.meta.version === 2) {
> +      RecordingUtils.deflateProfile(profilerData.profile);

We should probably add this to ./io.js as well for when we're importing old profiles with this version -- can handle in a follow up bug though if you CC me

::: browser/devtools/performance/modules/recording-utils.js
@@ +193,5 @@
> + * @param object profile
> + *               A profile with version 2.
> + */
> +exports.RecordingUtils.deflateProfile = function deflateProfile(profile) {
> +  function deflateStack(frames, uniqueStacks) {

Pure function -- pull out of the closure

@@ +238,5 @@
> +  function deflateMarkers(markers, uniqueStacks) {
> +    // Schema:
> +    //   [name, time, data]
> +    let deflatedMarkers = new Array(markers.length);
> +    for (let i = 0; i < markers.length; i++) {

for (let marker of markers) {

}

@@ +244,5 @@
> +      if (marker.data && marker.data.type === "tracing" && marker.data.stack) {
> +        marker.data.stack = deflateThread(marker.data.stack, uniqueStacks);
> +      }
> +
> +      deflatedMarkers[i] = [

You can just instantiate an array with the primitive `[]` and just call `push` here instead of creating the array upfront with empty elements, which is more idiomatic

@@ +275,5 @@
> +    };
> +  }
> +
> +  profile.threads = profile.threads.map(function (thread) {
> +    var uniqueStacks = new UniqueStacks();

let

@@ +282,5 @@
> +
> +  profile.meta.version = 3;
> +};
> +
> +function UniqueStacks() {

What's the reason for using a prototype for this? Would it make more sense just having objects and getStackTableWithSchema, and other methods just take the object?

@@ +286,5 @@
> +function UniqueStacks() {
> +  this._frameTable = [];
> +  this._stackTable = [];
> +  this.stringTable = [];
> +  this._frameHash = Object.create(null);

Use primitives `{}`

@@ +320,5 @@
> +UniqueStacks.prototype.getOrAddFrameIndex = function(frame) {
> +  // Schema:
> +  //   [location, implementation, optimizations, line, category]
> +  //
> +  // Don't bother with optimizations for deflating old profile data format to

Yeah, we can forget all about JIT data for older servers for the one release that has had them

@@ +333,5 @@
> +  // Super dumb.
> +  let hash = locationIndex + " ";
> +  hash += (implementationIndex || "") + " ";
> +  hash += (frame.line || "") + " ";
> +  hash += (frame.category || "");

`${locationIndex} ${implementationIndex||""} ${frame.line||""} ${frame.category||""}`
Attachment #8601225 - Flags: review?(jsantell) → review+
By the way, got an easy to run simple way to write A/B tests for performance with talos to compare different implementations, and also a way to wrap a function to offload onto a worker and compare that as well -- any other optimization stuff, let's talk about it
(In reply to Jordan Santell [:jsantell] [@jsantell] from comment #56)
> Comment on attachment 8601225 [details] [diff] [review]
> Add adapter that deduplicates old, undeduplicated profiles in the frontend.
> 
> Review of attachment 8601225 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Just some non-JSy things could be written differently (more idiomatic), and
> overall, could use more comments. I don't feel too strongly about anything
> here, though, as I think legacy gecko scenarios are not going to be the most
> common, but definitely more comments, so we can iterate on this for speed if
> we need.
> 
> R+ with my comments addressed, and if victor gives it a thumbs up
> 
> ::: browser/devtools/performance/modules/actors.js
> @@ +110,5 @@
> >    getProfile: Task.async(function *(options) {
> >      let profilerData = yield (actorCompatibilityBridge("getProfile").call(this, options));
> > +    // If the backend is not deduped, dedupe it ourselves, as rest of the code
> > +    // expects a deduped profile.
> > +    if (profilerData.profile.meta.version === 2) {
> 
> Is version 1 still around anywhere? What is version 1?
> 

Good question, I have no idea. It's not around AFAIK.

> @@ +111,5 @@
> >      let profilerData = yield (actorCompatibilityBridge("getProfile").call(this, options));
> > +    // If the backend is not deduped, dedupe it ourselves, as rest of the code
> > +    // expects a deduped profile.
> > +    if (profilerData.profile.meta.version === 2) {
> > +      RecordingUtils.deflateProfile(profilerData.profile);
> 
> We should probably add this to ./io.js as well for when we're importing old
> profiles with this version -- can handle in a follow up bug though if you CC
> me
> 

Done.

> 
> @@ +244,5 @@
> > +      if (marker.data && marker.data.type === "tracing" && marker.data.stack) {
> > +        marker.data.stack = deflateThread(marker.data.stack, uniqueStacks);
> > +      }
> > +
> > +      deflatedMarkers[i] = [
> 
> You can just instantiate an array with the primitive `[]` and just call
> `push` here instead of creating the array upfront with empty elements, which
> is more idiomatic
> 

We know exactly how big the final array is. Doing push is unnecessary and makes it less clear when eyeballing that the result array is in fact the same length.

> @@ +282,5 @@
> > +
> > +  profile.meta.version = 3;
> > +};
> > +
> > +function UniqueStacks() {
> 
> What's the reason for using a prototype for this? Would it make more sense
> just having objects and getStackTableWithSchema, and other methods just take
> the object?

Because it's a bunch of methods encapsulating shared mutable state, the tables.

> 
> @@ +286,5 @@
> > +function UniqueStacks() {
> > +  this._frameTable = [];
> > +  this._stackTable = [];
> > +  this.stringTable = [];
> > +  this._frameHash = Object.create(null);
> 
> Use primitives `{}`
> 

As discussed on IRC, Object.create(null) is preferred.
Comment on attachment 8601224 [details] [diff] [review]
Make the performance devtool handle the new profiler JSON format.

Review of attachment 8601224 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/devtools/performance/modules/recording-utils.js
@@ +40,5 @@
>  exports.RecordingUtils.offsetSampleTimes = function(profile, timeOffset) {
>    let firstThread = profile.threads[0];
> +  const TIME_SLOT = firstThread.samples.schema.time;
> +  let samplesData = firstThread.samples.data;
> +  for (let i = 0; i < samplesData.length; i++) {

Are `for .. of` loops slower than `for let i ..`? Why use an iterator here.

And if they're actually slower, why is life cruel?

::: browser/devtools/performance/views/details-js-call-tree.js
@@ +10,5 @@
>  
>    rerenderPrefs: [
>      "invert-call-tree",
> +    "show-platform-data",
> +    "flatten-tree-recursion"

Shouldn't the "flatten recursion" thing be fixed in a different bug? I don't really care that much though, this seems straight forward enough.

::: browser/devtools/shared/profiler/frame-utils.js
@@ +180,5 @@
> + * @return object
> + */
> +exports.getInflatedFrameCache = function getInflatedFrameCache(frameTable) {
> +  let inflatedCache = gInflatedFrameStore.get(frameTable);
> +  if (inflatedCache) {

!== undefined?

@@ +189,5 @@
> +  inflatedCache = new Array(length);
> +  // Pre-fill with nulls to ensure no holes.
> +  for (let i = 0; i < length; i++) {
> +    inflatedCache[i] = null;
> +  }

How about using Array.from() to ensure no holes?

inflatedCache = Array.from(Array(length));

@@ +204,5 @@
> + * @param object stringTable
> + */
> +exports.getOrAddInflatedFrame = function getOrAddInflatedFrame(cache, index, frameTable, stringTable) {
> +  let inflatedFrame = cache[index];
> +  if (inflatedFrame === null) {

...in which case you'll want to check for undefined here, I think.

@@ +245,5 @@
> +  this.location = stringTable[frame[LOCATION_SLOT]];
> +  this.optimizations = frame[OPTIMIZATIONS_SLOT];
> +  this.line = frame[LINE_SLOT];
> +  this.category = frame[CATEGORY_SLOT];
> +  this.isContent = isContent(this);

Can we compute this lazily? Or is it useful?

@@ +254,5 @@
> + * frames are always identified by location. Chrome frames are identified by
> + * location if content-only filtering is off. If content-filtering is on, they
> + * are identified by their category.
> + *
> + * @param object options

Add the return type to the doc.

::: browser/devtools/shared/profiler/jit.js
@@ +195,4 @@
>  
> +      types: data.types.map((t) => {
> +        return {
> +          typeset: t.typeset ? t.typeset.map((ty) => {

Have a maybeTypeset function here?

@@ +205,5 @@
> +          }) : undefined,
> +          site: stringTable[t.site],
> +          mirType: stringTable[t.mirType]
> +        };
> +      }),

There is a lot of indenting here, and it's somewhat hard to follow. Create separate smaller objects before then return a literal.

::: browser/devtools/shared/profiler/tree-view.js
@@ +142,5 @@
> +        let childrenAllocations = 0;
> +        let calls = this.frame.calls;
> +        for (let i = 0; i < calls.length; i++) {
> +          childrenAllocations += calls[i].allocations;
> +        }

Nit: more elegant to use `reduce` here if not slower (probably not).

::: browser/devtools/shared/widgets/FlameGraph.js
@@ +977,5 @@
>   */
>  let FlameGraphUtils = {
>    _cache: new WeakMap(),
>  
> +  createFlameGraphDataFromThread: function(thread, options = {}, out = []) {

Document this method.

@@ +993,1 @@
>      }

Nit: Array.from(Array(PALLETTE_SIZE)).map(e => []) is a more elegant one-liner.
Attachment #8601224 - Flags: review?(vporof) → review+
Attachment #8601225 - Flags: review?(vporof) → review+
Shu, if you can, make sure to upload the new versions for all those patches. I'd like to take another look at everything before landing.

Also, I assume we need to fix many tests nao.
(In reply to Victor Porof [:vporof][:vp] from comment #59)
> Comment on attachment 8601224 [details] [diff] [review]
> Make the performance devtool handle the new profiler JSON format.
> 
> Review of attachment 8601224 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: browser/devtools/performance/modules/recording-utils.js
> @@ +40,5 @@
> >  exports.RecordingUtils.offsetSampleTimes = function(profile, timeOffset) {
> >    let firstThread = profile.threads[0];
> > +  const TIME_SLOT = firstThread.samples.schema.time;
> > +  let samplesData = firstThread.samples.data;
> > +  for (let i = 0; i < samplesData.length; i++) {
> 
> Are `for .. of` loops slower than `for let i ..`? Why use an iterator here.
> 
> And if they're actually slower, why is life cruel?

They are not yet as optimized. Naively, for-of uses the iteration protocol, which, in addition to allocating an iterator, also allocates a result object each iteration. The right thing to do here is to optimize them in Ion, but they will always have some kind of fixed overhead over C-style loops.

Until Ion fully optimizes them, I'll continue using C-style for loops.

> 
> ::: browser/devtools/shared/profiler/frame-utils.js
> @@ +180,5 @@
> > + * @return object
> > + */
> > +exports.getInflatedFrameCache = function getInflatedFrameCache(frameTable) {
> > +  let inflatedCache = gInflatedFrameStore.get(frameTable);
> > +  if (inflatedCache) {
> 
> !== undefined?
> 

Good call.

> @@ +189,5 @@
> > +  inflatedCache = new Array(length);
> > +  // Pre-fill with nulls to ensure no holes.
> > +  for (let i = 0; i < length; i++) {
> > +    inflatedCache[i] = null;
> > +  }
> 
> How about using Array.from() to ensure no holes?
> 
> inflatedCache = Array.from(Array(length));
> 

That has the problem of being vulnerable to indexed properties on the prototype.

For example,

js> Array.prototype[0] = "foo"
"foo"
js> Array.from(Array(2))
["foo", (void 0)]

> @@ +245,5 @@
> > +  this.location = stringTable[frame[LOCATION_SLOT]];
> > +  this.optimizations = frame[OPTIMIZATIONS_SLOT];
> > +  this.line = frame[LINE_SLOT];
> > +  this.category = frame[CATEGORY_SLOT];
> > +  this.isContent = isContent(this);
> 
> Can we compute this lazily? Or is it useful?
> 

We can't compute isContent lazily, it's needed to compute the frame key for the call tree. When filtering platform frames we give leaf non-isContent frames a meta category like Graphics or Gecko or whatever.
Comment on attachment 8602968 [details] [diff] [review]
Make the memory stuff in the performance devtool synthesize the new profiler JSON format.

Review of attachment 8602968 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/devtools/performance/modules/recording-utils.js
@@ +136,5 @@
> +
> +    // Schema:
> +    //   [location]
> +    //
> +    // The only field a frame in an allocations-cum-profile will have is

Let's just call this an allocations profile

@@ +144,5 @@
> +    //   "functionDisplayName (source:line:column)"
> +    // Otherwise, it is
> +    //   "source:line:column"
> +    //
> +    // A static array is used to join to save memory on intermediate strings.

This is faster than `${frame.source}:${frame.line}:${frame.column}`? The Numbers will be cast to String regardless

@@ +158,3 @@
>  
> +    if (funcName) {
> +      locationConcatArray[0] = funcName;

Same thing, is this really that faster?
Attachment #8602968 - Flags: review?(jsantell) → review+
(In reply to Jordan Santell [:jsantell] [@jsantell] from comment #63)
> Comment on attachment 8602968 [details] [diff] [review]
> Make the memory stuff in the performance devtool synthesize the new profiler
> JSON format.
> 
> Review of attachment 8602968 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: browser/devtools/performance/modules/recording-utils.js
> @@ +136,5 @@
> > +
> > +    // Schema:
> > +    //   [location]
> > +    //
> > +    // The only field a frame in an allocations-cum-profile will have is
> 
> Let's just call this an allocations profile
> 
> @@ +144,5 @@
> > +    //   "functionDisplayName (source:line:column)"
> > +    // Otherwise, it is
> > +    //   "source:line:column"
> > +    //
> > +    // A static array is used to join to save memory on intermediate strings.
> 
> This is faster than `${frame.source}:${frame.line}:${frame.column}`? The
> Numbers will be cast to String regardless
> 
> @@ +158,3 @@
> >  
> > +    if (funcName) {
> > +      locationConcatArray[0] = funcName;
> 
> Same thing, is this really that faster?

It's mainly a space optimization, not a time one. The engine isn't smart enough to optimize away intermediate strings. So if you have |a + b + c|, where a, b, and c are strings, the engine will allocate some intermediate string a + b. Using Array.join gets around that.
Victor wanted to do another one-over.
Attachment #8603042 - Flags: review?(vporof)
(In reply to Shu-yu Guo [:shu] from comment #61)
> (In reply to Victor Porof [:vporof][:vp] from comment #59)
> > Comment on attachment 8601224 [details] [diff] [review]
> > Make the performance devtool handle the new profiler JSON format.
> > 
> > Review of attachment 8601224 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > ::: browser/devtools/performance/modules/recording-utils.js
> > @@ +40,5 @@
> > >  exports.RecordingUtils.offsetSampleTimes = function(profile, timeOffset) {
> > >    let firstThread = profile.threads[0];
> > > +  const TIME_SLOT = firstThread.samples.schema.time;
> > > +  let samplesData = firstThread.samples.data;
> > > +  for (let i = 0; i < samplesData.length; i++) {
> > 
> > Are `for .. of` loops slower than `for let i ..`? Why use an iterator here.
> > 
> > And if they're actually slower, why is life cruel?
> 
> They are not yet as optimized. Naively, for-of uses the iteration protocol,
> which, in addition to allocating an iterator, also allocates a result object
> each iteration. The right thing to do here is to optimize them in Ion, but
> they will always have some kind of fixed overhead over C-style loops.
> 
> Until Ion fully optimizes them, I'll continue using C-style for loops.
> 

This smells like premature optimization to me. And once Ion gets the improvements, it's very unlikely anyone will go through mozilla js code rewriting c-style for loops to for ... of where possible.

If there are no actual measurable performance benefits for not doing so, let's use for.. of. I believe Jordan measured that loop in a different bug, and it's less than 10 ms. I feel like c-style loops here are not going to save us anything.

But I'd be happy to be proven wrong.

> 
> > @@ +189,5 @@
> > > +  inflatedCache = new Array(length);
> > > +  // Pre-fill with nulls to ensure no holes.
> > > +  for (let i = 0; i < length; i++) {
> > > +    inflatedCache[i] = null;
> > > +  }
> > 
> > How about using Array.from() to ensure no holes?
> > 
> > inflatedCache = Array.from(Array(length));
> > 
> 
> That has the problem of being vulnerable to indexed properties on the
> prototype.
> 
> For example,
> 
> js> Array.prototype[0] = "foo"
> "foo"
> js> Array.from(Array(2))
> ["foo", (void 0)]
> 

Is this really ever going to happen? We never write that kind of code; we never modify prototypes, and this is just a module, so Array.prototype is going to be different.
Any performance critical code, as is the case here, should be written in C style JS code. The perf difference is at least 1 order of magnitude, 20X in this case.

tl;dr;

Try this in your JS shell:

100000 iterations 1024 elements;

for (var x of a ... 969.74ms
a.forEach(x)    ... 469.46ms
for (var i = 0; ... 41.85ms

1000000 iterations 128 elements;

for (var x of a ... 1337.56ms
a.forEach(x)    ... 610.11ms
for (var i = 0; ... 66.32ms

10000000 iterations 16 elements;

for (var x of a ... 2280.96ms
a.forEach(x)    ... 796.10ms
for (var i = 0; ... 104.57ms


var a = new Array(1024);
for (var i = 0; i < a.length; i++) {
  a[i] = 1;
}

// for of
function foo() {
  var s = 0;
  for (var x of a) {
    s += x;
  }
  return s;
}

var s = dateNow();
for (var i = 0; i < 100000; i++) {
  foo();
}
print("for (var x of a ... " + (dateNow() - s).toFixed(2) + "ms");

// for each
function car() {
  var s = 0;
  a.forEach(function (x) {
    s += x;
  });
  return s;
}

var s = dateNow();
for (var i = 0; i < 100000; i++) {
  car();
}
print("a.forEach(x)    ... " + (dateNow() - s).toFixed(2) + "ms");


// for i
function bar() {
  var s = 0;
  for (var i = 0; i < a.length; i++) {
    s += a[i];
  }
  return s;
}

var s = dateNow();
for (var i = 0; i < 100000; i++) {
  bar();
}
print("for (var i = 0; ... " + (dateNow() - s).toFixed(2) + "ms");
I'm fine with writing hot code in that style, as long as it's isolated, commented for the reasoning, and has benchmarks. I just don't know what the sample size in this case is, as this is a new feature that wasn't around before, but that's fine too
Comment on attachment 8603686 [details] [diff] [review]
Fix devtools tests to use the new profiler JSON format.

Review of attachment 8603686 [details] [diff] [review]:
-----------------------------------------------------------------

You're very brave, Shu.

::: browser/devtools/performance/test/browser_perf-jit-model-01.js
@@ +6,5 @@
>   * an OptimizationSiteProfile when adding optimization sites, like from the
>   * FrameNode, and the returning of that data is as expected.
>   */
>  
> +const { RecordingUtils } = devtools.require("devtools/performance/recording-utils");

Can just include this in the head.js so you don't have to add it to every test, but probably more work at this point to undo it
Attachment #8603686 - Flags: review?(jsantell) → review+
Comment on attachment 8603042 [details] [diff] [review]
Make the performance devtool handle the new profiler JSON format. Comments applied.

Review of attachment 8603042 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/devtools/shared/profiler/tree-model.js
@@ +85,5 @@
> +      // Insert the inflated frame into the call tree at the current level.
> +      let frameNode;
> +
> +      // Leaf nodes have fan out much greater than non-leaf nodes, thus the
> +      // use of a hash table. Otherwise, do linear search.

I'd like to see a comment here describing how this is incredibly hot code, thus the avoidance of simple methods like arr.find(), which does break early.

@@ +471,5 @@
>  };
> +
> +function copyFrameNode(node) {
> +  var newNode = new FrameNode(node.key, node, node.isMetaCategory);
> +  newNode._merge(node);

Calling private methods feels weird here, even though this function is implicitly a friend.

Having a .clone or .copy method on the FrameNode sounds better to me.
Attachment #8603042 - Flags: review?(vporof) → review+
Blocks: 1165361
Depends on: 1167230
Depends on: 1167895
Depends on: 1165859
Flags: qe-verify-
See Also: → 1168784
Attachment #8601224 - Attachment is obsolete: true
Attachment #8600470 - Attachment is obsolete: true
Attachment #8598324 - Flags: feedback?(kvijayan)
The patch that landed as 96474bfa4c61 (just uploading as an attachment so aurora approval can be requested)
Attachment #8611283 - Flags: review+
Comment on attachment 8596303 [details] [diff] [review]
Rewrite the JSAPI profiling API to use a FrameHandle, as to avoid multiple lookups in JitcodeGlobalTable.

Approval Request Comment
[Feature/regressing bug #]: 1167252, the new performance tool
[User impact if declined]: Won't ship the performance tool
[Describe test coverage new/current, TreeHerder]: There are try pushes in Bug 1167252 with all the patches needing uplift.
[Risks and why]: Requesting uplift for the accumulated changes in the performance tool since the 40 merge date, so these changes haven't had the full 6 weeks to bake.  Risks are generally contained within the devtools performance panel.
[String/UUID change made/needed]: None
Attachment #8596303 - Flags: approval-mozilla-aurora?
Comment on attachment 8598324 [details] [diff] [review]
Rewrite profiler JSON streaming.

Approval Request Comment
[Feature/regressing bug #]: 1167252, the new performance tool
[User impact if declined]: Won't ship the performance tool
[Describe test coverage new/current, TreeHerder]: There are try pushes in Bug 1167252 with all the patches needing uplift.
[Risks and why]: Requesting uplift for the accumulated changes in the performance tool since the 40 merge date, so these changes haven't had the full 6 weeks to bake.  Risks are generally contained within the devtools performance panel.
[String/UUID change made/needed]: None
Attachment #8598324 - Flags: approval-mozilla-aurora?
Comment on attachment 8601225 [details] [diff] [review]
Add adapter that deduplicates old, undeduplicated profiles in the frontend.

Approval Request Comment
[Feature/regressing bug #]: 1167252, the new performance tool
[User impact if declined]: Won't ship the performance tool
[Describe test coverage new/current, TreeHerder]: There are try pushes in Bug 1167252 with all the patches needing uplift.
[Risks and why]: Requesting uplift for the accumulated changes in the performance tool since the 40 merge date, so these changes haven't had the full 6 weeks to bake.  Risks are generally contained within the devtools performance panel.
[String/UUID change made/needed]: None
Attachment #8601225 - Flags: approval-mozilla-aurora?
Comment on attachment 8602968 [details] [diff] [review]
Make the memory stuff in the performance devtool synthesize the new profiler JSON format.

Approval Request Comment
[Feature/regressing bug #]: 1167252, the new performance tool
[User impact if declined]: Won't ship the performance tool
[Describe test coverage new/current, TreeHerder]: There are try pushes in Bug 1167252 with all the patches needing uplift.
[Risks and why]: Requesting uplift for the accumulated changes in the performance tool since the 40 merge date, so these changes haven't had the full 6 weeks to bake.  Risks are generally contained within the devtools performance panel.
[String/UUID change made/needed]: None
Attachment #8602968 - Flags: approval-mozilla-aurora?
Comment on attachment 8603042 [details] [diff] [review]
Make the performance devtool handle the new profiler JSON format. Comments applied.

Approval Request Comment
[Feature/regressing bug #]: 1167252, the new performance tool
[User impact if declined]: Won't ship the performance tool
[Describe test coverage new/current, TreeHerder]: There are try pushes in Bug 1167252 with all the patches needing uplift.
[Risks and why]: Requesting uplift for the accumulated changes in the performance tool since the 40 merge date, so these changes haven't had the full 6 weeks to bake.  Risks are generally contained within the devtools performance panel.
[String/UUID change made/needed]: None
Attachment #8603042 - Flags: approval-mozilla-aurora?
Comment on attachment 8603686 [details] [diff] [review]
Fix devtools tests to use the new profiler JSON format.

Approval Request Comment
[Feature/regressing bug #]: 1167252, the new performance tool
[User impact if declined]: Won't ship the performance tool
[Describe test coverage new/current, TreeHerder]: There are try pushes in Bug 1167252 with all the patches needing uplift.
[Risks and why]: Requesting uplift for the accumulated changes in the performance tool since the 40 merge date, so these changes haven't had the full 6 weeks to bake.  Risks are generally contained within the devtools performance panel.
[String/UUID change made/needed]: None
Attachment #8603686 - Flags: approval-mozilla-aurora?
Comment on attachment 8611283 [details] [diff] [review]
Fix nsIProfiler xpcshell tests to use the new profiler JSON format (96474bfa4c61)

Approval Request Comment
[Feature/regressing bug #]: 1167252, the new performance tool
[User impact if declined]: Won't ship the performance tool
[Describe test coverage new/current, TreeHerder]: There are try pushes in Bug 1167252 with all the patches needing uplift.
[Risks and why]: Requesting uplift for the accumulated changes in the performance tool since the 40 merge date, so these changes haven't had the full 6 weeks to bake.  Risks are generally contained within the devtools performance panel.
[String/UUID change made/needed]: None
Attachment #8611283 - Flags: approval-mozilla-aurora?
Note: I had verbal confirmation for these uplifts from Sylvestre even before he's flagged them as a+.  See https://bugzilla.mozilla.org/show_bug.cgi?id=1167252#c26
Comment on attachment 8598324 [details] [diff] [review]
Rewrite profiler JSON streaming.

Change approved to skip one train as part of the spring campaign.
Attachment #8598324 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8596303 [details] [diff] [review]
Rewrite the JSAPI profiling API to use a FrameHandle, as to avoid multiple lookups in JitcodeGlobalTable.

Change approved to skip one train as part of the spring campaign.
Attachment #8596303 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8611283 [details] [diff] [review]
Fix nsIProfiler xpcshell tests to use the new profiler JSON format (96474bfa4c61)

Change approved to skip one train as part of the spring campaign.
Attachment #8611283 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8603686 [details] [diff] [review]
Fix devtools tests to use the new profiler JSON format.

Change approved to skip one train as part of the spring campaign.
Attachment #8603686 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8603042 [details] [diff] [review]
Make the performance devtool handle the new profiler JSON format. Comments applied.

Change approved to skip one train as part of the spring campaign.
Attachment #8603042 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8602968 [details] [diff] [review]
Make the memory stuff in the performance devtool synthesize the new profiler JSON format.

Change approved to skip one train as part of the spring campaign.
Attachment #8602968 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8601225 [details] [diff] [review]
Add adapter that deduplicates old, undeduplicated profiles in the frontend.

Change approved to skip one train as part of the spring campaign.
Attachment #8601225 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.