Closed Bug 1172186 Opened 9 years ago Closed 9 years ago

Make the profiler build standalone

Categories

(Core :: Gecko Profiler, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: BenWa, Assigned: BenWa)

References

Details

Attachments

(6 files, 6 obsolete files)

      No description provided.
Attached patch WIP (obsolete) — Splinter Review
Gecko is still building, a few files now build standalone.
Assignee: nobody → bgirard
Status: NEW → ASSIGNED
Depends on: 858927
Depends on: 1172216
mStringToIndexMap started out as an std::map but got backed out in that form because on some mobile platform the standard library didn't correctly support it. Shu knows more.
Anyone have any ideas what we should use instead?
(In reply to Benoit Girard (:BenWa) from comment #3)
> Anyone have any ideas what we should use instead?

Originally they were map, but you really want a hash table, and unordered_map isn't available everywhere.

We can't use nsTHashtable? Can we not use any ns data structures?

I'd suggest even taking JS::HashTable and moving it to mfbt, it's better than nsTHashtable anyways.
Attached patch patch (obsolete) — Splinter Review
Still need to figure out the std::map stuff, otherwise it should be good to go.

There's a lot of refactoring opportunity but I figured it would be easier to keep this patch with no side effects and do the follow up later. I think the first thing to be refactored out will be the meta data.
Attachment #8616290 - Attachment is obsolete: true
Attachment #8616956 - Flags: review?(mstange)
Attached file js-profile
I use this script to run the profiler on octane like so:

% cd js/src/octane
% js-profile ../../../objdir-opt run.js
Using the script and the instrumentation patched I uploaded, I benchmarked
streaming the profile of the Octane benchmark suite inside js/src/octane with
this new patch vs the old one. I ran the suite 3 times each (the suite takes
quite a while) with Benoit's patch and on tip.

tip:     332 ms, 336 ms, 326 ms
patched: 928 ms, 891 ms, 944 ms

This is much too large of a regression to land. Streaming time is very
important for the performance of the Performance devtool.

I haven't investigated what's causing almost 3x slowdown, but my suspicion
would be std::map being tree-based and std::string copying.
Attached file perf-output
Dove in a bit deeper. I blocked StreamJSON on a key press and attached |perf --callgraph dwarf|.

This is the result of |perf report --no-children|. It's pretty gigantic but clearly identifies the top 2 cost centers: FrameKey::Hash and StackKey::Hash as called by operator< for use in std::map.
Flags: needinfo?(bgirard)
Can you remeasure this?

The old hash code was stringifing. This cache the results so should be much faster.
Flags: needinfo?(bgirard) → needinfo?(shu)
(In reply to Benoit Girard (:BenWa) from comment #11)
> Created attachment 8617391 [details] [diff] [review]
> Combined patch - cache Hash() results
> 
> Can you remeasure this?
> 
> The old hash code was stringifing. This cache the results so should be much
> faster.

Another 3 runs. For more apples-to-apples comparison, I also cached hashes on tip. Interestingly, caching the hash on tip makes no performance difference.

tip with hash caching: 337 ms, 332 ms, 336 ms
BenWa's patch:         584 ms, 551 ms, 541 ms

It's better, but still 1.6x slowdown.

Also, it's not right to cache the hash of UniqueStacks::StackKey up front. A StackKey's hash is already kind of cached, as it mutates everytime a new frame is appended to it -- see AppendFrame and mPrefixHash.
Flags: needinfo?(shu)
Attached file perf-tip.txt
The perf profile of tip.
Attached file perf-benwa.txt
The perf profile of BenWa's patch.
In the profile of BenWa's patch, because of inlining, I'm guessing the time spent in UniqueStacks::GetOrAddFrameIndex is time spent in std::map.

Comparing < of std::strings seem to take some time.

TBH I still think this is std::map being slow.
Attached patch patch v2 (obsolete) — Splinter Review
I had accidentally clobber some new code. This patch restores it. As well it has performance fixes for the std::map look ups.
Attachment #8616956 - Attachment is obsolete: true
Attachment #8616956 - Flags: review?(mstange)
Attachment #8617660 - Flags: review?(mstange)
If you have a chance to re-measure tomorrow that would be nice. Otherwise no worries.
Attachment #8617391 - Attachment is obsolete: true
Flags: needinfo?(shu)
(In reply to Benoit Girard (:BenWa) from comment #17)
> Created attachment 8617661 [details] [diff] [review]
> Combined patch - cache Hash() results
> 
> If you have a chance to re-measure tomorrow that would be nice. Otherwise no
> worries.

Is this rebased against today's m-c? I got some conflicts I had to hand fix for the previous patches.
No, but this here is rebased. Nothing of note changed here (just minor conflict with your parse error detection patch):

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e1da787aed13
Attached patch Combined patch - v3 (obsolete) — Splinter Review
More changes at the bottom of the queue
Attachment #8617661 - Attachment is obsolete: true
Flags: needinfo?(shu)
Attached patch patch v3Splinter Review
Attachment #8617660 - Attachment is obsolete: true
Attachment #8620587 - Attachment is obsolete: true
Attachment #8617660 - Flags: review?(mstange)
Attachment #8621679 - Flags: review?(mstange)
Comment on attachment 8621679 [details] [diff] [review]
patch v3

Review of attachment 8621679 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/ProfileEntry.cpp
@@ +1154,3 @@
>  }
>  
> +Mutex* ThreadProfile::GetMutex()

Can you make this return a reference?

::: tools/profiler/ProfileEntry.h
@@ +510,5 @@
>    mozilla::UniquePtr<char[]> mSavedStreamedMarkers;
>    mozilla::Maybe<UniqueStacks> mUniqueStacks;
>  
>    PseudoStack*   mPseudoStack;
> +  Mutex*         mMutex;

Please make this a mozilla::UniquePtr.

::: tools/profiler/platform.cpp
@@ +79,5 @@
>  static int sUnwindStackScan;  /* max # of dubious frames allowed */
>  static int sProfileEntries;   /* how many entries do we store? */
>  
>  std::vector<ThreadInfo*>* Sampler::sRegisteredThreads = nullptr;
> +::Mutex* Sampler::sRegisteredThreadsMutex = nullptr;

Hmm. If CreateMutex returns a UniquePtr, this instance is going to be a little tricky. Can you check whether using a static UniquePtr adds any static constructors?

::: tools/profiler/platform.h
@@ +132,5 @@
> +    mMutex->Unlock();
> +  }
> +
> + private:
> +  Mutex* mMutex;

Why not make this a reference?

@@ +154,5 @@
>  
>    // Called on startup to initialize platform specific things
>    static void Startup();
>  
> +  static Mutex* CreateMutex(const char* aDesc);

Please make this return a mozilla::UniquePtr.

@@ +453,1 @@
>    nsCOMPtr<nsIThread> mThread;

I bet you'd love to take over bug 1128091.
Attachment #8621679 - Flags: review?(mstange) → review+
These edits made the code nicer. We could probably do a UniquePtr pass on the profiler code in a follow-up.

(In reply to Markus Stange [:mstange] from comment #24)
> ::: tools/profiler/platform.cpp
> @@ +79,5 @@
> >  static int sUnwindStackScan;  /* max # of dubious frames allowed */
> >  static int sProfileEntries;   /* how many entries do we store? */
> >  
> >  std::vector<ThreadInfo*>* Sampler::sRegisteredThreads = nullptr;
> > +::Mutex* Sampler::sRegisteredThreadsMutex = nullptr;
> 
> Hmm. If CreateMutex returns a UniquePtr, this instance is going to be a
> little tricky. Can you check whether using a static UniquePtr adds any
> static constructors?
> 

I'm assuming it's fine:
http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/TiledContentClient.cpp#474
Benoit, looks like you accidentally reverted the ARM SIGBUS fix:

https://hg.mozilla.org/integration/mozilla-inbound/diff/d1d45ce7cbf5/tools/profiler/ProfileEntry.h#l1.37
Flags: needinfo?(bgirard)
Opps, I'll do a follow up! Thanks for noticing.

Curious what got you to notice?
Flags: needinfo?(bgirard)
(In reply to Benoit Girard (:BenWa) from comment #29)
> Opps, I'll do a follow up! Thanks for noticing.
> 
> Curious what got you to notice?

I saw you pushed and wanted to see how bad the ifdefs looked like.
and what's your verdict?

It's not great but it's not terrible IMO. I plan to offset my <strikethrough>carbon</strikethrough> code footprint by doing some follow up refactoring and later eventually moving the ifdef code to separate files when possible.
Blocks: 1176263
(In reply to Benoit Girard (:BenWa) from comment #31)
> and what's your verdict?
> 
> It's not great but it's not terrible IMO. I plan to offset my
> <strikethrough>carbon</strikethrough> code footprint by doing some follow up
> refactoring and later eventually moving the ifdef code to separate files
> when possible.

I agree with that. It's kind of gross, but it's not subtly gross -- it's at least pretty explicit that it doesn't seem like we would end up introducing subtle bugs.
https://hg.mozilla.org/mozilla-central/rev/d1d45ce7cbf5
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: