Factor out the common parts of SamplerThread::Run()

RESOLVED FIXED in Firefox 55

Status

()

Core
Gecko Profiler
RESOLVED FIXED
10 months ago
8 months ago

People

(Reporter: njn, Assigned: jseward)

Tracking

Trunk
mozilla55
Points:
---

Firefox Tracking Flags

(firefox55 fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

10 months ago
Once bug 1342306's patches land, all three platform-*.cpp files in the profiler will have a similar structure for SamplerThread::Run(), with considerable duplication. We should factor out the common parts into a single implementation in platform.cpp.

Bug 1344118 comment 2 (and the subsequent comment) has a couple of suggestions for tweaks that can be made in the code that needs to be factored out.
(Reporter)

Updated

10 months ago
Assignee: nobody → jseward
(Assignee)

Comment 1

9 months ago
Created attachment 8847256 [details] [diff] [review]
bug1344169-3.cset

WIP patch (not for review).  Believed to build and run correctly on
all 3 targets (Windows, MacOS, Linux/Android), but needs more testing.
(Reporter)

Comment 2

9 months ago
Comment on attachment 8847256 [details] [diff] [review]
bug1344169-3.cset

Review of attachment 8847256 [details] [diff] [review]:
-----------------------------------------------------------------

I did a quick scan and have a couple of comments.

- If you make my_pid and my_tid Linux-only members of SamplerThread, you won't have to pass them in to SuspendSampleAndResume(). (Also, it looks like the new code doesn't actually use my_tid.)

- In part 2 of bug 1346356 I merged SamplerThread::Join() into ~SamplerThread() because the were always called in tandem. Apologies for any conflicts this causes.
(Assignee)

Comment 3

9 months ago
Created attachment 8849949 [details] [diff] [review]
Factor out the common parts of SamplerThread::Run()

All three platform-*.cpp files have similar structure, most especially for
SamplerThread::Run(), with considerable duplication.  This patch factors out
the common parts into a single implementation in platform.cpp.

* The top level structure of class SamplerThread has been moved to
  platform.cpp.

* The class has some target-dependent fields, relating to signal handling and
  thread identity.

* There's a single implementation of Run() in platform.cpp.

* AllocPlatformData() and PlatformDataDestructor::operator() have also been
  commoned up and moved into platform.cpp.

* Time units in SamplerThread have been tidied up.  We now use microseconds
  throughout, except in the constructor.  All time interval field and variable
  names incorporate the unit (microseconds/milliseconds) for clarity.  The
  Windows uses of such values are scaled up/down by 1000 accordingly.

* The pre-existing MacOS Run() implementation contained logic that attempted
  to keep "to schedule" in the presence of inaccuracy in the actual sleep
  intervals.  This now applies to all targets.  A couple of comments on this
  code have been added.

* platform-{win32,macos,linux-android}.cpp have had their Run() methods
  removed, and all other methods placed in the same sequences, to the extent
  that is possible.

* In the Win32 and MacOS implementations, Thread::SampleContext has been
  renamed to Thread::SuspendSampleAndResumeThread as that better describes
  what it does.  In the Linux/Android implementation there was no such
  separate method, so one has been created.

* The three Thread::SuspendSampleAndResumeThread methods have been commented
  in such a way as to emphasise their identical top level structure.

* The point in platform.cpp where platform-{win32,macos,linux-android}.cpp are
  #included has been moved slightly earlier in the file, into the
  SamplerThread encampment, as that seems like a better place for it.
Attachment #8849949 - Flags: review?(n.nethercote)
(Assignee)

Updated

9 months ago
Attachment #8847256 - Attachment is obsolete: true
(Reporter)

Comment 4

9 months ago
Comment on attachment 8849949 [details] [diff] [review]
Factor out the common parts of SamplerThread::Run()

Review of attachment 8849949 [details] [diff] [review]:
-----------------------------------------------------------------

Very nice. Thank you for doing this.

Lots of nits, mostly about the formatting at the start of functions: the return type (including |static|) goes on one line, the function name and arguments on another (or multiple), the opening brace on another. I may have missed some, so please check through the whole thing again for any remaining ones.

There's one more bit of duplication that could be factored out into Run(), which is the declaration of the TickSample and the setting of its fields: threadInfo, timestamp, rssMemory, ussMemory. (ussMemory is slightly different on Linux/Android, so you'd have to #if it.) You'd then pass the TickSample in instead of the ThreadInfo. Given that this is a non-trivial change and this patch is super bitrot-prone, if you want to skip that change for now I'm happy to do it as a follow-up.

::: tools/profiler/core/platform-linux-android.cpp
@@ +108,5 @@
>    return syscall(SYS_tgkill, tgid, tid, signalno);
>  }
>  
> +static
> +void SleepMicro(int aMicroseconds)

Nit: |static void| on its own line.

@@ +136,5 @@
>  
>    MOZ_ASSERT(!rv, "nanosleep call failed");
>  }
>  
> +class PlatformData {

Nit: '{' on its own line.

@@ +137,5 @@
>    MOZ_ASSERT(!rv, "nanosleep call failed");
>  }
>  
> +class PlatformData {
> + public:

Nit: don't indent |public|. Same comment applies to numerous other classes in the patch.

@@ +138,5 @@
>  }
>  
> +class PlatformData {
> + public:
> +  PlatformData(int aThreadId)

Does this pass static analysis on TreeHerder? I would have thought you'd need |explicit| on this, because it's a single-argument constructor. Likewise for the Mac version. (The Windows version already has |explicit|).

@@ +224,2 @@
>  
> +static void SigprofHandler(int aSignal, siginfo_t* aInfo, void* aContext)

Nit: |static void| on its own line.

@@ +265,5 @@
> +
> +  errno = savedErrno;
> +}
> +
> +static void* ThreadEntry(void* aArg)

Nit: |static void*| on its own line.

@@ +340,2 @@
>  
> +void SamplerThread::Stop(PS::LockRef aLock) {

Nit: '{' on its own line.

@@ +349,5 @@
> +  sigaction(SIGPROF, &mOldSigprofHandler, 0);
> +}
> +
> +void SamplerThread::SuspendSampleAndResumeThread(
> +       PS::LockRef aLock, ThreadInfo* aThreadInfo, bool isFirstProfiledThread)

Should be |aIsFirstProfiledThread|. Likewise on the other two implementations and the declaration in platform.cpp.

Also, |void| on its own line, and then indent the argument by only 2 spaces.

@@ +353,5 @@
> +       PS::LockRef aLock, ThreadInfo* aThreadInfo, bool isFirstProfiledThread)
> +{
> +  // Only one sampler thread can exist at once.  So we expect to have complete
> +  // control over |sSigHandlerCoordinator|.
> +  MOZ_ASSERT(!sSigHandlerCoordinator);

This assertion is valid, but the comment is not quite true. Because each SamplerThread instance is joined and destroyed outside of gPSMutex being locked, it's possible for the profiler to be restarted and the next SamplerThread created before the old one is destroyed. *However*, the old sampler thread will never take a sample -- its main loop will exit as soon as it runs again.

It'll be correct if you change the comment to "Only one sampler thread can be sampling at once."

::: tools/profiler/core/platform-macos.cpp
@@ +42,5 @@
> +  return gettid();
> +}
> +
> +static
> +void SleepMicro(int aMicroseconds)

|static void| on the same line.

@@ +84,2 @@
>  
> +static void SetThreadName() {

|static void| on its own line; '{' on its own line.

@@ +95,3 @@
>  }
>  
> +static void* ThreadEntry(void* aArg) {

Ditto.

@@ +119,5 @@
> +{
> +  pthread_join(mThread, nullptr);
> +}
> +
> +void SamplerThread::Stop(PS::LockRef aLock) {

|void| and '{' on their own lines.

@@ +123,5 @@
> +void SamplerThread::Stop(PS::LockRef aLock) {
> +  MOZ_RELEASE_ASSERT(NS_IsMainThread());
> +}
> +
> +void SamplerThread::SuspendSampleAndResumeThread(

|void| on its own line.

@@ +148,5 @@
> +  // Suspend the samplee thread and get its context.
> +
> +  // We're using thread_suspend on OS X because pthread_kill (which is what
> +  // we're using on Linux) has less consistent performance and causes
> +  // strange crashes, see bug 1166778 and bug 1166808.

thread_suspend() is also a hell of a lot simpler...

::: tools/profiler/core/platform-win32.cpp
@@ +41,5 @@
> +  return GetCurrentThreadId();
> +}
> +
> +static
> +void SleepMicro(int aMicroseconds)

|static void| on its own line.

@@ +95,2 @@
>  
> +static unsigned int __stdcall ThreadEntry(void* aArg) {

|static unsigned int __stdcall| and '{' on their own lines.

@@ +141,2 @@
>  
> +void SamplerThread::Stop(PS::LockRef aLock)

|void| on its own line.

@@ +152,5 @@
> +  // the mActivityGeneration check, and so it won't make any more ::Sleep()
> +  // calls.
> +  if (mIntervalMicroseconds < 10 * 1000) {
> +    ::timeEndPeriod(mIntervalMicroseconds / 1000);
> +   }

Mis-indented closing brace -- 3 spaces!

@@ +155,5 @@
> +    ::timeEndPeriod(mIntervalMicroseconds / 1000);
> +   }
> +}
> +
> +void SamplerThread::SuspendSampleAndResumeThread(

|void| on its own line.

::: tools/profiler/core/platform.cpp
@@ +1607,5 @@
> +// BEGIN SamplerThread
> +
> +// The sampler thread controls sampling and runs whenever the profiler is
> +// active. It periodically runs through all registered threads, finds those
> +// that should be sampled, then pauses and samples them.

I would put this comment immediately above class SamplerThread's definition.

@@ +1627,5 @@
> +  ~SamplerThread();
> +
> +  // This runs on the sampler thread.  It suspends and resumes the samplee
> +  // threads.
> +  void SuspendSampleAndResumeThread(PS::LockRef aLock, ThreadInfo* aThreadInfo,

My brain keeps reading "Sample" as a noun in this name, when it's actually a verb. Is SuspendAndSampleAndResume() better?

@@ +1648,5 @@
> +  // The handle for the sampler thread.
> +  HANDLE mThread;
> +#endif
> +
> +#if defined(GP_OS_darwin) || defined(GP_OS_linux) || defined(GP_OS_android)

Changing this to an #elif makes it clearer that it's an alternative to the GP_OS_windows #if above. (You could even combine the two comments into a single comment above the #if that says "The OS-specific handle for the sampler thread.")

@@ +1680,5 @@
> +
> +
> +// This function is the sampler thread.  This implementation is used for all
> +// targets.
> +void SamplerThread::Run()

|void| on its own line.

@@ +1736,5 @@
> +
> +          isFirstProfiledThread = false;
> +        }
> +
> +#       if defined(USE_LUL_STACKWALK)

I generally don't indent #ifs like this unless they are within other #ifs, though I know the codebase is generally inconsistent.
Attachment #8849949 - Flags: review?(n.nethercote) → review+
Comment on attachment 8849949 [details] [diff] [review]
Factor out the common parts of SamplerThread::Run()

Review of attachment 8849949 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/core/platform-win32.cpp
@@ +47,5 @@
> +  if (aMicroseconds <= 0) {
> +    return;
> +  }
> +
> +  ::Sleep(aMicroseconds / 1000);

Please make sure we sleep for at least 1ms. ::Sleep only accepts integers, and ::Sleep(0) is very different from ::Sleep(1), and I'm worried that we'll freeze the entire application if we sample too frequently.

I've filed bug 1349801 to improve on this.
Comment on attachment 8849949 [details] [diff] [review]
Factor out the common parts of SamplerThread::Run()

Review of attachment 8849949 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/core/platform-linux-android.cpp
@@ +111,5 @@
> +static
> +void SleepMicro(int aMicroseconds)
> +{
> +  if (aMicroseconds <= 0) {
> +    return;

This early return also seems dangerous. I think it'd be safer to call sleep even with an interval of zero because it'll give other threads a chance to do something.
(Assignee)

Comment 7

9 months ago
(In reply to Markus Stange [:mstange] from comment #5 and comment #6)
Marcus, thanks for sanity checking the patch.  I hadn't considered the effect
of corner case times on SleepMicro, true.

What I could do for now is this:

* Mac and Linux/Android:

  aMicroseconds = std::max(0, aMicroseconds);
  // and then always sleep/nanosleep

* Windows:

  aMicroseconds = std::max(0, aMicroseconds);
  int aMilliseconds = aMicroseconds / 1000;
  aMilliseconds = std::max(1, aMilliseconds);
  // and then always ::Sleep

I think that would address your concerns.  Do we have a mfbt version
of min/max available, rather than the std:: one?  I didn't find one.

How does that sound?  Nick, are you happy to go with that?
Flags: needinfo?(mstange)
(Reporter)

Comment 8

9 months ago
I don't know much about the different timing functions on different OSes, so I'm happy to defer to mstange.
(Assignee)

Comment 9

9 months ago
(In reply to Nicholas Nethercote [:njn] from comment #4)
> Comment on attachment 8849949 [details] [diff] [review]
> Factor out the common parts of SamplerThread::Run()

Thank you for the review.  All nits fixed.  Some minor followup comments:

> > +class PlatformData {
> > + public:
> > +  PlatformData(int aThreadId)
>
> Does this pass static analysis on TreeHerder? I would have thought you'd need
> |explicit| on this, because it's a single-argument constructor. Likewise for
> the Mac version. (The Windows version already has |explicit|).

No, it doesn't.  I r?'d and tried treeherder in parallel and it found that
soon afterwards.

> > +  // Only one sampler thread can exist at once.  So we expect to have complete
> > +  // control over |sSigHandlerCoordinator|.
> > +  MOZ_ASSERT(!sSigHandlerCoordinator);
>
> This assertion is valid, but the comment is not quite true. [..]
> 
> It'll be correct if you change the comment to "Only one sampler thread can
> be sampling at once."

Ah, interesting.  Fixed.

> > +  void SuspendSampleAndResumeThread(PS::LockRef aLock, ThreadInfo* aThreadInfo,
> 
> My brain keeps reading "Sample" as a noun in this name, when it's actually a
> verb. Is SuspendAndSampleAndResume() better?

Ok.  SuspendAndSampleAndResume it is.

> > +#       if defined(USE_LUL_STACKWALK)
>
> I generally don't indent #ifs like this unless they are within other #ifs,
> though I know the codebase is generally inconsistent.

I indent it like this because I find the un-indented version interrupts the
normal if/loop/etc indentation of the code and makes it harder to scan for top
level structure -- particularly marked in this case.  But anyway, is
un-indented as that's more consistent with the rest of the code.

> Lots of nits, mostly about the formatting at the start of functions: [..]

Re-checked, found some leftovers.  I took the liberty of also fixing the same
for the existing Android-specific ReadProfilerVars et al at the end of
platform-linux-android.cpp.

> There's one more bit of duplication that could be factored out into Run(),
> [..]

Let's do that as a followup.
(Assignee)

Comment 10

9 months ago
(In reply to Julian Seward [:jseward] from comment #9)
> Re-checked, found some leftovers.  I took the liberty of also fixing the same
> for the existing Android-specific ReadProfilerVars et al at the end of
> platform-linux-android.cpp.

And also changed the formal parameter names for freeArray, readCSVArray
and ReadProfilerVars (the Android specific stuff) to use the leading-'a'
convention.
(In reply to Julian Seward [:jseward] from comment #7)
> (In reply to Markus Stange [:mstange] from comment #5 and comment #6)
> Marcus, thanks for sanity checking the patch.  I hadn't considered the effect
> of corner case times on SleepMicro, true.
> 
> What I could do for now is this:
> 
> * Mac and Linux/Android:
> 
>   aMicroseconds = std::max(0, aMicroseconds);
>   // and then always sleep/nanosleep
> 
> * Windows:
> 
>   aMicroseconds = std::max(0, aMicroseconds);
>   int aMilliseconds = aMicroseconds / 1000;
>   aMilliseconds = std::max(1, aMilliseconds);
>   // and then always ::Sleep
> 
> I think that would address your concerns.

Yes, this would address my concerns. Though the Windows version can be simplified to
int milliseconds = std::max(1, aMicroSeconds / 1000);

> Do we have a mfbt version
> of min/max available, rather than the std:: one?  I didn't find one.

No, std::min/max is the one that's used everywhere.

(IIRC we have a few header files that don't want to pay the cost of #including all of <algorithm>, so they have local definitions of their own versions of these functions, or even inlined them directly. But that's not a concern for us here.)
Flags: needinfo?(mstange)

Comment 12

9 months ago
Pushed by jseward@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ac9d819c57c6
Factor out the common parts of SamplerThread::Run().  r=n.nethercote.

Comment 13

9 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/ac9d819c57c6
Status: NEW → RESOLVED
Last Resolved: 9 months ago
status-firefox55: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
status-firefox54: affected → ---
You need to log in before you can comment on or make changes to this bug.