Closed Bug 1150252 Opened 9 years ago Closed 9 years ago

Switch MacOS sampler to use pthread_kill

Categories

(DevTools :: Performance Tools (Profiler/Timeline), defect)

x86_64
Windows 8.1
defect
Not set
normal

Tracking

(firefox40 fixed)

RESOLVED WONTFIX
Firefox 40
Tracking Status
firefox40 --- fixed

People

(Reporter: djvj, Assigned: djvj)

References

Details

Attachments

(1 file, 2 obsolete files)

The current implementation of the sampler thread for profiling on MacOS X uses Mach's thread_suspend() and thread_resume() functions to interrupt the sampled thread.

It was discovered through some tests (see: bug 1143598) that using pthread_kill to interrupt the sampler is faster by about 3x to 4x.

The interrupt mechanism on OSX should be switched from using suspend/resume to pthread_kill.

jrmuizel noted that the OSX sampler implementation _used_ to use pthread_kill, and that its usage may have been responsible for topcrashes seen in bug 721025.  However, there was no actual reproduction done which identified this as the issue, and it's more of a guess.

So I'm going to attempt to move the sampling mechanism back to pthread_kill, and if it ends up causing issues, it will be backed out.
Fixed the 32-bit build error: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2a7dbb7be1af

Re-ran all the 'dt' tests a bunch of times on the first try run (see comment 1).  Looks good for review.
Attached patch use-pthread-kill-on-osx.patch (obsolete) — Splinter Review
Updated patch for review.
Attachment #8587036 - Attachment is obsolete: true
Attachment #8587444 - Flags: review?(mstange)
Comment on attachment 8587444 [details] [diff] [review]
use-pthread-kill-on-osx.patch

Review of attachment 8587444 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with all comment addressed

::: tools/profiler/ProfileEntry.h
@@ +109,5 @@
>  
>    void addStoredMarker(ProfilerMarker* aStoredMarker);
>    void deleteExpiredStoredMarkers();
>  
> +  int writePos() const {

Is this used anywhere in the patch? I don't see it.

@@ +183,5 @@
>  
>    uint32_t bufferGeneration() const {
>      return mBuffer->mGeneration;
>    }
> +  ProfileBuffer *buffer() const {

Neither is this.

::: tools/profiler/platform-macos.cc
@@ +57,5 @@
>  
>  Sampler *SamplerRegistry::sampler = NULL;
>  
> +static mozilla::Atomic<ThreadProfile*> sCurrentThreadProfile;
> +static volatile bool sSignalHandlingDone;

Why did you choose not to make this a mozilla::Atomic<bool>? And why did you choose yield spin locking over a condition variable? Those reasons should probably make their way into code comments.

@@ +181,5 @@
> +}
> +
> +} // namespace
> +
> +static void ProfilerSignalThread(ThreadProfile *profile,

I can't parse the name of this function, but it's the same as in platform-linux.cc, so let's keep it the way it is.

@@ +287,5 @@
>            info->Profile()->GetThreadResponsiveness()->Update();
>  
>            ThreadProfile* thread_profile = info->Profile();
> +          sCurrentThreadProfile = thread_profile;
> + 

end-of-line whitespace

@@ +315,5 @@
>  
> +    MOZ_ASSERT(sSignalHandlingDone == false);
> +    pthread_kill(profiled_pthread, SIGPROF);
> +    while (!sSignalHandlingDone)
> +        sched_yield();

Indent is two spaces, and you need braces around this statement.

@@ +360,5 @@
> +  struct sigaction sa;
> +  sa.sa_sigaction = ProfilerSignalHandler;
> +  sigemptyset(&sa.sa_mask);
> +  sa.sa_flags = SA_RESTART | SA_SIGINFO;
> +  if (sigaction(SIGPROF, &sa, &old_sigprof_signal_handler_) != 0) {

Linux is restoring old_sigprof_signal_handler_ in Sampler::Stop, you're not.
Attachment #8587444 - Flags: review?(mstange) → review+
Comment on attachment 8587444 [details] [diff] [review]
use-pthread-kill-on-osx.patch

Review of attachment 8587444 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/ProfileEntry.h
@@ +205,5 @@
>    PlatformData*  mPlatformData;  // Platform specific data.
>    void* const    mStackTop;
>    ThreadResponsiveness mRespInfo;
>  
>    // Only Linux is using a signal sender, instead of stopping the thread, so we

Does this comment need updating?
Comments addressed.  Looks good on try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4a514ac9e9bf
Attachment #8587444 - Attachment is obsolete: true
Attachment #8589683 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/24238425bc91
Assignee: nobody → kvijayan
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 40
Depends on: 1166778
Depends on: 1166808
Kannan, can you take care of backing this out?
Flags: needinfo?(kvijayan)
I backed it out because it caused bug 1166778 and bug 1166808.
Flags: needinfo?(kvijayan)
Resolution: FIXED → WONTFIX
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: