Closed Bug 935092 Opened 7 years ago Closed 7 years ago

Make a way to get another thread's pseudo-stack

Categories

(Core :: Gecko Profiler, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla28
Tracking Status
firefox28 --- fixed

People

(Reporter: jchen, Assigned: jchen)

References

Details

(Whiteboard: [qa-])

Attachments

(2 files, 2 obsolete files)

The background hang detector in bug 909974 detects a transient hang after the fact, so the pseudo stack is no longer available by then. It is possible to have the last-seen stack if we keep a separate stack pointer (and only use non-copying labels)
Trivial patch to const-ify pseudo stack entry methods
Attachment #827509 - Flags: review?(bgirard)
This patch adds a max stack pointer so that we can get the last-seen pseudostack after the stack frames have been popped.
Attachment #827511 - Flags: review?(bgirard)
Comment on attachment 827509 [details] [diff] [review]
Const-ify profiler pseudostack entries (v1)

Review of attachment 827509 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/public/ProfilingStack.h
@@ +55,5 @@
>          return sp == nullptr;
>      }
>  
> +    uint32_t line() const volatile { JS_ASSERT(!js()); return idx; }
> +    JSScript *script() const volatile { JS_ASSERT(js()); return script_; }

ideally JSScript would also be constant

@@ +65,5 @@
>      void setStackAddress(void *aSp) volatile { sp = aSp; }
>      void setScript(JSScript *aScript) volatile { script_ = aScript; }
>  
>      // We can't know the layout of JSScript, so look in vm/SPSProfiler.cpp.
> +    JS_FRIEND_API(jsbytecode *) pc() const volatile;

and jsbytecode
Attachment #827509 - Flags: review?(bgirard) → review+
Comment on attachment 827511 [details] [diff] [review]
Add ability to get last-seen pseudostack (v1)

Review of attachment 827511 [details] [diff] [review]:
-----------------------------------------------------------------

I don't really think mMaxStackPointer will really be pointing to something useful after the hang. r- but feel free to convince me.

Why can't we have the hang detector catch the hang in the fact and take a snapshot (or even a profile) of the hang?
Attachment #827511 - Flags: review?(bgirard) → review-
So the hang detector defines two types of hangs - transient hangs and permanent hangs,

* Permanent hangs are more like hangs in the traditional sense: lasting longer than a few seconds; slow running code but quite possibly a deadlock. I agree that taking a snapshot or a profile is a good way of diagnosis.

* Transient hangs, on the other hand, are simply tasks that took longer than expected to run. In other words, we hang for a short time but eventually recover. The hang detector catches these hangs by measuring the interval between starting and ending a task, e.g.

> hangMonitor.NotifyActivity();
> do_task();
> hangMonitor.NotifyActivity();
> ...

While we do catch permanent hang during the hang, because of the mechanism above, we only catch transient hangs after the fact.

I do think there's value in the last-seen pesudostack. It indicates the task that ran for longer than expected time.

For example, if you have a thread that runs different tasks,

> void sub_task() {
>   profiler_label("subtask");
> }
> void run_task0() {
>   profiler_label("task0");
> }
> void run_task1() {
>   profiler_label("task1");
>   sub_task();
> }
> void run_task2() {
>   profiler_label("task2");
>   sub_task();
> }
>
> while (thread_running) {
>   hangMonitor.NotifyActivity();
>   switch (task_type) {
>   case 0: run_task0(); break;
>   case 1: run_task1(); break;
>   case 2: run_task2(); break;
>   }
>   hangMonitor.NotifyActivity();
> }

After each task runs, mMaxStackPointer will point to the last pseudo-stack from each task run,
For task 0, "task0"
For task 1: "task1" > "subtask"
For task 2: "task2" > "subtask"

This way, if the interval between task start and stop is too long, we can determine which task was last run and diagnose from there.
It's seems really fragile because there might be a label push new the end that replaces the useful labels with something useless near the end of the task. I do agree in some cases it could provide a bit of information but only if we're lucky.

Why should we treat permanent and transient hang differently? I'd rather build one good hang detector that works for both. If we don't notify the hang monitor of activity then we can just interrupt the main thread and take a snapshot.

Perhaps this discussion would best be had on vidyo.
After talking to BenWa, we've decided to use a mechanism similar to what the profiler uses -- asynchronously suspending a thread and taking the pseudo-stack. Patch coming soon.
Summary: Make last-seen peudo stack available → Make a way to get another thread's pseudo-stack
The ThreadStackHelper class implements platform-specific ways to suspend a thread and get the pseudostack. In the future, it may be expanded to have unwinding capabilities (so background hang monitor can get a real stack in case of very long hangs or deadlocks)
Attachment #828873 - Flags: review?(bgirard)
Comment on attachment 828873 [details] [diff] [review]
Add ThreadStackHelper to get a thread's pesudo-stack (v1)

Review of attachment 828873 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/threads/ThreadStackHelper.h
@@ +36,5 @@
> +
> +class ThreadStackHelperCommon
> +{
> +public:
> +  typedef Telemetry::HangHistogram::Stack Stack;

Not defined here, Telemetry::HangHistogram::Stack is defined elsewhere as mozilla::Vector<const char*, 8>, i.e. a vector of (const char*) with 8 inline entries.
Comment on attachment 828873 [details] [diff] [review]
Add ThreadStackHelper to get a thread's pesudo-stack (v1)

Review of attachment 828873 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/threads/ThreadStackHelper.cpp
@@ +53,5 @@
> +
> +void
> +ThreadStackHelper::Startup()
> +{
> +  if (sInitialized++) {

This means that this class can only be used by one thread other it races on this.

@@ +58,5 @@
> +    return;
> +  }
> +  if (::sem_init(&sSem, 0, 0)) {
> +    // Don't set sInitialized if sem_init failed
> +    sInitialized--;

MOZ_ASSERT(false);

If this fails the error isn't reported. How will ThreadStackHelper behave after this silent failure?

@@ +99,5 @@
> +  sCurrent = this;
> +  struct sigaction sigact = {};
> +  sigact.sa_sigaction = SigAction;
> +  sigemptyset(&sigact.sa_mask);
> +  sigact.sa_flags = SA_SIGINFO;

Bad things will happen if you don't also SA_RESTART here.

@@ +101,5 @@
> +  sigact.sa_sigaction = SigAction;
> +  sigemptyset(&sigact.sa_mask);
> +  sigact.sa_flags = SA_SIGINFO;
> +  if (::sigaction(SIGPROF, &sigact, &sOldSigAction)) {
> +    return;

MOZ_ASSERT(false)

@@ +103,5 @@
> +  sigact.sa_flags = SA_SIGINFO;
> +  if (::sigaction(SIGPROF, &sigact, &sOldSigAction)) {
> +    return;
> +  }
> +  ::pthread_kill(mThreadID, SIGPROF);

I would suggest using tgkill like the profiler. I believe it was for portability. In any case it's best to only rely on one instead of both.

@@ +105,5 @@
> +    return;
> +  }
> +  ::pthread_kill(mThreadID, SIGPROF);
> +  ::sem_wait(&sSem);
> +  aStack = Move(mStackBuffer);

I'd clear sCurrent here for safety.

Also your not removing your signal handler when your done. Either you add/remove it every time (I'd suggest this) or you set it up and remove it on init/shutdown.

@@ +130,5 @@
> +ThreadStackHelper::GetStack(Stack& aStack)
> +{
> +  if (!PrepareStackBuffer(aStack) ||
> +      !mInitialized) {
> +    return;

MOZ_ASSERT(false) on onward in the patch for all the failure conditions.

::: xpcom/threads/ThreadStackHelper.h
@@ +48,5 @@
> +  bool PrepareStackBuffer(Stack& aStack);
> +  void FillStackBuffer();
> +};
> +
> +#if defined(XP_UNIX)

IMO it's bad style to have ifdef around entire class when they are in fact cross platform. You should only ifdef around private fields that are platform specific.
Attachment #828873 - Flags: review?(bgirard) → review-
(In reply to Benoit Girard (:BenWa) from comment #10)
> Comment on attachment 828873 [details] [diff] [review]
> Add ThreadStackHelper to get a thread's pesudo-stack (v1)
> 
> Review of attachment 828873 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: xpcom/threads/ThreadStackHelper.cpp
> @@ +53,5 @@
> > +
> > +void
> > +ThreadStackHelper::Startup()
> > +{
> > +  if (sInitialized++) {
> 
> This means that this class can only be used by one thread other it races on
> this.

I think it'll be fine if we require calling Startup and Shutdown on the main thread. I added comments in ThreadStackHelper.h

> @@ +58,5 @@
> > +    return;
> > +  }
> > +  if (::sem_init(&sSem, 0, 0)) {
> > +    // Don't set sInitialized if sem_init failed
> > +    sInitialized--;
> 
> MOZ_ASSERT(false);
> 
> If this fails the error isn't reported. How will ThreadStackHelper behave
> after this silent failure?

An empty stack is returned in GetStack() if there's any error along the way.

> @@ +99,5 @@
> > +  sCurrent = this;
> > +  struct sigaction sigact = {};
> > +  sigact.sa_sigaction = SigAction;
> > +  sigemptyset(&sigact.sa_mask);
> > +  sigact.sa_flags = SA_SIGINFO;
> 
> Bad things will happen if you don't also SA_RESTART here.

Added the flag.

> @@ +103,5 @@
> > +  sigact.sa_flags = SA_SIGINFO;
> > +  if (::sigaction(SIGPROF, &sigact, &sOldSigAction)) {
> > +    return;
> > +  }
> > +  ::pthread_kill(mThreadID, SIGPROF);
> 
> I would suggest using tgkill like the profiler. I believe it was for
> portability. In any case it's best to only rely on one instead of both.

Changed to tgkill. Seems like pthreads would be more portable than Linux-specific tgkill, but maybe pthreads is not available everywhere.

> @@ +105,5 @@
> > +    return;
> > +  }
> > +  ::pthread_kill(mThreadID, SIGPROF);
> > +  ::sem_wait(&sSem);
> > +  aStack = Move(mStackBuffer);
> 
> I'd clear sCurrent here for safety.
> 
> Also your not removing your signal handler when your done. Either you
> add/remove it every time (I'd suggest this) or you set it up and remove it
> on init/shutdown.

sCurrent and the signal handler are cleared inside the handler.

> @@ +130,5 @@
> > +ThreadStackHelper::GetStack(Stack& aStack)
> > +{
> > +  if (!PrepareStackBuffer(aStack) ||
> > +      !mInitialized) {
> > +    return;
> 
> MOZ_ASSERT(false) on onward in the patch for all the failure conditions.

Added. I used MOZ_ALWAYS_TRUE in some cases.

> ::: xpcom/threads/ThreadStackHelper.h
> @@ +48,5 @@
> > +  bool PrepareStackBuffer(Stack& aStack);
> > +  void FillStackBuffer();
> > +};
> > +
> > +#if defined(XP_UNIX)
> 
> IMO it's bad style to have ifdef around entire class when they are in fact
> cross platform. You should only ifdef around private fields that are
> platform specific.

Okay I changed to using a single ThreadStackHelper declaration with ifdef'd fields and methods.
Attachment #828873 - Attachment is obsolete: true
Attachment #830940 - Flags: review?(bgirard)
(In reply to Jim Chen [:jchen :nchen] from comment #11)
> 
> I think it'll be fine if we require calling Startup and Shutdown on the main
> thread. I added comments in ThreadStackHelper.h

Let's have a main thread assert then

> 
> Changed to tgkill. Seems like pthreads would be more portable than
> Linux-specific tgkill, but maybe pthreads is not available everywhere.

The problem could be with android or maybe we just inherited tgkill from the v8 profiler. I don't remember. I rather only use one instance and if it causes problem we can patch both this and the profiler.
Comment on attachment 830940 [details] [diff] [review]
Add ThreadStackHelper to get a thread's pesudo-stack (v2)

Review of attachment 830940 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/threads/ThreadStackHelper.cpp
@@ +50,5 @@
> +  : mPseudoStack(mozilla_get_pseudo_stack())
> +  , mStackBuffer()
> +  , mMaxStackSize(mStackBuffer.capacity())
> +{
> +#if defined(XP_LINUX)

I don't mean to suggest that you should merge these however. I can leave that decision up to you for what you prefer. I'm more picky about the header.

::: xpcom/threads/ThreadStackHelper.h
@@ +33,5 @@
> + *
> + * Only non-copying labels are included in the stack, which means labels
> + * with custom text and markers are not included.
> + */
> +class ThreadStackHelper

IMO this is much easier to read.
Attachment #830940 - Flags: review?(bgirard) → review+
Attachment #827511 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/18ba8691786d
https://hg.mozilla.org/mozilla-central/rev/288456164d94
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
Depends on: 942488
Depends on: 946817
Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.