Closed Bug 909403 Opened 11 years ago Closed 10 years ago

Starting the profiler with a signal is not signal safe

Categories

(Core :: Gecko Profiler, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: BenWa, Unassigned)

Details

(Keywords: perf, Whiteboard: [c=profiling p= s=2014.03.28 u=])

Currently we listen on SIGUSR1 to start the profiler. When we received the signal we start the profiler from the signal handler. This is unsafe and should be fixed.

(In reply to Dave Huseby [:huseby] from comment #78)
> IIRC, that's what signalfd is for: http://linux.die.net/man/2/signalfd
Keywords: perf
Whiteboard: [c= p= s= u=]
I think the correct approach for this is to use signalfd to create an fd that will become readable when SIGUSR1 is delivered to the process.  The main thread can then use async I/O (select or libuv) to start the profiler when the fd becomes readable.

The only downside is the potential lag between when the kernel delivers the signal and sets the readability flag, and the time that the main thread notices that the fd is readable.
Assignee: nobody → dhuseby
OS: Mac OS X → All
Hardware: x86 → All
Status: NEW → ASSIGNED
Whiteboard: [c= p= s= u=] → [c= p=3 s= u=]
Thought: do we want to be using signals for this at all?  If we're going to epoll a signalfd, could we epoll a fifo or Unix-domain socket instead?  This would allow accepting a non-singleton set of commands.

(Currently it's epoll, anyway; this is, I expect, wrapped in a nonzero number of layers of abstraction.)
Well the benefit of using signals has always been that it works when the applications is locked. When the application is hanging it's worth taking the risk of signals not being safe to get data out about the hang.

We should take into consideration that APIs are available across platform as well. Most of the profiler logic is platform neutral with a skim in platform.h/platform-*.cpp for the platform specific part.

If we don't care at all about starting the profiler while gecko is hanging, and quite fraquantly I think there's many issues now that prevent this from working, perhaps we should just remove the signal handler.

The proper way to talk to the profiler while the application is responding is via the RDP server.
(In reply to Benoit Girard (:BenWa) from comment #3)
> Well the benefit of using signals has always been that it works when the
> applications is locked. 

If that's what you want, why not just use the gcore command or kill -ABRT?  Both will force the process to coredump which you can then pick apart with gdb to figure out what's going on.

I'm more concerned with turning the profiler on/off very precisely from unit tests or some other external driver.  I would like to use the profiler to capture perf data during perf tests.  This will allow us to optimize better and/or detect perf regressions.
(In reply to Dave Huseby [:huseby] from comment #4)
> (In reply to Benoit Girard (:BenWa) from comment #3)
> > Well the benefit of using signals has always been that it works when the
> > applications is locked. 
> 
> If that's what you want, why not just use the gcore command or kill -ABRT? 
> Both will force the process to coredump which you can then pick apart with
> gdb to figure out what's going on.

It's fine for deadlock but for livelocks it's interesting to know what states the application is cycling between. But I don't think it's worth the man time to do this properly over other features.

> 
> I'm more concerned with turning the profiler on/off very precisely from unit
> tests or some other external driver.  I would like to use the profiler to
> capture perf data during perf tests.  This will allow us to optimize better
> and/or detect perf regressions.

Right, that's been the plan all along to support these use cases when designing the profiler. We're already doing this to some extend on eideticker and talos. If you want the entire run you can trivially MOZ_PROFILER_STARTUP, MOZ_PROFILER_SHUTDOWN=<profile>.

Dispatching to the main thread is fine given that starting and saving the profile can't easily be made signal safe.
Whiteboard: [c= p=3 s= u=] → [c=profiling p=3 s= u=]
because this isn't blocking anything I'm going to punt it in favor of addressing koi+ bugs right now.
Whiteboard: [c=profiling p=3 s= u=] → [c=profiling p=5 s= u=]
Assignee: dhuseby → nobody
I filed this bug thinking that it would a problem that we'd run into.  The bug hasn't manifest itself and the amount of work needed to fix the non-existent bug overwhelms any latent concern.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Whiteboard: [c=profiling p=5 s= u=] → [c=profiling p=5 s=2014.03.28 u=]
Whiteboard: [c=profiling p=5 s=2014.03.28 u=] → [c=profiling p= s=2014.03.28 u=]
You need to log in before you can comment on or make changes to this bug.