Closed
Bug 877522
Opened 11 years ago
Closed 9 years ago
Allow Linux perf_event sampling of context switches to get call stacks on B2G
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: jld, Assigned: jld)
References
Details
(Whiteboard: [perf-reviewed])
Attachments
(2 files)
This was looked at in bug 847592 — see bug 847592 comment 49 and the next few — but it needs its own bug. To summarize: We'll sometimes want to know what threads are doing when they're not on the CPU — if they're blocked (and, if so, on what) or if they're runnable but not running due to scheduling decisions. The perf_event framework has an event type for probing context switches, which would give us this information, but it doesn't gather call stacks on ARM. It looks as if at least part of what's missing is perf_arch_fetch_caller_regs[1], whose implementations on other platforms are relatively simple. That retrieves the necessary kernel registers, so there may be something else needed to get the user stack as well. [1] http://lxr.free-electrons.com/ident?i=perf_arch_fetch_caller_regs
Assignee | ||
Updated•11 years ago
|
Summary: Allow perf_event sampling of context switches to get call chain → Allow Linux perf_event sampling of context switches to get call stacks on B2G
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → jld
Assignee | ||
Comment 1•11 years ago
|
||
Here's a patch against Linux master. Note that it hasn't been run past upstream yet, and it may not apply cleanly to other versions.
Assignee | ||
Comment 2•11 years ago
|
||
This one is for applying to the geeksphone keon kernel.
Assignee | ||
Updated•11 years ago
|
Attachment #762996 -
Attachment description: Linux kernel patch (against 3.10.0-rc5+) to add perf_arch_caller_regs on ARM → Linux kernel patch (against 3.10.0-rc5+) to add perf_arch_fetch_caller_regs on ARM
Updated•11 years ago
|
Status: NEW → ASSIGNED
Whiteboard: p= c= , → [p=profiling c=]
Assignee | ||
Updated•11 years ago
|
Whiteboard: [p=profiling c=] → [p=5 c=]
Updated•11 years ago
|
Whiteboard: [p=5 c=] → [p=5 c=profiling]
Updated•11 years ago
|
Depends on: 904899
Whiteboard: [p=5 c=profiling] → [c=profiling s=2013.09.06 p=3]
Assignee | ||
Updated•11 years ago
|
Whiteboard: [c=profiling s=2013.09.06 p=3] → [c=profiling s= p=3]
Assignee | ||
Comment 3•11 years ago
|
||
I got a proof-of-concept working, Wednesday morning during the Oslo work week: https://people.mozilla.org/~bgirard/cleopatra/#report=619e178f5e8e20e78f9b07ab2dd4e5e0b7176345 Judging by the relatively large number of non-unwound samples there are probably some bugs, but it works. The question is: how does this compare to the Gecko profiler, especially given that it's gaining the ability to sample multiple threads in different processes and combine them into one profile (which was arguably perf's big advantage), and should soon be able to trace past jitcode (which at this point perf seems unlikely to ever be able to do, at least not without even more kernel changes and special-case hacks)?
I still think being able to tell why a thread is blocked (just switched out, stuck on IO, waiting for a lock) is extremely valuable. If we can get that information with the Gecko profiler then I'm happy.
Updated•11 years ago
|
Whiteboard: [perf-reviewed]
Assignee | ||
Comment 5•9 years ago
|
||
This is probably not happening in any form; I think the Gecko profiler can get roughly equivalent information by now; and it's possible that the upstream Linux kernel already took care of this.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•