Open Bug 1805673 Opened 2 years ago Updated 2 years ago

Profiler: Incorrect Stack Chart on Linux in vdso image

Categories

(Core :: Gecko Profiler, defect, P3)

Unspecified
Linux
defect

Tracking

()

People

(Reporter: dshin, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Attached image Example of broken trace

STR: Capture profile of bug 1755006 attachment 9266373 [details] on Linux (Ubuntu 20.04, 110.0a1)
Expected: All RefreshDriver tick present under proper event loop stack ()
Actual: Some ticks are presented right under XRE_InitChildProcess, breaking up one continuous invocation of tick into multiple pieces. See https://share.firefox.dev/3Wiy3nD

On Windows, the chart looks correct: https://share.firefox.dev/3WlX3dT

I was thinking that this could be an occurrence of bug 1434402, but I'm not so sure, because it looks like that the all samples that have the issue are in the Interpreter. But this may just mean that the JIT frames are missing? It's not clear to me, we clearly see Ion samples that have working stacks.

Blocks: 1307215
See Also: → 1434402

The bad samples have only one native address that is 0x7fffXXXXXXXX which on Linux I believe are data stack addresses rather than code. The remaining samples are all markers. I don't believe the JITs are involved at all since the Performance.now marker is on the stack.

I've seen issues recently where 3rd party libraries (like gtk) don't have frame pointers. Not sure if this related though.

These addresses might be in the vdso image. We probably need to tell Lul about the unwind information in the vdso.

Summary: Profiler: Incorrect Stack Chart on Linux → Profiler: Incorrect Stack Chart on Linux in vdso image

So I think this code needs to be special-cased for libraries whose name is [vdso]. We need to treat the bytes in our process memory from lib.GetStart() to lib.GetEnd() as a normal ELF file. The Lul ELF section loading code needs to support the case where we already have the bytes for the file in memory and they don't need to be (and can't be) loaded from the file system.

Oh, and it makes sense that the vdso issue would affect this synthetic "sleep" testcase; the vdso was basically invented to avoid syscall overhead for the gettimeofday function, which gets called during performance.now().

Hmm, it seems SharedLibraryInfo::GetInfoForSelf() does not currently assign a consistent name to the vdso image. The comment I linked in comment 4 says the vdso has an empty name, but it seems like it could also be called "linux-vdso.so.1" (as in the profile from bug 1805915 comment 7).

So it's probably best to fix SharedLibraryInfo::GetInfoForSelf() first. The vdso start address can be obtained with getauxval:

#include <sys/auxv.h>
void *vdso = (uintptr_t) getauxval(AT_SYSINFO_EHDR);

And then we can just find the existing entry in GetInfoForSelf() for it and adjust its name.

The severity field is not set for this bug.
:canova, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(canaltinova)
Severity: -- → S3
Flags: needinfo?(canaltinova)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: