Enable ForkServer during PGO
Categories
(Core :: Security: Process Sandboxing, defect, P2)
Tracking
()
People
(Reporter: gerard-majax, Assigned: gerard-majax)
References
(Depends on 1 open bug, Blocks 2 open bugs)
Details
Attachments
(1 file, 5 obsolete files)
From bug 1907883 we have been able to identify that forkserver during PGO profiling results in:
- much less *.profraw data files (~80 with forkserver VS ~180 without forkserver)
- less process being profiled (counted 42 processes without forkserver, 27 with forkserver)
It would be nice to understand why.
Assignee | ||
Comment 1•3 months ago
|
||
LVM Profile Error: Invalid profile data to merge
LLVM Profile Error: Profile Merging of file /home/alex/codaz/Mozilla/gecko-cinnabar/PGO_data/pgo_forkserver_4138445_random_2970392330336318723_0.profraw failed: Le fichier existe
LLVM Profile Error: Failed to write file "/home/alex/codaz/Mozilla/gecko-cinnabar/PGO_data/pgo_forkserver_4138445_random_2970392330336318723_0.profraw": Le fichier existe
Getting this when I try to run the instrumented build. 4138445
is the ForkServer PID, so likely LLVM profiling gets tricked on the PID of WebContent process after they came from ForkServer
Assignee | ||
Comment 2•2 months ago
|
||
Assignee | ||
Comment 3•2 months ago
|
||
Assignee | ||
Comment 4•2 months ago
|
||
Assignee | ||
Comment 5•2 months ago
|
||
Assignee | ||
Comment 6•2 months ago
|
||
Updated•2 months ago
|
Assignee | ||
Comment 7•2 months ago
|
||
Assignee | ||
Updated•2 months ago
|
Updated•2 months ago
|
Updated•2 months ago
|
Updated•2 months ago
|
Updated•2 months ago
|
Updated•1 month ago
|
Assignee | ||
Comment 8•1 month ago
|
||
https://treeherder.mozilla.org/jobs?repo=try&revision=324e0fda0a80879857dbc3ada3895ba8cd82a1a4
this shows the same amount of data for merged.profdata
as compared to previously:
$ tar tvf profdata.tar.xz
-rw-r--r-- worker/worker 106602448 2024-09-04 13:59 merged.profdata
With e.g., a mozilla-central run [https://treeherder.mozilla.org/jobs?repo=mozilla-central&searchStr=linux64%2Crun]:
$ tar tvf profdata.tar.xz
-rw-r--r-- worker/worker 106487816 2024-09-05 07:18 merged.profdata
Updated•29 days ago
|
Comment 10•29 days ago
|
||
Backed out as requested for working in the incorrect sandbox.
Assignee | ||
Updated•29 days ago
|
Comment 11•29 days ago
|
||
bugherder |
Updated•29 days ago
|
Comment 12•28 days ago
|
||
Backout merged to central: https://hg.mozilla.org/mozilla-central/rev/c681659e206a
Assignee | ||
Comment 13•14 days ago
|
||
So with the progress on bug 1917575, I rebased on top of that stack and I'm getting stuck into a kind of deadlock state at startup within updateFilemaneAfterFork
:
$ grep --color -E "0x7f957bb06c20|0x564b7954be10" pgo_pthread_atfork.log
[297869] [bool malloc_init_hard()] register _malloc_postfork_child@0x564b7954be10
[297869] [int CustomRegisterRuntime()] register updateFilenameAfterFork@0x7f957bb06c20
[297887] [void updateFilenameAfterFork()] running updateFilenameAfterFork@0x7f957bb06c20 START
[297904] [void updateFilenameAfterFork()] running updateFilenameAfterFork@0x7f957bb06c20 START
[297917] [void updateFilenameAfterFork()] running updateFilenameAfterFork@0x7f957bb06c20 START
Obviously _malloc_postfork_child
is registered before updateFilenameAfterFork
but does not get called before we call updateFilenameAfterFork
and I suspect this may be because how we handle interposition of pthread_atfork
and forwarding to the real atfork: the real_pthread_atfork
symbol is resolved after the malloc init (because searching the symbol triggers some malloc so we would deadlock), so the interposer registers and will replay once things allows.
But our interposer does not looks to be applied to the LLVM bypass code, so the pthread_atfork
it uses is directly the one from the glibc: ergo, even if malloc calls before, the actual moment the glibc really knows about it is after the registration of updateFilenameAfterFork
and the child handlers are called in order of registration.
Tricking by just using the pthread_atfork child handler to atexit(...)
with ...
a function that will call updateFilenameAfterFork
seems enough.
$ grep --color -E "0x7f7985506c40|0x5580a6468ec0" pgo_pthread_atfork.log
[291301] [bool malloc_init_hard()] register _malloc_postfork_child@0x5580a6468ec0
[291301] [int CustomRegisterRuntime()] register registerAtExit@0x7f7985506c40
[291320] [void registerAtExit()] running registerAtExit@0x7f7985506c40 START
[291320] [void registerAtExit()] running registerAtExit@0x7f7985506c40 END
[291320] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 START
[291320] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 END
[291444] [void registerAtExit()] running registerAtExit@0x7f7985506c40 START
[291444] [void registerAtExit()] running registerAtExit@0x7f7985506c40 END
[291444] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 START
[291444] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 END
[291563] [void registerAtExit()] running registerAtExit@0x7f7985506c40 START
[291563] [void registerAtExit()] running registerAtExit@0x7f7985506c40 END
[291563] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 START
[291563] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 END
[291571] [void registerAtExit()] running registerAtExit@0x7f7985506c40 START
[291571] [void registerAtExit()] running registerAtExit@0x7f7985506c40 END
[291571] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 START
[291571] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 END
[291574] [void registerAtExit()] running registerAtExit@0x7f7985506c40 START
[291574] [void registerAtExit()] running registerAtExit@0x7f7985506c40 END
[291574] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 START
[291574] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 END
[291718] [void registerAtExit()] running registerAtExit@0x7f7985506c40 START
[291718] [void registerAtExit()] running registerAtExit@0x7f7985506c40 END
[291718] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 START
[291718] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 END
[291910] [void registerAtExit()] running registerAtExit@0x7f7985506c40 START
[291910] [void registerAtExit()] running registerAtExit@0x7f7985506c40 END
[291910] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 START
[291910] [void _malloc_postfork_child()] running _malloc_postfork_child@0x5580a6468ec0 END
Assignee | ||
Comment 14•14 days ago
|
||
I guess that contrary to within the interposer code, we want to dlsym(nullptr, "pthread_atfork")
here to find the interposed function and not the glibc one. This way we follow the same call order than the rest of the interposed calls and we are fine.
Description
•