Permafail Assertion failure: curThreadData, at /builds/worker/workspace/build/src/xpcom/build/IOInterposer.cpp:505 when Gecko 66 merges to Beta 2019-01-21
Categories
(Core :: Gecko Profiler, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox64 | --- | unaffected |
firefox65 | --- | unaffected |
firefox66 | + | verified |
People
(Reporter: RaulG, Assigned: florian)
References
Details
(Keywords: regression)
Attachments
(1 file, 1 obsolete file)
2.35 KB,
patch
|
mstange
:
review+
|
Details | Diff | Splinter Review |
[Tracking Requested - why for this release]:
Central as Beta simulation: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d86a8f66f46ca3221d538868788a0c52fe14d5a1&searchStr=linux%2Cx64%2Cquantumrender%2Cdebug%2Ctest-linux64-qr%2Fdebug-gtest%2C%28gtest%29&selectedJob=224032521
Log link: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=224032521&repo=try&lineNumber=1208
Log snippet:
[task 2019-01-25T14:55:57.866Z] 14:55:57 INFO - Copy/paste: /builds/worker/workspace/build/venv/bin/mozinstall -h
[task 2019-01-25T14:55:58.078Z] 14:55:58 INFO - Reading from file tmpfile_stdout
[task 2019-01-25T14:55:58.079Z] 14:55:58 INFO - Output received:
[task 2019-01-25T14:55:58.079Z] 14:55:58 INFO - Usage: mozinstall [options] installer
[task 2019-01-25T14:55:58.080Z] 14:55:58 INFO - Options:
[task 2019-01-25T14:55:58.080Z] 14:55:58 INFO - -h, --help show this help message and exit
[task 2019-01-25T14:55:58.080Z] 14:55:58 INFO - -d DEST, --destination=DEST
[task 2019-01-25T14:55:58.083Z] 14:55:58 INFO - Directory to install application into. [default:
[task 2019-01-25T14:55:58.083Z] 14:55:58 INFO - "/builds/worker/workspace"]
[task 2019-01-25T14:55:58.083Z] 14:55:58 INFO - --app=APP Application being installed. [default: firefox]
[task 2019-01-25T14:55:58.083Z] 14:55:58 INFO - mkdir: /builds/worker/workspace/build/application
[task 2019-01-25T14:55:58.083Z] 14:55:58 INFO - Getting output from command: ['/builds/worker/workspace/build/venv/bin/mozinstall', '/builds/worker/workspace/build/installer.tar.bz2', '--destination', '/builds/worker/workspace/build/application']
[task 2019-01-25T14:55:58.083Z] 14:55:58 INFO - Copy/paste: /builds/worker/workspace/build/venv/bin/mozinstall /builds/worker/workspace/build/installer.tar.bz2 --destination /builds/worker/workspace/build/application
[task 2019-01-25T14:56:17.681Z] 14:56:17 INFO - Reading from file tmpfile_stdout
[task 2019-01-25T14:56:17.681Z] 14:56:17 INFO - Output received:
[task 2019-01-25T14:56:17.682Z] 14:56:17 INFO - /builds/worker/workspace/build/application/firefox/firefox
[task 2019-01-25T14:56:17.682Z] 14:56:17 INFO - Running post-action listener: _resource_record_post_action
[task 2019-01-25T14:56:17.683Z] 14:56:17 INFO - [mozharness: 2019-01-25 14:56:17.682988Z] Finished install step (success)
[task 2019-01-25T14:56:17.683Z] 14:56:17 INFO - [mozharness: 2019-01-25 14:56:17.683266Z] Running stage-files step.
[task 2019-01-25T14:56:17.683Z] 14:56:17 INFO - Running pre-action listener: _resource_record_pre_action
[task 2019-01-25T14:56:17.683Z] 14:56:17 INFO - Running main action method: stage_files
[task 2019-01-25T14:56:17.684Z] 14:56:17 INFO - Moving /builds/worker/workspace/build/tests/bin/plugins/gmp-clearkey to /builds/worker/workspace/build/application/firefox
[task 2019-01-25T14:56:17.685Z] 14:56:17 ERROR - shutil error: Destination path '/builds/worker/workspace/build/application/firefox/gmp-clearkey' already exists
[task 2019-01-25T14:56:17.685Z] 14:56:17 INFO - Moving /builds/worker/workspace/build/tests/bin/plugins/gmp-fake to /builds/worker/workspace/build/application/firefox
[task 2019-01-25T14:56:17.685Z] 14:56:17 INFO - Moving /builds/worker/workspace/build/tests/bin/plugins/gmp-fakeopenh264 to /builds/worker/workspace/build/application/firefox
Comment 1•5 years ago
|
||
It looks like these assertions are being hit off the main thread in a Gecko profiler unit test that calls functions on a background thread, so maybe this is due to the profiler. I think that the assert means that some IOInterposer TLS hasn't been set up on these threads, but I have no idea why this would start breaking.
Comment 3•5 years ago
|
||
Regression from bug 1522497? This is a beta simulation and the final merge of central to beta (for Gecko 66) is on Monday.
Florian, please investigate.
Updated•5 years ago
|
Assignee | ||
Comment 4•5 years ago
|
||
This patch adds ifdefs in the profiler code to match the behavior at https://searchfox.org/mozilla-central/rev/465dbfe030dfec7756b9b523029e90d48dd5ecce/xpcom/build/IOInterposer.h#259-261
Updated•5 years ago
|
Comment 5•5 years ago
•
|
||
Comment on attachment 9039144 [details] [diff] [review] Tentative patch Review of attachment 9039144 [details] [diff] [review]: ----------------------------------------------------------------- Interesting! I think I've understood what's happening here. On Nightly, IOInterposer::Init is always called before any thread starts that calls IOInterposer::RegisterCurrentThread. That means that IOInterposer::RegisterCurrentThread always succeeds and IOInterposer::UnregisterCurrentThread always finds data for the current thread. However, if IOInterposer::Init is not called before a thread calls IOInterposer::RegisterCurrentThread, then IOInterposer::RegisterCurrentThread will fail. Now, two things can happen: - Nothing calls IOInterposer::Init until the thread shuts down. Then IOInterposer::UnregisterCurrentThread won't do anything and all is fine. - Something calls IOInterposer::Init before the thread shuts down. Then IOInterposer::UnregisterCurrentThread sees that the interposer has been initialized, but it won't find any data for the current thread, and it'll assert. The profiler is now calling IOInterposer::Init. On Nightly, this happens: - If we're startup profiling, we will call IOInterposer::Init before any thread has called IOInterposer::RegisterCurrentThread. All is fine. - If we're not startup profiling, then IOInterposerInit has already called IOInterposer::Init. The profiler's call to IOInterposer::Init does nothing. Things are fine as well. But on release and beta, in the second case, IOInterposer::Init has not been called! So the profiler's call will initialize it, and we get into the state described above where the IOInterposer has been initialized between calls to IOInterposer::RegisterCurrentThread and IOInterposer::UnregisterCurrentThread. I think this is a problem with the IOInterposer API and not with the profiler. So we shouldn't work around it in the profiler. I'll attach an alternative patch.
Comment 6•5 years ago
|
||
Comment 7•5 years ago
|
||
However, this does mean that bug 1522497 had the unintended effect of actually giving us IO information on Beta + Release. We now actually make use of the IOInterposer on those channels, if the user starts the profiler, and has the "Main Thread IO" checkbox checked, whereas we just wouldn't have gotten any information on those channels in the past.
Comment 8•5 years ago
|
||
Comment on attachment 9039144 [details] [diff] [review] Tentative patch Review of attachment 9039144 [details] [diff] [review]: ----------------------------------------------------------------- Actually yes, let's land this as a quick workaround while the other patch gets reviewed. It fixes the problem and is harmless.
Comment 9•5 years ago
|
||
requesting checkin-needed for attachment 9039144 [details] [diff] [review], not for the phabricator patch.
Comment 10•5 years ago
|
||
Pushed by ccoroiu@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/21211736651c
ifdef out the profiler calls to IOInterposer on release and beta to match the IOInterposerInit behavior, r=mstange.
Comment 11•5 years ago
|
||
bugherder |
Assignee | ||
Updated•5 years ago
|
Comment 12•5 years ago
|
||
Verified fixed with latest beta simulation: https://treeherder.mozilla.org/#/jobs?repo=try&resultStatus=testfailed%2Cbusted%2Cexception%2Crunnable&revision=b8d3814c073f0624d75a5985476d21002dcf4721
Comment 13•5 years ago
|
||
Needinfo :mstange to make sure you don't forget to open a new bug for your other patch :)
Updated•5 years ago
|
Comment 14•1 year ago
|
||
It seems like a similar patch to the other patch eventually landed in bug 1705579.
Description
•