[meta] Make TSan (ThreadSanitizer) usable with Firefox
Categories
(Core :: Sanitizers, task)
Tracking
()
People
(Reporter: decoder, Unassigned)
References
(Depends on 49 open bugs)
Details
(Keywords: meta, sec-want)
| Reporter | ||
Updated•12 years ago
|
| Reporter | ||
Updated•12 years ago
|
Updated•9 years ago
|
Updated•6 years ago
|
| Reporter | ||
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Comment 1•6 years ago
|
||
Chiming into this bug, please direct me to a more appropriate bug if there exists one.
I've noticed that tsan is currently running on linux64 docker image. This has been phased out for the most part on mozilla-central, with tests now running with ubuntu1804-test docker image.
I have a push where I made the necessary changes to have tsan run with ubuntu1804-test docker image, and I have some failures:
https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&tier=1%2C2%2C3&revision=e41b65c04476a324e12f6699f2bb7b6b3fbed695
xpcshell
- X2 chunk reports a bunch of
TEST-UNEXPECTED-FAILthat I assume are legitimate failures (i.e not slow tests).
mochitest-plain
- a persistent failure of
toolkit/content/tests/widgets/test_videocontrols.htmlcan be seen in chunk 4. - multiple failures are observed in chunk 5, retriggers are underway.
I'd like to have all of tsan migrated over to use ubuntu1804 before the list of test suites grows. Should I skip all of the failing tests with skip-if = tsan?
| Reporter | ||
Updated•6 years ago
|
Updated•5 years ago
|
| Reporter | ||
Updated•5 years ago
|
| Reporter | ||
Updated•5 years ago
|
Updated•5 years ago
|
| Reporter | ||
Updated•5 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•5 months ago
|
Comment 2•2 months ago
|
||
I am not sure if this is the proper bugzilla, but I thought this is seen by many who try to fix TSAN issues.
Hi I have been trying to detect and fix possible thread races in thunderbird.
I use TSAN built using clang locally.
However, I use ordinary GCC toolchain for memory check using valgrind.
So in a sense I am using mixed binary tool chain with different object directories for different target (ordinary, ASAN TSAN, etc.)
I noticed that using GCC-15 (and possibly GCC-14) I noticed the following type of warning in my local xpcshell test.
As you can see, they are extremely useful for detecting possible deadlock issue after I inserted monitor calls maybe too eagerly without checking cyclic locaking to fix TSAN issues found by
TSAN version run locally.
14:46.60 INFO xpcshell-maildir.toml:comm/mailnews/imap/test/unit/test_dontStatNoSelect.js | Starting endTest
14:46.60 INFO (xpcshell/head.js) | test endTest pending (2)
14:46.60 PASS endTest - [endTest : 113] 2 == 2
14:46.60 pid:2637614 {debug} SetSpec succeeded. : aSpec=imap://user@localhost:34595/folderstatus%3E/folder%201
14:46.60 pid:2637614 {debug} SetSpec succeeded. : aSpec=moz-nullprincipal:{f53034e1-92a5-481a-9bf7-3d8b0f72cdb0}
14:46.61 INFO (xpcshell/head.js) | test run_next_test 3 finished (2)
14:46.61 INFO (xpcshell/head.js) | test run_next_test 4 pending (2)
14:46.61 INFO (xpcshell/head.js) | test endTest finished (2)
14:46.61 INFO (xpcshell/head.js) | test run_next_test 4 finished (1)
14:46.61 INFO exiting test
14:46.63 pid:2637614 ###!!! ERROR: Potential deadlock detected:
14:46.63 pid:2637614 === Cyclical dependency starts at
14:46.63 pid:2637614 --- ReentrantMonitor : nsImapProtocol.mMonitor (currently acquired)
14:46.63 pid:2637614 calling context
Initializing stack-fixing for the first stack frame, this may take a while...
15:21.28 pid:2637614 #01: mozilla::ReentrantMonitor::Enter() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/xpcom/threads/BlockingResourceBase.cpp:448)
15:21.28 pid:2637614 #02: nsImapProtocol::TellThreadToDie() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:1348)
15:21.28 pid:2637614 #03: nsImapProtocol::CreateNewLineFromSocket() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:5240)
15:21.29 pid:2637614 #04: nsImapServerResponseParser::GetNextLineForParser(char**) (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapServerResponseParser.cpp:99)
15:21.30 pid:2637614 #05: nsImapGenericParser::AdvanceToNextLine() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapGenericParser.cpp:120)
15:21.31 pid:2637614 #06: nsImapGenericParser::AdvanceToNextToken() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapGenericParser.cpp:95)
15:21.31 pid:2637614 #07: nsImapServerResponseParser::ParseIMAPServerResponse(char const*, bool, char*) (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapServerResponseParser.cpp:206)
15:21.31 pid:2637614 #08: nsImapProtocol::AuthLogin(char const*, nsTString<char16_t> const&, unsigned long) [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:6294)
15:21.31 pid:2637614 #09: nsImapProtocol::TryToLogon() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:8914)
15:21.31 pid:2637614 #10: mozilla::detail::RunnableFunction<nsImapProtocol::ProcessCurrentURL()::{lambda()#1}>::Run() (/NEW-SSD/moz-obj-dir/objdir-tb3/dist/include/nsThreadUtils.h:549)
15:21.31 pid:2637614 --- Next dependency:
15:21.31 pid:2637614 --- ReentrantMonitor : imapThreadDeath (currently acquired)
15:21.31 pid:2637614 calling context
15:21.31 pid:2637614 #01: mozilla::ReentrantMonitor::Enter() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/xpcom/threads/BlockingResourceBase.cpp:450)
15:21.31 pid:2637614 #02: nsImapProtocol::CreateNewLineFromSocket() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:5239)
15:21.31 pid:2637614 #03: nsImapServerResponseParser::GetNextLineForParser(char**) (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapServerResponseParser.cpp:99)
15:21.31 pid:2637614 #04: nsImapGenericParser::AdvanceToNextLine() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapGenericParser.cpp:120)
15:21.31 pid:2637614 #05: nsImapGenericParser::AdvanceToNextToken() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapGenericParser.cpp:95)
15:21.31 pid:2637614 #06: nsImapServerResponseParser::ParseIMAPServerResponse(char const*, bool, char*) (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapServerResponseParser.cpp:206)
15:21.31 pid:2637614 #07: nsImapProtocol::AuthLogin(char const*, nsTString<char16_t> const&, unsigned long) [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:6294)
15:21.31 pid:2637614 #08: nsImapProtocol::TryToLogon() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:8914)
15:21.31 pid:2637614 #09: mozilla::detail::RunnableFunction<nsImapProtocol::ProcessCurrentURL()::{lambda()#1}>::Run() (/NEW-SSD/moz-obj-dir/objdir-tb3/dist/include/nsThreadUtils.h:549)
15:21.31 pid:2637614 #10: ??? (/NEW-SSD/moz-obj-dir/objdir-tb3/dist/bin/libxul.so + 0xbe488f3)
15:21.31 pid:2637614 === Cycle completed at
15:21.31 pid:2637614 --- ReentrantMonitor : nsImapProtocol.mMonitor (currently acquired)
15:21.31 pid:2637614 calling context
15:21.31 pid:2637614 #01: mozilla::ReentrantMonitor::Enter() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/xpcom/threads/BlockingResourceBase.cpp:448)
15:21.31 pid:2637614 #02: nsImapProtocol::TellThreadToDie() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:1348)
15:21.31 pid:2637614 #03: nsImapProtocol::CreateNewLineFromSocket() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:5240)
15:21.31 pid:2637614 #04: nsImapServerResponseParser::GetNextLineForParser(char**) (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapServerResponseParser.cpp:99)
15:21.31 pid:2637614 #05: nsImapGenericParser::AdvanceToNextLine() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapGenericParser.cpp:120)
15:21.31 pid:2637614 #06: nsImapGenericParser::AdvanceToNextToken() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapGenericParser.cpp:95)
15:21.31 pid:2637614 #07: nsImapServerResponseParser::ParseIMAPServerResponse(char const*, bool, char*) (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapServerResponseParser.cpp:206)
15:21.31 pid:2637614 #08: nsImapProtocol::AuthLogin(char const*, nsTString<char16_t> const&, unsigned long) [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:6294)
15:21.31 pid:2637614 #09: nsImapProtocol::TryToLogon() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:8914)
15:21.31 pid:2637614 #10: mozilla::detail::RunnableFunction<nsImapProtocol::ProcessCurrentURL()::{lambda()#1}>::Run() (/NEW-SSD/moz-obj-dir/objdir-tb3/dist/include/nsThreadUtils.h:549)
15:21.31 pid:2637614 ###!!! Deadlock may happen NOW!
15:21.31 pid:2637614 \x07[Parent 2637614, IMAP] ###!!! ASSERTION: Potential deadlock detected:
15:21.31 pid:2637614 Cyclical dependency starts at
15:21.31 pid:2637614 ReentrantMonitor : nsImapProtocol.mMonitor (currently acquired)
15:21.31 pid:2637614 #01: mozilla::ReentrantMonitor::Enter() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/xpcom/threads/BlockingResourceBase.cpp:448)
15:21.31 pid:2637614 #02: nsImapProtocol::TellThreadToDie() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:1348)
15:21.31 pid:2637614 #03: nsImapProtocol::CreateNewLineFromSocket() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:5240)
15:21.31 pid:2637614 #04: nsImapServerResponseParser::GetNextLineForParser(char**) (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapServerResponseParser.cpp:99)
15:21.31 pid:2637614 #05: nsImapGenericParser::AdvanceToNextLine() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapGenericParser.cpp:120)
15:21.31 pid:2637614 #06: nsImapGenericParser::AdvanceToNextToken() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapGenericParser.cpp:95)
15:21.31 pid:2637614 #07: nsImapServerResponseParser::ParseIMAPServerResponse(char const*, bool, char*) (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapServerResponseParser.cpp:206)
15:21.31 pid:2637614 #08: ???[/NEW-SSD/moz-obj-dir/objdir-tb3/dist/bin/libxul.so +0x
15:21.32 pid:2637614 #01: NS_DebugBreak.cold (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/xpcom/base/nsDebugImpl.cpp:511)
15:21.32 pid:2637614 #02: mozilla::BlockingResourceBase::CheckAcquire() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/xpcom/threads/BlockingResourceBase.cpp:279)
15:21.32 pid:2637614 #03: mozilla::ReentrantMonitor::Enter() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/xpcom/threads/BlockingResourceBase.cpp:448)
15:21.32 pid:2637614 #04: nsImapProtocol::TellThreadToDie() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:1348)
15:21.32 pid:2637614 #05: nsImapProtocol::CreateNewLineFromSocket() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:5240)
15:21.32 pid:2637614 #06: nsImapServerResponseParser::GetNextLineForParser(char**) (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapServerResponseParser.cpp:99)
15:21.32 pid:2637614 #07: nsImapGenericParser::AdvanceToNextLine() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapGenericParser.cpp:120)
15:21.32 pid:2637614 #08: nsImapGenericParser::AdvanceToNextToken() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapGenericParser.cpp:95)
15:21.32 pid:2637614 #09: nsImapServerResponseParser::ParseIMAPServerResponse(char const*, bool, char*) (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapServerResponseParser.cpp:206)
15:21.32 pid:2637614 #10: nsImapProtocol::AuthLogin(char const*, nsTString<char16_t> const&, unsigned long) [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:6294)
15:21.32 pid:2637614 #11: nsImapProtocol::TryToLogon() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:8914)
15:21.32 pid:2637614 #12: mozilla::detail::RunnableFunction<nsImapProtocol::ProcessCurrentURL()::{lambda()#1}>::Run() (/NEW-SSD/moz-obj-dir/objdir-tb3/dist/include/nsThreadUtils.h:549)
15:21.32 pid:2637614 #13: ??? (/NEW-SSD/moz-obj-dir/objdir-tb3/dist/bin/libxul.so + 0xbe488f3)
15:21.32 pid:2637614 #14: nsImapProtocol::ProcessCurrentURL() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:2063)
15:21.32 pid:2637614 #15: nsImapProtocol::ImapThreadMainLoop() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:1607)
15:21.32 pid:2637614 #16: nsImapProtocol::RunImapThreadMainLoop() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:1124)
15:21.32 pid:2637614 #17: nsImapProtocolMainLoopRunnable::Run() (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/imap/src/nsImapProtocol.cpp:460)
15:21.32 pid:2637614 #18: nsThread::ProcessNextEvent(bool, bool*) [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/xpcom/threads/nsThread.cpp:1159)
15:21.32 pid:2637614 #19: NS_ProcessNextEvent(nsIThread*, bool) (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/xpcom/threads/nsThreadUtils.cpp:461)
15:21.32 pid:2637614 #20: mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/ipc/glue/MessagePump.cpp:299)
15:21.32 pid:2637614 #21: MessageLoop::Run() [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/ipc/chromium/src/base/message_loop.cc:344)
15:21.32 pid:2637614 #22: nsThread::ThreadFunc(void*) [clone .cold] (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/xpcom/threads/nsThread.cpp:375)
15:21.34 pid:2637614 #23: _pt_root (/NEW-SSD/NREF-COMM-CENTRAL/mozilla/nsprpub/pr/src/pthreads/ptthread.c:198)
15:21.37 pid:2637614 #24: ??? (/lib/x86_64-linux-gnu/libc.so.6 + 0x92b7b)
15:21.37 pid:2637614 #25: ??? (/lib/x86_64-linux-gnu/libc.so.6 + 0x1107b8)
15:21.37 pid:2637614 #26: ??? (???:???)
There are definitely some data race issues regarding IMAP buffer code handling in thunderbird and it seems that
I inserted monitor call inappropriately and this helps me track down the strange timeout issue which I found locally and on treeherder.
But I have absolutely no idea which option during the compling or which libraries I used during linking were responsible for the generation of these runtime warnings.
I saw this type of message once while I used gcc-14 about a year ago, but have not seen this until about three days ago.
Now I realize that I may have mixed up the ordinary compiled objects (using GCC for non-TSAN built object) and then linked it with libraries that are meant for TSAN build, even.
But if so, that accidental mistake may have somehow produced very nice runtime warning messages.
My question:
Has anyone seen similar messages before, and if so, can you recall how you compile and link the binaries?
I feel that this deadlock detection at runtime is SO USEFUL and ought to be part of a rebular testing.
TSAN does not seem to detect this if I am not mistaken (but it may do. I have updated clang support libraries lately using }mach bootstrap|
because somehow after I could not develop and test for a copule of couple of months, around December, my local build failed and I had to do some serious code clean up with a few fals starts after Jan 10th. Then I noticed these nice runtime warning messages
which I once saw and could not reproduce until a few days ago. this suggessts maybe an update C++/C runtime libraries (possibly for TSAN use, which I may have linked accidentally to my normal build.)
If there is a better place to ask these questions, I will welcome any pointers to such places.
TIA
Comment 3•2 months ago
|
||
(In reply to ISHIKAWA, Chiaki from comment #2)
I am not sure if this is the proper bugzilla, but I thought this is seen by many who try to fix TSAN issues. ... [omission] ...
My question:
Has anyone seen similar messages before, and if so, can you recall how you compile and link the binaries?
Sorry for the noise.
After searching for "ASSERTION: Potential deadlock detected: Cyclical dependency starts at" using google, I now realize the particular message
I quoted in my comment 2 comes from XPCOM code in moazilla-central, I mean firefox-main.
https://searchfox.org/firefox-main/source/xpcom/threads/BlockingResourceBase.cpp#211
So that probably detected the dormant deadlock issue which I have introduced in November last year and not realized yet then.
The similar deadlock detection code in
https://searchfox.org/firefox-main/source/xpcom/threads/BlockingResourceBase.cpp#265
probably generated the message that I saw about a year ago, and after code modification, that problem must have disappeared.
The above scenario fits my observation of the deadlock warning one year ago and a few days ago. (It could be that the first code detected the
error last year, but I often lagged behind the source code update so that I can chase a bug in a known source tree for extended days.)
With the understanding that there is XPCOM runtime check,
I can insert monitor calls to fix TSAN issues rather too eagerly and let the runtime code to detect potential deadlocks.
Not ideal, but very workable to fix legacy thunderbird imap code.
One point I noticed. TSAN runtime library itself may introduce enough timing variance to mask/hide or cause timing related issues.
I checked my local log files and found the message appear in a couple of logfiles that were produced by TSAN version **** xpchsell test runs on April 2, 2025.
I ran three tests on that day, and two files recorded the potential deadlock and the other did not. I am not sure if I fixed the problem by modifying the code or not. TSAN issues are hard to erase, but this runtime check of potential deadlock is very useful.
I was led to believe that if I use TSAN build, this message was NOT produced, but I was probably wrong. It was just a concidence that I did not see them during TSAN testing (either due to timing differences and/or the culprit issue had been fixed incidentally.)
Again, I am sorry for the noise, but when I simply insert the whole message lines nothing showed up in google search, and only afterI trimmed the message lines and fed them to google, meaningful result appeared.
Description
•