Fuzzing calls exit() which runs static destructors and can cause MemoryTelemetry to be used after free.
Categories
(Thunderbird :: Testing Infrastructure, task)
Tracking
(Not tracked)
People
(Reporter: jtracey, Assigned: pbone)
References
Details
Attachments
(4 files)
Since bug 1505277 is now wontfix, we should get Thunderbird fuzzing running on Mozilla infrastructure instead.
Comment 1•6 months ago
|
||
It would be good to describe what the next steps are.
As I understand it, fuzzing-capable binaries are now produced by Thunderbird CI.
I assume the next step is to have a task running that fetches those builds, and regularly executes them, captures the result and produces some kind of report.
What's necessary for that?
| Reporter | ||
Comment 2•6 months ago
|
||
It sounds like at least Fuzzfetch support is needed, but how the Mozilla fuzzing infra works beyond that, I don't know.
Comment 3•6 months ago
|
||
Once fuzzfetch support is added the fuzzing team can add the target to our fuzzing infrastructure.
| Reporter | ||
Comment 4•6 months ago
|
||
| Reporter | ||
Comment 5•6 months ago
|
||
Now that the PR is merged, what are the next steps here?
Comment 6•6 months ago
|
||
Thanks for the fuzzzfetch PR!
guided-fuzzing-daemon is used in automation to run the fuzzing targets. AFL and Libfuzzer support will be required. Have a look and let us know if you have any questions.
Comment 7•6 months ago
•
|
||
That said we may be able to get the libfuzzer working. I'll try and let you know.
Comment 8•6 months ago
|
||
We got the infrastructure changes in place but the target is crashing when running replay or corpus merge. These issues will need to be resolved before the target will run properly in automation. This can be reproduced locally by running:
replay (add a single 'a' to test.bin):
FUZZER=MimeDecoder ./thunderbird test.bin
corpus reduce:
FUZZER=MimeDecoder ./thunderbird -merge=1 corpus/ empty/
| Reporter | ||
Comment 9•5 months ago
|
||
I've been looking into this, but haven't made a ton of progress. Here is the output with ASAN's backtraces:
FUZZER=MimeDecoder ./thunderbird test.bin
*** You are running in headless mode.
Running Fuzzer tests...
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 1874693350
INFO: Loaded 2 modules (2478618 inline 8-bit counters): 14715 [0x77262b6534e0, 0x77262b656e5b), 2463903 [0x7726254725d0, 0x7726256cbe6f),
INFO: Loaded 2 PC tables (2478618 PCs): 14715 [0x77262b656e60,0x77262b690610), 2463903 [0x7726256cbe70,0x772627c64860),
./thunderbird: Running 1 inputs 1 time(s) each.
Running: test.bin
Executed test.bin in 47 ms
***
*** NOTE: fuzzing was not performed, you have only
*** executed the target code on a fixed set of inputs.
***
=================================================================
==1254668==ERROR: AddressSanitizer: heap-use-after-free on address 0x77a62da56df8 at pc 0x77260ed0525e bp 0x7ffe405b1630 sp 0x7ffe405b1628
READ of size 1 at 0x77a62da56df8 thread T0
#0 0x77260ed0525d in mozilla::MemoryTelemetry::Poke() /builds/worker/checkouts/gecko/xpcom/base/MemoryTelemetry.cpp:160:8
#1 0x77261083ee60 in XPCJSContext::AfterProcessTask(unsigned int) /builds/worker/checkouts/gecko/js/xpconnect/src/XPCJSContext.cpp:1539:28
#2 0x77260eee1311 in nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1220:24
#3 0x77260eedb349 in NS_ProcessPendingEvents(nsIThread*, unsigned int) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:426:19
#4 0x77260ec96887 in mozilla::AppShutdown::AdvanceShutdownPhaseInternal(mozilla::ShutdownPhase, bool, char16_t const*, nsCOMPtr<nsISupports> const&) /builds/worker/checkouts/gecko/xpcom/base/AppShutdown.cpp:407:5
#5 0x77261bd3fa2f in ~ScopedXPCOM /builds/worker/checkouts/gecko/tools/fuzzing/interface/harness/FuzzerTestHarness.h:82:7
#6 0x77261bd3fa2f in mozilla::_InitFuzzer::DeinitXPCOM() /builds/worker/checkouts/gecko/tools/fuzzing/interface/harness/FuzzerRunner.cpp:25:23
#7 0x7b262e6488a0 in __run_exit_handlers stdlib/exit.c:118:8
#8 0x7b262e64897d in exit stdlib/exit.c:148:3
#9 0x572ceef5b44c in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /builds/worker/checkouts/gecko/tools/fuzzing/libfuzzer/FuzzerDriver.cpp
#10 0x77261bd3f5fb in mozilla::FuzzerRunner::Run(int*, char***) /builds/worker/checkouts/gecko/tools/fuzzing/interface/harness/FuzzerRunner.cpp:87:13
#11 0x77261bc8bdba in XREMain::XRE_mainStartup(bool*) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:4817:35
#12 0x77261bc9759f in XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:6148:12
#13 0x77261bc98703 in XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:6246:21
#14 0x572ceee718b8 in do_main /builds/worker/checkouts/gecko/comm/mail/app/nsMailApp.cpp:218:22
#15 0x572ceee718b8 in main /builds/worker/checkouts/gecko/comm/mail/app/nsMailApp.cpp:414:16
#16 0x7b262e62a574 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#17 0x7b262e62a627 in __libc_start_main csu/../csu/libc-start.c:360:3
#18 0x572ceed8b9b8 in _start (/home/user/Documents/fuzzfetch/thunderbird-fuzzing3/thunderbird-bin+0xcf9b8) (BuildId: 814d5880744d72c1bafd6f9f161c20cec1ca34c2)
0x77a62da56df8 is located 88 bytes inside of 96-byte region [0x77a62da56da0,0x77a62da56e00)
freed by thread T0 here:
#0 0x572ceee2cab6 in free /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:51:3
#1 0x77260ed0436d in operator delete /builds/worker/workspace/obj-build/dist/include/mozilla/cxxalloc.h:64:10
#2 0x77260ed0436d in mozilla::MemoryTelemetry::Release() /builds/worker/checkouts/gecko/xpcom/base/MemoryTelemetry.cpp:122:1
#3 0x7b262e6488a0 in __run_exit_handlers stdlib/exit.c:118:8
#4 0x7b262e64897d in exit stdlib/exit.c:148:3
#5 0x572ceef5b44c in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /builds/worker/checkouts/gecko/tools/fuzzing/libfuzzer/FuzzerDriver.cpp
#6 0x77261bd3f5fb in mozilla::FuzzerRunner::Run(int*, char***) /builds/worker/checkouts/gecko/tools/fuzzing/interface/harness/FuzzerRunner.cpp:87:13
#7 0x77261bc8bdba in XREMain::XRE_mainStartup(bool*) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:4817:35
#8 0x77261bc9759f in XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:6148:12
#9 0x77261bc98703 in XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:6246:21
#10 0x572ceee718b8 in do_main /builds/worker/checkouts/gecko/comm/mail/app/nsMailApp.cpp:218:22
#11 0x572ceee718b8 in main /builds/worker/checkouts/gecko/comm/mail/app/nsMailApp.cpp:414:16
#12 0x7b262e62a574 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#13 0x7b262e62a627 in __libc_start_main csu/../csu/libc-start.c:360:3
#14 0x572ceed8b9b8 in _start (/home/user/Documents/fuzzfetch/thunderbird-fuzzing3/thunderbird-bin+0xcf9b8) (BuildId: 814d5880744d72c1bafd6f9f161c20cec1ca34c2)
previously allocated by thread T0 here:
#0 0x572ceee2cd54 in malloc /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:67:3
#1 0x572ceee823a5 in moz_xmalloc /builds/worker/checkouts/gecko/memory/mozalloc/mozalloc.cpp:51:15
#2 0x77260ed0463e in operator new /builds/worker/workspace/obj-build/dist/include/mozilla/cxxalloc.h:33:10
#3 0x77260ed0463e in mozilla::MemoryTelemetry::Get() /builds/worker/checkouts/gecko/xpcom/base/MemoryTelemetry.cpp:146:17
#4 0x77261083ee58 in XPCJSContext::AfterProcessTask(unsigned int) /builds/worker/checkouts/gecko/js/xpconnect/src/XPCJSContext.cpp:1539:5
#5 0x77260eee1311 in nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1220:24
#6 0x77260eeeb828 in NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:461:10
#7 0x77260acbf476 in bool mozilla::SpinEventLoopUntil<(mozilla::ProcessFailureBehavior)1, mozilla::net::FuzzingStreamListener::waitUntilDone()::'lambda'()>(nsTSubstring<char> const&, mozilla::net::FuzzingStreamListener::waitUntilDone()::'lambda'()&&, nsIThread*) /builds/worker/workspace/obj-build/dist/include/mozilla/SpinEventLoopUntil.h:176:25
#8 0x77260d5b226f in waitUntilDone /builds/worker/workspace/obj-build/dist/include/mozilla/fuzzing/FuzzingStreamListener.h:23:5
#9 0x77260d5b226f in FuzzingMimeDecoder /builds/worker/checkouts/gecko/comm/mailnews/mime/fuzz/TestMimeFuzz.cpp:84:19
#10 0x77260d5b226f in LibFuzzerTestMimeDecoder(unsigned char const*, unsigned long) /builds/worker/checkouts/gecko/comm/mailnews/mime/fuzz/TestMimeFuzz.cpp:89:1
#11 0x572ceef6a346 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /builds/worker/checkouts/gecko/tools/fuzzing/libfuzzer/FuzzerLoop.cpp:619:13
#12 0x572ceef55b68 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /builds/worker/checkouts/gecko/tools/fuzzing/libfuzzer/FuzzerDriver.cpp:335:6
#13 0x572ceef5b2dd in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /builds/worker/checkouts/gecko/tools/fuzzing/libfuzzer/FuzzerDriver.cpp:871:9
#14 0x77261bd3f5fb in mozilla::FuzzerRunner::Run(int*, char***) /builds/worker/checkouts/gecko/tools/fuzzing/interface/harness/FuzzerRunner.cpp:87:13
#15 0x77261bc8bdba in XREMain::XRE_mainStartup(bool*) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:4817:35
#16 0x77261bc9759f in XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:6148:12
#17 0x77261bc98703 in XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:6246:21
#18 0x572ceee718b8 in do_main /builds/worker/checkouts/gecko/comm/mail/app/nsMailApp.cpp:218:22
#19 0x572ceee718b8 in main /builds/worker/checkouts/gecko/comm/mail/app/nsMailApp.cpp:414:16
#20 0x7b262e62a574 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#21 0x7b262e62a627 in __libc_start_main csu/../csu/libc-start.c:360:3
#22 0x572ceed8b9b8 in _start (/home/user/Documents/fuzzfetch/thunderbird-fuzzing3/thunderbird-bin+0xcf9b8) (BuildId: 814d5880744d72c1bafd6f9f161c20cec1ca34c2)
SUMMARY: AddressSanitizer: heap-use-after-free /builds/worker/checkouts/gecko/xpcom/base/MemoryTelemetry.cpp:160:8 in mozilla::MemoryTelemetry::Poke()
It's clear that the problem is the final drain of the event loop is still invoking the memory telemetry after it was already freed, but I don't understand why that's happening or how to prevent it.
Comment 10•5 months ago
|
||
Chris, do you know who might be a good person for us to ask about this?
Comment 11•5 months ago
|
||
I think :pbone is the one most likely to know how best to work out "Fuzzer's deinit of XPCOM is triggering the XPCJSContext to poke an uninit'd MemoryTelemetry reporter"
| Assignee | ||
Comment 12•5 months ago
|
||
Those stacks paint a clear picture. I'll fix it on Monday.
| Assignee | ||
Comment 13•5 months ago
|
||
It seems strange that the fuzzing code would call exit(), this is running destructors for static variables which have no defined execution order and causing the use after free:
#2 0x77260ed0436d in mozilla::MemoryTelemetry::Release() /builds/worker/checkouts/gecko/xpcom/base/MemoryTelemetry.cpp:122:1
#3 0x7b262e6488a0 in __run_exit_handlers stdlib/exit.c:118:8
#4 0x7b262e64897d in exit stdlib/exit.c:148:3
#5 0x572ceef5b44c in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /builds/worker/checkouts/gecko/tools/fuzzing/libfuzzer/FuzzerDriver.cpp
#6 0x77261bd3f5fb in mozilla::FuzzerRunner::Run(int*, char***) /builds/worker/checkouts/gecko/tools/fuzzing/interface/harness/FuzzerRunner.cpp:87:13
#7 0x77261bc8bdba in XREMain::XRE_mainStartup(bool*) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:4817:35
#8 0x77261bc9759f in XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:6148:12
I will fix this use after free but there may be others that get caused by calling exit() without shutting down XPCOM.
| Assignee | ||
Comment 14•5 months ago
|
||
(In reply to Justin Tracey [:jtracey] from comment #9)
I've been looking into this, but haven't made a ton of progress. Here is the output with ASAN's backtraces:
FUZZER=MimeDecoder ./thunderbird test.bin
Hi Justin, I can see this is how to run the test. Can you show me how to build for this test and if there's anything else I need to run it? Or is it best to upload some patches and ask you if you could test them?
Thanks.
Comment 15•5 months ago
|
||
:truber, I vaguely remember we had a similar problem before with libFuzzer calling exit() causing issues with our destructor order? how did we fix that?
Comment 16•5 months ago
|
||
There's _exit() which skips the destructors, I think. Although I remember there's some complication about it? Looking at AppShutdown::DoImmediateExit, on Windows we do something different. Honestly you could just call DoImmediateExit() and maybe that would be best.
| Reporter | ||
Comment 17•5 months ago
|
||
(In reply to Paul Bone [:pbone] from comment #14)
Hi Justin, I can see this is how to run the test. Can you show me how to build for this test and if there's anything else I need to run it? Or is it best to upload some patches and ask you if you could test them?
Thanks.
To run the test, you'd need to build thunderbird with fuzzing enabled (including the gtest build, as documented there). Submitting changes for me to try is fine too though.
(In reply to Andrew McCreight [:mccr8] from comment #16)
There's
_exit()which skips the destructors, I think. Although I remember there's some complication about it? Looking atAppShutdown::DoImmediateExit, on Windows we do something different. Honestly you could just callDoImmediateExit()and maybe that would be best.
We're only fuzzing on Linux for now, FWIW, though longer term that could change. If terminating more quickly is the solution, and ASAN is fine with it, I'm good with that, but I would still like to understand why we need that (and don't in Firefox fuzzing targets, at least from what I can see).
Comment 18•5 months ago
|
||
(In reply to Christian Holler (:decoder) from comment #15)
:truber, I vaguely remember we had a similar problem before with libFuzzer calling
exit()causing issues with our destructor order? how did we fix that?
We fixed this in bug 1695761 by patching a lot of the exit() calls in libfuzzer to return instead. It looks like there's a lot more that could be done.
| Assignee | ||
Comment 19•5 months ago
|
||
MemoryTelemetry was already ref counted, but it also had a static global
that was assumed to be never cleared/freed. This patch makes it
entirely reference counted and places a reference in TelemetryImpl so
that TelemetryImpl can always access it. It also requires threadsafe
reference counting since some of the memory telemetry is collected
off-thread.
| Assignee | ||
Comment 20•5 months ago
|
||
Until telemetry is initialised MemoryTelemetry will consume no memory.
| Assignee | ||
Comment 21•5 months ago
|
||
This shutdown observer flushes telemetry from content processes during
shutdown. It may have been created for memory telemetry but it's more
general.
| Assignee | ||
Comment 22•5 months ago
|
||
Comment 23•5 months ago
|
||
Comment 24•5 months ago
|
||
Comment 25•5 months ago
•
|
||
Backed out for causing mda failures @browser_encrypted_play_time_telemetry.js.
- Backout link
- Push with failures
- Failure Log @browser_encrypted_play_time_telemetry.js.
- Failure Log @browser_contentBlockingTelemetry.js
- Failure Log @browser_sanitization_events.js
| Reporter | ||
Comment 26•3 months ago
|
||
I just realized that only one commit in the stack was reverted, and the rest of the stack seems to be sufficient for our purposes. :tsmith, can we try fuzzing Thunderbird again?
Comment 27•3 months ago
|
||
Thanks for the ni? I have the fuzzer up and running in our infrastructure. I will log any issues that are reported.
| Reporter | ||
Comment 28•3 months ago
|
||
Excellent! I'm going to close this now, since everything in this component is complete. If there's lingering issues that still need to be fixed with MemoryTelemetry, that's probably best tracked in its own issue at this point. Thanks everyone for helping get this over the finish line!
Updated•2 months ago
|
Comment 29•2 months ago
•
|
||
Hello Tyson:
In comment 8 you said you were seeing crashes and fixing the crashes is necessary for the target to run properly in automation.
Early december crash fixes were landed, but they were backed out for causing other failures (see comment 25).
In comment 27 you said, that the fuzzer is correctly running.
Does that mean it is no longer necessary to fix the crashes? Or did the crashes disappear?
Thanks also for your offer to let us know about any issue that you'll see reported.
I'm wondering, is there a way for the Thunderbird team to have a look at logs to confirm the MIME fuzzer is executed as expected?
Is there any help needed from our side to keep fuzzing running?
Comment 30•2 months ago
|
||
(In reply to Kai Engert [:KaiE:] from comment #29)
In comment 8 you said you were seeing crashes and fixing the crashes is necessary for the target to run properly in automation.
Early december crash fixes were landed, but they were backed out for causing other failures (see comment 25).
See bug 2018949, but this does not seem to be affecting our automation runs.
In comment 27 you said, that the fuzzer is correctly running.
Does that mean it is no longer necessary to fix the crashes? Or did the crashes disappear?
Yes, it is running in automation but only runs for 20 minutes before OOMing. This is a separate issue, the memory usage seems to continue to grow over time, perhaps a memory leak in the target? In the short term we can give the machine more memory.
Thanks also for your offer to let us know about any issue that you'll see reported.
Check your access to: https://fuzzmanager.fuzzing.mozilla.org. If you can login we can grant access to Thunderbird reports, but there aren't any at the moment.
I'm wondering, is there a way for the Thunderbird team to have a look at logs to confirm the MIME fuzzer is executed as expected?
Nothing beyond checking locally at this time.
Is there any help needed from our side to keep fuzzing running?
Investigating if there is indeed a memory leak in the target and getting that resolved would allow for full fuzzing runs.
| Assignee | ||
Comment 31•2 months ago
|
||
Fixing this now. I see the problem. The observer was holding a weak reference to the telemetry class. it needs a strong reference to live long enough to get the content-child-shutdown message.
| Assignee | ||
Updated•2 months ago
|
| Assignee | ||
Updated•2 months ago
|
Comment 33•2 months ago
|
||
Comment 34•2 months ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/c41276854344
https://hg.mozilla.org/mozilla-central/rev/0eac3bbf674a
https://hg.mozilla.org/mozilla-central/rev/335b94f24d1a
https://hg.mozilla.org/mozilla-central/rev/4621abdc51fb
Description
•