Heap-use-after-free race condition in blob objectURL loading
Categories
(Core :: Networking, defect, P2)
Tracking
()
People
(Reporter: attekett, Assigned: smaug)
References
Details
(4 keywords, Whiteboard: [necko-triaged][necko-priority-review][post-critsmash-triage][adv-main107+][adv-esr102.5+])
Crash Data
Attachments
(6 files)
88.51 KB,
text/html
|
Details | |
22.41 KB,
text/plain
|
Details | |
23.00 KB,
text/plain
|
Details | |
4.30 KB,
text/plain
|
Details | |
48 bytes,
text/x-phabricator-request
|
dmeehan
:
approval-mozilla-beta+
dmeehan
:
approval-mozilla-esr102+
tjr
:
sec-approval+
|
Details | Review |
219 bytes,
text/plain
|
Details |
Tested on:
OS: Ubuntu 22.04
Firefox:
:$ fuzzfetch --target firefox --os Linux --asan --fuzzing -n firefox
> Identified task: https://firefox-ci-tc.services.mozilla.com/api/index/v1/task/gecko.v2.mozilla-central.latest.firefox.linux64-fuzzing-asan-opt
> Task ID: IboI9AdYS7Owkt5NGknp5Q
> Rank: 1663450354
> Changeset: 09d4c0163454293e8431301cf5799d6ed96164f2
> Build ID: 20220917213234
> Downloading: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/IboI9AdYS7Owkt5NGknp5Q/artifacts/public/build/target.tar.bz2 (470.55MiB total)
Reprofile 'loader.html' contains a large HTML string that is converted to a blob and to an objectURL, which is loaded to an iframe. The HTML string has a script that changes top.location to point to document.location from within the iframe. There seems to be a race condition where the browser sometimes ends up to use-after-free when trying to loading the objectURL
as main page.
Easiest way to reproduce is by using ffpuppet and prefpicker. In my tests prefpicker was not mandatory, but tested that the issue also reproduces with fuzzing prefs.
$ ffpuppet -d -p ./prefs.js -u ./loader.html ./firefox/firefox
Few things to note:
* If you want to reproduce the issue by loading as a single file, you may need to reload the file couple of times. In my testing VM the issue reproduced 9/10 tries. If you are having
hard time reproducing, you can run python3 -m http.server on the reprofile directory. The reprofile already has redirection to top.location="http://127.0.0.1:8000/loader.html" for the http server.
Note that using the http server slows the process down and makes the issue slower to reproduce.
* Depending on timing the issue can reproduce as a heap-use-after-free with different stack traces, or as a ubsan "runtime error: applying non-zero offset 4096 to null pointer", few different
ASAN traces as an attachments. On my machine the used HTML string had high frequency for use-after-free.
* Outside of the script to load top.location, the content of the HTML string doesn't really matter. The race seems to be easier to reproduce with a larger and more complex HTML strings.
ASAN-trace summaries:
==469580==ERROR: AddressSanitizer: heap-use-after-free on address 0x6210006028a9 at pc 0x7fb55122eaee bp 0x7fb5221afcc0 sp 0x7fb5221afcb8
READ of size 1 at 0x6210006028a9 thread T19
#0 0x7fb55122eaed in encoding_rs::utf_8::convert_utf8_to_utf16_up_to_invalid::h9a48addeb16fe1c4 /builds/worker/checkouts/gecko/third_party/rust/encoding_rs/src/utf_8.rs:304:46
#1 0x7fb551306e92 in encoding_rs::handles::Utf16Destination::copy_utf8_up_to_invalid_from::h87b18c80f0f13c1c /builds/worker/checkouts/gecko/third_party/rust/encoding_rs/src/handles.rs:724:31
#2 0x7fb551306e92 in encoding_rs::utf_8::Utf8Decoder::decode_to_utf16_raw::hce13a9354eb98962 /builds/worker/checkouts/gecko/third_party/rust/encoding_rs/src/utf_8.rs:506:17
==470424==ERROR: AddressSanitizer: heap-use-after-free on address 0x62100060fadd at pc 0x7f86daf04c77 bp 0x7f86abe81cc0 sp 0x7f86abe81cb8
READ of size 16 at 0x62100060fadd thread T19
#0 0x7f86daf04c76 in core::intrinsics::copy_nonoverlapping::h7cef3b9efd772203 /builds/worker/fetches/rust/library/core/src/intrinsics.rs:2137:9
#1 0x7f86daf04c76 in encoding_rs::simd_funcs::load16_unaligned::hae46f61ffef24749 /builds/worker/checkouts/gecko/third_party/rust/encoding_rs/src/simd_funcs.rs:20:5
#2 0x7f86daf04c76 in encoding_rs::ascii::ascii_to_basic_latin_stride_neither_aligned::hc02e43498f051222 /builds/worker/checkouts/gecko/third_party/rust/encoding_rs/src/ascii.rs:799:24
#3 0x7f86daf04c76 in encoding_rs::ascii::ascii_to_basic_latin::hd2cb5a3ac8e1a55e /builds/worker/checkouts/gecko/third_party/rust/encoding_rs/src/ascii.rs:457:25
UBSAN-trace:
/builds/worker/checkouts/gecko/netwerk/base/nsBufferedStreams.cpp:440:27: runtime error: applying non-zero offset 4096 to null pointer
#0 0x7f38d08a857a in nsBufferedInputStream::ReadSegments(nsresult (*)(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*), void*, unsigned int, unsigned int*) /builds/worker/checkouts/gecko/netwerk/base/nsBufferedStreams.cpp:440:27
#1 0x7f38d2495600 in nsHtml5StreamParser::OnDataAvailable(nsIRequest*, nsIInputStream*, unsigned long, unsigned int) /builds/worker/checkouts/gecko/parser/html/nsHtml5StreamParser.cpp:1661:21
#2 0x7f38d24952ea in nsHtml5StreamListener::OnDataAvailable(nsIRequest*, nsIInputStream*, unsigned long, unsigned int) /builds/worker/checkouts/gecko/parser/html/nsHtml5StreamListener.cpp:91:9
#3 0x7f38d08a2caf in nsBaseChannel::OnDataAvailable(nsIRequest*, nsIInputStream*, unsigned long, unsigned int) /builds/worker/checkouts/gecko/netwerk/base/nsBaseChannel.cpp:876:28
Reporter | ||
Comment 1•2 years ago
|
||
Reporter | ||
Comment 2•2 years ago
|
||
Reporter | ||
Comment 3•2 years ago
|
||
Updated•2 years ago
|
Comment 4•2 years ago
|
||
I moved this to DOM: File because of the blob stuff, but looking at the ASan stacks, maybe it is a nsBufferedStream issue? It looks like the parser thread is trying to read from an nsBufferedStream that was deleted by the main thread.
Comment 5•2 years ago
|
||
Nika, you've looked at nsBufferedStream this year. Do you have any ideas? Thanks.
Comment 6•2 years ago
|
||
I think that this is a bug in the nsInputStreamPump. In general for an arbitrary nsIInputStream
, it is not safe to close the stream from a different thread than you are reading it from. When nsInputStreamPump
is retargeted off-main-thread though, we'll be reading from the stream on a different thread (in this case the HTML parser thread), and then we could end up cancelling that pump from the main thread.
We should make sure that when a nsInputStreamPump
is cancelled, we dispatch the actual event to CloseWithStatus(...)
the underlying mAsyncStream
to the target which the runnable has been retargeted to.
Comment 8•2 years ago
|
||
nsInputStreamPump looks like networking, so I'll move it there.
Updated•2 years ago
|
Comment 9•2 years ago
|
||
Olli, let me know if you are able to work on this. Thanks!
Comment 10•2 years ago
|
||
The severity field for this bug is set to S3. However, the bug is flagged with the sec-high
keyword.
:smaug, could you consider increasing the severity of this security bug?
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 11•2 years ago
|
||
Yeah, I was thinking this might be an interesting bug to take a look :)
Assignee | ||
Comment 12•2 years ago
|
||
Now that I'm back in Finland, I'm planning to take a look at this.
Assignee | ||
Comment 13•2 years ago
|
||
I'm having hard time to reproduce.
Using python3 -m http.server doesn't seem to help, though I don't think anything is even accessed from the server.
I get a crash, but it is something like
==1662399==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address
...
==1662399==The signal is caused by a READ memory access.
Reporter | ||
Comment 14•2 years ago
|
||
Weird.
I'm traveling, by train, so I'm currently behind semi-stable internet, for the rest of the week I'll with very limited access to my machines.
Did a quick test with new build, out of three tries, 2 crashed, one uaf, one ubsan applying non-zero offset 4096 to null pointer.
OS: Ubuntu 22.04.1, 64-bit
FWIW, for me it also reproduces without ffpuppet, with empty profile directory.
:~/Downloads$ fuzzfetch --target firefox --os Linux --asan --fuzzing -n firefox
[2022-10-11 05:10:23] Identified task: https://firefox-ci-tc.services.mozilla.com/api/index/v1/task/gecko.v2.mozilla-central.latest.firefox.linux64-fuzzing-asan-opt
[2022-10-11 05:10:23] > Task ID: C7gynEqcRdir2e0aFVA_Kw
[2022-10-11 05:10:23] > Rank: 1665480728
[2022-10-11 05:10:23] > Changeset: 5cbd3d92a78c54b324b6009a25d196adaa8a669b
[2022-10-11 05:10:23] > Build ID: 20221011093208
[2022-10-11 05:10:25] > Downloading: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/C7gynEqcRdir2e0aFVA_Kw/artifacts/public/build/target.tar.bz2 (469.40MiB total)
[2022-10-11 05:10:54] .. still downloading (13.4%, 2.16MB/s)
[2022-10-11 05:11:32] .. still downloading (20.2%, 1.45MB/s)
[2022-10-11 05:12:02] .. still downloading (44.7%, 2.23MB/s)
[2022-10-11 05:12:33] .. still downloading (69.0%, 2.63MB/s)
[2022-10-11 05:13:03] .. still downloading (73.5%, 2.27MB/s)
[2022-10-11 05:13:35] .. still downloading (88.2%, 2.27MB/s)
[2022-10-11 05:14:05] .. still downloading (90.3%, 2.00MB/s)
[2022-10-11 05:14:36] .. still downloading (98.2%, 1.91MB/s)
[2022-10-11 05:14:43] .. downloaded (1.90MB/s)
[2022-10-11 05:14:43] .. extracting
[2022-10-11 05:17:06] Extracted into /Downloads/firefox
:~/Downloads$ ./firefox/firefox ./test.html
Crash Annotation GraphicsCriticalError: |[0][GFX1-]: glxtest: VA-API test failed: missing or old libva-drm library. (t=1.23202) [GFX1-]: glxtest: VA-API test failed: missing or old libva-drm library.
=================================================================
==105749==ERROR: AddressSanitizer: heap-use-after-free on address 0x6210006236a0 at pc 0x7f789ee87b17 bp 0x7f786fdb9cc0 sp 0x7f786fdb9cb8
READ of size 16 at 0x6210006236a0 thread T19
#0 0x7f789ee87b16 in core::intrinsics::copy_nonoverlapping::hac5a3c6acb4aab2b /builds/worker/fetches/rust/library/core/src/intrinsics.rs:2472:9
#1 0x7f789ee87b16 in encoding_rs::simd_funcs::load16_unaligned::heff1dcef0cd36c7a /builds/worker/checkouts/gecko/third_party/rust/encoding_rs/src/simd_funcs.rs:20:5
#2 0x7f789ee87b16 in encoding_rs::ascii::ascii_to_basic_latin_stride_neither_aligned::hc572d6c94b27a574 /builds/worker/checkouts/gecko/third_party/rust/encoding_rs/src/ascii.rs:799:24
#3 0x7f789ee87b16 in encoding_rs::ascii::ascii_to_basic_latin::h818f601a9356b7d9 /builds/worker/checkouts/gecko/third_party/rust/encoding_rs/src/ascii.rs:457:25
#4 0x7f789ee87b16 in encoding_rs::utf_8::convert_utf8_to_utf16_up_to_invalid::ha78a7c2c3884e3d8 /builds/worker/checkouts/gecko/third_party/rust/encoding_rs/src/utf_8.rs:241:17
#5 0x7f789ef60452 in encoding_rs::handles::Utf16Destination::copy_utf8_up_to_invalid_from::h8a3beb6ab24ab8f0 /builds/worker/checkouts/gecko/third_party/rust/encoding_rs/src/handles.rs:724:31
#6 0x7f789ef60452 in encoding_rs::utf_8::Utf8Decoder::decode_to_utf16_raw::hbdb7fbe2ca8bc21a /builds/worker/checkouts/gecko/third_party/rust/encoding_rs/src/utf_8.rs:506:17
Assignee | ||
Comment 15•2 years ago
|
||
ok, thanks, found a way to reproduce.
ffpuppet was confusing me, and I don't need that here.
Assignee | ||
Comment 16•2 years ago
|
||
FWIW, if the following unlock is removed, the crash is gone (but there are then warnings about potential deadlocks)
https://searchfox.org/mozilla-central/rev/e94c6cb9649bfe4e6a3888460f41bcd4fe30a6ca/netwerk/base/nsInputStreamPump.cpp#549
Comment 17•2 years ago
|
||
That's because mAsyncStream should be protected by mMutex.
Could you see if something like this works instead?
nsCOMPtr<nsIAsyncInputStream> tempStream = mAsyncStream;
RecursiveMutexAutoUnlock unlock(mMutex);
...
rv = mListener->OnDataAvailable(this, tempStream, mStreamOffset,
odaAvail);
MOZ_POP_THREAD_SAFETY
Assignee | ||
Comment 18•2 years ago
|
||
That wouldn't help since the issue is that we release the data on nsBufferedStream.
Not unlocking would help just because it would prevent one to trigger the cancel call which releases the data.
(and I tested it doesn't help).
I'm just trying to figure out some least regression risky way to keep the data alive long enough
Assignee | ||
Comment 19•2 years ago
|
||
Assignee | ||
Comment 20•2 years ago
|
||
Comment on attachment 9299876 [details]
Bug 1791314, some underlying streams prefer being closed on the target thread, r=valentin
Security Approval Request
- How easily could an exploit be constructed based on the patch?: I think the patch does pin point rather badly where the issue is.
- Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Yes
- Which older supported branches are affected by this flaw?: all
- If not all supported branches, which bug introduced the flaw?: None
- Do you have backports for the affected branches?: Yes
- If not, how different, hard to create, and risky will they be?: The patch seem to apply cleanly to esr102 too
- How likely is this patch to cause regressions; how much testing does it need?: It is rather risky since it makes synchronous operation asynchronous.
- Is Android affected?: Yes
Updated•2 years ago
|
Comment 21•2 years ago
|
||
Comment on attachment 9299876 [details]
Bug 1791314, some underlying streams prefer being closed on the target thread, r=valentin
Approved to land and request uplift when ready.
Comment 22•2 years ago
|
||
some underlying streams prefer being closed on the target thread, r=valentin,necko-reviewers
https://hg.mozilla.org/integration/autoland/rev/05797d51070ceb06e74fd3c552a60c872b3834a9
https://hg.mozilla.org/mozilla-central/rev/05797d51070c
Comment 23•2 years ago
|
||
Please nominate this for Beta and ESR102 approval when you get a chance.
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 24•2 years ago
|
||
I would like to verify this fix, but I can't seem to really get the reproduction steps.
Could you help clear them up?
- I assume, that I should install fuzzfetch first with command: pip install fuzzfetch
- Then I should install the right build with command: fuzzfetch --target firefox --os Linux --asan --fuzzing -n firefox
- Then I think I should be using ffpuppet and prefpicker. Is this the right command?
$ ffpuppet -d -p ./prefs.js -u ./loader.html ./firefox/firefox
How exactly should I be using this? Where do I get prefs.js file from? Which exactly are the "fuzzing prefs"? - Then what? What are the actual and expected results and how do I verify them?
Please mention anything that I might need to know to reproduce and verify this bug.
Thank you very much for the help!
Assignee | ||
Comment 25•2 years ago
|
||
I couldn't get that ffpuppet stuff etc working, but asan build + the testcase did crash (a child process).
Comment 26•2 years ago
|
||
The patch landed in nightly and beta is affected.
:smaug, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox107
towontfix
.
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 27•2 years ago
•
|
||
Comment on attachment 9299876 [details]
Bug 1791314, some underlying streams prefer being closed on the target thread, r=valentin
Beta/Release Uplift Approval Request
- User impact if declined: Security sensitive crash
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: Yes
- If yes, steps to reproduce: Load the testcase in an Asan build.
- List of other uplifts needed: None
- Risk to taking this patch: Medium
- Why is the change risky/not risky? (and alternatives if risky): This is a bit risky patch. It changes certain synchronous operations to be asynchronous
- String changes made/needed: NA
- Is Android affected?: Yes
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration:
- User impact if declined:
- Fix Landed on Version:
- Risk to taking this patch:
- Why is the change risky/not risky? (and alternatives if risky):
Comment 28•2 years ago
|
||
Comment on attachment 9299876 [details]
Bug 1791314, some underlying streams prefer being closed on the target thread, r=valentin
Approved for 107.0RC1
Comment 29•2 years ago
|
||
uplift |
Comment 30•2 years ago
|
||
Comment on attachment 9299876 [details]
Bug 1791314, some underlying streams prefer being closed on the target thread, r=valentin
Approved for 102.5esr.
Comment 31•2 years ago
|
||
uplift |
Comment 32•2 years ago
•
|
||
I have managed to reproduce a tab crash by loading the loader.html test case attached in an asan build on Ubuntu 22 in an affected asan Nightly v108.0a1 from 2022-10-25. I have to mention that this issue is intermittent. It occurs in about 5/10 reloads/retries.
Furthermore, I can confirm that the issue no longer occurs on Nightly v108.0a1 and Beta v107.0 (20221107141014) in more than 10 tries.
Leaving the general bug status as FIXED until the fix is pushed and verified on the ESR102 channel as well. Thanks!
Updated•2 years ago
|
Comment 33•2 years ago
|
||
I have also verified the fix on the Linux x64 asan build following the revision in comment 31. The tab crash is no longer reproducible in more than 10 tries.
Updated•2 years ago
|
Comment 34•2 years ago
|
||
Updated•2 years ago
|
Updated•2 years ago
|
Comment 36•2 years ago
|
||
Copying crash signatures from duplicate bugs.
Updated•2 years ago
|
Updated•8 months ago
|
Description
•