Closed Bug 1595786 Opened 5 years ago Closed 4 years ago

AddressSanitizer: heap-use-after-free FetchStreamReader::WriteBuffer / /builds/worker/fetches/llvm-project/llvm/projects/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cc:22:3 in __asan_memcpy

Categories

(Core :: DOM: Networking, defect, P2)

defect

Tracking

()

VERIFIED FIXED
mozilla73
Tracking Status
firefox-esr68 73+ fixed
firefox71 --- wontfix
firefox72 - wontfix
firefox73 --- verified
firefox74 --- verified

People

(Reporter: bc, Assigned: baku)

References

(Blocks 1 open bug, )

Details

(Keywords: csectype-uaf, sec-high, Whiteboard: [necko-triaged][post-critsmash-triage][adv-main73+r] [adv-esr68.5+r])

Attachments

(2 files)

Attached file asan report
  1. https://play.tidb.io/
  2. UAF in Linux/Windows Nightly opt
==19179==ERROR: AddressSanitizer: heap-use-after-free on address 0x7f1809666800 at pc 0x5633353e211a bp 0x7ffe9f8a0c90 sp 0x7ffe9f8a0458
READ of size 4096 at 0x7f1809666800 thread T0 (Web Content)
    #0 0x5633353e2119 in __asan_memcpy /builds/worker/fetches/llvm-project/llvm/projects/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cc:22:3
    #1 0x7f18a5eae46c in nsReadFromRawBuffer(nsIOutputStream*, void*, char*, unsigned int, unsigned int, unsigned int*) /builds/worker/workspace/build/src/xpcom/io/nsPipe3.cpp:1702:3
    #2 0x7f18a5eade91 in nsPipeOutputStream::WriteSegments(nsresult (*)(nsIOutputStream*, void*, char*, unsigned int, unsigned int, unsigned int*), void*, unsigned int, unsigned int*) /builds/worker/workspace/build/src/xpcom/io/nsPipe3.cpp:1671:12
    #3 0x7f18ac3585d7 in mozilla::dom::FetchStreamReader::WriteBuffer() /builds/worker/workspace/build/src/dom/fetch/FetchStreamReader.cpp:298:19
    #4 0x7f18ac357f13 in mozilla::dom::FetchStreamReader::OnOutputStreamReady(nsIAsyncOutputStream*) /builds/worker/workspace/build/src/dom/fetch/FetchStreamReader.cpp:201:12
    #5 0x7f18a5ec29c6 in nsOutputStreamReadyEvent::Run() /builds/worker/workspace/build/src/xpcom/io/nsStreamUtils.cpp:173:20

Given that the use is in DOM and the free is somewhere in JS, I think this is a DOM issue, at least at first.

Component: XPCOM → DOM: Networking
Group: core-security → dom-core-security
Summary: AddressSanitizer: heap-use-after-free /builds/worker/fetches/llvm-project/llvm/projects/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cc:22:3 in __asan_memcpy → AddressSanitizer: heap-use-after-free FetchStreamReader::WriteBuffer / /builds/worker/fetches/llvm-project/llvm/projects/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cc:22:3 in __asan_memcpy

The stacks look odd that the object was created and freed in JS but dereffed in DOM, but Tyson can reproduce it also. we'll try to catch it in Pernosco.

Flags: needinfo?(twsmith)

A Pernosco session is available here: https://pernos.co/debug/r-N5SqIkir8uF6LODHZXNg/index.html

It will expire in 7 days.

Flags: needinfo?(twsmith)

Hi Neha, looks like the Pernosco recording is going to expire soon. Is there anybody on your team who might be free to take a quick look?

Flags: needinfo?(nkochar)

Failing in StreamReader. Someone from Necko team might know?

Flags: needinfo?(nkochar) → needinfo?(nhnguyen)
Flags: needinfo?(nhnguyen)
Flags: needinfo?(kershaw)

I can't figure out why the object is freed in JS. In this case, FetchStreamReader and its members should be only accessible on main thread. I have no idea why and which object is freed on JS helper thread.

Tyson, could you share the STR? I'd like to reproduce this by myself.
Thanks.

Flags: needinfo?(kershaw) → needinfo?(twsmith)

(In reply to Kershaw Chang [:kershaw] from comment #6)

I can't figure out why the object is freed in JS. In this case, FetchStreamReader and its members should be only accessible on main thread. I have no idea why and which object is freed on JS helper thread.

I'm not too familiar with this code, but it looks like nsReadFromRawBuffer is touching memory passed in from FetchStreamReader::WriteBuffer(), which is a buffer derived from FetchStreamReader::mBuffer. mBuffer is a dictionary. It looks like the dictionary does not actually own the buffer memory itself, but instead contains a JSObject*. If that object gets collected, the actual underlying memory will get deleting on a background JS helper thread (assuming the object has a tracekind that allows that). I think the issue here is that FetchStreamReader's Trace method should call TraceDictionary on mBuffer. I'm a little surprised this lack of rooting hasn't shown up before.

Andrew is right. All uses of FetchReadableStreamReadDataArray need to be rooted. Bindings do that automatically, but this code is definitely not doing it.

It gets worse. Even if all this stuff were being properly traced there's a problem here. In FetchStreamReader::ResolvedCallback we grab the typed array, call ComputeLengthAndData on it, and then use that length and data pointer, possibly async (e.g. if we get NS_BASE_STREAM_WOULD_BLOCK from our pipe, we will do the rest of the reading async, from OnOutputStreamReady.

But by that point, even if we were tracing the underlying typed array object (and we're not), someone could have detached the arraybuffer (e.g. by postMessage-transferring it somewhere), no? And then the data pointer is likely invalid.

The only proper way to use one of the dom::TypedArray structs, and the only way the ES typed array structs are used in the spec, is to do whatever reading of the data you plan to do immediately on entry to your algorithm, without any JS running in the interim. In practical terms, there must be no JS running (and ideally no GC activity) between the ComputeLengthAndData call and using the data (in practice copying it).

Andrea, do you know who owns this code, by any chance?

Flags: needinfo?(amarchesini)

(comments 7 and 8 should really say "tracing" in place of "rooting".)

This makes me wonder if maybe TypedArray_base could be marked JS_HAZ_GC_POINTER. It won't catch the tracing problem, but it would prevent users from doing anything significant while it is live. I don't think it has a destructor, which would cause problems because then it would be considered 'live' to the end of its scope, and you might very well grab the data, copy it somewhere else, and then correctly never access the original pointer again. But perhaps this is impossible due to the specific usage patterns?

TypedArray_base inherits from SpiderMonkeyInterfaceObjectStorage which has JSObject* members, so adding JS_HAS_GC_POINTER won't tell the static analysis anything it doesn't already know. But this thing is stored in a member UniquePtr, so doesn't have stack lifetime, so the analysis assumes the consumer knows what they're doing...

So I poked a bit at making dictionaries that need rooting MOZ_STACK_CLASS, since on stack the static analysis would catch issues like this.

That compiles, modulo this one callsite, but only due to bug 1559226 (which means sequence<SomeDictionaryThatNeedsRooting> is allowed even if SomeDictionaryThatNeedsRooting is MOZ_STACK_CLASS).

And it looks like we do in fact have such sequences around, for PaymentDetailsModifier, PaymentMethodData, ReportingItem, PublicKeyCredentialDescriptor. So we could add the MOZ_STACK_CLASS thing for now, but would need to decide what to do once bug 1559226 is fixed or something.

(In reply to Kershaw Chang [:kershaw] from comment #6)

Tyson, could you share the STR? I'd like to reproduce this by myself.

I opened the link in an ASan build. Trying it now with the same build it does not crash. Maybe the page changed?

Flags: needinfo?(twsmith)

I spent a bit of time trying to modify test_readableStreams.html, but I wasn't able to reproduce any crash. (I was using a debug build, but I would expect that should misbehave even without ASan due to poisoning.) The basic idea is to drop all references to the buffer after it is passed into a stream, try to force a GC, then examine the value. As it stands, test_nonNativeStream and test_nonNativeStream_cache keep a gratuitous reference to the buffer (the former only looks at the length to check that the data from the stream is the same, and the latter doesn't use it at all). This code runs in three different context, so I'm not sure how to always force a GC.

Andrea, do you know who owns this code, by any chance?

I guess it's me. I take this bug.

Assignee: nobody → amarchesini
Flags: needinfo?(amarchesini)

The priority flag is not set for this bug.
:nhi, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(nhnguyen)
Priority: -- → P2
Whiteboard: [necko-triaged]
Flags: needinfo?(nhnguyen)
Group: dom-core-security → core-security-release
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla73

Looks like this landed without sec-approval. Should we uplift to 72 (we're in the last week before RC)? esr68?

:baku, please fill in the sec-approval form either way.

Flags: needinfo?(tom)
Flags: needinfo?(dveditz)
Flags: needinfo?(amarchesini)

Comment on attachment 9111888 [details]
Bug 1595786 - FetchStreamReader takes a copy of the received array buffer, r?bzbarsky

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: This crash is a UAF.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
  • Which older supported branches are affected by this flaw?: All
  • If not all supported branches, which bug introduced the flaw?: Bug 1128959
  • Do you have backports for the affected branches?: No
  • If not, how different, hard to create, and risky will they be?: Low risk. This code is about taking the whole content and save it in a buffer instead of consuming it when needed.
  • How likely is this patch to cause regressions; how much testing does it need?: low
Flags: needinfo?(amarchesini)
Attachment #9111888 - Flags: sec-approval?

The patch applies cleanly on beta and esr68 at least.

Can you also request uplift to beta and esr? Thanks.

Flags: needinfo?(amarchesini)
Flags: qe-verify+
Whiteboard: [necko-triaged] → [necko-triaged][post-critsmash-triage]

Boris, since you reviewed this patch would you be comfortable asking for uplift?

Since the patch is already on trunk, I'm trying to figure out if we're better off shipping without this fix on release until 73 on February 11, or taking the patch today in a new release candidate for 72 and 68.4 before they ship next week.

Flags: needinfo?(bzbarsky)

I am comfortable with taking this patch for uplift if we've checked how Chrome behaves in this situation and determined that it makes an immediate copy of the data at the point when we are now doing the copy. Doubly so if we add a test for the copying behavior, ideally under a separate bug.

If we have not done that checking, then the fact that this changes our web-behavior feels moderately risky to me, even though I am 90+% sure that the change makes us more compatible and probably 80%+ sure the change isn't an issue in practice.

My preference would be to do the checking I describe above and uplift, I think, but someone needs to do that checking.

Flags: needinfo?(bzbarsky)

Since time is short, I'd like to leave this patch out of the 72 RC build and try checking and uplift in a possible 72 dot release.

I would like to verify this fix, but I wasn't able to properly reproduce it. Which build type should I get and where exactly should I get it from?
Thank you!

Flags: needinfo?(bob)

I tried to reproduce using a beta asan opt build but failed to so do. I was able to reproduce with an opt asan windows build from the same day the bug was reported using https://firefox-ci-tc.services.mozilla.com/tasks/index/gecko.v2.mozilla-central.pushdate.2019.11.12.20191112094307.firefox/win64-asan-opt where I downloaded the target.zip. To run, I extracted the zip file then changed the exe and dll files to have the executable bit set and then loaded the url. It asan UAF'd me right away. You can adjust the taskcluster task index to point to the appropriate builds.

Flags: needinfo?(bob)

I have attempted reproducing this issue, but I wasn't able to.
I have opened the "https://play.tidb.io/" test page in both Nightly/Beta and both AsanOpt/AsanReporter taken from pushlogs before the fix. 19-12-2019 and 12-12-2019. Crashes, hangs, freezes are not observed and "AddressSanitizer: errors are not being seen in the logs.

I would like to verify the fix, but I am not able to reproduce it. Can you still reproduce this on an affected build? Please provide me with the link to a build that reproduces it. Thank you.

Flags: needinfo?(bob)

Yes, but can you reproduce it? I cannot reproduce it. If you can, you could help by verifying the fix.

Flags: needinfo?(bob)

(In reply to Bob Clary [:bc:] from comment #26)

I was able to reproduce with an opt asan windows build from the same day the bug was reported using https://firefox-ci-tc.services.mozilla.com/tasks/index/gecko.v2.mozilla-central.pushdate.2019.11.12.20191112094307.firefox/win64-asan-opt where I downloaded the target.zip.

Yes, I was able to reproduce it using that build and could not reproduce with recent builds. I'll call it verified.

Status: RESOLVED → VERIFIED
Flags: needinfo?(bob)

Marking flags correctly based on the comment above. Thank you.

Flags: qe-verify+

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #23)

I am comfortable with taking this patch for uplift if we've checked how Chrome behaves in this situation and determined that it makes an immediate copy of the data at the point when we are now doing the copy. Doubly so if we add a test for the copying behavior, ideally under a separate bug.

If we have not done that checking, then the fact that this changes our web-behavior feels moderately risky to me, even though I am 90+% sure that the change makes us more compatible and probably 80%+ sure the change isn't an issue in practice.

My preference would be to do the checking I describe above and uplift, I think, but someone needs to do that checking.

Hi Andrew, can you please help find someone to make sure this gets on someone's radar in the near term? There's been talk of trying to include this fix in a 72.x dot release and I don't think we can until this comment has been addressed. Even without the prospect of being in a dot release, we'd probably still want it for the ESR uplift and in case any follow-up work is needed for 73 still. Thanks!

Flags: needinfo?(overholt)

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #23)

I am comfortable with taking this patch for uplift if we've checked how Chrome behaves in this situation and determined that it makes an immediate copy of the data at the point when we are now doing the copy. Doubly so if we add a test for the copying behavior, ideally under a separate bug.

It looks like you tried at least a bit of test adjustment and couldn't make it fail (comment 22); are you able to write a test like Boris describes, baku?

If we have not done that checking, then the fact that this changes our web-behavior feels moderately risky to me, even though I am 90+% sure that the change makes us more compatible and probably 80%+ sure the change isn't an issue in practice.

IIRC we never found sites using Fetch in the real world that we could use for testing, right, Jason? Mike/Karl, do you know off-hand of any?

Flags: needinfo?(overholt)
Flags: needinfo?(miket)
Flags: needinfo?(kdubost)
Flags: needinfo?(jorendorff)

From the top of my head (emails), I don't recall any, but probably a search on github code would help at least discover if someone tried to use it in a library or a website

Flags: needinfo?(kdubost)

That's right.

Flags: needinfo?(jorendorff)

At this point this is going to ride to 73. We probably still want an uplift request for 68.5.

Comment on attachment 9111888 [details]
Bug 1595786 - FetchStreamReader takes a copy of the received array buffer, r?bzbarsky

Alright let's land this now to get some stability testing and request uplift.

Flags: needinfo?(tom)
Attachment #9111888 - Flags: sec-approval? → sec-approval+

This has been on m-c for about a month now and Beta since 73 merged there a few weeks ago, so I think we're reasonably good to go from a stability standpoint. I think we just need the ESR68 approval request.

Comment on attachment 9111888 [details]
Bug 1595786 - FetchStreamReader takes a copy of the received array buffer, r?bzbarsky

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration:
  • User impact if declined: A crash can occur.
  • Fix Landed on Version: 73
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This patch is stable in nightly. There are some concerns about the correctness from a spec standpoint, but this is definitely better than a crash.
  • String or UUID changes made by this patch:
Flags: needinfo?(amarchesini)
Attachment #9111888 - Flags: approval-mozilla-esr68?

Comment on attachment 9111888 [details]
Bug 1595786 - FetchStreamReader takes a copy of the received array buffer, r?bzbarsky

Fixes a Fetch sec bug, approved for 68.5esr.

Attachment #9111888 - Flags: approval-mozilla-esr68? → approval-mozilla-esr68+
Flags: needinfo?(dveditz)

Here's results from bigQuery for something like /fetch(['"]http/ in the response body from the most recent HTTP Archive run:

https://docs.google.com/spreadsheets/d/1FaYxhVUwjbZeSIfcJdgl0w-U9XtB4eaeO4rstiBMFhQ/edit#gid=795587681

(That's only 16,000 rows, link to full 13MB csv here https://drive.google.com/open?id=1DvJG09q0H2hKPhUAuoToNUV_Uc91TJhx).

Flags: needinfo?(miket)
Whiteboard: [necko-triaged][post-critsmash-triage] → [necko-triaged][post-critsmash-triage][adv-main73+r]
Whiteboard: [necko-triaged][post-critsmash-triage][adv-main73+r] → [necko-triaged][post-critsmash-triage][adv-main73+r] [adv-esr68.5+r]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: