Closed Bug 1442722 Opened 2 years ago Closed Last year

Assertion failure: point.canPeek(), at js/src/vm/StructuredClone.cpp:648 or various crashes with invalid free

Categories

(Core :: JavaScript Engine, defect, P1, critical)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla62
Tracking Status
firefox-esr52 61+ fixed
firefox-esr60 61+ fixed
firefox59 --- wontfix
firefox60 + wontfix
firefox61 + fixed
firefox62 + fixed

People

(Reporter: decoder, Assigned: sfink)

References

(Blocks 1 open bug)

Details

(4 keywords, Whiteboard: [adv-main61+][adv-esr52.9+][adv-esr60.1+][post-critsmash-triage])

Attachments

(16 files, 10 obsolete files)

79 bytes, application/octet-stream
Details
3.80 KB, patch
Details | Diff | Splinter Review
72 bytes, application/octet-stream
Details
6.09 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
1.25 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
18.92 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
2.43 KB, patch
sfink
: review+
Details | Diff | Splinter Review
1.73 KB, patch
sfink
: review+
Details | Diff | Splinter Review
2.92 KB, patch
sfink
: review+
Details | Diff | Splinter Review
26.42 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
64.34 KB, patch
Details | Diff | Splinter Review
32.70 KB, patch
baku
: review+
jorendorff
: review+
Details | Diff | Splinter Review
1.11 KB, patch
sfink
: review+
abillings
: sec-approval+
Details | Diff | Splinter Review
38.55 KB, patch
Details | Diff | Splinter Review
81.39 KB, patch
Details | Diff | Splinter Review
72.66 KB, patch
Details | Diff | Splinter Review
I've prototyped a new libfuzzer-based fuzzing target that uses dom::ipc::StructuredCloneData as its entry point to cover the StructuredCloneReader code that is browser-only. The target code has not landed on mozilla-central yet, if you need it to reproduce (in case stack + test attached do not suffice), please let me know.

The attached testcase crashes on mozilla-central revision 8a2584063e19+:

Backtrace:

==20305==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7ff2c6ba5f93 bp 0x7ffd5c2f2340 sp 0x7ffd5c2f21e0 T0)
==20305==The signal is caused by a WRITE memory access.
==20305==Hint: address points to the zero page.
    #0 0x7ff2c6ba5f92 in void DiscardTransferables<js::SystemAllocPolicy>(mozilla::BufferList<js::SystemAllocPolicy>&, JSStructuredCloneCallbacks const*, void*) js/src/vm/StructuredClone.cpp:675:5
    #1 0x7ff2c6bbe694 in JSAutoStructuredCloneBuffer::clear(JSStructuredCloneCallbacks const*, void*) js/src/vm/StructuredClone.cpp:2814:9
    #2 0x7ff2b8383031 in ~JSAutoStructuredCloneBuffer objdir-ff-libfuzzer/dist/include/js/StructuredClone.h:303:38
    #3 0x7ff2b8383031 in operator() objdir-ff-libfuzzer/dist/include/mozilla/UniquePtr.h:528
    #4 0x7ff2b8383031 in reset objdir-ff-libfuzzer/dist/include/mozilla/UniquePtr.h:343
    #5 0x7ff2b8383031 in operator= objdir-ff-libfuzzer/dist/include/mozilla/UniquePtr.h:313
    #6 0x7ff2b8383031 in Clear dom/base/StructuredCloneHolder.cpp:167
    #7 0x7ff2b8383031 in mozilla::dom::StructuredCloneHolder::~StructuredCloneHolder() dom/base/StructuredCloneHolder.cpp:259
    #8 0x7ff2b8384378 in ~StructuredCloneBlob dom/base/StructuredCloneBlob.cpp:29:1
    #9 0x7ff2b8384378 in mozilla::dom::StructuredCloneBlob::~StructuredCloneBlob() dom/base/StructuredCloneBlob.cpp:27
    #10 0x7ff2b8386728 in Release dom/base/StructuredCloneBlob.cpp:206:1
    #11 0x7ff2b8386728 in Release objdir-ff-libfuzzer/dist/include/mozilla/RefPtr.h:41
    #12 0x7ff2b8386728 in Release objdir-ff-libfuzzer/dist/include/mozilla/RefPtr.h:398
    #13 0x7ff2b8386728 in ~RefPtr objdir-ff-libfuzzer/dist/include/mozilla/RefPtr.h:79
    #14 0x7ff2b8386728 in mozilla::dom::StructuredCloneBlob::ReadStructuredClone(JSContext*, JSStructuredCloneReader*, mozilla::dom::StructuredCloneHolder*) dom/base/StructuredCloneBlob.cpp:117
    #15 0x7ff2b838cc0a in mozilla::dom::StructuredCloneHolder::CustomReadHandler(JSContext*, JSStructuredCloneReader*, unsigned int, unsigned int) dom/base/StructuredCloneHolder.cpp:1021:12
    #16 0x7ff2c6bb5f7c in JSStructuredCloneReader::startRead(JS::MutableHandle<JS::Value>) js/src/vm/StructuredClone.cpp:2358:25
    #17 0x7ff2c6b9ebca in JSStructuredCloneReader::read(JS::MutableHandle<JS::Value>) js/src/vm/StructuredClone.cpp:2596:10
    #18 0x7ff2c6b9e7ac in ReadStructuredClone(JSContext*, JSStructuredCloneData&, JS::StructuredCloneScope, JS::MutableHandle<JS::Value>, JSStructuredCloneCallbacks const*, void*) js/src/vm/StructuredClone.cpp:632:14
    #19 0x7ff2b8389799 in ReadFromBuffer dom/base/StructuredCloneHolder.cpp:344:8
    #20 0x7ff2b8389799 in mozilla::dom::StructuredCloneHolder::ReadFromBuffer(nsISupports*, JSContext*, JSStructuredCloneData&, JS::MutableHandle<JS::Value>, mozilla::ErrorResult&) dom/base/StructuredCloneHolder.cpp:324
    #21 0x7ff2c4a9b17b in FuzzingRunDomSC(unsigned char const*, unsigned long) dom/base/fuzztest/FuzzStructuredClone.cpp:53:10
    #22 0x5c0b67 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) tools/fuzzing/libfuzzer/FuzzerLoop.cpp:458:13
    #23 0x5c0d83 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long) tools/fuzzing/libfuzzer/FuzzerLoop.cpp:397:3
    #24 0x5c15ca in fuzzer::Fuzzer::MutateAndTestOne() tools/fuzzing/libfuzzer/FuzzerLoop.cpp:596:30
    #25 0x5c1787 in fuzzer::Fuzzer::Loop() tools/fuzzing/libfuzzer/FuzzerLoop.cpp:624:5
    #26 0x5bafd1 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) tools/fuzzing/libfuzzer/FuzzerDriver.cpp:744:6
    #27 0x7ff2c37b2bda in mozilla::FuzzerRunner::Run(int*, char***) tools/fuzzing/interface/harness/FuzzerRunner.cpp:60:10
    #28 0x7ff2c36f2dd4 in XREMain::XRE_mainStartup(bool*) toolkit/xre/nsAppRunner.cpp:3865:35
    #29 0x7ff2c37029de in XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) toolkit/xre/nsAppRunner.cpp:4793:12
    #30 0x7ff2c3704112 in XRE_main(int, char**, mozilla::BootstrapConfig const&) toolkit/xre/nsAppRunner.cpp:4900:21
    #31 0x519f50 in do_main browser/app/nsBrowserApp.cpp:231:22
    #32 0x519f50 in main browser/app/nsBrowserApp.cpp:304
    #33 0x7ff2d897182f in __libc_start_main /build/glibc-bfm8X4/glibc-2.23/csu/../csu/libc-start.c:291
    #34 0x4235d8 in _start (objdir-ff-libfuzzer/dist/bin/firefox+0x4235d8)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV js/src/vm/StructuredClone.cpp:675:5 in void DiscardTransferables<js::SystemAllocPolicy>(mozilla::BufferList<js::SystemAllocPolicy>&, JSStructuredCloneCallbacks const*, void*)
==20305==ABORTING


This bug itself might not be security-sensitive, but the fuzzing efforts around this target in general are, until we have found and fixed the most common bugs, so I would request to keep this concealed for the moment.
Attached file Testcase
Steve, do you think this might be a problem with the JS engine or rather a DOM problem? Trying to figure out who should look into this first.
Flags: needinfo?(sphink)
Mutated data should never reach DiscardTransferables. So first, I guess I'd like to verify that this is being read in DifferentProcess scope.

Our protections around that aren't all that convincing, though, so my first guess is the JS engine.

I tried loading the test case's binary data in as a structured clone, but couldn't get a crash. Do you have a way of running test cases that crash your DOM runner with the JS shell runner?
Flags: needinfo?(sphink)
Attached patch scdom.patchSplinter Review
The entry point for the attached testcase is in dom/ipc/StructuredCloneData.cpp:

https://searchfox.org/mozilla-central/rev/efce4ceddfbe5fe4e2d74f1aed93894bada9b273/dom/ipc/StructuredCloneData.cpp#436

This is marked here

https://searchfox.org/mozilla-central/rev/efce4ceddfbe5fe4e2d74f1aed93894bada9b273/dom/ipc/StructuredCloneData.cpp#42

as being DifferentProcess. So I assume the scope is declared properly. Maybe that is lost somewhere (which would be bad).

In order to run the test, you can use this patch. Apply it to mozilla-central, then build with clang and --enable-fuzzing so you get a fuzzing build instead of a regular one. Then build gtests (./mach gtest dontruntests).

Once you have that, it is easy to run this and other tests, by doing:

MOZ_RUN_GTEST=1 LIBFUZZER=1 FUZZER=StructuredCloneReaderDOM objdir/dist/bin/firefox -handle_segv=0 -handle_bus=0 -handle_abrt=0 test.bin


Let me know if you need help to reproduce this.
Flags: needinfo?(sphink)
As Steve pointed out in comment 3, we reach DiscardTransferables although we should not be able to get there when we are in DifferentProcess scope. I only realized how critical this could be after looking at the code and tracing back when/how this is actually called.

Based on the code, I managed to modify the testcase to achieve an arbitrary free. You can probably do more with this even (you can e.g. get back into the DOM code and have delete being called on an arbitrary pointer, which should allow arbitrary code execution).

I took the existing testcase, made the buffer larger so it wouldn't die on the canPeek check, then added all the data as the function expects them (SCTAG_TRANSFER_MAP_HEADER tag, some data, numTransferables set to 1, ownership set to JS::SCTAG_TMO_ALLOC_DATA and content set to 0xdeadbeefdeadbeef because I really like dead beef).

With this setup, you end here:

https://searchfox.org/mozilla-central/rev/33cc9e0331da8d9ff3750f1e68d72d61201176cb/js/src/vm/StructuredClone.cpp#706

with content being fully controllable. You can also set ownership to something else and go through the callback at 

https://searchfox.org/mozilla-central/rev/33cc9e0331da8d9ff3750f1e68d72d61201176cb/js/src/vm/StructuredClone.cpp#710

to end up back in the DOM in one of the freeTransfer callbacks. You should then end up here:

https://searchfox.org/mozilla-central/rev/33cc9e0331da8d9ff3750f1e68d72d61201176cb/dom/base/StructuredCloneHolder.cpp#1271

with all of the values controllable, so you can have your pointer being used as OffscreenCanvasCloneData and have it deleted.

I assume all of this is possible because there are insufficient scope checks. The freeTransfer callbacks are a formidable target and probably need to be guarded as well as all Transferable code that isn't supposed to be reachable here. See also bug 1444056 about this.
Flags: needinfo?(sphink)
Keywords: sec-othersec-high
Great work, Christian. Steve, can you take this bug?
Flags: needinfo?(sphink)
Priority: -- → P1
Wow, that's a nice way to drive this home! Thanks.
Assignee: nobody → sphink
Fwiw I was able to tune libfuzzer to hit this bug on its own using value profiling (https://llvm.org/docs/LibFuzzer.html#value-profile) for the code in js/src/vm/StructuredClone.cpp. The code has some hardcoded constants that caused the fuzzer to not get further in that area.
Summary: Assertion failure: point.canPeek(), at js/src/vm/StructuredClone.cpp:648 → Assertion failure: point.canPeek(), at js/src/vm/StructuredClone.cpp:648 or various crashes with invalid free
I've been poking at this stuff for a while now. It seems like there's a fundamental problem -- the clone byte data contains a directive as to whether there are Transferables that might need to be freed, as well as the pointer values that would be freed. On creation, we can easily ensure that none of this happens for DifferentProcess clones. But in practice, the data is held in a JSStructuredCloneData object (if you're lucky; there's one place that Borrows the underlying BufferList), and that is just dumb data.

Which doesn't really make sense in the first place -- the data can have various interpretations in different contexts. What's the point of having a holder class that doesn't know anything about the semantics? At this point, we have BufferList for that.

At the very least, JSStructuredCloneData ought to know how to properly and safely release the data it contains. But that requires knowledge of the scope. DifferentProcess data should not believe the bytes if they say they have pointers to be freed.

Unfortunately, our current crappy API actively works against this. You have to create a temporary JSStructuredCloneData in various places independently of what is going to be stored in it. It's dumb data. With pointers. That can be corrupted on the wire.

I attempted a spot fix for this bug by passing the scope into DiscardTransferables in the 3 different places that do it. But it turns out that we don't always know the scope.

So I've been building up this stack of refactoring patches to try to reshuffle things in a way that makes more sense to me.

This first patch is just to allow the fuzzers to create buffers of the problematic sort -- synthetic buffers that claim to contain Transferable data. The point of this bug is to make that harmless.
Attachment #8964468 - Flags: feedback?(jorendorff)
It doesn't matter too much, but it's more clear if discardTransferables only sees the freeTransfer callback and not all the rest.
Attachment #8964469 - Flags: feedback?(jorendorff)
Right now, there's JSStructuredCloneWriter that knows the semantics of the clone data, an SCOutput that knows how to write JSObjects and ArrayBufferObjects and things, and a BufferList in the SCOutput to hold the bytes.

We can pull together the outer JSStructuredCloneWriter's callbacks etc. and combine them with the BufferList, in the form of a now-ensmartified JSStructuredCloneData. This goes onto the SCOutput, making it an interface for writing semantic data.

This removes the loose fields from JSStructuredCloneWriter and allows using move construction to eliminate some code.
Attachment #8964471 - Flags: feedback?(jorendorff)
This is just another symptom of storing the semantics separately from the data, and will interfere with further refactoring.
Attachment #8964472 - Flags: feedback?(jorendorff)
It's not a great comment, but it's something.
Attachment #8964473 - Flags: feedback?(jorendorff)
We do copy structured clones, but nobody uses this API. Kill it.
Attachment #8964474 - Flags: feedback?(jorendorff)
Ok, here's a bigger change. It's hard to encapsulate data with semantics when all of the users are just reaching through to manipulate the superclass anyway. This switches to containing a BufferList instead of being one, and exposing only the interface (currently) needed.

I put this much of the stack up on try, and it failed all over the place but the only relevant failures appear to be where I broke rust and fuzzing builds. (I have fixes for those, I think, but I still need to separate them from the next patch.)

https://treeherder.mozilla.org/#/jobs?repo=try&revision=01685677e1a11f7300d2b9edb1547fdb3b1c0122
Attachment #8964475 - Flags: feedback?(jorendorff)
Attached patch Move scope into JSStructuredData (obsolete) — Splinter Review
This is the main thing that all these patches have been leading up to. (Well, other than fixing the bug; haven't done that yet.) And it's also why I'm posting this series -- I'm in too deep.

The whole point here is that in order for JSStructuredCloneData to correctly and safely dispose of its own contents, it needs to know the scope. Because you could have DifferentProcess data that has been hacked on its way out of the content process, and the recipient might not track the scope and so end up believing it to be SameProcessSameThread or something. Which means that it's up to the user to set the right scope, and preferably as soon as possible. (I attempted to enforce it during construction of the JSStructuredCloneData, but that looks like it's too early for some uses, so I also allow the possibility of setting it to Unassigned and then initializing it to the proper scope before putting any data in.)

But I don't know if this is a feasible path or not. In this patch, I guessed at a whole bunch of scopes. Some of them didn't seem to have a clear answer, but I wasn't sure, so I set to Unassigned (which will assert at runtime if they get used.) I would like somebody (probably :baku) to look and see whether this approach is even going to work, or if I'm headed in completely the wrong direction.

This patch compiles, but has two scopes that I am certain are incorrect (set to Unassigned and commented with FIXME in the patch), and I am highly doubtful the rest are correct. I'm hoping somebody can clue me in before I spend even more time on this stuff.

Thanks!
Attachment #8964476 - Flags: feedback?(jorendorff)
Attachment #8964476 - Flags: feedback?(amarchesini)
Comment on attachment 8964476 [details] [diff] [review]
Move scope into JSStructuredData

Review of attachment 8964476 [details] [diff] [review]:
-----------------------------------------------------------------

sfink, can you tell me why we cannot simply use the mScope here and hide the passing of the parameter in the implementation of Write() ?
This approach seems fragile to me, because we need to check every Write(). The scope should not change and it should be known from the CTOR.

::: dom/base/nsFrameMessageManager.cpp
@@ +556,5 @@
>    // First try to use structured clone on the whole thing.
>    JS::RootedValue v(aCx, aValue);
>    JS::RootedValue t(aCx, aTransfer);
>    ErrorResult rv;
> +  aData.Write(aCx, v, t, JS::StructuredCloneScope::SameProcessDifferentThread, rv);

This must be DifferentProcess as any use of StructuredCloneData.

@@ +590,5 @@
>    JS::Rooted<JS::Value> val(aCx, JS::NullValue());
>    NS_ENSURE_TRUE(JS_ParseJSON(aCx, static_cast<const char16_t*>(json.get()),
>                                json.Length(), &val), false);
>  
> +  aData.Write(aCx, val, JS::StructuredCloneScope::SameProcessDifferentThread, rv);

This must be DifferentProcess.

@@ +1224,5 @@
>        if (aRetVal) {
>          ErrorResult rv;
>          StructuredCloneData* data = aRetVal->AppendElement();
> +        data->Write(cx, rval,
> +                    JS::StructuredCloneScope::SameProcessDifferentThread, rv);

This must be DifferentProcess.

::: dom/ipc/StructuredCloneData.cpp
@@ +39,5 @@
>  StructuredCloneData::StructuredCloneData(TransferringSupport aSupportsTransferring)
>    : StructuredCloneHolder(StructuredCloneHolder::CloningSupported,
>                            aSupportsTransferring,
>                            StructuredCloneHolder::StructuredCloneScope::DifferentProcess)
> +  , mExternalData(JS::StructuredCloneScope::Unassigned)

The cope here must be DifferentProcess.

@@ +111,3 @@
>                             ErrorResult &aRv)
>  {
> +  Write(aCx, aValue, JS::UndefinedHandleValue, aScope, aRv);

here the scope must be mScope.

::: dom/ipc/StructuredCloneData.h
@@ +189,5 @@
>              ErrorResult &aRv);
>  
>    void Write(JSContext* aCx,
>               JS::Handle<JS::Value> aValue,
> +             JS::StructuredCloneScope aScope,

I don't like this. The scope must be already set when Write() is called.

::: dom/ipc/TabChild.cpp
@@ +231,5 @@
>                        &json)) {
>          ErrorResult rv;
> +        // FIXME: This is just to get things to compile. I don't know what the
> +        // correct scope is here.
> +        data.Write(cx, json, JS::StructuredCloneScope::Unassigned, rv);

This must be DifferentProcess.

::: dom/messagechannel/MessagePort.cpp
@@ +368,5 @@
>    }
>  
> +  // FIXME! I don't know if we know what the proper scope is here? I'm putting
> +  // one in just to find remaining compile errors.
> +  data->Write(aCx, aMessage, transferable, JS::StructuredCloneScope::Unassigned, aRv);

This must be always DifferentProcess
Attachment #8964476 - Flags: feedback?(amarchesini) → feedback-
(In reply to Andrea Marchesini [:baku] from comment #17)
> Comment on attachment 8964476 [details] [diff] [review]
> Move scope into JSStructuredData
> 
> Review of attachment 8964476 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> sfink, can you tell me why we cannot simply use the mScope here and hide the
> passing of the parameter in the implementation of Write() ?
> This approach seems fragile to me, because we need to check every Write().
> The scope should not change and it should be known from the CTOR.

Because it looked to me like there were cases where it isn't known at construction time. But from your response, it looks like I probably got those wrong. Which is awesome; passing it in via Write() was a hack.

I'll make your updates and then go back and see if I still think there are cases where you don't know at construction time. I would much, much prefer to be able to have this set during construction.

(One case that comes to mind is where there was an nsTArray of JSStructuredCloneData, or perhaps one of the several container classes, and it wanted a zero-arg constructor. But maybe that case is now always DifferentProcess, or if not, perhaps fixable by defaulting to Unassigned and using initScope.)
Flags: needinfo?(sphink)
Comment on attachment 8964475 [details] [diff] [review]
Use delegation rather than inheritance for the BufferList in JSStructuredCloneData

Oops, I wanted feedback from baku on this one too, since it messes with the public API quite a bit and has the ForEachDataChunk weirdness in it. I guess I'd like an opinion as to whether you're ok with that.
Attachment #8964475 - Flags: feedback?(amarchesini)
Ok, so the other one I couldn't figure out is MessagePort.cpp (MessagePort::Initialize). Is the scope known there?
Flags: needinfo?(amarchesini)
(In reply to Steve Fink [:sfink] [:s:] from comment #20)
> Ok, so the other one I couldn't figure out is MessagePort.cpp
> (MessagePort::Initialize). Is the scope known there?

MessagePort uses DifferentProcess, always.
Flags: needinfo?(amarchesini)
Comment on attachment 8964475 [details] [diff] [review]
Use delegation rather than inheritance for the BufferList in JSStructuredCloneData

Review of attachment 8964475 [details] [diff] [review]:
-----------------------------------------------------------------

looks good to me.

::: js/public/StructuredClone.h
@@ +370,5 @@
>  
> +  public:
> +    // The constructor must be infallible but SystemAllocPolicy is not, so both
> +    // the initial size and initial capacity of the BufferList must be zero.
> +    explicit JSStructuredCloneData()

no explicit here.
Attachment #8964475 - Flags: feedback?(amarchesini) → feedback+
Attachment #8964468 - Flags: feedback?(jorendorff) → feedback+
Comment on attachment 8964469 [details] [diff] [review]
Restrict discardTransferables to only the freeTransfer callback

Review of attachment 8964469 [details] [diff] [review]:
-----------------------------------------------------------------

I don't really care about this either way.
Attachment #8964469 - Flags: feedback?(jorendorff) → feedback+
Comment on attachment 8964471 [details] [diff] [review]
Bulk up SCOutput by changing it from storing a bare BufferList to a full JSStructuredCloneData

Review of attachment 8964471 [details] [diff] [review]:
-----------------------------------------------------------------

Good change.
Attachment #8964471 - Flags: feedback?(jorendorff) → feedback+
Comment on attachment 8964472 [details] [diff] [review]
Remove unused alternate callback option to JSAutoStructuredCloneBuffer::clear

Review of attachment 8964472 [details] [diff] [review]:
-----------------------------------------------------------------

Good grief.
Attachment #8964472 - Flags: review+
Attachment #8964472 - Flags: feedback?(jorendorff)
Attachment #8964472 - Flags: feedback+
Attachment #8964473 - Flags: review+
Attachment #8964473 - Flags: feedback?(jorendorff)
Attachment #8964473 - Flags: feedback+
Attachment #8964474 - Flags: review+
Attachment #8964474 - Flags: feedback?(jorendorff)
Attachment #8964474 - Flags: feedback+
Comment on attachment 8964475 [details] [diff] [review]
Use delegation rather than inheritance for the BufferList in JSStructuredCloneData

Review of attachment 8964475 [details] [diff] [review]:
-----------------------------------------------------------------

ok... It's really too bad this can't be more opaque than that, but I'll take what I can get...

::: dom/base/StructuredCloneBlob.cpp
@@ +172,5 @@
>    }
>  
>    aHolder->BlobImpls().AppendElements(BlobImpls());
>  
> +  data.ForEachDataChunk([&](const char* aData, size_t aSize) {

This doesn't propagate failure.
Attachment #8964475 - Flags: feedback?(jorendorff) → feedback+
Comment on attachment 8964476 [details] [diff] [review]
Move scope into JSStructuredData

Review of attachment 8964476 [details] [diff] [review]:
-----------------------------------------------------------------

I think the basic approach is a good step.
Attachment #8964476 - Flags: feedback?(jorendorff) → feedback+
(In reply to Jason Orendorff [:jorendorff] from comment #23)
> 
> I don't really care about this either way.

Yep, you're right, since at the end of this series I undo this, or perhaps make it irrelevant. I'll splice it out of the series.
Attachment #8964469 - Attachment is obsolete: true
At this point, discardTransferable always go through to the JSStructuredCloneData, so there's no need for a weird static function to do the work. Splitting out a pure refactoring patch so that the next patch is obvious.
Attachment #8965859 - Flags: review?(jorendorff)
And for the whole point of this series -- if you're discarding a DifferentProcess clone buffer, you'd better not try to free any pointers or anything.
Attachment #8965860 - Flags: review?(jorendorff)
Since this approach is looking viable, I'm going to request review for most of the stack. The scope declarations aren't quite right, though, so I haven't managed to get them through try. I suspect that this is because indexedDB is either still using SameProcessSameThread somewhere, or it is reading that scope off of disk (bug 1434308). But either way, this patch stack is unfortunately incomplete without it, and I'll be out for a week.

  https://treeherder.mozilla.org/#/jobs?repo=try&revision=f96e8666f759862061e5be6bf5461124c64ef497&filter-tier=1

Failing with

  Assertion failure: scope_ == scope, at /builds/worker/workspace/build/src/obj-firefox/dist/include/js/StructuredClone.h:420

 1  mozilla::dom::indexedDB::ObjectStoreAddPutParams::ObjectStoreAddPutParams [IPCMessageUtils.h:f96e8666f759862061e5be6bf5461124c64ef497 : 79 + 0x5]
 2  mozilla::dom::IDBObjectStore::AddOrPut [PBackgroundIDBSharedTypes.h: : 2885 + 0x5]
 3  mozilla::dom::IDBObjectStoreBinding::put [IDBObjectStore.h:f96e8666f759862061e5be6bf5461124c64ef497 : 194 + 0x2d]
 4  mozilla::dom::GenericBindingMethod [BindingUtils.cpp:f96e8666f759862061e5be6bf5461124c64ef497 : 3032 + 0x9]
 5  js::CallJSNative [JSContext-inl.h:f96e8666f759862061e5be6bf5461124c64ef497 : 290 + 0x6]
Attachment #8965897 - Flags: review?(jorendorff)
Attachment #8964468 - Attachment is obsolete: true
(I have passing try pushes for all of this up to the scope moving patch.)
Attachment #8965899 - Flags: review?(jorendorff)
Attachment #8964471 - Attachment is obsolete: true
Attachment #8964472 - Attachment is obsolete: true
Attachment #8964473 - Attachment is obsolete: true
Attachment #8964474 - Attachment is obsolete: true
Yeah, more encapsulation would be better, but I guess this is where we are now.

I may kill off JSAutoStructuredCloneBuffer at some point, or at least make it trivial. That may allow trimming down the interface here further.
Attachment #8965903 - Flags: review?(jorendorff)
Attachment #8964475 - Attachment is obsolete: true
Attached patch Move scope into JSStructuredData (obsolete) — Splinter Review
Not requesting review on this patch yet. It's the problematic one.
Attachment #8964476 - Attachment is obsolete: true
Comment on attachment 8965900 [details] [diff] [review]
Remove unused alternate callback option to JSAutoStructuredCloneBuffer::clear

Carrying over r+ on some simple patches.
Attachment #8965900 - Flags: review+
Attachment #8965901 - Flags: review+
Attachment #8965902 - Flags: review+
In the unlikely event that somebody wants to poke at this in the next week, here's a rollup.
Comment on attachment 8965859 [details] [diff] [review]
Move DiscardTransferables back into JSStructuredCloneData (pure refactor)

Review of attachment 8965859 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8965859 - Flags: review?(jorendorff) → review+
Attachment #8965860 - Flags: review?(jorendorff) → review+
Attachment #8965897 - Flags: review?(jorendorff) → review+
Comment on attachment 8965903 [details] [diff] [review]
Use delegation rather than inheritance for the BufferList in JSStructuredCloneData

Review of attachment 8965903 [details] [diff] [review]:
-----------------------------------------------------------------

r=me for the js/ parts.

::: js/public/StructuredClone.h
@@ +400,5 @@
> +
> +    size_t Size() const { return bufList_.Size(); }
> +
> +    Iterator Start() { return bufList_.Iter(); }
> +    const Iterator Start() const { return bufList_.Iter(); }

I'm looking askance at this `const Iterator` return type. If the caller just does

    auto iter = cloneData.Start();

then we lose the constness implicitly, right? In practice the two methods are the same.

I don't want the code to look more solid and reasonable than it is. So I'd prefer a single `Start()` method here, `const` if necessary, and add the second signature "later" (if/when BufferList ever gets a proper ConstIter type).

@@ +414,5 @@
> +    bool WriteBytes(const char* data, size_t size) {
> +        return bufList_.WriteBytes(data, size);
> +    }
> +
> +    bool WriteBytes(Iterator& iter, const char* data, size_t size) const {

`const` here doesn't seem right.
Attachment #8965903 - Flags: review?(jorendorff) → review+
Comment on attachment 8965899 [details] [diff] [review]
Bulk up SCOutput by changing it from storing a bare BufferList to a full JSStructuredCloneData

Review of attachment 8965899 [details] [diff] [review]:
-----------------------------------------------------------------

Great.
Attachment #8965899 - Flags: review?(jorendorff) → review+
(In reply to Jason Orendorff [:jorendorff] from comment #41)
> @@ +414,5 @@
> > +    bool WriteBytes(const char* data, size_t size) {
> > +        return bufList_.WriteBytes(data, size);
> > +    }
> > +
> > +    bool WriteBytes(Iterator& iter, const char* data, size_t size) const {
> 
> `const` here doesn't seem right.

In current code, the write is https://searchfox.org/mozilla-central/source/dom/indexedDB/ActorsParent.cpp#26191 into an data pointer grabbed on the previous line, out of an iterator from https://searchfox.org/mozilla-central/source/dom/indexedDB/ActorsParent.cpp#26184 which comes from a const JSStructuredCloneData& at https://searchfox.org/mozilla-central/source/dom/indexedDB/ActorsParent.cpp#26138 ...and I could go on.

But anyway, the const Iter() method returns a non-const thing at https://searchfox.org/mozilla-central/source/mfbt/BufferList.h#293 -- IterImpl takes a const BufferList& at https://searchfox.org/mozilla-central/source/mfbt/BufferList.h#176 and produces a non-const char* at https://searchfox.org/mozilla-central/source/mfbt/BufferList.h#182 via https://searchfox.org/mozilla-central/source/mfbt/BufferList.h#54 which, if I'm following all this correctly, means that BufferList and its Segments can be const containers of mutable pointers just like const std::vector<T*>, and you can have const iterators that give you access to those mutable pointers. And all of the IPDL stuff seems to be passing around more const containers of mutable data.

Which I think means that this *has* to be const. Unless I'm confused, this is the same as what you pointed out with "if/when BufferList ever gets a proper ConstIter type"?

In the new code, the JSStructuredCloneData is const, so we can't modify the data pointers it's hanging onto, but we can freely modify the data pointed to by those pointers, and that's what we're doing here. Which is all kind of awful, and perhaps I should make a ConstJSStructuredCloneData, but I wouldn't use it here.

Hm... I wonder if WriteBytes should be on the Iterator, since it updates the iterator position. (I wondered why it was taking a non-const Iterator&, before trying it and realizing that that's kind of the point of this -- write at the iterator position and advance past the written data.) But that's a direct BufferList::IterImpl, so I'd need to add it there... bleh.

It's kind of weird that I have a non-const WriteBytes that appends to the whole BufferList, and a const WriteBytes that is used to update existing storage within the BufferList. I think I'd better rename one.
It seems like I'll need to do more work here before I can land, due to IndexedDB scope weirdness, so I'm forking off (public) bug 1455071 to land the simpler refactoring stuff now.
From comment 44 it sounds like this is wontfix for 60.
\o/

Finally managed to get this happy. https://treeherder.mozilla.org/#/jobs?repo=try&revision=c65f0b3d6932c5ef6ce5f7af6c82a37e42927765

baku, there may be some places in this patch where we could know the scope earlier. Please let me know if you see any.

Also note that the usage of JSAutoStructuredCloneBuffer gets a little bit more awkward in this patch, but that's ok because I'm hoping to remove that class entirely in a later patch (which is in my stack after the security bug is fixed.)
Attachment #8973857 - Flags: review?(jorendorff)
Attachment #8973857 - Flags: review?(amarchesini)
Attachment #8965904 - Attachment is obsolete: true
Attachment #8973857 - Flags: review?(amarchesini) → review+
Comment on attachment 8973857 [details] [diff] [review]
Move scope into JSStructuredData

Review of attachment 8973857 [details] [diff] [review]:
-----------------------------------------------------------------

Great. Show me more!

::: dom/base/nsFrameMessageManager.cpp
@@ +895,5 @@
>        }
>  
>        if (aRetVal) {
>          StructuredCloneData* data = aRetVal->AppendElement();
> +        data->InitScope(JS::StructuredCloneScope::DifferentProcess);

I think this use of InitScope can be removed by writing to a local StructuredCloneData first and then doing `aRetVal->AppendElement(Move(dat))`.

Then I think InitScope can be deleted.

(Follow-up patch would be fine.)

::: ipc/glue/IPCMessageUtils.h
@@ +68,5 @@
>  
>  struct SerializedStructuredCloneBuffer final
>  {
> +  SerializedStructuredCloneBuffer()
> +   : data(JS::StructuredCloneScope::Unassigned)

I'd be happiest if Unassigned could be removed. It looks as though SerializedStructuredCloneBuffer is used exclusively for IPC, so--very naively--I would guess DifferentProcess is the only right answer here. Worth a try?

If that doesn't work, consider making `data` a Maybe?

::: js/public/StructuredClone.h
@@ +396,5 @@
>  
> +    // The (address space, thread) scope within which this clone is valid. Note
> +    // that this must be either set during construction, or start out as
> +    // Unassigned and transition once to something else.
> +    JS::StructuredCloneScope scope_;

\o/ At last!

@@ +429,5 @@
>          , ownTransferables_(OwnTransferablePolicy::NoTransferables)
>      {}
> +    MOZ_IMPLICIT JSStructuredCloneData(BufferList&& buffers)
> +        : JSStructuredCloneData(mozilla::Move(buffers), JS::StructuredCloneScope::Unassigned)
> +    {}

Definitely remove this signature. I don't think there are any call sites. If there are, explicit is better than implicit for this half-initialized-state option...

@@ +450,5 @@
> +
> +    void initScope(JS::StructuredCloneScope scope) {
> +        MOZ_ASSERT(Size() == 0, "initScope() of nonempty JSStructuredCloneData");
> +        if (scope_ != JS::StructuredCloneScope::Unassigned)
> +            MOZ_ASSERT(scope_ == scope, "Cannot change scope after it has been initialized");

lol hideous :D

Do we really call initScope() multiple times in places where we want to accept that? Sounds awful. If not, instead assert that `scope_ == Unassigned`. (but actually, tear down this code)
Attachment #8973857 - Flags: review?(jorendorff) → review+
Attachment #8965860 - Attachment is obsolete: true
Comment on attachment 8976679 [details] [diff] [review]
discardTransferables is a no-op for DifferentProcess clones,

Note that I've been cheating and landing the patches here without getting sec-approval. But they're all just refactoring to make the actual fix possible; I could have easily put them in a separate sec-none bug.

Requesting sec-approval for the actual fix:

[Security approval request comment]
> How easily could an exploit be constructed based on the patch?

It provides an idea of where to attack. To actually exploit this, you'd need to first compromise a content process, then get moderately clever with what it sends back to the parent process.

> Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

Not any more. After reading this question, I updated the patch to remove the helpful suggestion of how to exploit it from my comment. Carrying over r+.

> Which older supported branches are affected by this flaw?

all

> Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

I expect them to be moderately annoying; the refactoring patches touch enough code that there will be minor fixups required. Once they pass compile + test, they should not be very risky at all.

> How likely is this patch to cause regressions; how much testing does it need?

Moderately. Our existing test suite is pretty good at exercising this stuff, though, so I would expect any issues to arise pretty quickly.
Attachment #8976679 - Flags: sec-approval?
Attachment #8976679 - Flags: review+
(In reply to Steve Fink [:sfink] [:s:] (PTO June 31) from comment #49)
> Comment on attachment 8976679 [details] [diff] [review]
> discardTransferables is a no-op for DifferentProcess clones,
> 
> Note that I've been cheating and landing the patches here without getting
> sec-approval. But they're all just refactoring to make the actual fix
> possible; I could have easily put them in a separate sec-none bug.

Sigh.

> > Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
> 
> I expect them to be moderately annoying; the refactoring patches touch
> enough code that there will be minor fixups required. Once they pass compile
> + test, they should not be very risky at all.

I'm giving sec-approval+ for trunk. We should get this on beta and ESR60 for sure. I'd *like* this on ESR52 but if it winds up being too difficult or potentially destabilizing, we could survive without it there.
Attachment #8976679 - Flags: sec-approval? → sec-approval+
Rollup for beta landing. I've been slow enough that the stack of prerequisites already landed, so this is very straightforward.
(In reply to Al Billings [:abillings] from comment #50)
> (In reply to Steve Fink [:sfink] [:s:] (PTO June 31) from comment #49)
> > Comment on attachment 8976679 [details] [diff] [review]
> > discardTransferables is a no-op for DifferentProcess clones,
> > 
> > Note that I've been cheating and landing the patches here without getting
> > sec-approval. But they're all just refactoring to make the actual fix
> > possible; I could have easily put them in a separate sec-none bug.
> 
> Sigh.

Oops. My apologies; that description was incorrect. I already *did* split out and land the non-security sensitive patches into bug 1455071, but neglected to obsolete the relevant patches on this bug. Let me know if you still think I botched this process, but I *think* I got it right. (This time, anyway.) I can obsolete the changes I moved out of this bug, if it would make things more clear.

I now have an esr60 backport that compiles, but I'm running it through try before posting: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5ba25b8cdd20fdeeeacbac632aec210f6976b5fb

It includes bug 1455071, bug 1433642, bug 1456604 and its fixup bug 1458320, as well as the security patches from this bug.

I haven't attempted esr52 yet.
Comment on attachment 8979402 [details] [diff] [review]
rollup patch for beta

Approval Request Comment
[Feature/Bug causing the regression]: structured cloning (from very early on)
[User impact if declined]: enabling security breach escalation
[Is this code covered by automated tests?]: correct usage is. Incorrect usage, like what this guards against, is not, other than some fuzz tests that are not landed in the tree.
[Has the fix been verified in Nightly?]: It's in Nightly, hasn't broken anything, and fixes a fuzz test
[Needs manual test from QE? If yes, steps to reproduce]: Probably not. It would require some very specific domain knowledge to mutate buffers in just the right way. decoder has a fuzzer that is suitable for testing this, but he already has some known failing tests to compare against.
[List of other uplifts needed for the feature/fix]: for beta, only this patch is needed
[Is the change risky?]: Moderately.
[Why is the change risky/not risky?]: It changes quite a bit of code, though the changes should not result in any behavioral differences. And we have pretty good testing of the proper usage.
[String changes made/needed]: None.
Attachment #8979402 - Attachment description: Move scope into JSStructuredData, → rollup patch for beta
Attachment #8979402 - Flags: approval-mozilla-beta?
Comment on attachment 8979402 [details] [diff] [review]
rollup patch for beta

Fixes a sec-high and well-covered by tests. Approved for 61.0b8.
Attachment #8979402 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Also, I took a look at your ESR60 Try push - only the SM(rust) failure looked relevant to your changes.
[Approval Request Comment]
User impact if declined: opportunity for security breach escalation from content -> parent process
Fix Landed on Version: 62
Risk to taking this patch (and alternatives if risky): It changes a bunch of code. Profiles and IndexedDB storage use this stuff. The only alternative is to leave it unfixed.
String or UUID changes made by this patch: None

Passes try (including the SM(rust) job that failed on the first attempt.)
Attachment #8980395 - Flags: approval-mozilla-esr60?
I have a backport to esr52 that compiles and passes JS tests, but fails to even build with -Werror (a minor thing, but just an indicator of status.) It required quite a bit more work than esr60 did. The underlying BufferList had a slightly different API back then.

RyanVM, the changes in these patches span 4 different bugs. Is it better to do a single rollup, a rollup per-bug, or leave it as separate patches? (Given that it looks like you just landed on beta as a single rollup, I'm guessing you prefer that.)
for comment 59
Flags: needinfo?(ryanvm)
A single roll-up is fine.
Flags: needinfo?(ryanvm)
Comment on attachment 8980395 [details] [diff] [review]
backport for esr60

sec-high fix, approved for 60.1esr
Attachment #8980395 - Flags: approval-mozilla-esr60? → approval-mozilla-esr60+
[Approval Request Comment]
Same as for esr60
Attachment #8981586 - Flags: approval-mozilla-esr52?
Comment on attachment 8981586 [details] [diff] [review]
backport rollup for esr52

Approved for ESR 52.9.
Attachment #8981586 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
Whiteboard: [adv-main61+][adv-esr52.9+][adv-esr60.1+]
Flags: qe-verify-
Whiteboard: [adv-main61+][adv-esr52.9+][adv-esr60.1+] → [adv-main61+][adv-esr52.9+][adv-esr60.1+][post-critsmash-triage]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.