Closed Bug 1264613 Opened 3 years ago Closed 3 years ago

Crash in js::GetBuiltinClass with web workers, typed arrays, console.log

Categories

(Core :: JavaScript Engine, defect, critical)

x86_64
Linux
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla49
Tracking Status
firefox45 --- wontfix
firefox46 --- wontfix
firefox47 --- verified
firefox48 --- verified
firefox49 --- verified
firefox-esr38 --- unaffected
firefox-esr45 --- affected

People

(Reporter: pipcet, Assigned: pipcet)

References

Details

(5 keywords, Whiteboard: [sg:dos])

Crash Data

Attachments

(4 files, 2 obsolete files)

Attached file crashing worker script
This worker script reliably crashes Firefox:

function f(global)
{
    this.global = global;
    this.heap = new ArrayBuffer(0);
    this.HEAP8 = new Int8Array(this.heap);
    g = this;
}

console.log(new f(this));

(saved as worker.js, invoked via <script>new Worker("worker.js")</script> on a local file system).

g is a global variable. This crashes only in worker scope, only when the ArrayBuffer and Int8Array are both present, only when the global variable is assigned to, and only when the console.log statement is present; I haven't had time yet to track this down, so I don't know which component the bug really is in.
CR FF48:
https://crash-stats.mozilla.com/report/index/8628969f-bfc3-4f73-94a8-b0fd82160414
Severity: normal → critical
Status: UNCONFIRMED → NEW
Crash Signature: [@ js::GetBuiltinClass ]
Ever confirmed: true
Summary: Crash with web workers, typed arrays, console.log → Crash in js::GetBuiltinClass with web workers, typed arrays, console.log
Seems to be a JS crash:


Frame 	Module 	Signature 	Source
0 	xul.dll 	js::GetBuiltinClass(JSContext*, JS::Handle<JSObject*>, js::ESClassValue*) 	js/src/jsfriendapi.cpp
1 	xul.dll 	`anonymous namespace'::TypedArrayObjectTemplate<signed char>::fromBufferWithProto 	js/src/vm/TypedArrayObject.cpp
2 	xul.dll 	JSStructuredCloneReader::readTypedArray(unsigned int, unsigned int, JS::MutableHandle<JS::Value>, bool) 	js/src/vm/StructuredClone.cpp
3 	xul.dll 	JSStructuredCloneReader::startRead(JS::MutableHandle<JS::Value>) 	js/src/vm/StructuredClone.cpp
4 	xul.dll 	JSStructuredCloneReader::read(JS::MutableHandle<JS::Value>) 	js/src/vm/StructuredClone.cpp
5 	xul.dll 	ReadStructuredClone(JSContext*, unsigned __int64*, unsigned int, JS::MutableHandle<JS::Value>, JSStructuredCloneCallbacks const*, void*) 	js/src/vm/StructuredClone.cpp
6 	xul.dll 	JS_ReadStructuredClone(JSContext*, unsigned __int64*, unsigned int, unsigned int, JS::MutableHandle<JS::Value>, JSStructuredCloneCallbacks const*, void*) 	js/src/vm/StructuredClone.cpp
7 	xul.dll 	JSAutoStructuredCloneBuffer::read(JSContext*, JS::MutableHandle<JS::Value>, JSStructuredCloneCallbacks const*, void*) 	js/src/vm/StructuredClone.cpp
8 	xul.dll 	mozilla::dom::StructuredCloneHolderBase::Read(JSContext*, JS::MutableHandle<JS::Value>) 	dom/base/StructuredCloneHolder.cpp
9 	xul.dll 	mozilla::dom::ConsoleCallDataRunnable::ProcessCallData(JSContext*) 	dom/base/Console.cpp
10 	xul.dll 	mozilla::dom::ConsoleCallDataRunnable::RunConsole(JSContext*, nsPIDOMWindowOuter*, nsPIDOMWindowInner*) 	dom/base/Console.cpp
Component: DOM: Workers → JavaScript Engine
I've looked into this some more, mostly in StructuredClone.cpp, and it seems the problem is the way that the worker scope global object turns into a string ("[object DedicatedWorkerGlobalScope]") during the structured cloning serialization. That appears to confuse our object counts, and a backreference to the ArrayBuffer (which has been constructed at this point) is misinterpreted as a backreference to the TypedArray which is being constructed; but that's a placeholder value (and the wrong object anyway), so everything goes wrong.

It's a theory, at least.

That would suggest that an easy fix might be to make dom/base/Console.cpp:CustomWriteHandler write a string object rather than a string value, if I understand correctly. I'll try that next.
As confirmation, this worker script:

o = {};
p = {};

console.log([o, this, p, this]);

produces:

Array [ Object, "[object DedicatedWorkerGlobalScope]", Object, Object ]

when it should produce:

Array [ Object, "[object DedicatedWorkerGlobalScope]", Object, "[object DedicatedWorkerGlobalScope]" ]


So I'm pretty convinced now that the bug is that when JSStructuredCloneWriter::startWrite calls callbacks->write for an object, the callback must write an object since startObject has already assigned a numeric id for backreferences to this object. The callback for DedicatedWorkerGlobalScope fails to do this and writes a string instead, rendering all further object backreferences in the same serialization packet invalid.

I'm not sure how many custom write handlers there are or are expected to be and whether it's sufficient to review them once, or whether it's necessary to add an extra check for this (in debug mode, at least?) or rename the ->write method to something like ->writeObject to clarify that you're not allowed to write a string instead.
So here's a first attempt at fixing this bug. I think the bug is going to reappear and cause hard-to-track-down issues unless we change the API to make it explicit whether a serialized object will deserialize as an object or as a non-object value.

Serializing an object as a non-object value is something we want to do, particularly for our implementation of console.log, so removing that feature seems like a bad idea. Similarly, writing String objects instead of strings works, but would need further changes to the console logging code to (imperfectly) distinguish objects-serialized-as-Strings and actual String objects.

This patch changes the API, but does not change the serialized data stream format. Invalid serialization packets caused by the previous code will remain invalid and cause errors or invalid data, but no longer crashes, upon deserialization.

The API change consists of an additional bool *wroteObject argument to those methods that write serialized data; if such a method returns true, it MUST set *wroteObject to true if the serialized data will deserialize as an object or to false if the serialized data will deserialize as a non-object JS value.  Existing methods set *wroteObject to true with the exception of the Console.cpp
serialization method which sets it to false if the object is serialized as a string.

Any comments would be much appreciated. In particular, is there a better way of doing all this? And should we let *wroteObject default to true, which would simplify most users of this code but, IMHO, make it more likely for the bug to reappear when a user decides to write non-object values?
Rediff against current inbound tree.
Attachment #8742065 - Attachment is obsolete: true
Comment on attachment 8742135 [details] [diff] [review]
0001-Bug-1264613-Allow-object-to-nonobject-serialization.patch

Review of attachment 8742135 [details] [diff] [review]:
-----------------------------------------------------------------

r=me on the js/src parts. This looks fine to me and the analysis makes sense.

pipcet, did you mean to put this up for review?
Attachment #8742135 - Flags: review+
Thanks!

> pipcet, did you mean to put this up for review?

If the analysis makes sense to you, I think this should go in, yes. Do I set review back to ? for the non-js/src parts?
Comment on attachment 8742135 [details] [diff] [review]
0001-Bug-1264613-Allow-object-to-nonobject-serialization.patch

Review of attachment 8742135 [details] [diff] [review]:
-----------------------------------------------------------------

Setting r?bz for the non-js/src parts.
Attachment #8742135 - Flags: review?(bzbarsky)
Instead changing all of this with an extra param that is quite confusing about the meaning, what about if manage the 'anomaly' differently. I propose this:

Add an extra method in the JS clone API such as:

void JS_ObjectNotWrittenByStructuredClone() or a better name.
This method can be called just during the write callback in this way:
... somewhere in the context definition:

enum {
  NOT_IN_CALLBACK;
  IN_CALLBACK;
  OBJECT_NOT_WRITTEN;
} objectWritten;

... then:

context()->objectWritten = IN_CALLBACK;
if (!callbacks->write(context(), this, obj, closure))
    return false;

if (context()->objectWritten == OBJECT_NOT_WRITTEN)
    memory.remove(memory.lookup(obj));

context()->objectWritten = NOT_IN_CALLBACK;

It seems cleaner to me.
Comment on attachment 8742135 [details] [diff] [review]
0001-Bug-1264613-Allow-object-to-nonobject-serialization.patch

Review of attachment 8742135 [details] [diff] [review]:
-----------------------------------------------------------------

Consider what I propose in my latest comment.

::: dom/base/Console.cpp
@@ +531,5 @@
>  
>    virtual bool CustomWriteHandler(JSContext* aCx,
>                                    JSStructuredCloneWriter* aWriter,
> +                                  JS::Handle<JSObject*> aObj,
> +                                  bool *wroteObject) override

Then...
1. aWroteObject here and everywhere else.
2. bool*<space> and not bool<space>*
Attachment #8742135 - Flags: review?(bzbarsky) → review-
Group: dom-core-security
(In reply to Andrea Marchesini (:baku) from comment #11)
> Instead changing all of this with an extra param that is quite confusing
> about the meaning, what about if manage the 'anomaly' differently.

This is just speculation on my part, but I think having a "confusing" parameter makes it more likely that people will actually check they're using it correctly. Merely documenting that it's a requirement failed once before...

> I propose
> this:
> 
> Add an extra method in the JS clone API such as:
> 
> void JS_ObjectNotWrittenByStructuredClone() or a better name.
> This method can be called just during the write callback in this way:
> ... somewhere in the context definition:
> 
> enum {
>   NOT_IN_CALLBACK;
>   IN_CALLBACK;
>   OBJECT_NOT_WRITTEN;
> } objectWritten;
> 
> ... then:
> 
> context()->objectWritten = IN_CALLBACK;
> if (!callbacks->write(context(), this, obj, closure))
>     return false;
> 
> if (context()->objectWritten == OBJECT_NOT_WRITTEN)
>     memory.remove(memory.lookup(obj));
> 
> context()->objectWritten = NOT_IN_CALLBACK;
> 
> It seems cleaner to me.

Can write callbacks be called reentrantly, though? That would clobber a value that's only stored once per context(), and it's unclear which object would not be written.

How about something like

void JS_ObjectNotWrittenByStructuredClone(JSStructuredCloneWriter *w, HandleObject obj)
{
    w->memory.remove(w->memory.lookup(obj));
}

(I'd still prefer the extra argument, precisely because it requires all users to figure out what the parameter is for and what to return in it, but that's just a matter of opinion).
In terms of a security rating is this issue always going to result in a NULL deference? Or could we see something worse?
I've gone ahead and implemented the simple version of the method. It'd be great if you could have a look, and thanks for the previous review!
Attachment #8742135 - Attachment is obsolete: true
Attachment #8742926 - Flags: review?(amarchesini)
Attachment #8742926 - Flags: review?(amarchesini) → review+
(In reply to Tyson Smith [:tsmith] from comment #15)
> In terms of a security rating is this issue always going to result in a NULL
> deference? Or could we see something worse?

No definite answer, but I've had a look and can't see any immediate way to cause a buffer overflow or anything like that. The pointer we dereference is the payload of an undefined Value, so it will always be NULL. We do produce incorrect deserialized data in some cases, but this is limited to the Console API.
Thanks for the review. May I ask how to continue at this point? I don't have commit access or anything, and I'm not sure how the procedure differs for security-relevant bugs.
For security bugs, the next step is asking for sec-approval to check in.  At least for critical/high ones... (which this one might be; it's still untriaged, apparently).

So load https://bugzilla.mozilla.org/attachment.cgi?id=8742926&action=edit and then toggle the "sec‑approval" dropdown under "Flags" to "?" and fill in the questionnaire.
Comment on attachment 8742926 [details] [diff] [review]
0001-Bug-1264613-Allow-object-to-nonobject-serialization.patch

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

Exploits limited to causing a NULL pointer dereference can be constructed easily based on the patch.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

The check-in comment effectively describes how to exploit the security problem to cause a NULL pointer dereference.

Which older supported branches are affected by this flaw?

I tested Firefox 45 ESR, Firefox 46, Firefox 47, and all were affected.

If not all supported branches, which bug introduced the flaw?

Bug 1184541.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

No. I expect them to be easy to create, effectively identical to this patch, and low-risk.

How likely is this patch to cause regressions; how much testing does it need?

I consider this patch relatively low-risk since it is small and affects only cases for which the previous code was clearly in error.
Attachment #8742926 - Flags: sec-approval?
We need a security rating on this before we even know if sec-approval applies.

If this is a simple null deref, that isn't that bad. It appears that the thinking is that it is only that. Is that correct?

Adding Dan for opinions.
We don't need to treat this as a sec bug -- land away!
Blocks: 1184541
Group: dom-core-security
Flags: needinfo?(dveditz)
Whiteboard: [sg:dos]
Attachment #8742926 - Flags: sec-approval?
I just verified the patch still applies and appears to work, but I'm not sure what the next step is. Sorry about that.
The next step is someone checks in the patch.  You can request the sheriffs to do it by adding "checkin-needed" to the keywords field.  Then once it gets checked in, we should probably request approval for the relevant supported branches that this needs to be backported to.
Keywords: checkin-needed
Andrea, can you nominate for branch stuff as needed?
Flags: needinfo?(amarchesini)
https://hg.mozilla.org/mozilla-central/rev/0504bc4398f6
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Assignee: nobody → pipcet
Flags: needinfo?(amarchesini)
Comment on attachment 8742926 [details] [diff] [review]
0001-Bug-1264613-Allow-object-to-nonobject-serialization.patch

Approval Request Comment
[Feature/regressing bug #]: Console API in workers
[User impact if declined]: A crash
[Describe test coverage new/current, TreeHerder]: none
[Risks and why]: none: the patch informs the JSStructuredClone algorithm that an object has not been written/serialized, so that its pointer is removed from a hashtable used to keep objects alive. This is needed for Console API in workers because we stringify objects instead serialize them somehow.
[String/UUID change made/needed]: none
Attachment #8742926 - Flags: approval-mozilla-aurora?
Comment on attachment 8742926 [details] [diff] [review]
0001-Bug-1264613-Allow-object-to-nonobject-serialization.patch

Crash fix, pretty low volume but this issue has potential to cause other problems, so it seems worth uplifting as far as aurora.
Attachment #8742926 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Hi Andrea, should we consider uplifting to Beta47 as well? Or is this risky and should be a wontfix for beta?
Flags: needinfo?(amarchesini)
Comment on attachment 8742926 [details] [diff] [review]
0001-Bug-1264613-Allow-object-to-nonobject-serialization.patch

Approval Request Comment
[Feature/regressing bug #]: Console API in workers
[User impact if declined]: a crash
[Describe test coverage new/current, TreeHerder]: none
[Risks and why]: This patch is stable on m-i/m-c and it's not too complex. I don't see big risk to uplift to beta.
[String/UUID change made/needed]: none
Flags: needinfo?(amarchesini)
Attachment #8742926 - Flags: approval-mozilla-beta?
Comment on attachment 8742926 [details] [diff] [review]
0001-Bug-1264613-Allow-object-to-nonobject-serialization.patch

Crash fix, Beta47+
Attachment #8742926 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify+
Reproduced with 48.0a1 from 2016-04-18, under Windows 10 64-bit ⇒ bp-21360e17-fdd7-4594-b348-79dd12160524
No crashes encountered with 47 beta 8 (Build ID: 20160523113146) nor with latest Nightly 49.0a1, across platforms [1].
No new crash reports in Socorro after this fix in the last 14 days [2]. 

[1] Windows 10 64-bit, Mac OS X 10.11.1 and Ubuntu 12.04 64-bit
[2] https://crash-stats.mozilla.com/report/list?range_unit=days&range_value=14&signature=js%3A%3AGetBuiltinClass#tab-reports
Status: RESOLVED → VERIFIED
Also confirming the fix for Fx 48.0b2, build ID: 20160620091522.
Flags: qe-verify+
Crash volume for signature 'js::GetBuiltinClass':
 - nightly (version 50): 2 crashes from 2016-06-06.
 - aurora  (version 49): 0 crash from 2016-06-07.
 - beta    (version 48): 41 crashes from 2016-06-06.
 - release (version 47): 494 crashes from 2016-05-31.
 - esr     (version 45): 7 crashes from 2016-04-07.

Crash volume on the last weeks:
             Week N-1   Week N-2   Week N-3   Week N-4   Week N-5   Week N-6   Week N-7
 - nightly          0          1          1          0          0          0          0
 - aurora           0          0          0          0          0          0          0
 - beta            13          7          8          3          3          2          1
 - release        135        124        146         33          4          6          3
 - esr              1          3          1          0          0          0          0

Affected platforms: Windows, Linux
You need to log in before you can comment on or make changes to this bug.