Closed Bug 636078 Opened 13 years ago Closed 13 years ago

Crash [@ TypedArrayTemplate<double>::copyFrom] with fake length attribute

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- final+

People

(Reporter: jruderman, Assigned: Waldo)

References

Details

(Keywords: crash, testcase, Whiteboard: [sg:critical?][hardblocker][has patch][can land], fixed-in-tracemonkey)

Crash Data

Attachments

(3 files)

Debug-only?
Attached file stack trace
Confirmed in 4.0b11 on Win7. Needs security analysis at least to see if it should block. Anyone have spare cycles?
blocking2.0: --- → ?
Assignee: general → jwalden+bmo
Status: NEW → ASSIGNED
OS: Mac OS X → All
Hardware: x86_64 → All
If it turns out to not be sg:crit, please renominate so we can re-evaluate blocking.
blocking2.0: ? → final+
Whiteboard: [sg:critical?] → [sg:critical?][hardblocker]
(Update: it is.)
So, fundamentally the problem is a typed array has four components:

* the typed array object
* the TypedArray* for it as its private
* the ArrayBuffer object for the TypedArray*
* the ArrayBuffer* for the array buffer object as its private

We create the privates before the objects sometimes.  And here, we keep a pointer to the ArrayBuffer object in the TypedArray* before we associate the TypedArray* with the typed array object.  With the way the algorithms are structured now, the buffer object pointer doesn't live on the stack for the whole time the typed array's being constructed -- it gets squirreled away in the TypedArray* and nowhere else for a bit.  And during that we can do stuff that can GC -- getting properties here, or creating object pointers, and other times.

Fixing all that is a non-trivial task, at least not to do it in a way that is convincingly correct.  You could conceivably auto-root the buffer object.  Sometimes.  Maybe.  But to be sure you get all of them?  Much harder, much more room for error.

So, much as it pains me, I'm posting a patch that adjusts the organization to never create the intermediate pointers before the actual objects.  It's a bit big.  It might still have some bugs.  But I am much more confident in its correctness as regards this particular issue than I would be with some sort of expedient and only half-principled hack.
Attached patch PatchSplinter Review
This passes shell tests on my machine.  I develop on a Linux machine without a supported graphics card, so I can't run the WebGL tests locally.  I'll copy the patch to another machine and do some running there as well.
Attachment #514711 - Flags: review?(vladimir)
(Unrelated aside: Having worked on this patch, I can much better understand/appreciate Brendan's, and apparently ECMA's, dislike for the type-punning that is intrinsically part of typed arrays as views on untyped buffers.)
I should note, in case it's not entirely obvious, that I removed some report-OOM code because cx->malloc, cx->calloc, and cx->create all already report OOM on failure, so those were double-reports.
content/canvas/test/webl runs and passes on a Windows machine with the patch.
Attachment #514711 - Flags: review?(jorendorff)
Comment on attachment 514711 [details] [diff] [review]
Patch

Looks ok to me (sorry!), but should get a review from jorendorff as well -- some of this has been swapped out of my head for the most part.

(Note that this would get a lot simpler with andreas' work to remove the separate ArrayBuffer object, as well as removing the private objects.)

The test passing is a better indication than webgl specifically, because most webgl apps' usage of typed arrays is pretty simple.  You could try running https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/sdk/tests/conformance/array-unit-tests.html -- I don't think it depends on creating a webgl context, just exercises typed arrays some more.
Attachment #514711 - Flags: review?(vladimir) → review+
Whiteboard: [sg:critical?][hardblocker] → [sg:critical?][hardblocker][has patch]
So what's left to do here. Are we still waiting on another review?
Comment on attachment 514711 [details] [diff] [review]
Patch

In TypedArrayTemplate::init:
>+        if (!obj || !copyFrom(cx, obj, other, len))
>+            return false;
>+        return obj;

The return type is bool, so just
    return obj && copyFrom(...);

or if you think it's clearer:
    if (!obj)
        return false;
    return copyFrom(...);

>-    bool
>-    copyFrom(JSContext *cx, TypedArray *tarray, jsuint offset = 0)
>+    static bool
>+    copyFrom(JSContext *cx, JSObject *thisTypedArrayObj, TypedArray *tarray, jsuint offset = 0) // XXX

Why the XXX?

r=me with those nits addressed.
Attachment #514711 - Flags: review?(jorendorff) → review+
The fundamental mistake that led to this bug is how TypedArray is trying to be a normal, typical C++ class, and it just happens to be something you can attach to a JSObject. That's not the best way to do things. GC bugs are a significant issue and there are others. So Waldo's changes make that a lot less true.

The code with the patch feels a little messy, sort of halfway rehabilitated, but we can worry about it later.
Whiteboard: [sg:critical?][hardblocker][has patch] → [sg:critical?][hardblocker][has patch][can land]
I added the XXX because it was unclear to me, from the code I'd read, if offset was ever actually passed as optional, to recheck that.  It's always passed, so keeping it optional just makes things more complicated -- removed the optional and the comment.

I looked for the first code.  Did you mean createTypedArrayWithOffsetLength rather than init?  (The method returns false a bunch of places where it really should be returning NULL, which I've fixed locally.)  Since that method returns JSObject* I'm inclined to do the s/false/NULL/ thing and leave it as-is.

The last sentence of comment 13 is correct.  The way we range-check byte offsets and lengths and all that, particularly in createTypedArrayWithOffsetLength, requires an ArrayBuffer*.    Probably we shouldn't be pulling that out, then ignoring it to re-pull it out to create the new typed array viewing it.  Other stuff has similar weirdnesses.  But it'd be more change to reorganize this all further, more chance for mistake (since most of it is bounds-checking arithmetic and all that), so it stays a bit odd for now.

I'll land this in the next half hour or so, after I head into the office, assuming I can get a clarification on the second paragraph here.
This patch is rather large to take at this point in the game, and we seem to be on the fence about whether it is worth taking. We'd certainly not take a patch this size if this weren't a security bug.

Do Jason, Brendan, or anyone else think it would be worth seeing this spot-fixed with a patch using Anchors? Jeff indicated that it would be about 30 minutes to 1 hour of work, but didn't seem very positive about the approach.
For good measure, I went over this again with a fine-toothed comb just to see if a look a couple days later, with the patch flushed somewhat from memory, pointed out anything.  I specifically compared old control flows to new.  In other words, I didn't try to read the new code for behaving correctly, but rather I read the new code paths with an eye to whether they differed from the old.  I couldn't find any differences in doing so.

In the interest of doing less unbounded fence-sitting, plus maybe getting this out for people to pound on (Gary?  ;-) ), I'm going to push this fully-reviewed patch to TM later tonight, unless I get different feedback along the lines in comment 15.  (Concerning the question in comment 14, I think this is due to the name of the function changing pre-patch to post-patch in the hunk jorendorff meant.)  This will get it in a TM nightly for people to use, and it'll more easily get it in front of Gary for any testing he can do.  We can always back out, revert, whatever, if argument after the fact decides to do that.  In the meantime the extra data may well be helpful.
There's no zero-risk alternative. I'll review the patch over the weekend, but if it has jorendorff's r+ then you're good to go. What seems rookie-ish is having to rewrite the code to understand it. The endgame trick is to understand it even if it's not factored cleanly. If the refactoring is the only way to fix it, great. If there's a smaller patch, then by a crude metric of lines of code touched or rewritten correlated to new bugs, the smaller patch can (not must) win in the end of release game.

Sayrer and I pushed for an understanding of the code as it was and a minimal patch and got nothing. So did Jason. Now that the cost of refactoring has been sunk it seems kind of jerky and possibly wasteful to question what happened. If nothing goes wrong, everyone wins. If your changes introduce a regression, then "I told you so" is cold comfort for us.

Consider dmandelin's patching jspcre. He was not about to rewrite that mess, and rightly so. But to fix a blocker he didn't need to.

Reading and understanding a finite control flow graph and one-file codebase, in order to propose (not decide on) a minimal change, is a necessary skill. I think you have the brainpower for it. That you couldn't or wouldn't do it, and instead refactored, is a general concern. But if reviewers are on the fence and I haven't had time to review, I won't stand in the way of this patch. I would appreciate a serious response to this point, though -- not just the silence we your peers got, in person and on IRC.

/be
And btw, I go too far in my cleanups, even lately (see bug 635548), so this is not me being perfect and judging others by a different standard. No one is that perfect. You don't want to walk that tightrope. Thus the general rule to minimize patch scope in the end-game. Another recent example: bug 635200.

/be
I changed the manner in which typed arrays and array buffers are allocated internally, but most logic is untouched, unaltered, and unmoved.  The allocation-manner bits are rewrite in a sense.  The rest does the least it can to accommodate that (modulo some of that OOM-fixing, although it wasn't really the point of contention).

Regarding understanding finite control flow...I try to err on the side of believing I know and truly understand less than I think I do.  More so than you do as regards me, it seems.  I think there's more risk of my misreading and not anchoring enough places than of getting logical flow in a larger patch right.  And I don't see an easy verification step like gczeal here, because when I ran this (even with gzceal(2)) things didn't go immediately awry on the first iteration of getting properties to set in the new typed array, for reasons I didn't understand then and don't understand now.

Minimal is good this late (cf. my bug 635200 suggestion).  Yet if the code doesn't let you easily verify minimal-fix correctness at a glance, there's tension between targeted-but-not-clear and broader-but-clear.  I also lean more toward clear than others do, even late, it seems.

Do you consider this a serious response?
For extra surety I did a mini-run on try on Linux64 debug, and it came back green as expected.  With that in hand I pushed to TM:

http://hg.mozilla.org/tracemonkey/rev/8323a963fd6c
Whiteboard: [sg:critical?][hardblocker][has patch][can land] → [sg:critical?][hardblocker][has patch][can land][fixed-in-tracemonkey]
Whiteboard: [sg:critical?][hardblocker][has patch][can land][fixed-in-tracemonkey] → [sg:critical?][hardblocker][has patch][can land], fixed-in-tracemonkey
(In reply to comment #16)
> In the interest of doing less unbounded fence-sitting, plus maybe getting this
> out for people to pound on (Gary?  ;-) ), I'm going to push this fully-reviewed
> patch to TM later tonight, unless I get different feedback along the lines in
> comment 15.  (Concerning the question in comment 14, I think this is due to the
> name of the function changing pre-patch to post-patch in the hunk jorendorff
> meant.)  This will get it in a TM nightly for people to use, and it'll more
> easily get it in front of Gary for any testing he can do.  We can always back
> out, revert, whatever, if argument after the fact decides to do that.  In the
> meantime the extra data may well be helpful.

Pounding on the updated TM repo changeset which has this patch now... First glance seems to be ok but I can't guarantee about the future (as always). :)
http://hg.mozilla.org/mozilla-central/rev/8323a963fd6c
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Depends on: 637643
This patch apparently broke subarray().  See bug 637643.

We need more typed array regression tests, clearly.  :(
Crash Signature: [@ TypedArrayTemplate<double>::copyFrom]
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: