Need recursive support for structured clones

RESOLVED FIXED

Status

()

Core
JavaScript Engine
RESOLVED FIXED
7 years ago
6 years ago

People

(Reporter: Ben Turner (not reading bugmail, use the needinfo flag!), Assigned: khuey)

Tracking

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 attachment, 2 obsolete attachments)

http://www.w3.org/Bugs/Public/show_bug.cgi?id=10878

The structured clone algorithm has been updated to support recursive objects, we need to make out implementation support that.
I think we should fix this for Firefox 4, but I don't know that I can claim this absolutely must block. I'd approve a patch that adds this, even if it's arguably a feature, but it's adding a feature that was recently added to the HTML5 spec, and it's purely an additive change.
Created attachment 535756 [details] [diff] [review]
Patch

Woo!  My first Spidermonkey patch.  Please don't be too rough :-)

This relies on the fact that if something is in "memory", then it will be in the Writer's array 'objs', and we can calculate and store an offset into that array.  Then, when we deserialize, since we preserve order, we can use that offset to find our object in the Reader's array 'objs'.

I move the failing tests from check-errors.js to check-complex-object.js, except for the one that causes stack exhaustion in the recursive comparison routine :-(  There are also some tests added in the DOM postMessage structured clone tests.
Assignee: general → khuey
Status: NEW → ASSIGNED
Attachment #535756 - Flags: review?(jorendorff)
Nice work. This looks good.

Go ahead and rename MemorySet to MemoryMap (or just Memory),
since it's no longer a HashSet.

Instead of subtracting from objs.length(), just store the offset. So here:
>        return out.writePair(SCTAG_RECURSIVE_OBJECT, objs.length() - p->value);
instead just write p->value. And instead of these two lines:
>        JSObject *obj = &objs[objs.length() - data].toObject();
>        vp->setObject(*obj);
just say:
>        *vp = objs[data];

Don't throw away the test for "memory" containing lots of objects!
checkEq being recursive is not great; also it is not very picky (there
are several things it doesn't check). Below I've written a function
that's super picky and uses its own stack rather than the JS
stack. Please review it and use it to replace checkEq.

clone-complex-object.js has a mode-line that sets c-basic-offset to 4,
but then all the code in there is indented two spaces. Please make sure
it's all indented 4 spaces, as usual for JS test suite code.

>  assertEq(true, checkEq(a, deserialize(serialize(a))));
Here swap the two arguments to assertEq, please. The first arg is the actual
value and the second one is the expected value.

r=me with those changes.



function isClone(a, b) {
    var stack = [[a, b]];
    var memory = new WeakMap();
    var rmemory = new WeakMap();

    while (stack.length > 0) {
        var pair = stack.pop();
        var x = pair[0], y = pair[1];
        if (typeof x !== "object" || x === null) {
            // x is primitive.
            if (x !== y)
                return false;
        } else if (memory.has(x)) {
            // x is an object we have seen before in a.
            if (y !== memory.get(x))
                return false;
            assertEq(rmemory.get(y), x);
        } else {
            // x is an object we have not seen before.
	    // Check that we have not seen y before either.
            if (rmemory.has(y))
                return false;

            // x and y must be of the same [[Class]].
            var xcls = Object.prototype.toString.call(x);
            var ycls = Object.prototype.toString.call(y);
            if (xcls !== ycls)
                return false;

            // This function is only designed to check Objects and Arrays.
            assertEq(xcls === "[object Object]" || xcls === "[object Array]",
		     true);

            // Compare objects.
            var xk = Object.keys(x), yk = Object.keys(y);
            if (xk.length !== yk.length)
                return false;
            for (var i = 0; i < xk.length; i++) {
                // We must see the same property names in the same order.
                if (xk[i] !== yk[i])
                    return false;

                // Put the property values on the stack to compare later.
                stack.push([x[xk[i]], y[yk[i]]]);
            }

            // Record that we have seen this pair of objects.
            memory.set(x, y);
            rmemory.set(y, x);
        }
    }
    return true;
}
Attachment #535756 - Flags: review?(jorendorff) → review+
Comment on attachment 535756 [details] [diff] [review]
Patch

khuey took the new isClone function, and it detected a bug in the algorithm!

Setting r-, just as a reminder.
Attachment #535756 - Flags: review+ → review-
Created attachment 539389 [details] [diff] [review]
Patch

This implements the alternative version we talked about on IRC.  The problem with the earlier patch is that it doesn't handle the tree case.

e.g. b = {foo: bar}; a = [b, b, b]; results in a being cloned as an array of three distinct objects.

The earlier patch relied on every reference to a duplicate object being a reference to an object on the serialization stack, which is not the case here.  Instead, we must identify every object we serialize uniquely.
Attachment #535756 - Attachment is obsolete: true
Attachment #539389 - Flags: review?(jorendorff)
Created attachment 539390 [details] [diff] [review]
Patch

Right patch this time.
Attachment #539389 - Attachment is obsolete: true
Attachment #539390 - Flags: review?(jorendorff)
Attachment #539389 - Flags: review?(jorendorff)
Comment on attachment 539390 [details] [diff] [review]
Patch

Review of attachment 539390 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the comments below addressed.

::: js/src/jsclone.cpp
@@ +462,5 @@
>             out.writeBytes(abuf->data, abuf->byteLength);
>  }
>  
>  bool
> +JSStructuredCloneWriter::startObject(JSObject *obj, uint64_t *objCount)

Instead of manually maintaining objCount and passing the pointer around, how about using memory.count()? That should make the patch smaller.

@@ +470,5 @@
> +    /* Handle cycles in the object graph. */
> +    CloneMemory::AddPtr p = memory.lookupForAdd(obj);
> +    if (p)
> +        return out.writePair(SCTAG_BACK_REFERENCE_OBJECT, p->value);
> +    if (!memory.add(p, obj, (*objCount)++))

When you do this, check that memory.count() actually fits in 32 bits. Yeah, I know. Unlikely in the extreme. Go ahead and do it.

If we do overflow, do this:
    JS_ReportErrorNumber(cx, js_GetErrorMessage, NULL, JSMSG_NEED_DIET,
                         "object graph to serialize");
    return false;

@@ +781,5 @@
>          break;
>        }
>  
> +      case SCTAG_BACK_REFERENCE_OBJECT: {
> +        JS_ASSERT(data < allObjs.length());

Don't assert that the input is valid. Instead, check at runtime that it is correct, and if it's bogus, report an error (search for JSMSG_SC_BAD_SERIALIZED_DATA in this file to see examples).

::: js/src/jsclone.h
@@ +133,5 @@
>  
>      // Stack of objects with properties remaining to be read.
>      js::AutoValueVector objs;
>  
> +    // Stack of all objects read during this deserialization

Nit: It isn't really a stack, so say "vector" I guess.

Even smaller nit: The other comments have periods, so add one here.
Attachment #539390 - Flags: review?(jorendorff) → review+
Comments addressed and changes made:

http://hg.mozilla.org/tracemonkey/rev/16b6d006a3fe
Whiteboard: fixed-in-tracemonkey
cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/16b6d006a3fe
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
So, does this mean we have to add code to indexedDB to ensure that downlevel browsers reading an uplevel database doesn't mess things up?
You shouldn't ... this should be a purely additive change.
So what happens when a FF6 browser opens a database containing data with those additions? I.e. if a user downloads FF7, creates an indexedDB database with some cyclic data, then decides to switch back to FF6 and tries to open that database?
Ah, I misread.  Upgrading should be no trouble at all.  A downgrade will throw because the JS engine won't recognize the tag.
Oops, yeah, you should have bumped JS_STRUCTURED_CLONE_VERSION when doing this. Then we would have hit this path:

https://mxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBFactory.cpp#967
Will we update that field on any writes to any table in the database?

I.e. what happens if the user does the following:

1. Install FF6
2. Create a indexedDB database
3. Install FF7
4. Write a single value into the database which uses cyclic data
5. Install FF6
6. Attempt to read said value.
That would be here:

https://mxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBFactory.cpp#972

NS_NOTYETIMPLEMENTED :)
(In reply to comment #14)
> Oops, yeah, you should have bumped JS_STRUCTURED_CLONE_VERSION when doing
> this. Then we would have hit this path:
> 
> https://mxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBFactory.
> cpp#967

Is that really desirable?  Either:

1) We leave the version as is.  Downgrades fail if and only if the value read uses the functionality here.

2) We bump the version.  All downgrades fail.

1 seems like a better solution to me than 2 ...
(In reply to comment #16)
> https://mxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBFactory.
> cpp#972
> 
> NS_NOTYETIMPLEMENTED :)

Cool, so who takes care of implementing that? ;-)

(In reply to comment #17)
> > https://mxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBFactory.
> > cpp#967
> 
> Is that really desirable?  Either:
> 
> 1) We leave the version as is.  Downgrades fail if and only if the value
> read uses the functionality here.
> 
> 2) We bump the version.  All downgrades fail.

Option 1 is a bit of Russian roulette as you're relying on that the page deals properly with a random read failing without putting the database in an inconsistent state.
The indexedDB API certainly helps with making people not get into an inconsistent state, but there's still a decent chance that they will.
(In reply to comment #18)
> (In reply to comment #17)
> > > https://mxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBFactory.
> > > cpp#967
> > 
> > Is that really desirable?  Either:
> > 
> > 1) We leave the version as is.  Downgrades fail if and only if the value
> > read uses the functionality here.
> > 
> > 2) We bump the version.  All downgrades fail.
> 
> Option 1 is a bit of Russian roulette as you're relying on that the page
> deals properly with a random read failing without putting the database in an
> inconsistent state.
> The indexedDB API certainly helps with making people not get into an
> inconsistent state, but there's still a decent chance that they will.

I think intentionally breaking everyone is worse than breaking a few people in an edge case, but in the end it's not my call.
The difference is also between what type of breakage. Corrupted data and potentially dataloss, or nothing works until you upgrade to FF7 again, but then everything works.
You need to log in before you can comment on or make changes to this bug.