Gecko not booting anymore due to "bad serialized structured data (unhandled typed array element type)"

VERIFIED FIXED in Firefox 35

Status

()

defect
--
blocker
VERIFIED FIXED
5 years ago
4 years ago

People

(Reporter: gerard-majax, Assigned: lth)

Tracking

({regression, smoketest})

unspecified
mozilla35
ARM
Gonk (Firefox OS)
Points:
---

Firefox Tracking Flags

(blocking-b2g:2.2+, firefox35 fixed, b2g-v2.2 fixed)

Details

Attachments

(1 attachment, 2 obsolete attachments)

(Reporter)

Description

5 years ago
With current master, building for JB for my ZR:

> E/GeckoConsole(  256): [JavaScript Error: "bad serialized structured data (unhandled typed array element type)"]
> E/GeckoConsole(  256): [JavaScript Error: "TypeError: aTxn.result is undefined" {file: "resource://gre/modules/AlarmDB.jsm" line: 136}]
> E/GeckoConsole(  256): [JavaScript Error: "uncaught exception: 2147500033"]
> E/GeckoConsole(  256): [JavaScript Error: "bad serialized structured data (truncated)"]
> E/GeckoConsole(  256): [JavaScript Error: "TypeError: event.target.result is undefined" {file: "resource://gre/modules/SettingsRequestManager.jsm" line: 333}]

This makes the device completely not working, Gecko boot looping.
Flags: needinfo?(kyle)
(Reporter)

Comment 1

5 years ago
> $ git revert --no-commit ce3b8256891f125592e71d4cc169f0e7a1ab918f
> $ git revert --no-commit 10c38bbb147c389bfb104a71b5c930259a7e9e96
> $ git revert --no-commit b979b1352934444bd6f64d5c6132ec22cc094ecb

I'm trying with reverting those latest changes to dom/settings
(Reporter)

Comment 2

5 years ago
(In reply to Alexandre LISSY :gerard-majax from comment #1)
> > $ git revert --no-commit ce3b8256891f125592e71d4cc169f0e7a1ab918f
> > $ git revert --no-commit 10c38bbb147c389bfb104a71b5c930259a7e9e96
> > $ git revert --no-commit b979b1352934444bd6f64d5c6132ec22cc094ecb
> 
> I'm trying with reverting those latest changes to dom/settings

So, no luck. But trashing /data/b2g and /data/local helps.
(Reporter)

Updated

5 years ago
Summary: Gecko not booting anymore, probably settings-related → Gecko not booting anymore due to "bad serialized structured data (unhandled typed array element type)"
(Reporter)

Comment 3

5 years ago
Reverting bug 1054882 fixes the issue for me.
Depends on: 1054882
(Assignee)

Comment 4

5 years ago
Attachment #8490697 - Flags: feedback?(lissyx+mozillians)
(Reporter)

Comment 5

5 years ago
Comment on attachment 8490697 [details] [diff] [review]
bug1068539-StructuredClone.diff

After applying this, my device is able to boot again. I confirmed by making sure that after reverting the backout and prior to apply the proposed patch, the device was reproducing the issue.
Attachment #8490697 - Flags: feedback?(lissyx+mozillians) → feedback+
Flags: needinfo?(kyle)
(Assignee)

Comment 6

5 years ago
Analysis (excerpted from https://bugzilla.mozilla.org/show_bug.cgi?id=1054882#c39 and later):

The problem is that the END_OF_KEYS in the list of type designators in StructuredClone.cpp must have a stable value, or deserialization will not work - END_OF_KEYS is written into the serialized form.  END_OF_KEYS need not come last in the list of type designators; no use of it depends on that.  My mistake was not realizing that.

By adding the new key for SHARED_TYPED_ARRAY_OBJECT after END_OF_KEYS the correct deserialization behavior on existing content is restored.
(Assignee)

Updated

5 years ago
Component: General → JavaScript Engine
Product: Firefox OS → Core
(Assignee)

Updated

5 years ago
Attachment #8490697 - Flags: review?(jorendorff)
(Assignee)

Updated

5 years ago
Assignee: nobody → lhansen

Comment 7

5 years ago
Couldn't you also bump XDR_BYTECODE_VERSION? I don't think we need to worry about supporting old serialized scripts.
(Assignee)

Comment 8

5 years ago
(In reply to Michael Wu [:mwu] from comment #7)
> Couldn't you also bump XDR_BYTECODE_VERSION? I don't think we need to worry
> about supporting old serialized scripts.

I don't know (though I suppose Alexandre does).  I confess I'm pretty much a novice when it comes to the ecosystem around the JS engine.
Yeah, the xdr version is there so serialization doesn't need to be backwards compatible (which would be unworkable in the long run).
(In reply to Lars T Hansen [:lth] from comment #8)
> (In reply to Michael Wu [:mwu] from comment #7)
> > Couldn't you also bump XDR_BYTECODE_VERSION? I don't think we need to worry
> > about supporting old serialized scripts.
> 
> I don't know (though I suppose Alexandre does).  I confess I'm pretty much a
> novice when it comes to the ecosystem around the JS engine.

Anytime anything changes about the serialized representation of scripts or objects, the xdr version needs to be bumped, which will cause compiled scripts which have been cached on disk to be thrown out.
(Assignee)

Comment 11

5 years ago
Alternate (and probably better) fix: change the XDR serial number.
Attachment #8490722 - Flags: feedback?(lissyx+mozillians)
(Reporter)

Comment 12

5 years ago
Comment on attachment 8490722 [details] [diff] [review]
bug1068539-StructuredClone.diff

This one does not work :(

> E/GeckoConsole(  257): [JavaScript Error: "bad serialized structured data (truncated)"]
Attachment #8490722 - Flags: feedback?(lissyx+mozillians) → feedback-
(Assignee)

Comment 13

5 years ago
(In reply to Alexandre LISSY :gerard-majax from comment #12)
> Comment on attachment 8490722 [details] [diff] [review]
> bug1068539-StructuredClone.diff
> 
> This one does not work :(
> 
> > E/GeckoConsole(  257): [JavaScript Error: "bad serialized structured data (truncated)"]

So that is interesting.  This patch did not rearrange the tags like the previous patch did, it just incremented the serial number.  That we still fail - with a different error message - suggests perhaps that somebody is not falling back properly when the serial number changes, or does not properly check the serial number.  (The error message is still from StructuredClone.cpp, representing read-past-end problems.)
Structured clone versioning is different than XDR versioning ...
(In reply to Lars T Hansen [:lth] from comment #13)
> (In reply to Alexandre LISSY :gerard-majax from comment #12)
> > Comment on attachment 8490722 [details] [diff] [review]
> > bug1068539-StructuredClone.diff
> > 
> > This one does not work :(
> > 
> > > E/GeckoConsole(  257): [JavaScript Error: "bad serialized structured data (truncated)"]
> 
> So that is interesting.  This patch did not rearrange the tags like the
> previous patch did, it just incremented the serial number.  That we still
> fail - with a different error message - suggests perhaps that somebody is
> not falling back properly when the serial number changes, or does not
> properly check the serial number.  (The error message is still from
> StructuredClone.cpp, representing read-past-end problems.)

Hmm, I guess the problem here isn't related to XDR, and we're serializing objects somewhere using the structured clone mechanism by itself.  Looking at StructuredClone.h it seems like structured cloning needs to be backwards compatible, which seems crazy to me and there isn't any comment explicating this or why this design choice was made.  Anyways, there's a SCTAG_DO_NOT_USE_1 which you can probably reuse for SCTAG_SHARED_TYPED_ARRAY_OBJECT.
(In reply to Brian Hackett (:bhackett) from comment #15)
> (In reply to Lars T Hansen [:lth] from comment #13)
> > (In reply to Alexandre LISSY :gerard-majax from comment #12)
> > > Comment on attachment 8490722 [details] [diff] [review]
> > > bug1068539-StructuredClone.diff
> > > 
> > > This one does not work :(
> > > 
> > > > E/GeckoConsole(  257): [JavaScript Error: "bad serialized structured data (truncated)"]
> > 
> > So that is interesting.  This patch did not rearrange the tags like the
> > previous patch did, it just incremented the serial number.  That we still
> > fail - with a different error message - suggests perhaps that somebody is
> > not falling back properly when the serial number changes, or does not
> > properly check the serial number.  (The error message is still from
> > StructuredClone.cpp, representing read-past-end problems.)
> 
> Hmm, I guess the problem here isn't related to XDR, and we're serializing
> objects somewhere using the structured clone mechanism by itself.  Looking
> at StructuredClone.h it seems like structured cloning needs to be backwards
> compatible, which seems crazy to me and there isn't any comment explicating
> this or why this design choice was made.  Anyways, there's a
> SCTAG_DO_NOT_USE_1 which you can probably reuse for
> SCTAG_SHARED_TYPED_ARRAY_OBJECT.

Serialized structured clones are persisted in IndexedDB.
(Assignee)

Comment 17

5 years ago
(In reply to Brian Hackett (:bhackett) from comment #15)
> 
> Anyways, there's a SCTAG_DO_NOT_USE_1 which you can probably reuse for
> SCTAG_SHARED_TYPED_ARRAY_OBJECT.

Reusing a value whose name is "do not use" makes me more nervous than adding a new value where no sign prohibits it...  Given that SCTAG_END_OF_KEYS is private to this cpp file and is only ever used for its tag value, never to size an array (I'll double check that but that's how it looked on first reading), is there any particular reason we can't add new tags after it?

Old code wouldn't be able to read structures serialized under the new tag whether it's at the end or somewhere in the middle using a previously dead tag.

(Anyway the problem backwards/forwards compatibility and/or versioning needs to be fixed somehow.  The comment in StructuredClone.h is wrong because somebody has recently(?) added Set and Map to the types handled by the engine.)
When Map and Set were added the structured clone version was correctly bumped.

http://hg.mozilla.org/mozilla-central/rev/225fa7edfb16

And yes, it appears END_OF_KEYS does not need to be the last value for any reason.
(In reply to Lars T Hansen [:lth] from comment #17)
> Reusing a value whose name is "do not use" makes me more nervous than adding
> a new value where no sign prohibits it...  Given that SCTAG_END_OF_KEYS is
> private to this cpp file and is only ever used for its tag value, never to
> size an array (I'll double check that but that's how it looked on first
> reading), is there any particular reason we can't add new tags after it?

I don't know, but adding keys after SCTAG_END_OF_KEYS isn't all that great either.  Is structured cloning of shared typed arrays actually needed or spec'ed anywhere?

> (Anyway the problem backwards/forwards compatibility and/or versioning needs
> to be fixed somehow.  The comment in StructuredClone.h is wrong because
> somebody has recently(?) added Set and Map to the types handled by the
> engine.)

Bug 1036136, though reading that bug doesn't help my understanding much.
So was the change actually not backwards compatible? Or did we just pick the wrong number for the tag value?
Flags: needinfo?(evilpies)
(Assignee)

Comment 21

5 years ago
(In reply to Brian Hackett (:bhackett) from comment #19)
> (In reply to Lars T Hansen [:lth] from comment #17)
> > Reusing a value whose name is "do not use" makes me more nervous than adding
> > a new value where no sign prohibits it...  Given that SCTAG_END_OF_KEYS is
> > private to this cpp file and is only ever used for its tag value, never to
> > size an array (I'll double check that but that's how it looked on first
> > reading), is there any particular reason we can't add new tags after it?
> 
> I don't know, but adding keys after SCTAG_END_OF_KEYS isn't all that great
> either.  Is structured cloning of shared typed arrays actually needed or
> spec'ed anywhere?

It's specified in the draft spec for shared memory and atomics.  Whether it's needed is slightly speculative obviously but I think it will be.

(Said spec: https://docs.google.com/document/d/1NDGA_gZJ7M7w1Bh8S0AoDyEqwDdRh4uSoTPSNn77PFk/edit?usp=sharing)

We could back out that functionality narrowly right now to unblock FFOS, and maybe that's the right thing, but Kyle pointed to (comment #18) the version number for structured clone which I obviously neglected to modify; perhaps there is a solution there.
(Assignee)

Comment 22

5 years ago
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #20)
> So was the change actually not backwards compatible? Or did we just pick the
> wrong number for the tag value?

The original change was backwards incompatible because a tag got reassigned accidentally.

The proposed fix makes the code backwards compatible though it will be possible for the encoder to produce structures that old code cannot read (because there will be a new tag).
(Assignee)

Updated

5 years ago
Attachment #8490722 - Attachment is obsolete: true
(In reply to Lars T Hansen [:lth] from comment #22)
> The proposed fix makes the code backwards compatible though it will be
> possible for the encoder to produce structures that old code cannot read
> (because there will be a new tag).

And bumping the structured clone version ensures that it will give and throw an error if it tries to.
(Assignee)

Comment 24

5 years ago
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #23)
> (In reply to Lars T Hansen [:lth] from comment #22)
> > The proposed fix makes the code backwards compatible though it will be
> > possible for the encoder to produce structures that old code cannot read
> > (because there will be a new tag).
> 
> And bumping the structured clone version ensures that it will give and throw
> an error if it tries to.

And if I bump the version number and make changes to indexedDB along the lines of what evilpies did then we should be fairly safe, yes?  My preference would be to stick to the backward-compatibleness of my first proposed patch, so as not to introduce ripple effects into the indexedDB code.
Right.  Making the change backwards compatible and doing what evilpies did to the structured clone version/IDB stuff is the right move imo.  Along with adding some comments in the js engine.
Oh, sorry. I misread. evilpies did the right thing.
Flags: needinfo?(evilpies)
(Assignee)

Updated

5 years ago
Attachment #8490697 - Attachment is obsolete: true
Attachment #8490697 - Flags: review?(jorendorff)
(Assignee)

Comment 27

5 years ago
This bumps the structured clone version number while keeping the serialized format stable and backwards compatible, and propagates the bump in version number into the indexedDB code, following evilpie's patch for Set and Map.

(There's not been much I've been able to do to test this locally.)
Attachment #8490775 - Flags: review?(khuey)
Attachment #8490775 - Flags: review?(jorendorff)
Comment on attachment 8490775 [details] [diff] [review]
bug1068539-StructuredClone.diff

Review of attachment 8490775 [details] [diff] [review]:
-----------------------------------------------------------------

r=me on the IDB bits.
Attachment #8490775 - Flags: review?(khuey) → review+
(Assignee)

Comment 29

5 years ago
Does this little adventure suggests overly light test coverage in the area of indexedDB?  Or (more likely) test data that is not using the latest encoding?  (Encodings before the latest, which introduced the END marker, would not trigger this bug.)
Flags: needinfo?(bent.mozilla)
IndexedDB doesn't have automated migration tests that test that earlier databases work in the current code, unfortunately.
Comment on attachment 8490775 [details] [diff] [review]
bug1068539-StructuredClone.diff

Review of attachment 8490775 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/public/StructuredClone.h
@@ +68,5 @@
> +// This callback must first use the JS_WriteUint32Pair API to write an object
> +// header, passing a value greater than JS_SCTAG_USER to the tag parameter.
> +// Then it can use the JS_Write* APIs to write any other relevant parts of
> +// the value v to the writer w. closure is any value passed to the
> +// JS_WriteStructuredCLone function.

Would you mind fixing the capitalization error on the last line here while you're passing through?

::: js/src/vm/Xdr.h
@@ +27,5 @@
>   * this wiki page:
>   *
>   *  https://developer.mozilla.org/en-US/docs/SpiderMonkey/Internals/Bytecode
>   */
> +static const uint32_t XDR_BYTECODE_VERSION = uint32_t(0xb973c0de - 184);

Don't bump this one, since you aren't making any changes to the XDR format.
Attachment #8490775 - Flags: review?(jorendorff) → review+
https://hg.mozilla.org/mozilla-central/rev/98dd1d0aba48
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
[Blocking Requested - why for this release]:
Functional regression that breaks smoke tests.
blocking-b2g: --- → 2.2?
Keywords: smoketest
Flags: needinfo?(bent.mozilla)
Bug does NOT repro on todays build on 2.2 Flame 319MB

I am able to boot after OTA, and manual flashing, verifying fixed.

Enviromental Variables:
----------------------------------------
Device: Flame 2.2 Master
BuildID: 20140919040205
Gaia: bc2da39ccd2b82b67773f10c8da8fc8eedc483ab
Gecko: c8e325eee9e1
Version: 35.0a1 (2.2 Master)
Firmware: V123
User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(pbylenga)
blocking-b2g: 2.2? → 2.2+
You need to log in before you can comment on or make changes to this bug.