Closed Bug 505656 Opened 12 years ago Closed 11 years ago

Bookmarks backup JSON file is not valid JSON (includes trailing commas)

Categories

(Firefox :: Bookmarks & History, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 3.7a5
Tracking Status
blocking1.9.2 --- -
status1.9.2 --- wontfix
blocking1.9.1 --- -
status1.9.1 --- wontfix

People

(Reporter: aaronecay, Assigned: Waldo)

References

Details

Attachments

(4 files)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090721 Minefield/3.6a1pre
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090721 Minefield/3.6a1pre

The bookmarks backup JSON file created daily in the profile directory is not valid JSON.  It contains a trailing comma, which makes JSON parsers barf, including http://code.google.com/p/json-framework/ (Mac OS X/Cocoa) and http://www.jsonlint.com/ (Web).  A similar bug was reported for the built-in JSON serialization facility: https://bugzilla.mozilla.org/show_bug.cgi?id=426718 .  That bug is no longer reproducible (i.e. fixed?) on the latest nightly.  However the bookmarks backup code still uses manual JSON generation.  

The problem function appears to be serializeNodeAsJSONToOutputStream at mozilla/toolkit/components/places/src/utils.js:1490 .  I believe that the offending comma is written by line 1626 of that file.

Reproducible: Always

Steps to Reproduce:
1. Create a new profile
2. Remove all bookmarks via the bookmark manager
3. Choose "Backup" from the bookmark manager toolbar
Actual Results:  
An invalid JSON file is produced.  I will attempt (as I've never used bugzilla before) to attache this file to this report

Expected Results:  
A valid JSON file is produced.
Note the comma which is the third-from-last character in the file.
Attachment #389864 - Attachment mime type: application/octet-stream → text/plain
Presumably the tail of the children are all in aExcludeItems.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h".

In Thunderbird 3.0b, you do that as follows:
Tools | Message Filters
Make sure the correct account is selected. Click "New"
Conditions: Body   contains   places-to-b-and-h
Change the action to "Delete Message".
Select "Manually Run" from the dropdown at the top.
Click OK.

Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter.

Gerv
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
This is still an issue in Firefox 3.5.5. Exporting my bookmarks will result in a JSON stream that ends in "}]},]}", which is clearly invalid since JSON does not permit trailing commas in arrays.
I'm looking to fix this error in JSON.parse, but this bug prevents that right now.
Blocks: 564621
Summary: Bookmarks backup JSON file is not valid JSON → Bookmarks backup JSON file is not valid JSON (includes trailing commas)
This, once fixed, should be backported to 1.9.2, since otherwise when trunk bug 564621 will be fixed and users will upgrade, they will be without a valid backup. For the same reason this should probably land on trunk at least a week before bug 564621.
Tentatively taking, have written a strawman patch and pushed to try, will see how that goes...
Assignee: nobody → jwalden+bmo
Attached patch Proto-patchSplinter Review
This doesn't seem to fail any tinderbox tests, maybe, but this is against TM, which currently has some unrelated test bustage, so it's hard to say whether it's all good or not:

http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1273533148.1273540606.12415.gz&fulltext=1

Once that bustage gets fixed I'll get back to this.  Also this may need some migration-adding code (or maybe JSON-strictifying patch needs it), haven't considered that yet.
Comment on attachment 444542 [details] [diff] [review]
Proto-patch

Try has nothing but love for this.

I think this can go in without needing any extra migration code -- only the require-valid-JSON-input part may need some.  Please correct me if I'm mistaken.
Attachment #444542 - Flags: review?(mak77)
Comment on attachment 444542 [details] [diff] [review]
Proto-patch

>diff --git a/toolkit/components/places/src/PlacesUtils.jsm b/toolkit/components/places/src/PlacesUtils.jsm
>--- a/toolkit/components/places/src/PlacesUtils.jsm
>+++ b/toolkit/components/places/src/PlacesUtils.jsm
>@@ -1508,28 +1508,18 @@ var PlacesUtils = {
>       }
>     }
> 
>-    function writeScalarNode(aStream, aNode) {
>-      // serialize to json
>-      var jstr = PlacesUtils.toJSONString(aNode);
>-      // write to stream
>-      aStream.write(jstr, jstr.length);
>+    function translateScalarNode(aNode) {
>+      return aNode;
>     }

I know you're not here to fix all of our crappy code (even if I'd actually appreciate that!), but this function is now pointless, can you just kill it and replace calls to it with just "node" please?

It looks good and is the same thing i was planning to do for this bug, before giving final review I just want to do a couple manual tests today.
regarding migration, the only thing I ask you is to land on 1.9.2, so that when we will upgrade in future users will have valid json backups.
Comment on attachment 444542 [details] [diff] [review]
Proto-patch

ok, works fine afaict. We have tests for bookmarks backup/restore, thus not requiring a new one.
Attachment #444542 - Flags: review?(mak77) → review+
Comment on attachment 444542 [details] [diff] [review]
Proto-patch

>diff --git a/toolkit/components/places/src/PlacesUtils.jsm b/toolkit/components/places

>-    function serializeNodeToJSONStream(bNode, aIndex) {
>+    function translateNodeToObject(bNode, aIndex) {
>       var node = {};
> 
>       // set index in order received
>@@ -1597,14 +1584,14 @@ var PlacesUtils = {
>       }
> 
>       if (!node.feedURI && node.type == PlacesUtils.TYPE_X_MOZ_PLACE_CONTAINER)
>-        writeComplexNode(aStream, node, bNode);
>-      else
>-        writeScalarNode(aStream, node);
>-      return true;
>+        return translateComplexNode(node, bNode);
>+      return translateScalarNode(node);
>     }

So...looking at this again during a 192 port, I'm wondering if this is quite what it should be.  Specifically, translateNodeToObject is used as though it always returns a node to add -- but in some cases, out of the context in the patch.  Here's full context:

> -    function serializeNodeToJSONStream(bNode, aIndex) {
> +    function translateNodeToObject(bNode, aIndex) {
>        var node = {};
>  
>        // set index in order received
>        // XXX handy shortcut, but are there cases where we don't want
>        // to export using the sorting provided by the query?
>        if (aIndex)
>          node.index = aIndex;
>  
>        addGenericProperties(bNode, node);
>  
>        var parent = bNode.parent;
>        var grandParent = parent ? parent.parent : null;
>  
>        if (PlacesUtils.nodeIsURI(bNode)) {
>          // Tag root accept only folder nodes
>          if (parent && parent.itemId == PlacesUtils.tagsFolderId)
>            return false;
>          // Check for url validity, since we can't halt while writing a backup.
>          // This will throw if we try to serialize an invalid url and it does
>          // not make sense saving a wrong or corrupt uri node.
>          try {
>            PlacesUtils._uri(bNode.uri);
>          } catch (ex) {
>            return false;
>          }
>          addURIProperties(bNode, node);
>        }
>        else if (PlacesUtils.nodeIsContainer(bNode)) {
>          // Tag containers accept only uri nodes
>          if (grandParent && grandParent.itemId == PlacesUtils.tagsFolderId)
>            return false;
>          addContainerProperties(bNode, node);
>        }
>        else if (PlacesUtils.nodeIsSeparator(bNode)) {
>          // Tag root accept only folder nodes
>          // Tag containers accept only uri nodes
>          if ((parent && parent.itemId == PlacesUtils.tagsFolderId) ||
>              (grandParent && grandParent.itemId == PlacesUtils.tagsFolderId))
>            return false;
>  
>          addSeparatorProperties(bNode, node);
>        }
>  
>        if (!node.feedURI && node.type == PlacesUtils.TYPE_X_MOZ_PLACE_CONTAINER)
> -        writeComplexNode(aStream, node, bNode);
> -      else
> -        writeScalarNode(aStream, node);
> -      return true;
> +        return translateComplexNode(node, bNode);
> +      return node;
>      }
>  
>      // serialize to stream
> -    serializeNodeToJSONStream(aNode, null);
> +    var repr = translateNodeToObject(aNode, null);
> +    var json = JSON.stringify(repr);
> +    aStream.write(json, json.length);
>    },

So right now, if we hit any of these return false statements, we'll insert some sort of property that's semi-bogus, with a "random" boolean value of false.  Maybe it makes more sense for translateNodeToObject to instead be:

> function translateNodeToObject(bNode, aIndex, base, property) {
>   if (shouldAddNode(bNode, aIndex))
>     base[property] = node(bNode, aIndex);
> }

...assuming something like a shouldAddNode method that determines whether or not we'd hit a return-false case and a node that computes the node that would be assigned each exist.

Does this seem like a good change to make, perhaps in a patch atop the one here?  The falses seem worrisome to me at the very least from a code-structuring point of view, even if it might be the case that they won't usually be hit for some reason.
Attachment #444542 - Flags: feedback?(mak77)
Yes, I guess it would be bad to have boolean values instead of expected objects, those return false things were put there because of corruptions we had in the past (that were interrupting restores). As of today I'd expect all bugs regarding that kind of things being fixed, but still having some sort of "pass-check" before adding a node sounds useful for future.
I'd rename shouldAddNode to canAddNode, we want all nodes, but we could not be able to add some.
And translateNodeToObject would now also set the node in the base object so the name is ambiguous, something like transplantNodeToObject, or if you have better ideas. Also the last assignment is var repr = translateNodeToObject(aNode, null); in this case there is no property, so I guess you should check if property === undefined then directly assign to the base object

So far canAddNode should check that:
- if node is a uri node, uri should be a valid nsIURI
- if node.parent is tags folder, node should be a folder node
and in future we could add other checks.
Attachment #444542 - Flags: feedback?(mak77)
hm actually the tags part is a bit more complex.
- if (node.parent.itemId == PlacesUtils.tagsFolderId) then node must be a folder node
if (node.parent.parent.itemId == PlacesUtils.tagsFolderId) then node must be a uri node
this is because tags root is a folder containing folders, and each of these can only be uris
s/can only be uris/can only contain uris/
The strawman from comment 13 requires a weird disconnection of validity-checking and conversion code -- likely bug-prone, I think.  The current setup is much more natural.  A few tweaks to it, as done in this patch, produces a much more natural final result, and as a side bonus it makes the necessary changes much smaller.

Try server still does well with this.
Attachment #449348 - Flags: review?(mak77)
Comment on attachment 449348 [details] [diff] [review]
Patch atop previous: a (better, I think) way to cleanly omit unserializable nodes

>diff --git a/toolkit/components/places/src/PlacesUtils.jsm b/toolkit/components/places/src/PlacesUtils.jsm

>+    function appendConvertedComplexNode(aNode, aSourceNode, aArray) {
>       var repr = {};
> 
>       for (let [name, value] in Iterator(aNode))
>@@ -1530,16 +1526,21 @@ var PlacesUtils = {
>           var childNode = aSourceNode.getChild(i);
>           if (aExcludeItems && aExcludeItems.indexOf(childNode.itemId) != -1)
>             continue;
>-          children.push(translateNodeToObject(aSourceNode.getChild(i), i));
>+
>+          // This conceivably might fail to append, but it shouldn't cause the
>+          // overall conversion to fail.  Should we maybe record a failure here
>+          // somehow, rather than just swallowing completely?
>+          appendConvertedNode(aSourceNode.getChild(i), i, children);

as I said, I expect this code to fail only in Firefox 3.0 alphas/betas. PlacesDBUtils is also taking care of many of these corruptions. Thus I think there is not even the need for a verbose comment, I'd remove the "Should we maybe..." part.

It's a bit hard to put toghether 2 patches and figure out the final result, but looks like this is mostly retaining the old behavior and let methods do the addition rather than doing it on return. Sounds fine.
Attachment #449348 - Flags: review?(mak77) → review+
http://hg.mozilla.org/mozilla-central/rev/f21132993dc2

The 192 backport was straightforward; I'll post that patch for approval (after it's baked a bit, of course) in a second.
Status: NEW → RESOLVED
Closed: 11 years ago
OS: Mac OS X → All
Hardware: x86 → All
Resolution: --- → FIXED
Target Milestone: --- → Firefox 3.7a5
Attached patch 192 backportSplinter Review
The backport to 192 was pretty easy -- just deal with a different file name and manually, but straightforwardly, merge a few hunks around a little change at the edges.
Attachment #450544 - Flags: review+
Attachment #450544 - Flags: approval1.9.2.5?
Comment on attachment 450544 [details] [diff] [review]
192 backport

The 192 patch also applies cleanly to 191.

A small part of me wants to do a 190 patch, but 1) that involves cvs, and 2) sessionstore dropped 3.0 compatibility recently, so the ship has already sailed on not preserving compatibility with 3.0-era profiles.
Attachment #450544 - Flags: approval1.9.1.11?
I asked a comment about this via email. We likely won't be taking this for .6/.11
I just got to responding to this email a few minutes ago.  :-)  I'm not worried about missing a particular very-soon point release, just that it not miss one without good cause.
blocking1.9.2: --- → needed
Attachment #450544 - Flags: approval1.9.2.7?
Attachment #450544 - Flags: approval1.9.2.5?
Attachment #450544 - Flags: approval1.9.1.12?
Attachment #450544 - Flags: approval1.9.1.11?
We'll take this next time rather than squeeze it in on top of a groaning QA team, setting to "needed" so we don't forget it.
blocking1.9.1: --- → needed
We need it on 1.9.2 for sure, I'm not worried about where, even if it could hit FX4.0beta1 users, but beta users should be ready to possible issues. Any release before FX4RC1 should work imo.
(If you got the previously-mentioned email, move along, nothing new in this comment...)

For purposes of *this* bug it's not too important when.  As I noted in that email, for the purposes of bug 564621 (which this blocks), it's important it get in sooner rather than later (to the extent feasible given schedule weirdness of 3.6.4 being super-long), because tightening the strictness/correctness of JSON.parse could very easily have an effect on websites.  It's more likely such effects will be discovered and fixed the longer that change is in mozilla-central.
A couple of things:

#1 - won't you have to support/munge trailing commas in 4.0 final anyway? What if a user is running 3.0-3.6.7 and then installs 4.0 on top? It seems like if you have to work around this in 4.0 anyway there is really no benefit to taking it

#2 - There doesn't seem to be a test to make sure this does what it says or doesn't regress later
Upgrading from 3.0 to 4.0 is already unsupported, and other profile-format changes would make applying this fix there not fix all the possible problems they might encounter anyway.

The point of applying this to 3.5.x and 3.6.x, x being whichever release includes this change, is so that people upgrading from there to 4.0 will have their profiles adjusted properly.  People with 3.{5,6}.y installs where y < x are expected to upgrade through 3.{5,6}.z where z >= x, to get the fix this way.

The test is that JSON.parse will complain if it's given a bookmarks backup that includes a trailing comma.  Further, I was told in comment 12, we already have tests that exercise this code.
(In reply to comment #28)
> People with 3.{5,6}.y installs where y < x are expected to upgrade
> through 3.{5,6}.z where z >= x, to get the fix this way.

That's certainly the normal path, but it's not universal. Right now we've still got ~10M Firefox 3.0.x users we're trying to upgrade to 3.6 (skipping 3.5). Someday it'll be roughly that many unsupported 3.5 users we'll be trying to move to 4.0 (rather than 3.6.x which by that time will be nearing EOL itself). Will we remember this bug and force them through multiple upgrade steps to avoid bookmark loss? I wouldn't count on it, and then it's a huge support headache.

You're going to have to handle upgrading both formats.
And, if you are handling upgrading with the trailing commas, isn't it easier to just assume everyone < 4.0 has the trailing commas instead of dealing with both cases?
Sigh.  I'll look into writing some sort of migration code, I guess, just let me gouge my eyes out first.  :-\
Blocks: 582077
No longer blocks: 564621
I understand it's not ideal, but I see no way around it as we do not require users to be on the latest security update before updating to a new major version :-/.

Also, if we did take this on the branch it would be a bit of a support nightmare, as most SUMO people will be on the latest (with the fix) and unable to reproduce. I have a feeling this would become a hard-to-reproduce "phantom" bug.
Attachment #450544 - Flags: approval1.9.2.9?
Attachment #450544 - Flags: approval1.9.2.9-
Attachment #450544 - Flags: approval1.9.1.12?
Attachment #450544 - Flags: approval1.9.1.12-
Duplicate of this bug: 586207
blocking1.9.1: needed → -
blocking1.9.2: needed → -
should I post a new bug? there is a different issue but related to the same subject.

the structure of the JSON is not valid in that the first entry's "children":[] doesn't have a close ]

I was using notepad++'s JSON  Viewer's JSON Format to reformat the JSON file to a readable format.
I also checked the non-formatted version and the end curly and square brackets have same problem.
the rest of the first/main JSON entry is missing from the JSON export.

is there anything else missing after this truncated main entry? besides maybe 
    ]
    }
} 
?

how is this supposed to end?
my last comment is for 24.0.1
excuse me, ff 24.0
Works for me. Anyway, please file a new bug.
(In reply to Jim Michaels from comment #34)
> should I post a new bug? there is a different issue but related to the same
> subject.
> 
> the structure of the JSON is not valid in that the first entry's
> "children":[] doesn't have a close ]

it's very likely unrelated to this bug, sounds more like a bug in the module that writes the file.
I need to correct myself, it is missing at least ]} not ]}}
You need to log in before you can comment on or make changes to this bug.