Closed Bug 622657 Opened 12 years ago Closed 7 years ago

Bookmark sync: NS_ERROR_MALFORMED_URI in PlacesUtils.bookmarks.getItemType()

Categories

(Firefox :: Sync, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla40
Tracking Status
firefox40 --- fixed

People

(Reporter: philikon, Assigned: markh)

References

Details

Attachments

(3 files, 2 obsolete files)

A user reported this, and it's been reported as part of bug 602956 comment 3 as well:

2011-01-01 22:44:28     Service.Main         INFO       In sync().
2011-01-01 22:44:28     Net.Resource         DEBUG      GET success 200 https://phx-sync098.services.mozilla.com/1.0//info/collections
2011-01-01 22:44:28     Service.Main         DEBUG      Fetching global metadata record
2011-01-01 22:44:28     Service.Main         DEBUG      Clearing cached meta record. metaModified is undefined, setting to 1293906137.5
2011-01-01 22:44:28     Net.Resource         DEBUG      GET success 200 https://phx-sync098.services.mozilla.com/1.0//storage/meta/global
2011-01-01 22:44:28     Service.Main         DEBUG      Weave Version: 1.6.1 Local Storage: 5 Remote Storage: 5
2011-01-01 22:44:28     Service.Main         INFO       Sync key is up-to-date: no need to upgrade.
2011-01-01 22:44:28     Service.Main         DEBUG      Fetching and verifying -- or generating -- symmetric keys.
2011-01-01 22:44:28     Service.Main         INFO       Testing info/collections: {"clients":1293925144.17,"crypto":1293906118.87,"keys":1291339841.4,"meta":1293906137.5,"bookmarks":1293906136.52,"prefs":1293925173.55}
2011-01-01 22:44:28     CollectionKeys       INFO       Testing for updateNeeded. Last modified: 1293906118.87
2011-01-01 22:44:28     Engine.Clients       DEBUG      First sync, uploading all items
2011-01-01 22:44:28     Engine.Clients       INFO       1 outgoing items pre-reconciliation
2011-01-01 22:44:28     Collection           DEBUG      GET success 200 https://phx-sync098.services.mozilla.com/1.0//storage/clients?full=1
2011-01-01 22:44:28     Engine.Clients       INFO       Records: 2 applied, 0 reconciled.
2011-01-01 22:44:28     Engine.Clients       INFO       Uploading all of 1 records
2011-01-01 22:44:28     Collection           DEBUG      POST Length: 277
2011-01-01 22:44:28     Collection           DEBUG      POST success 200 https://phx-sync098.services.mozilla.com/1.0//storage/clients
2011-01-01 22:44:28     Engine.Clients       DEBUG      Total (ms): sync 145, processIncoming 60, uploadOutgoing 84, syncStartup 0, syncFinish 0, reconcile 0, createRecord 0, syncCleanup 0
2011-01-01 22:44:28     Service.Main         INFO       Updating enabled engines: 2 clients.
2011-01-01 22:44:28     Service.Engines      DEBUG      Could not get engine: clients
2011-01-01 22:44:28     Engine.Bookmarks     DEBUG      First sync, uploading all items
2011-01-01 22:44:28     Engine.Bookmarks     INFO       1283 outgoing items pre-reconciliation
2011-01-01 22:44:30     Engine.Bookmarks     DEBUG      Total (ms): sync 1402, processIncoming 514, syncStartup 797, createRecord 403, isEqual 403, reconcile 403, syncCleanup 21
2011-01-01 22:44:30     Service.Main         DEBUG      bookmarks failed: NS_ERROR_MALFORMED_URI JS Stack trace: Channel_onStopRequest([object XPCWrappedNative_NoHelper],null,2152398858)@resource.js:438 < Res__request(...)@resource.js:357 < Res_get()@resource.js:376 < SyncEngine__processIncoming()@engines.js:511 < _processIncoming()@bookmarks.js:212 < ()@engines.js:203 < SyncEngine__sync()@engines.js:768 < wrappedSync(null)@util.js:168 < runInBatchMode([object Object],null)@:0 < batchedSync()@util.js:174 < ()@engines.js:203 < WrappedNotify()@util.js:147 < Engine_sync()@engines.js:213 < WeaveSvc__syncEngine([object Object])@service.js:1738 < ()@service.js:1624 < WrappedNotify()@util.js:147 < WrappedLock()@util.js:119 < WrappedCatch()@util.js:97 < sync()@service.js:1529 < (5)@sync.js:380
2011-01-01 22:44:30     Engine.Prefs         INFO       0 outgoing items pre-reconciliation
2011-01-01 22:44:30     Engine.Prefs         INFO       Records: 0 applied, 0 reconciled.
2011-01-01 22:44:30     Engine.Prefs         DEBUG      Total (ms): sync 3, processIncoming 1, uploadOutgoing 0, syncStartup 1, syncFinish 0, getChangedIDs 0, syncCleanup 0
2011-01-01 22:44:30     Service.Main         DEBUG      Exception: Some engines did not sync correctly No traceback available

Will try to get more information on where it fails. It seems weird that places could have given us malformed URIs. But we should probably just be careful when passing values (URIs, places ids, etc.) into places at all times.
Duplicate of this bug: 636324
Duplicate of this bug: 889434
Duplicate of this bug: 871896
Sync becomes annoying when something breaks, the error prompt keeps popping up way too often.
I used this script (modified from bug 636324) to find the problem bookmark.

Components.utils.import("resource://services-sync/service.js");
Components.utils.import("resource://gre/modules/PlacesUtils.jsm");
let bme = Service.engineManager.get("bookmarks")
let bms = bme._store;
let ids = bms.getAllIDs();
for (let guid of Object.keys(ids)) {
  let i = bms.idForGUID(guid);
  try {
    if (PlacesUtils.bookmarks.getItemType(i) == PlacesUtils.bookmarks.TYPE_BOOKMARK) {
      PlacesUtils.bookmarks.getBookmarkURI(i);
    }
  } catch (e) {
    console.log("Problem with moz_bookmarks id", i, "guid", guid);
  }
}

I found the guid in moz_bookmarks (in places.sqlite), but couldn't join with moz_places.
The foreign key was broken:

select * from moz_bookmarks where fk is not null and not exists (select * from moz_places where moz_places.id = moz_bookmarks.fk);

This fixed it:

delete from moz_bookmarks where fk is not null and not exists (select * from moz_places where moz_places.id = moz_bookmarks.fk) limit 1;
(In reply to Gummi from comment #4)
> I found the guid in moz_bookmarks (in places.sqlite), but couldn't join with
> moz_places.
> The foreign key was broken:

did you ever use an external tool or an add-on to delete entries from places.sqlite or optimize it?
Attached file 8787289.txt
Here's a more recent log via sumo.  It seems like the bookmarks engine could ignore the problem bookmark quite easily here.
Most identifiable stack traces go via _createRecord.  It seems worth catching exceptions here, ignoring the incoming record.  I can't see an obvious way to test this for the bookmarks engine, but one concern is that the underlying failure seems to be building the _guidMap, and it's not obvious the engine will recover correctly in this case - but OTOH, it seems unlikely it would behave worse!

Richard, what do you think?
Attachment #8565179 - Flags: feedback?(rnewman)
Comment on attachment 8565179 [details] [diff] [review]
Handle and ignore exceptions in _createRecord

Review of attachment 8565179 [details] [diff] [review]:
-----------------------------------------------------------------

I agree with the overall approach of trying not to fail when local corruption is unrecoverable.

I don't think try..catch in createRecord is the answer.

Let's take bookmarks as an example. createRecord -> _createRecord -> _mapDupe -> _guidMap.

_guidMap will currently throw, once per sync, in certain cases of Places corruption. It's not caught (at this level, at least) so it aborts bookmark sync and ultimately shows an error.

If we wrap in createRecord, it'll throw and be caught *once per incoming record*. Each will fail to apply. The sync will continue and do nothing.

I think a better fix would be something like:

* In _buildGUIDMap, catch these particular kinds of exception.
* Delete the failing row from the local DB. There's nothing we can do with it, and the DB isn't recoverable with that row in it.
* Try to build the map again. Lather, rinse, repeat.

Alternatively -- and probably more sustainably -- we should push this logic down into Places. getItemType, getBookmarkURI, etc. should never throw when provided suitable input, and Places should fix itself as appropriate.
Attachment #8565179 - Flags: feedback?(rnewman) → feedback-
Thanks, and yes, you are correct.

(In reply to Richard Newman [:rnewman] from comment #8)

> Alternatively -- and probably more sustainably -- we should push this logic
> down into Places. getItemType, getBookmarkURI, etc. should never throw when
> provided suitable input, and Places should fix itself as appropriate.

Mak, how do you feel about that?
Flags: needinfo?(mak77)
We are in the process of moving to the new bookmarks API, so I don't think it's the right time to touch the old APIs... We should at a maximum think how we can improve things in the new one.

The other fact is that a good part of the Sync bookmarks engine sucks (sorry, this hurts). Not your fault, the whole _guidMap thing is terrible, it generates a huge load of IO traffic to go through the whole bookmarks multiple times when it shouldn't: it fetches all of the ids, then it fetches itemType, uri, title, index, parent etc _for each id_, when it already had all of this data when fetching ids. But I'm risking to go OT. I just wanted to underline guidMap should be rewritten  regardless of what we do here or in the new API.

But mostly, I don't understand what is "suitable input", a good API should never try to apply wrong input and never trust the caller.
So my answer here depends on what's suitable input, we'll never insert an invalid uri. We can correct some input(like position) and the new API does that already (I think also the old one somehow).

I'm open to make the API correct any input that won't break coherence and shouldn't be considered a developer mistake. But we won't correct input that could hide caller's bugs.
Flags: needinfo?(mak77)
PS: it would be very easy to write "never throwing" wrappers for the given methods.
(In reply to Marco Bonardo [::mak] -- catching up, delayed answers -- from comment #11)
> PS: it would be very easy to write "never throwing" wrappers for the given
> methods.

Yeah - which is what my patch did, but the problem is exactly as you say (and Richard pointed out in his f-) - the engine sucks.

The problem *does* seem to be invalid data or a bug in bookmarks though.  The attachment in comment 6 shows |PlacesUtils.bookmarks.getItemType(id)| throwing with NS_ERROR_MALFORMED_URI. This is after we've queried for the id via the guid - so the record exists, it just can't be accessed this way. This method shouldn't throw for a valid existing ID.
(In reply to Mark Hammond [:markh] from comment #12)
> The problem *does* seem to be invalid data or a bug in bookmarks though. 
> The attachment in comment 6 shows |PlacesUtils.bookmarks.getItemType(id)|
> throwing with NS_ERROR_MALFORMED_URI. This is after we've queried for the id
> via the guid - so the record exists, it just can't be accessed this way.
> This method shouldn't throw for a valid existing ID.

The fact is that none of our APIs allows to insert an invalid type, the only way that can happen is direct editing the db with queries (either done directly by the user or by an add-on), that is clearly unsupported bypass of our APIs.

What should getItemType return if it's invalid and it can't throw? Any code expects whatever you get from getItemType to be a valid item type, if it's not, you can't even tell if it's a bookmark, a folder or a separator, so throwing appears as the right solution.
If we'd make it return TYPE_UNKNOWN, it could potentially cause a bunch of very hard to find failures in other code that expects to handle that bookmark in some way. It would be very regression prone and not something we can consider for the old API, cause it's a compatibility breaking change that can introduce subtle bugs.

The new API doesn't have these single getters, you just get a bookmark object from a guid, url, or position. So that should work much better and disallow these footguns.
To sump up, the bug here is that this code should not use getItemType at all, getAllIds already knows which type this item is, we should not try to workaround this mistake imo.
(In reply to Marco Bonardo [::mak] -- catching up, delayed answers -- from comment #13)

> The fact is that none of our APIs allows to insert an invalid type, the only
> way that can happen is direct editing the db with queries (either done
> directly by the user or by an add-on), that is clearly unsupported bypass of
> our APIs.

We've seen plenty of users who've somehow suffered DB corruption such that getItemType returns bookmark, but getBookmarkURI throws NS_ERROR_MALFORMED_URI. See Comment 4 for one such foreign key violation.

I think we've also seen other accessors throw.


> What should getItemType return if it's invalid and it can't throw? Any code
> expects whatever you get from getItemType to be a valid item type, if it's
> not, you can't even tell if it's a bookmark, a folder or a separator, so
> throwing appears as the right solution.

My proposal: Places should delete that row from the DB, run through a recovery process, then throw.
So clearly we aren't going to get any help from bookmarks to deal with obvious errors in the bookmarks store :(  How about this sync patch then?
Attachment #8565179 - Attachment is obsolete: true
Attachment #8573531 - Flags: feedback?(rnewman)
Comment on attachment 8573531 [details] [diff] [review]
0001-Bug-622657-catch-invalid-bookmark-items-and-delete-t.patch

Review of attachment 8573531 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/sync/modules/engines/bookmarks.js
@@ +251,5 @@
> +      try {
> +        itemType = PlacesUtils.bookmarks.getItemType(id);
> +      } catch (ex) {
> +        this._log.warn("Deleting invalid bookmark record with id", id);
> +        PlacesUtils.bookmarks.removeItem(id);

Is it possible (might require code reading) for this to throw?
Attachment #8573531 - Flags: feedback?(rnewman) → feedback+
(In reply to Richard Newman [:rnewman] from comment #16)
> Is it possible (might require code reading) for this to throw?

If it is, it will be our fault anyway ;) But yes, it is - https://dxr.mozilla.org/mozilla-central/source/toolkit/components/places/nsNavBookmarks.cpp#585. So we should just try/catch there and continue regardless - sound good?

Note that as you mentioned in comment 14, I suspect this will just get us past one error but end up with others later anyway, and it's almost impossible to test as the places corruption could take many forms (and I expect in some cases it will be *impossible* to recover from without help from bookmarks) but one step at a time.
(In reply to Richard Newman [:rnewman] from comment #14)
> > What should getItemType return if it's invalid and it can't throw? Any code
> > expects whatever you get from getItemType to be a valid item type, if it's
> > not, you can't even tell if it's a bookmark, a folder or a separator, so
> > throwing appears as the right solution.
> 
> My proposal: Places should delete that row from the DB, run through a
> recovery process, then throw.

We take corrective steps on idle-daily, but due to idle bugs and short sessions, unfortunately many users don't hit that.
I think your suggestion has a good hint, we could add probes detecting bad coherence and force a check asap rather than waiting idle.
The only problem is that, as I said, the old API will soon be deprecated and I'm not sure the new one suffers so much of coherence issues. But surely as soon as we hit something like that with the new API we'll take the above path.
note also that Drew is mentoring bug 522668 that will basically add a nicer way to fix db coherence from about:support. That could somehow help support.
Going through some old patches and found this. While we do want something better to handle bookmarks corruption, I think this is still an improvement - it should completely solve the problem for some users, and should still improve the behaviour for users with significant corruption at the cost of extra log spew. This is the same patch as before, although we catch and log failure to delete the invalid bookmark.
Attachment #8573531 - Attachment is obsolete: true
Attachment #8597759 - Flags: review?(rnewman)
Comment on attachment 8597759 [details] [diff] [review]
0001-Bug-622657-catch-invalid-bookmark-items-and-delete-t.patch

Review of attachment 8597759 [details] [diff] [review]:
-----------------------------------------------------------------

This'll throw out a lot of logspew on a bad DB, but at least most of the time it should only do so once!

::: services/sync/modules/engines/bookmarks.js
@@ +258,5 @@
> +          this._log.warn("Failed to delete invalid bookmark", ex);
> +        }
> +        continue;
> +      }
> +      switch () {

I think you want

  switch (itemType) {
Attachment #8597759 - Flags: review?(rnewman) → review+
(In reply to Richard Newman [:rnewman] from comment #21)
> I think you want
> 
>   switch (itemType) {

Doh! Thanks.
Assignee: nobody → mhammond
Status: NEW → ASSIGNED
note I found a bug in Places that caused us to not run maintenance for a long time. This could have made the coherency situation worse. It should now be fixed from 39 on. I am waiting for telemetry to be 100% sure.
https://hg.mozilla.org/mozilla-central/rev/4715ff680f36
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
I started getting this error a few days ago in all my nightly installs. It happened after I ran an old version of nightly+sync that asked to switch to firefox accounts (it was old). Had to install sqlite manager and follow instructions from comment #4. The sync error toolbar was *very* annoying.

Maybe it's not that important but just wanted to share.
(In reply to jorge alves from comment #27)
> I started getting this error a few days ago in all my nightly installs.

Thanks - that probably isn't *exactly* the same as the logs we have as we (think we) fixed that in 40 (nightly is now 42). Could you please share your logs?
Flags: needinfo?(jag.alves)
Attached file error logs
Sure. Added an attachment
Flags: needinfo?(jag.alves)
Summary: Bookmark sync: NS_ERROR_MALFORMED_URI → Bookmark sync: NS_ERROR_MALFORMED_URI in PlacesUtils.bookmarks.getItemType()
(In reply to jorge alves from comment #29)
> Sure. Added an attachment

Awesome, thanks! I opened bug 1182366.
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.