Closed Bug 626341 Opened 13 years ago Closed 13 years ago

adding/sorting/syncing bookmarks screws sort order if position column is out of sync

Categories

(Firefox :: Bookmarks & History, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

VERIFIED FIXED
Firefox 4.0b12

People

(Reporter: cpuidle, Assigned: mak)

References

Details

(Keywords: regression)

Attachments

(1 file, 4 obsolete files)

User-Agent:       Mozilla/5.0 (Windows NT 5.1; rv:2.0b9) Gecko/20100101 Firefox/4.0b9
Build Identifier: Mozilla/5.0 (Windows NT 5.1; rv:2.0b9) Gecko/20100101 Firefox/4.0b9

Using drag & drop to sort bookmarks in bookmarks menu and bookmarks library does:
- either not have any effect: after dropping the item it is back where it was previously
- or confusing: drag & drop in bookmarks library produces different order than in bookmarks menu (e.g. use bm lib to drag item to first position-> shows up in bm menu on 1st position-> on reopening bm lib item is no longer post pos 1)

This behaviour has imho been introduced with beta 8, potentially beta 7.

Reproducible: Always

Steps to Reproduce:
1. open bm menu
2. drag item to target position
3. release item
4. item ends in different position
Actual Results:  
item not aligned to desired position

Expected Results:  
item at position where it was dropped
Summary: Drag & Drop sorting in Bookmarks menu and bookmarks library broken → Drag & Drop sorting in Bookmarks menu and bookmarks library screws sort order
As my places file is pretty big (30mb) I've retried with a clean profile. Apparently a clean profile does not show this problem.

Best regards,
Andreas
just to clarify situation, do you use Sync?

I have seen another report, to me looks like somehow the position column in our bookmarks table goes out of sync
Good thought. I did not use sync on the new profile when testing but I do on my main profile where the problem occurs. If helpful I could test sorting with sync disabled to see if it works locally.
I could also provide places.sqlite and/or output of queries against that.
This should detect parent folders where position field does not have contiguous values. In a clean db it should not return anything.

SELECT parent FROM 
       (SELECT parent, (SUM(position) - (count(*) * (count(*) - 1) / 2)) AS diff 
FROM moz_bookmarks
GROUP BY parent)
WHERE diff <> 0
uops, I indented it badly, but it should be correct regarding parentheses, let me know if it does not work.
I'm using today's nightly (2011-01-17) and this problem still occurs. I tried disabling Firefox Sync and restarting the browser, but that didn't fix it.

When I try to drag and drop bookmarks in the toolbar or library window, there's often no affect, or else it moves it to the wrong location.
I don't think the cause is having Sync enabled, but it's possible Sync concurred to the cause if it added bookmarks skipping position values. but I think we fixed this in the past in the backend, so I'm mostly asking if Sync was used to have a clear image of possible causes. So disabling Sync is not expected to have positive effects.
In reply to comment 4: query shows up 2 and 3, which is bookmarks toolbar and menu.

The following query

SELECT * 
FROM moz_bookmarks
WHERE parent in (2,3)
ORDER BY parent, position

Shows that duplicate and missing values occur in the position column.

One more idea regarding causes: I remember that sometime during 4b8 I received a message that sync was outdated and installed a build 1.6b5 which was disabled with the 4b9 update. Might that be the culprit?

For now, is there any clever SQL to update the position numbers that I could try?
This js code should do the trick, I extracted the above query from it, it's untested (That's why it's commented out), I plan to enable it as soon as I can write a test for it.
Should not be hard to build a query from that code, but that's at your own risk, so make a backup first.

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/src/PlacesDBUtils.jsm#329

If you can give any steps to reproduce the bug starting from a clean profile,, that would be appreciated too.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Drag & Drop sorting in Bookmarks menu and bookmarks library screws sort order → Drag & Drop sorting in menu and library screws sort order if position column is out of sync
Blocks: 560198
In reply to comment 9: Query tested- works like a charm- here the example for parent = 3:

UPDATE moz_bookmarks 
SET position = 
(SELECT ((SELECT count(*) FROM moz_bookmarks WHERE parent = 3) - (SELECT count(*) FROM moz_bookmarks WHERE parent = 3 AND ROWID >= b.ROWID)) FROM moz_bookmarks b WHERE parent = 3 AND id = moz_bookmarks.id) 
WHERE parent = 3
I'm unable to reproduce how this happend.
WFM if the code in comments can be shipped as part of FF4.
I can trace it back to starting with a clean profile and importing bookmarks from a Google Chrome html export.
I've not been able to reproduce exporting/importing my quite large FF bookmarks. Wondering what the difference is to Chrome?
I exported from Chrome and Firefox and compared the HTML. It's almost the same, but there are a few differences.

Each bookmark is stored inside an anchor tag with several custom attributes. The bookmarks file exported from Firefox includes "ICON_URI" and "LAST_MODIFIED" attributes in the anchor tag, while Chrome does not. In Firefox, some a tags include LAST_CHARSET="UTF-8".

All in all, Chrome's html seems quite a bit simpler than Firefox's.
hm, sigh, this worries me
http://mxr.mozilla.org/mozilla-central/source/services/sync/modules/engines/bookmarks.js#836

Sync should really NOT use setItemIndex, this API is dangerous (see bug 539517)
Blocks: 615285
Keywords: regression
Assignee: nobody → mak77
Status: NEW → ASSIGNED
Philipp, is _orderChildren guaranteed to be invoked for ALL children in a folder, or only for some of them?
(In reply to comment #15)
> Sync should really NOT use setItemIndex, this API is dangerous (see bug 539517)

What should we be using instead? moveItem?

(In reply to comment #16)
> Philipp, is _orderChildren guaranteed to be invoked for ALL children in a
> folder, or only for some of them?

I can think of edge cases where _orderChildren will not call setItemIndex for all items in a folder (e.g. user adds a bookmark to a folder on one machine and at the same time a different bookmark to the same folder on another machine, then sync).

So I'm guessing the problem is that setItemIndex doesn't update all indices if you don't call it for all items in a folder? I thought it did that but I may have misread the source back then.
(In reply to comment #17)
> So I'm guessing the problem is that setItemIndex doesn't update all indices if
> you don't call it for all items in a folder? I thought it did that but I may
> have misread the source back then.

Yes, setItemIndex is a sucky method, it's faster than moveItem (we use it for sorting) but can cause any sort of issues if it doesn't sort all of the children of a folder, because it's point-n-shoot like Duck Hunt game.
Unfortunately we missed the train to drop it, but I'd guess we can still mark it as deprecated now and fix sorting in FX5.

I'm making a patch where Sync uses moveItem and our maintenance code will fix the broken indices.
Attached patch patch v1.0 (obsolete) — Splinter Review
The reordering is even better than the original approach.

Philipp, I'm pretty sure the parent could already be known and cached when we are in _orderChildren (at least it was known to the record), but orphan reparenting is causing me a headache, thus I went through the large door.  Suggestions?
Attachment #512571 - Flags: feedback?(philipp)
Comment on attachment 512571 [details] [diff] [review]
patch v1.0

We should really have a separate bug for the Sync portion here. Filed bug 634401, please attach a separate patch there.

>diff --git a/services/sync/modules/engines/bookmarks.js b/services/sync/modules/engines/bookmarks.js
>--- a/services/sync/modules/engines/bookmarks.js
>+++ b/services/sync/modules/engines/bookmarks.js
>@@ -825,6 +825,7 @@ BookmarksStore.prototype = {
>       // Reorder children according to the GUID list. Gracefully deal
>       // with missing items, e.g. locally deleted.
>       let delta = 0;
>+      let parent = 0;

Nit: That default value makes it look like we could actually end up with parent = 0. I would suggest 'null' here.

As you say, we might save the parent id in _childrenToOrder to avoid having to look it up here. Unless of course you think the risk that the parent id becomes invalid between the time _childrenToOrder is populated and _orderChildren() is executed is too high. That window is there, but it's super small and it would require the user to delete the folder or restore from backup that very instant, right?

Any kind of unnecessary DB access hurts us a lot on mobile, but we've had some problems with caching places ids in the past (granted, we were caching them over the course of an entire Firefox session, not just for a small window during a sync.) If I need to make a call, I'd like us to err on the safe side, but put a comment in that this is a known point of inefficiency.
Attachment #512571 - Flags: feedback?(philipp) → feedback+
(In reply to comment #20)
> Nit: That default value makes it look like we could actually end up with parent
> = 0. I would suggest 'null' here.

what about -1? we use it as a invalid id value in a lot of other places, null is fine too.

> As you say, we might save the parent id in _childrenToOrder to avoid having to
> look it up here. Unless of course you think the risk that the parent id becomes
> invalid between the time _childrenToOrder is populated and _orderChildren() is
> executed is too high. That window is there, but it's super small and it would
> require the user to delete the folder or restore from backup that very instant,
> right?

No idea, this code confuses me, I suppose it's pretty hard that something wrong happens but I see Sync is doing lots of reparenting and such... Ideally the information we are looking for could be in page cache and performances should not suffer that much.

> If I need to make a call, I'd like us to err on the safe side,
> but put a comment in that this is a known point of inefficiency.

most likely the best call.
Summary: Drag & Drop sorting in menu and library screws sort order if position column is out of sync → Sorting bookmarks screws sort order if position column is out of sync
This is a low risk patch that adds a task in PlacesDBUtils (These tasks run once a week on idle) to fix bogus bookmarks positions. This can be caused by add-ons, or in this case probably caused by Sync bug 634401.

The effect is that any kind of bookmarks positioning (full folder, drag&drop, copy&paste) in any part of the UI could cause the moved or created bookmark to appear at random positions.
Plus Sync could emphasize the effect syncing these movements to other devices.

The risk is just that the queries are wrong, but the test is here to ensure that. The test itself uses random positions values on 30 bookmarks, so that each time it runs, it will test different conditions. If something goes wrong we'll notice it easily with the number of test runs we do.

PS: while I can avoid the createStatement -> createAsyncStatement change, it is an efficiency change that has no risks involved.

Since I guess reviewing this query could be hard, what it does is:
- Get all parents where triangular calculation is wrong (this means that children have bogus position inside that parent)
- For each parent, calculate the new position of each bookmark
- The new position of a bookmark is the sum of the number of bookmarks with a minor position value and the number of bookmarks with the same position value but a minor ROWID (unique id) (clearly all of this in the same folder)
So this is like doing a SELECT WHERE parent = X ORDER BY position ASC, ROWID ASC
Attachment #512571 - Attachment is obsolete: true
Attachment #512682 - Flags: review?(dietrich)
Flags: in-testsuite?
Summary: Sorting bookmarks screws sort order if position column is out of sync → adding/sorting/syncing bookmarks screws sort order if position column is out of sync
Comment on attachment 512682 [details] [diff] [review]
Fix bogus bookmarks positions with PlacesDBUtils

comments addressed on irc, r=me.
Attachment #512682 - Flags: review?(dietrich)
Attachment #512682 - Flags: review+
Attachment #512682 - Flags: approval2.0+
Attached patch for checkin (obsolete) — Splinter Review
I've found a small query improvement, just to be paranoid I've run the test 1000 times on a loop (the test is randomized so this gives more coverage), ready to go.
Attachment #512682 - Attachment is obsolete: true
http://hg.mozilla.org/mozilla-central/rev/753b5de51ba1
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Flags: in-testsuite? → in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → Firefox 4.0b12
backed out because ts_places_generated_max started to crash in libmozsqlite, without any apparent reason...

http://hg.mozilla.org/mozilla-central/rev/a96f4d673e66
http://hg.mozilla.org/mozilla-central/rev/299667917db6

Maybe a threadlock?
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297949248.1297951208.28554.gz

Thread 0 (crashed)
 0  linux-gate.so + 0x424
    eip = 0x00977424   esp = 0xbfbc4d34   ebp = 0xbfbc4d68   ebx = 0xb2774788
    esi = 0x00000000   edi = 0x00000000   eax = 0xfffffffc   ecx = 0x00000080
    edx = 0x00000002   efl = 0x00000202
    Found by: given as instruction pointer in context
 1  libmozsqlite3.so!pthreadMutexEnter [sqlite3.c : 17004 + 0xa]
    eip = 0x003af3ed   esp = 0xbfbc4d70   ebp = 0xbfbc4d78
    Found by: previous frame's frame pointer
 2  libmozsqlite3.so!sqlite3_mutex_enter [sqlite3.c : 16244 + 0x8]
    eip = 0x00339772   esp = 0xbfbc4d80   ebp = 0xbfbc4d88   ebx = 0x003bdc58
    Found by: call frame info
 3  libmozsqlite3.so!sqlite3_extended_result_codes [sqlite3.c : 106979 + 0xa]
    eip = 0x00342b45   esp = 0xbfbc4d90   ebp = 0xbfbc4da8   ebx = 0x003bdc58
    Found by: call frame info
 4  0x17bcd18
    eip = 0x017bcd19   esp = 0xbfbc4db0   ebp = 0x00000000   ebx = 0x02179d1c
    esi = 0xbfbc4e10
    Found by: call frame info

Thread 7
 0  libmozsqlite3.so + 0x50f7
    eip = 0x003390f7   esp = 0xa82feb1c   ebp = 0xa82feb48   ebx = 0x003bdc58
    esi = 0x00000001   edi = 0xa9d2a238   eax = 0x00000063   ecx = 0x00000004
    edx = 0x00000001   efl = 0x00000282
    Found by: given as instruction pointer in context
 1  libmozsqlite3.so!sqlite3VdbeExec [sqlite3.c : 62028 + 0x17]
    eip = 0x003a4dab   esp = 0xa82feb50   ebp = 0xa82fef58
    Found by: previous frame's frame pointer
 2  libmozsqlite3.so!sqlite3_step [sqlite3.c : 58704 + 0x7]
    eip = 0x0038b7cb   esp = 0xa82fef60   ebp = 0xa82ff0f8   ebx = 0x003bdc58
    esi = 0xb276a808   edi = 0xb276a808
    Found by: call frame info
 3  0x17bce58
    eip = 0x017bce59   esp = 0xa82ff100   ebp = 0x00000000   ebx = 0x02179d1c
    esi = 0xae7dd7a8   edi = 0xae7dd7a8
    Found by: call frame info
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
So here is what happens in ts_places:
- Firefox starts
- Talos boxes are always idle so idle-daily is fired
- Maintenance runs
- the new repositioning query runs
- startup continues

Now the problem is that the max dirty database has 140 000 entries (all of them) with position set to -1... So the query has to reposition all of them.
When startup continues it tries to run a synchronous query and the main-thread enters a mutex.
The repositining query takes a long time to reposition 140 000 entries, and the process gets killed before it can finish.

I have to fix the crazy script that generates the dirty db, and generate a new db with proper positions.
Attached patch alternative queries (obsolete) — Splinter Review
Provided that I'll file a bug to fix the dirty profile creation, and a bug to fix a potential memory corruption bug in idle service, these new queries are able to fix the position for 140 000 bookmarks without freezing, in just 3-4 seconds rather than infinite minutes.
Before proceding with these I still want to get a tryserver run and run my test in loop other 1000 times.
I'd say try was largely positive
http://tbpl.mozilla.org/?tree=MozillaTry&rev=9a5a821802c6

But I still have an idea to improve the test and make it even better.
Attached patch for checkinSplinter Review
Now with a harder test, did run it 1500 times, all pass, tryserver approved.
I don't think there is any value in re-asking review here, the touched code did not change, nor the structure of the patch, the test was just coalesced with helpers to test 2 folders at the same time. Reviewing changed queries is not that interesting unless you are a databases lover.

I'll push this and then file the bugs I've found while investigating the issue.
Attachment #512935 - Attachment is obsolete: true
Attachment #513677 - Attachment is obsolete: true
http://hg.mozilla.org/mozilla-central/rev/51b61720bc35
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Depends on: 635474
Is there any way to fix this?
The problem infected my bookmarks toolbar.

Even if I export HTML and reimport it I still have the bug (and in safe mode).

How can I get an uncorrupted bookmark toolbar back.

I can't delete everything - firefox won't let me.
Now I have a junk "mobile bookmarks" folder.  How do I get rid of that?  It wasn't there before (and ended up duplicating a different non-mobile folder)
Sync will apparently spread the breakage across all the computers.

I also tried upgrading to firefox 4 with one of them which may have started the problem.

So now I have 5 computers with broken bookmarks and no way to fix or restore them.
(In reply to comment #39)
> I also tried upgrading to firefox 4 with one of them which may have started the
> problem.

most likely it could have fixed the problem on that specific computer, while others are still broken. I'd suggest upgrading to Firefox 4 wherever you can.
Firefox 4 didn't fix it, it probably caused it.

And too many things don't work yet as far as extensions go for me to use 4, so I won't be upgrading for a while.
The problem is that currently your Sync profile has bogus data due to the bug that was in Sync, Firefox 4 fixes that data locally, but then Sync has to fetch the changes, so if it appears to cause the bug it's just because of bogus remote data, the FF4 profile is indeed correct. You could probably, with the Firefox 4 profile do a Reset in sync, and reupload all data.
Consider looking for alternatives for the incompatible extensions, add-on authors had a lot of time to fix those, and most of the extensions are compatible already, some could just be unmaintained.
Not sure I can follow this. Latest when all clients are FF4 the problem should be solved.

OR- do we need the FF4 fix on the central sync data store?
FF can only fix local data, it cannot fix remote data, the only way to fix remote data would be to resync all position values from local to remote, but only for FF4 installs (3.6 is currently unable to fix broken positions).
This is that kind of bug that has a tail and disappears after some time of use of a new version. You can speed up that time by resetting your remote data from a fixed profile.
Most of the extensions are probably going to be compatible, but I;m not going through dozens looking for betas.  I would rather go to Chrome than Firefox 4 right now given the extensions I use.  Even many common ones, unless it is now a completely different extensions.

What I managed to do is use a backup set of bookmarks with sync disconnected and restored that and told sync to use that as the master.  I also took the same backup file manually to other computers.

It says "resolved fixed" - is it actually fixed on both 3.x and 4.0 so if I resync I won't just get it doing it again?  Is there something I can do with mysql manually to fix it (one of the extensions that doesn't work in 4.0)
3.6 is going to be around for a while - the linux distros if nothing else will take a while to switch to 4.0.  With sync spreading the breakage this should be important enough to implement a fix and send it out with the next 3.6 update.  There may also be people with mixed 3.6 and 4.0 for the same reason.  Maybe their Mac is upgraded but not Linux.

If there are still versions out there that can cause the bug, you need to have one that fixes it.

I don't think it is acceptable to have DATA DESTROYING bugs left unfixed or without a repair mechanism when one is possible.
If you use firefox 4 or latest sync version in previous versions of the browser, after resetting data the bug should not happen anymore since both have the fix.
AFTER resetting the data?  If the data is not reset, it will contaminate the other sync clients, and apparently recontaminate things after the reset - unless everything is reset manually.  I was up to date on all the platforms but it still might be broken.

I don't think you can guarantee that 3.6 with sync can ever be fixed as you cannot be sure if any reset source isn't contaminated.  If so, how?

I asked about backups and restores - I really don't know.  Some of the restores I tried seemed contaminated.

Even if I run firefox 4 again, it might fix things, but it won't tell me that it had to fix things so I would know to push that version.
So I have the three computers, mac, pc, and linux, and manually restored the bookmarks and synced, and the mac today had a scrambled bookmark toolbar.


What do I need to do to fix this?  I did run firefox 4.  It didn't help.  Everything is up to date as of yesterday, I synced, so either it is still broken or syncing doesn't work even if I tell it to completely replace.

Nor has anyone said how to detect when something is wrong.  The positions look normal for a long while.

This is neither resolved nor fixed, at least for 3.6.
Doesn't look like you did a Sync Reset (reupload all data on the server from the Firefox 4 profile).
I didn't do that but I didn't think I needed to do so explicitly, nor is it a trivial thing to download, run firefox 4 hwoever many times to force the sync, then go back to 3.6.

I did reset sync with an old copy that should have been undamaged under 3.6.

IS THERE ANY WAY TO TEST FOR A DAMAGED BOOKMARK DATABASE?

I ran mmsql against the one I force-synced and the positions for the bookmark toolbar entries were from 0 to the last one without gaps and correct.

So what happens if I do the 4.0 reset and back under 3.6 it scrambles again?
On 3.6 you can force a "replace local data with remote" after on 4.0 you did a "replace remote data with local".
The database must only have adjacent position values from 0 up.
Nor 4.0 nor 3.6 (with latest Sync version) can scramble data, if any data is scrambled it's because that data is scrambled in the cloud, reset as above, or it will fix by itself after some time. I'm sorry but I have no magic wand for fixing remote data.
I just did exactly what you said - ran firefox 4 on my windows machine, the bookmarks were correct, told it to reset sync from that local copy to the cloud (where the order was correct), and now I'm getting scrambled bookmarks on my 3.6 toolbar.

So firefox 4 either didn't repair of is still trashing the bookmarks.
Update - after the sync completed - I don't know when, the bookmark toolbar unscrambled, but it did that before, and stayed ok until I added or removed a bookmark.

When I did the sync, it didn't have the names of the other computers listed - I've seen duplicate names or missing names or wrong names.  Each of 5 computers has different names.
And of course, "Sync encountered an error while syncing: Unknown Error" on one of my 3.6.  It doesn't know what error happened?  It is better that firefox 4 (at least my copy on windows) that gives no indication sync is running.  No animated icon, no way of checking status, just click OK and pray that it completed or did something.  Maybe it is related to the problem.  But without any information it is really hard to tell.
At this point I don't think your report has anything to do with this bug, please file a bug against Mozilla Services/Sync with a brief description of your experience.
This bug is only handling a local database corruption, adding more comments is not going to do anything useful here.
Verified on: 
Build identifier: Mozilla/5.0 (Windows NT 5.1; rv:2.2a1pre) Gecko/20110405 Firefox/4.2a1pre
Status: RESOLVED → VERIFIED
Blocks: 487805
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: