Closed Bug 1310554 Opened 5 years ago Closed 5 years ago

updateChildIndex in PlacesSyncUtils is extremely inefficient

Categories

(Firefox :: Sync, defect, P1)

defect

Tracking

()

RESOLVED FIXED
Firefox 52
Tracking Status
firefox50 --- unaffected
firefox51 --- fixed
firefox52 --- fixed

People

(Reporter: markh, Assigned: lina)

References

Details

Attachments

(1 file, 1 obsolete file)

If you have 10000 bookmarks in a single folder, applying those records takes many hours, during which Firefox is completely unresponsive. Instrumenting PlacesSyncUtils shows this is called 10000 times, with each call taking around 5-10 seconds.
n^3 FTL. :-(
Assignee: nobody → kcambridge
Status: NEW → ASSIGNED
Priority: -- → P1
Comment on attachment 8801913 [details]
Bug 1310554 - Improve `PlacesSyncUtils.bookmarks.reorder` performance.

https://reviewboard.mozilla.org/r/86510/#review85442

Apart from the temp table name (which may not even be an actual issue), this looks great to me. I'll r+ it now regardless, as Mak's r+ is the one you care about ;)

::: toolkit/components/places/Bookmarks.jsm:1194
(Diff revision 1)
>  
> +      // To do the update in a single step, we build a temporary
> +      // (guid, position) sorting table.  We then use count() in the sorting
> +      // table to avoid skipping values when no more existing GUIDs have been
> +      // provided.
> +      yield db.executeCached(

Even though we are in a transaction here, is there a risk there will be 2 reorders (on different folders) executing concurrently causing an issue with the temp table name? Sync will not do that, and I can't really imagine that happening in practice, but it seems possible. Google tells me the temp tables are per session rather than per transaction.

::: toolkit/components/places/tests/unit/test_sync_utils.js:176
(Diff revision 1)
>    yield PlacesUtils.bookmarks.eraseEverything();
>  });
>  
> +add_task(function* test_order_reverse() {
> +  do_print("Insert bookmark");
> +  // Inserting a batch of 3k bookmarks using `PlacesUtils.bookmarks.insert` is

ISTM that this test should really be in test_bookmarks_reorder.js (ie, testing the Bookmarks.jsm changes) rather than doing so indirectly via PlacesSyncUtils. In either case, I'd think just testing `Math.floor(SQLITE_MAX_VARIABLE_NUMBER / 2) * 3 + 1` (or maybe a few boundary conditions around the batch size) would be enough, rather than an arbitrary 3k given the perf hit you mention in the comments.
Attachment #8801913 - Flags: review?(markh) → review+
(In reply to Mark Hammond [:markh] from comment #3)
> Google tells me the temp
> tables are per session rather than per transaction.

Temp tables are bound to the connection, 2 different connections don't see each others' temp tables.
They are alive as far as the connection is alive or they get dropped.
Comment on attachment 8801913 [details]
Bug 1310554 - Improve `PlacesSyncUtils.bookmarks.reorder` performance.

https://reviewboard.mozilla.org/r/86510/#review85826

Just some questions, as usual!
How much of the win comes from the SQL changes and how much from code removal in PlacesSyncUtils? Off-hand the 2 query approaches should be doing the same exact thing, before and after, I'd expect the WITH to be more efficient cause it can create and destroy the temporary view without an overhead. If the SQL really matters, I think we should contact drh and ask him why the WITH is so less efficient than a plain TEMP table. They can usually give us answers in a few hours (you can contact support@sqlite.org from your mozilla.com address).

::: toolkit/components/places/Bookmarks.jsm:1159
(Diff revision 1)
>  
>  ////////////////////////////////////////////////////////////////////////////////
>  // Reorder implementation.
>  
>  function reorderChildren(parent, orderedChildrenGuids) {
>    return PlacesUtils.withConnectionWrapper("Bookmarks.jsm: updateBookmark",

This label looks wrong, could you please fix it?

::: toolkit/components/places/Bookmarks.jsm:1209
(Diff revision 1)
> +          params.push(children[startIndex + i].guid, startIndex + i);
> +        }
> +        yield db.execute(
> +          `INSERT INTO moz_bookmarks_reorder_sorting(g, p)
> +           VALUES ${new Array(valuesPerChunk).fill("(?, ?)").join()}
> +          `, params);

binding has a cost. since in this case we completely control the values, and it's safe values, there is no injection risk, so we could just build the VALUES string directly and avoid chunking. Or maybe not, there is a limit to the number of values that is 2000. we could increase it by modifying the SQLITE_MAX_COLUMN define, but to which value? we have some (0.5%) users with more than 8000 bookmarks, likely due to add-ons. but we don't have telemetry about max number of bookmarks in a folder, maybe it could be useful to add it!

A separate util to do this chunking could be generally useful, maybe it could be a generator what one can loop until done.

Btw, I'm surprised that a temp table is faster than WITH, since in the end WITH just creates a view, and looks like the 2 aproaches are doing the same exact thing. Do you have some measurements about this? We may want to ask Sqlite team about it.
Attachment #8801913 - Flags: review?(mak77)
(In reply to Kit Cambridge [:kitcambridge] from comment #2)
> * Instead of using a CTE for the `sorting` table, we build up a
>   temporary table, chunking insertions to avoid exceeding the bound
>   param limit. We then use this table to update positions in
>   `moz_bookmarks`. The `UPDATE` query is slow (reversing 10k bookmarks
>   takes 3m41s on my machine), but doesn't block the main thread, and
>   doesn't cause slow script warnings.

I'm sorry, I didn't see the comment due to bugzilla, thanks Mark for letting me notice this.
Do you have a stack of the segfault? We should report that to Sqlite, it may be an intrinsic limit of CTE or a real bug, they should know about it.

I'm still a bit worried we'll end up blocking the connection for 4 minutes. it's true we are in another thread, but everything on the connection is serialized, and we (unfortunately) still have synchronous queries, and many are used in the UI. It's enough that something like the star button asks for the bokmarked status of a page and we are blocking everything.

Is there a way to chunk the whole reorder?
The course of action I'd suggest is to collect the segfault stack, contact sqlite team by mail (we also have a phone number if you want) and ask why it segfaults and if there's a clear limit to CTE, or if the limit is just OOM or such.
Depending on the answer, we may keep the original query, or use temp table, but in both cases we need a chunking strategy. in the temp table case, the chunking could happen in a followup.
But first of all we need that additional information to take an informed decision.
Hey Mak,

I spent some time trying to get a reproducible crash for the SQLite team. TL;DR: I'm not actually sure it's SQLite's fault. I've only been able to reproduce it on OS X. My Linux VM runs into the 5 minute timeout and kills the test for 10k bookmarks, but it *can* reorder 8k (in 3m33s) without crashing. I suspect it's not a limitation of the CTE.

On OS X, I'm only able to reorder about 5415 bookmarks at a time. 5418 is the maximum I've seen so far, but that can also crash if I stress my system while the test runs. Here's the crash dump: https://gist.github.com/kitcambridge/1acd78d4bf855adfe47471395c9e23fb

Looks like a lot of calls to `sqlite3SelectDup`. A query of 5415 bookmarks comes out to 123806 bytes, which is well below SQLITE_MAX_SQL_LENGTH = 1000000.

Should I still contact the SQLite team, even though that dump has a lot of Mozilla-specific stuff? I can also write a toy C program that builds up a large query and executes it directly with sqlite3_exec, to see if I can find a smaller test case.
Flags: needinfo?(mak77)
Another interesting (maybe related) point: something seems to starve the event loop on OS X. I added a `setInterval` call that prints a message every 5 seconds before calling `reorder`. Here's the output I see: https://gist.github.com/kitcambridge/d8b25adc830921e2ef497a3e31f0a4f4 I'm not sure why nothing is printed between 0:32 and 0:47. This doesn't happen on Linux; that prints "Still alive..." every 5-6 seconds. 

Of course, even on Linux, this query will still block synchronous queries waiting on that transaction. So we probably need a different one for sorting. :-(
Attachment #8801913 - Attachment is obsolete: true
I think removing unnecessary logic from PlacesSyncUtils and using a map lookup is a win, so splitting that out into a separate patch that we can uplift. Let's figure out what to do about the sorting query in a follow-up.

Bug 1293365 needs Aurora uplift, too. Mak, would you mind requesting that, or should I?

(In unrelated news, I got the 10k test working on Linux by bumping http://searchfox.org/mozilla-central/rev/d96317a351af8aa78ab9847e7feed964bbaac7d7/testing/xpcshell/runxpcshelltests.py#50. Disk I/O in my VM is painfully slow).
Thread 19 Crashed:: mozStorage #1
0   libsystem_malloc.dylib          0x00007fff8e2006c4 tiny_malloc_from_free_list + 8
1   libsystem_malloc.dylib          0x00007fff8e1ff705 szone_malloc_should_clear + 292
2   libsystem_malloc.dylib          0x00007fff8e1ff5a1 malloc_zone_malloc + 71
3   libnss3.dylib                   0x000000010a0ac619 sqlite3MemMalloc + 25 (sqlite3.c:20388)
4   libnss3.dylib                   0x0000000109fd4a1e sqlite3Malloc + 334 (sqlite3.c:24067)
5   libnss3.dylib                   0x0000000109ff5966 dbMallocRawFinish + 22 (sqlite3.c:24425)
6   libnss3.dylib                   0x000000010a07508e exprDup + 110 (sqlite3.c:90350)
7   libnss3.dylib                   0x000000010a075900 sqlite3ExprListDup + 224 (sqlite3.c:90488)
8 libnss3.dylib 0x000000010a075577 sqlite3SelectDup + 87 (sqlite3.c:90578)

Looks like a crash in malloc, could be OOM?
Honestly I'd still heads-up the Sqlite team, sending them the same dump and explaining that this doesn't happen when using a temp table, but happens when using a CTE. They could be able to optimize something there and we'd benefit from such a fix in general.
On our side, we must figure out a way to do all the sorting in chunks (1000 at a time or such), one way or another.

(In reply to Kit Cambridge [:kitcambridge] from comment #11)
> Bug 1293365 needs Aurora uplift, too. Mak, would you mind requesting that,
> or should I?

feel free to, I have nothing against uplifts :)
Flags: needinfo?(mak77)
> (In reply to Kit Cambridge [:kitcambridge] from comment #11)
> > Bug 1293365 needs Aurora uplift, too. Mak, would you mind requesting that,
> > or should I?

Actually, that bug states it is fixed in Firefox 51 already?
Comment on attachment 8802785 [details]
Bug 1310554 - Simplify `BookmarkSyncUtils.order` and use map lookups in `Bookmarks.reorder`.

https://reviewboard.mozilla.org/r/87082/#review86200
Attachment #8802785 - Flags: review?(mak77) → review+
Pushed by kcambridge@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3b93af82598a
Simplify `BookmarkSyncUtils.order` and use map lookups in `Bookmarks.reorder`. r=mak
(In reply to Marco Bonardo [::mak] from comment #13)
> Actually, that bug states it is fixed in Firefox 51 already?

Off-by-one error. :-) I got mixed up and thought Aurora was 50. Will email the SQLite team today with the log.
I had a look at the query on the train today. At first glance, it's not clear to me why COUNT(*) or the self-join are needed. This query seems to do the trick, and takes about 10s to reorder 10k bookmarks:

    WITH sorting(g, p) AS (
      VALUES ${valuesTable}
    )
    UPDATE moz_bookmarks SET position = IFNULL(
      (SELECT p FROM sorting WHERE g = guid),
      -position
    )
    WHERE parent = :parentId;

If we use IDs instead of GUIDs, and an integer primary key on the temp table, the query is lightning-fast. This takes ~10ms to reorder 10k bookmarks, and ~125ms to reorder 40k bookmarks:

    CREATE TEMP TABLE moz_bookmarks_reorder_sorting(i INTEGER PRIMARY KEY, p INTEGER);
    UPDATE moz_bookmarks SET position = IFNULL(
      (SELECT p FROM moz_bookmarks_reorder_sorting WHERE i = id),
      -position
    )
    WHERE parent = :parentId;

Do you think that could work, or do we need the self-join?
Flags: needinfo?(mak77)
(In reply to Kit Cambridge [:kitcambridge] from comment #17)
> takes about 10s to reorder 10k bookmarks.

That should be 10s to reorder *5k bookmarks*. I can't reorder 10k with the CTE yet, because OS X.
The comment explaining the count need sucks indeed, the reason was that between the time we build our nice Map and the time the query runs, some other API call could remove one of the guids that are in the map.
Remember we have 2 APIs and also that the async API is not serialized, thus it can happen.

So supposing your Map is [ A: 0, B: 1, C: 2 ] and someone removes B after you built the map but before the query runs, your final result would have an hole at position 1, cause you would set A to 0 and C to 2.

Now, the problem is that even the current join doesn't save us, because it should join on moz_bookmarks, not on the sorting table itself!
So this is broken and we must fix it somehow. It could even be that we can run the simplified query as you suggest and then run a query to check there are no holes (for example http://searchfox.org/mozilla-central/rev/2e7511ed5af5497cbd7e2fa5c043efdc1751b34a/toolkit/components/places/PlacesDBUtils.jsm#567), then only activate special handling if there are holes. or we could change this to a recursive query. It's complex. If we can find a solution that runs a cheaper query, we should investigate it.

Second problem, while I was looking at your suggestion, it came to my mind that also -position is wrong, because someone could insert at index 0, and -0 is still 0. So we could end up with 2 items at position 0. it should be (-position - 1)

Sigh. If you have any time to help fixing these issues in follow-ups, I'll appreciate it a lot, atm I really can't spare any time on fixing this :(
Flags: needinfo?(mak77)
https://hg.mozilla.org/mozilla-central/rev/3b93af82598a
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 52
Comment on attachment 8802785 [details]
Bug 1310554 - Simplify `BookmarkSyncUtils.order` and use map lookups in `Bookmarks.reorder`.

Approval Request Comment
[Feature/regressing bug #]: Bug 1274108.
[User impact if declined]: Browser becomes completely unresponsive when syncing folders with a large number of bookmarks.
[Describe test coverage new/current, TreeHerder]: Covered by existing tests.
[Risks and why]: Low risk. This patch removes dead code, and changes an array lookup to a map.
[String/UUID change made/needed]: None.
Attachment #8802785 - Flags: approval-mozilla-aurora?
Comment on attachment 8802785 [details]
Bug 1310554 - Simplify `BookmarkSyncUtils.order` and use map lookups in `Bookmarks.reorder`.

Fix a performance issue related to syncing bookmarks. Take it in 51 aurora.
Attachment #8802785 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Needs rebasing for Aurora.
Flags: needinfo?(kcambridge)
You need to log in before you can comment on or make changes to this bug.