Closed
Bug 583821
Opened 14 years ago
Closed 14 years ago
Firefox Sync full syncs are really slow after SQLite3.7.0 upgrade
Categories
(Toolkit :: Storage, defect)
Toolkit
Storage
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
blocking2.0 | --- | beta4+ |
People
(Reporter: dholbert, Unassigned)
References
Details
(Keywords: regression, relnote)
STEPS TO REPRODUCE: 0. Be using a profile that has lots of history (> 5000 entries) and is set up with latest Firefox Sync "experimental version" (I'm using 1.4.3) 1. Start Firefox 2. Tools menu | Sync | Preferences 3. (optional) For simplicity, tell Sync to just synchronize your history (Minefield will: Use my custom settings" in lowest drop-down menu 4. Click "Manage Account" and then "Reset Sync" 5. Choose the bottom option, "Replace all other devices with your local data" 6. Click Next, Next, and then trigger a sync. ACTUAL RESULTS: The sync hangs for at least a minute on my machine, apparently while querying my local history. REGRESSION RANGE (using .txt files in nightly download dirs): 20100723031340 58101a16aff7 20100724030057 30239e4cebd8 http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=58101a16aff7&tochange=30239e4cebd8 This includes both a places change related to history (Bug 580228) as well as a sqlite upgrade (Bug 581000). Filing in Places right now, but it may very well be a sqlite/storage issue.
Reporter | ||
Comment 1•14 years ago
|
||
Note that Firefox 4.0 b2 isn't affected by this -- IMHO this should block 4.0b3, or else it'll be a major regression from one beta to the next for Weave users.
blocking2.0: --- → ?
Reporter | ||
Comment 2•14 years ago
|
||
A few clarifications on comment 0: - I believe this affects all "full" syncs - I initially discovered it without any "Reset Sync" required. I only include that in my STR because that's a surefire way to get a full sync, and also because it's what I did to find the regression range. - The hang here locks up the UI and spikes the CPU. (whereas in nightlies before this regressed, the CPU usage goes up, but the UI remains responsive. That'd be the 'expected behavior', I guess.)
Comment 3•14 years ago
|
||
What operating system are you on?
Reporter | ||
Comment 4•14 years ago
|
||
I've seen this on both Ubuntu 10.04 & 10.10a2. This also might be a dupe of (or related to) Bug 563538. (that's another hang that's marked as blocking Bug 580228, which is the history/places change in this bug's regression range) FWIW, here's the chunk of about:sync-log around the hang (notice that 1:16 elapses before we figure out "5000 outgoing items pre-reconciliation" -- that's the hang): >2010-08-02 11:57:15 Engine.History DEBUG Resetting history last sync time >2010-08-02 11:57:15 Net.Resource DEBUG PUT Length: 563 >2010-08-02 11:57:15 Net.Resource DEBUG PUT success 200 https://phx-sync064.services.mozilla.com/1.0/dholbert/storage/crypto/history >2010-08-02 11:57:15 Engine.History DEBUG First sync, uploading all items >2010-08-02 11:58:31 Engine.History INFO 5000 outgoing items pre-reconciliation >2010-08-02 11:58:31 Collection DEBUG GET success 200 https://phx-sync064.services.mozilla.com/1.0/dholbert/storage/history?full=1
Comment 5•14 years ago
|
||
(In reply to comment #4) > FWIW, here's the chunk of about:sync-log around the hang (notice that 1:16 > elapses before we figure out "5000 outgoing items pre-reconciliation" -- that's > the hang): > >2010-08-02 11:57:15 Engine.History DEBUG Resetting history last sync time > >2010-08-02 11:57:15 Net.Resource DEBUG PUT Length: 563 > >2010-08-02 11:57:15 Net.Resource DEBUG PUT success 200 https://phx-sync064.services.mozilla.com/1.0/dholbert/storage/crypto/history > >2010-08-02 11:57:15 Engine.History DEBUG First sync, uploading all items > >2010-08-02 11:58:31 Engine.History INFO 5000 outgoing items pre-reconciliation > >2010-08-02 11:58:31 Collection DEBUG GET success 200 https://phx-sync064.services.mozilla.com/1.0/dholbert/storage/history?full=1 Mardak - does this tell you anything useful?
Comment 6•14 years ago
|
||
(In reply to comment #4) > This also might be a dupe of (or related to) Bug 563538. (that's another hang > that's marked as blocking Bug 580228, which is the history/places change in > this bug's regression range) Note that that bug has only been seen by people on OS X so far. Knowing that you are on Linux and seeing it would be interesting.
Comment 7•14 years ago
|
||
Assuming this is latest-weave, it's doing a query SELECT url FROM moz_places_view: http://hg.mozilla.org/services/fx-sync/file/eaffcccd1744/services/sync/modules/engines/history.js#l137 Older versions used history query api and that was locking up the browser, so we switched to sql.
Comment 8•14 years ago
|
||
So that change was made in 1.4a1pre (if I'm reading things correctly) which means dholbert was running that code. Is now a bad time to point out to you that queries against the view, especially with a WHERE clause are going to be *very* slow? As in, create a new temporary table containing the results of the view query, and then querying that (which has no indexes). If you can avoid it, you really /really/ *really* should.
Comment 9•14 years ago
|
||
I should also add that it's possible that the SQLite upgrade regressed querying the view, but we've had a policy in places that we NEVER select from it because it has bad performance. I'd hate to have to start caring about view performance because weave uses it (not to mention that it's likely going away in bug 552023).
Comment 10•14 years ago
|
||
So while this may have been a result of the SQLite upgrade, queries on the view have always been slow; so much so that places specifically forbids using them. Since we are turning on the sync UI in b3 (plan at least), this should block beta 4.
blocking2.0: ? → beta4+
Component: Places → Sync
Keywords: relnote
Product: Toolkit → Weave
QA Contact: places → sync
Version: Trunk → unspecified
Comment 12•14 years ago
|
||
(Note that this bug is now marked as blocking beta4+ even though it doesn't show up in the UI that way. It does show up in queries for beta 4 blockers.)
Reporter | ||
Comment 13•14 years ago
|
||
(In reply to comment #10) > So while this may have been a result of the SQLite upgrade I'm marking this as a regression from that bug (bug 581000), for bookkeeping's sake (even though it sounds like the underlying issue is that Sync shouldn't have been using views in the first place). Also: the SQLite upgrade bug was just backed out for investigation of issues unrelated to this bug. If that backout lasts through the b3 tagging, then this bug won't affect the 4.0b3 release after all.
Blocks: SQLite3.7.0.1
Comment 14•14 years ago
|
||
(In reply to comment #13) > Also: the SQLite upgrade bug was just backed out for investigation of issues > unrelated to this bug. If that backout lasts through the b3 tagging, then this > bug won't affect the 4.0b3 release after all. Assuming that SQLite was the cause of this problem. And it will not be relanding in time for b3.
Reporter | ||
Comment 15•14 years ago
|
||
(In reply to comment #14) > Assuming that SQLite was the cause of this problem. FWIW, I just confirmed that it was indeed the cause, by testing an m-c build from just after the sqlite-backout vs. an m-c build with the backout stripped off. (The backout build was fine, and the sans-backout build was affected.)
Reporter | ||
Updated•14 years ago
|
Summary: Firefox Sync full syncs are really slow since 201007240 m-c nightly → Firefox Sync full syncs are really slow after SQLite3.7.0 upgrade
Comment 16•14 years ago
|
||
(In reply to comment #10) > So while this may have been a result of the SQLite upgrade, queries on the view > have always been slow; so much so that places specifically forbids using them. > Since we are turning on the sync UI in b3 (plan at least), this should block > beta 4. Okay, so what's the recommended alternative to querying the view? It's not fast, but it's correct, data-wise, and doesn't cause hangs.
Comment 17•14 years ago
|
||
(In reply to comment #16) > Okay, so what's the recommended alternative to querying the view? It's not > fast, but it's correct, data-wise, and doesn't cause hangs. https://wiki.mozilla.org/Places/Places_SQL_queries_best_practices or wait for me to kill temp tables... I'm still fighting a couple oranges though.
Comment 18•14 years ago
|
||
(In reply to comment #17) > or wait for me to kill temp tables... I'm still fighting a couple oranges > though. Except that that won't help them exactly because they still support Firefox 3.5 and Firefox 3.6.
Comment 19•14 years ago
|
||
(In reply to comment #17) > https://wiki.mozilla.org/Places/Places_SQL_queries_best_practices That says not to query the view tables, except that querying the actual tables caused bug 575423. Of course the fix doesn't have a test so there's no way to tell for sure whether it actually solves the problem at hand. :/ (In reply to comment #18) > Except that that won't help them exactly because they still support Firefox 3.5 > and Firefox 3.6. We could have conditional code so that we do the right thing on Firefox 3.5/3.6 and 4.
Comment 20•14 years ago
|
||
(In reply to comment #19) > (In reply to comment #17) > > https://wiki.mozilla.org/Places/Places_SQL_queries_best_practices > > That says not to query the view tables and 5 rows below it says "When using SELECT we must always be sure to query both tables"[...]"To query both tables we can use a special SQL construct"[...]Example.
Comment 21•14 years ago
|
||
Hey look, the blocking flag cannot be changed. Is this fixed now that weave is not querying on the view?
Comment 22•14 years ago
|
||
Is 3.7.0 landed?
Comment 23•14 years ago
|
||
3.7.0.1 landed.
Comment 24•14 years ago
|
||
see bug 581000 for 3.7.0.1
Updated•14 years ago
|
Assignee: nobody → mitcho
Comment 25•14 years ago
|
||
Oops, wrong tab! Reassigning to mconnor to resolve for b4; I think we backed out SQLite, though.
Assignee: mitcho → mconnor
Comment 26•14 years ago
|
||
(In reply to comment #25) > Oops, wrong tab! Reassigning to mconnor to resolve for b4; I think we backed > out SQLite, though. And then landed 3.7.0.1 though. I believe that Sync not querying on the views (bug 583852) fixed this, but somebody should confirm that (comment 21)
Reporter | ||
Comment 27•14 years ago
|
||
I'll try to confirm. brb.
Reporter | ||
Comment 28•14 years ago
|
||
Nope, still broken on a full initial sync (using STR from comment 0). Hangs on the same step as before: >2010-08-16 14:49:55 Engine.History DEBUG First sync, uploading all items >2010-08-16 14:51:20 Engine.History INFO 5000 outgoing items pre-reconciliation That's almost 90-seconds of browser-hanging pain. I'm using Sync 1.4.4 with today's nightly: Mozilla/5.0 (X11; Linux x86_64; rv:2.0b4pre) Gecko/20100816 Minefield/4.0b4pre
Reporter | ||
Comment 29•14 years ago
|
||
(In reply to comment #28) > I'm using Sync 1.4.4 with today's nightly FWIW, I tried reproducing with 1.4.4 & an old nightly shortly after this regressed (20100727), but my sync failed in that case (retried a few times), with errors like: 2010-08-16 14:48:37 Engine.Bookmarks DEBUG Total (ms): sync 4377, processIncoming 35, uploadOutgoing 1310, syncStartup 3025, syncFinish 1, createRecord 398, resetLastSync 0, resetClient 1 2010-08-16 14:48:37 RecordMgr DEBUG Failed to import record: Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIHttpChannel.requestMethod] Stack trace: Res__request()@resource://services-sync/resource.js:205 < Res_get()@resource://services-sync/resource.js:293 < RecordMgr_import()@resource://services-sync/base_records/wbo.js:119 < RecordMgr_get()@resource://services-sync/base_records/wbo.js:142 < SyncEngine__syncStartup()@resource://services-sync/engines.js:352 < anonymous()@resource://services-sync/engines.js:196 < SyncEngine__sync()@resource://services-sync/engines.js:744 < anonymous()@resource://services-sync/engines.js:196 < WrappedNotify()@resource://services-sync/util.js:115 < Engine_sync()@resource://services-sync/engines.js:206 < WeaveSvc__syncEngine()@resource://services-sync/service.js:1455 < anonymous()@resource://services-sync/service.js:1404 < WrappedNotify()@resource://services-sync/util.js:115 < WrappedLock()@resource://services-sync/util.js:87 < WrappedCatch()@resource://services-sync/util.js:66 < sync()@resource://services-sync/service.js:1289 < anonymous()@sync.js:326 This is all with my "normal" browsing profile. I wonder if the error is due to a recent change in places that makes an older nightly barf? In any case, though, the hanging persists in today's nightly with the most recent "beta channel" version of weave, as noted in comment 28.
Comment 30•14 years ago
|
||
I wonder if turning off automatic_index would help... http://www.sqlite.org/pragma.html#pragma_automatic_index (shooting in the dark here)
Reporter | ||
Comment 31•14 years ago
|
||
sdwilsh/mconnor asked me to try "stock" mozilla-central, without the Sync addon installed, to see if that helps. It doesn't. :(
Comment 32•14 years ago
|
||
Okay, so it's not views, since everywhere has that fixed. This looks like a SQLite regression at this point.
Comment 33•14 years ago
|
||
I guess our approach here for the beta is to back out the SQLite upgrade. Can I get some commitment from the Sync team to help me identify the cause of the regression *this week* so I can report it to the SQLite folks so it can hopefully get fixed in time for 3.7.1?
Comment 34•14 years ago
|
||
Whatever help you need from us you'll get.
Comment 35•14 years ago
|
||
(and if this doesn't get backed out by 8 PM PDT, I guess I can do that)
Comment 36•14 years ago
|
||
SQLite ticket http://www.sqlite.org/src/info/13f033c865 describes a performance issue in the query planner that was introduced in version 3.7.0 but which has now been fixed. Could this be your problem? A pre-release snapshot of SQLite 3.7.1 is available at http://www.sqlite.org/draft/download.html and the performance regression of ticket [13f033c865] is fixed in that snapshot. Can yall do a test build using the snapshot and see if that clears your problem?
Reporter | ||
Comment 37•14 years ago
|
||
sdwi
Reporter | ||
Comment 38•14 years ago
|
||
(d'oh, accidentally posted early) sdwilsh: I'd be happy to test a tryserver / custom-patched build with the beta sqlite upgrade that Dr H. referenced above, if you can let me know what I need to do to upgrade.
Reporter | ||
Comment 39•14 years ago
|
||
I'm making a local build with the sqlite code from comment 36 and a version-rev in configure.in (don't know if that's required), to match the patches in bug 581000...
Reporter | ||
Comment 40•14 years ago
|
||
Yay - that upgrade fixes the issue, in my local build! So, IIUC, all SQLite versions 3.7.0* are affected by this, and 3.7.1 isn't release-ready yet -- should we back out the 3.7.0.1 upgrade and wait until 3.7.1 is ready?
Comment 41•14 years ago
|
||
FWIW, we are aiming for a 3.7.1 release early next week - Monday or Tuesday - 2010-08-23 or 2010-09-24.
Reporter | ||
Updated•14 years ago
|
Assignee: mconnor → nobody
Component: Sync → Storage
Product: Weave → Toolkit
QA Contact: sync → storage
Version: unspecified → Trunk
Comment 42•14 years ago
|
||
sdwilsh: backout sqlite or go to beta with the slowness; I think we want to back it out, right?
Reporter | ||
Comment 43•14 years ago
|
||
beltzner: sounds like it, based on sdwilsh's comment 33 & 34. Sounds like sdwilsh will be AFK until 8pm or later based on comment 34. In the interests of not clashing with other blocker-landings later tonight, I'll go ahead and back out bug 581000 right now.
Reporter | ||
Comment 44•14 years ago
|
||
Backed out the sqlite upgrade. (see bug 581000 comment 31) We're back to version 3.6.23.1 for now. Should I resolve this as FIXED by the backout? (with the understanding that the next sqlite version to land will be 3.7.1-based?)
Comment 45•14 years ago
|
||
Yes, that seems appropriate.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Comment 46•14 years ago
|
||
(In reply to comment #42) > sdwilsh: backout sqlite or go to beta with the slowness; I think we want to > back it out, right? Yes, backout is the right thing to do here. I couldn't get to it until about now, so thanks dholbert for doing this! dholbert - did your own trunk build without the test SQLite show the same issue? Just making sure we've covered all our bases here :)
Reporter | ||
Comment 47•14 years ago
|
||
Yes, the trunk build from comment 40 hits this bug if I remove the 3.7.1 SQLite prerelease.
Comment 48•14 years ago
|
||
Awesome! Thanks a ton!
You need to log in
before you can comment on or make changes to this bug.
Description
•