Closed
Bug 450104
Opened 17 years ago
Closed 17 years ago
insertion of data is extremely slow compared to other platforms
Categories
(Calendar :: Provider: Local Storage, defect)
Calendar
Provider: Local Storage
Tracking
(Not tracked)
RESOLVED
FIXED
0.9
People
(Reporter: WSourdeau, Assigned: dbo)
References
Details
Attachments
(1 file, 2 obsolete files)
6.33 KB,
patch
|
Fallen
:
review+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; fr-FR; rv:1.9.0.1) Gecko/2008071420 Iceweasel/3.0.1 (Debian-3.0.1-1)
Build Identifier: 0.9pre
The insertion of data in a cache calendar (based on the storage calendar) is extremely slow on Windows. A work-around for this would be to use transactions when updating more than one event at a time.... Which leads to another bug: concurrent access to the same database file, which SQLite documentation discourages... In effect, when trying to work-around bug #1 here, the addition of transaction code for batch updates trigger exceptions. This can be explained by the fact that the db file is locked when a transaction is ongoing and all cached calendars access the same file... Proposed work-around to bug #2: use a different cache file per calendar.
Reproducible: Always
Steps to Reproduce:
1.
2.
3.
Reporter | ||
Updated•17 years ago
|
Flags: blocking-calendar0.9?
Reporter | ||
Comment 1•17 years ago
|
||
This patch fixes both issues.
Two problems left though: deleting calendar files does not seem to be achievable. A solution is left to be found.
For no apparent reason, the methods using mDBTwo now have to use mDB as well, otherwise some exceptions are generated on Windows, possibly because of the file locking.
Attachment #334014 -
Flags: review?(daniel.boelzle)
Reporter | ||
Comment 2•17 years ago
|
||
I forgot... If we could separate the local calendars in different files, it would probably help remove all the references to "cal_id".
Comment 3•17 years ago
|
||
I'm investigating too with this issue...
I've modified the CalDav calendar, now it create a distinct cache per calendar...
With this patch... if only one calendar is loaded with 4 events inside :
prepareBatchUpdate...
finishBatchUpdate... 145
it takes only 145ms to modify/store an event
If i load (enable) one calendar with ~1000 events inside it takes about 600ms to modify/store an event:
prepareBatchUpdate...
finishBatchUpdate... 586
and suprisingly the other calendar (with 4 event) is slow too... ~600ms to modify the event too...
Comment 4•17 years ago
|
||
I've put some profiling code in the logs: here's the result
numbers are milliseconds afer prepareBatch() with my ~1000 event calendar
begin flushItem 15
end flushItem 73 (58 ms OK)
before notifyOperationComplete setMetaData... 74
before notify observers... 75
after notify observers... 685 (600 ms NOT OK)
Setting Metadata 602e8f45-ade7-49c1-9582-323654babbc9, /calendars/public/f4e49374-eae7-427b-a261-f0caeb2e1615/calendar/602e8f45-ade7-49c1-9582-323654babbc9.ics, "633546706504026010"
begin setMetaData... 686
end setMetaData... 687
begin finishBatchUpdate... 687
end finishBatchUpdate... 1301
the problem is this line :
this.observers.notify("onModifyItem", [modifiedItem, aOldItem]);
Comment 5•17 years ago
|
||
I solved the issue, i'm only doing batch updates during the firstImport... and when i'm doing batchupdates i do :
if (this.mInBatchUpdate === false)
this.observers.notify("onAddItem", [aItem]);
it used to takes 5min to load my 1000 event calendar, now it only takes 40s
Reporter | ||
Comment 6•17 years ago
|
||
Hi Hubert, the next loads should event be faster...
Comment 7•17 years ago
|
||
I don't understand your comment, because the next loads, items are cached...
is there an IRC place where we could discuss on this issue?
Comment 8•17 years ago
|
||
#calendar on irc.mozilla.org is always a good place for these kinds of discussion.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Reporter | ||
Comment 9•17 years ago
|
||
(In reply to comment #7)
> I don't understand your comment, because the next loads, items are cached...
> is there an IRC place where we could discuss on this issue?
>
I am connected, and trying to reach you privately and in French...
Comment 10•17 years ago
|
||
(In reply to comment #4)
> the problem is this line :
> this.observers.notify("onModifyItem", [modifiedItem, aOldItem]);
I don't see where you tell the observers that you are entering batch mode. Observers should take that into account. For example, they should not redraw the view, because the batch mode says that another change is about to happen.
If you do notify observers, it would be interesting to find out which observer is so slow.
Reporter | ||
Comment 11•17 years ago
|
||
Would invoking this.startBatch from prepareBatchUpdate do the trick? (and its corollary: this.endBatch from finishBatchUpdate)
Comment 12•17 years ago
|
||
afaik, yes, that would do. But feel free to experiment :)
Assignee | ||
Comment 13•17 years ago
|
||
Comment on attachment 334014 [details] [diff] [review]
proposed patch
> /**
>+ * Hooks to trigger potential optimisations prior to a large
>+ * amount of updates.
>+ */
>+ void prepareBatchUpdate();
>+ void finishBatchUpdate();
I am not yet convinced whether we should add new API or just open/close the transaction in calICalendar::startBatch() resp. endBatch(). I think those have a similar meaning.
I never saw documentation how multiple startBatch/endBatch calls should be handled, and what happens if someone lacks an endBatch().
I can imagine we handle this like recursive/non-blocking mutexes, i.e. do ref-counting to control when the opened transaction needs to be closed (by the last endBatch).
This needs to be resolved up front IMO.
> case "storage":
> var file = getCalendarDirectory();
>- file.append("cache.sqlite");
>+ file.append("cache" + this.id + ".sqlite");
> var uri = getIOService().newFileURI(file);
> uri.spec += ("?id=" + this.id);
> cachedCalendar.uri = uri;
I don't like that, because it leaves dangling files around. Is it really performance-critical to have multiple files?
>+ this.mIsCacheDB = false;
>+ this.mInBatchUpdate = false;
I think this needs to be a counter rather than a bool like suggested above.
>- for (var i in this.mDeleteEventExtras) {
>- this.mDeleteEventExtras[i].execute();
>- this.mDeleteEventExtras[i].reset();
>+ if (this.mIsCacheDB) {
>+ var file = this.mDB.databaseFile;
>+ delete this.mDB;
>+// TODO: how do we remove that file?
> }
Ok, here should be the cleanup. Erasing the file is no problem via nsIFIle. What I don't like is the cacheDB flag. I suggest to do usual table cleanup, then test whether there are any events/todos in the DB in case it's not the profile DB (because the profile-DB is the only one that is shared with calendar registration).
Slightly off-topic: It would have been advantageous if we would have shifted calendar registration to the prefs already, sigh :/
>+ finishBatchUpdate: function stor_finishBatchUpdate() {
>+ if (this.mInBatchUpdate) {
>+ this.mDB.commitTransaction();
>+ this.mDB.executeSimpleSQL("VACUUM");
Is VACUUM expensive? If we go with startBatch/endBatch, we need to test how often those are called, and if VACUUM hurts. We could also opt for doing a timer-driven VACUUM instead.
Comment 14•17 years ago
|
||
> Is VACUUM expensive?
It's very slow, see <http://developer.mozilla.org/en/Storage/Performance#Vacuuming_and_zero-fill>. In Firefox it's disabled for that reason <http://mxr.mozilla.org/mozilla/source/toolkit/components/places/src/nsNavHistory.cpp#3998> and <http://mxr.mozilla.org/mozilla/source/toolkit/components/places/src/nsNavHistory.cpp#4664>.
Comment 15•17 years ago
|
||
(In reply to comment #13)
> Is VACUUM expensive?
Yes. Stefan already supplied the relevant links.
BTW bug 352976 already asks for the ability to compress your database by using the VACUUM command. From an UX perspective I would recommend to implement a timer-based VACUUM, e.g. perform the VACUUM command after 5/10/30 minutes of application-inactivity.
Assignee | ||
Comment 16•17 years ago
|
||
This won't block 0.9, but is of course welcome if landing in time.
Flags: wanted-calendar0.9+
Flags: blocking-calendar0.9?
Flags: blocking-calendar0.9-
Reporter | ||
Comment 17•17 years ago
|
||
(In reply to comment #13)
> (From update of attachment 334014 [details] [diff] [review])
> > /**
> >+ * Hooks to trigger potential optimisations prior to a large
> >+ * amount of updates.
> >+ */
> >+ void prepareBatchUpdate();
> >+ void finishBatchUpdate();
>
> I am not yet convinced whether we should add new API or just open/close the
> transaction in calICalendar::startBatch() resp. endBatch(). I think those have
> a similar meaning.
I agree. I will change that.
> I don't like that, because it leaves dangling files around. Is it really
> performance-critical to have multiple files?
Not performance-critical, but with SQLite you can only have one transaction per database. When you start a batch or multiple cached calendars, they overlap and triggers exceptions. This is why this is needed. The other alternative would be to block until the batch updates have ended, but it becomes tricky to handle and is probably less elegant as well as less efficient.
> >+ this.mInBatchUpdate = false;
> I think this needs to be a counter rather than a bool like suggested above.
(see my answer above)
> >+ if (this.mIsCacheDB) {
> >+ var file = this.mDB.databaseFile;
> >+ delete this.mDB;
> >+// TODO: how do we remove that file?
> > }
>
> Ok, here should be the cleanup. Erasing the file is no problem via nsIFIle.
I never managed to do that because the file handle never seems to be released.
> What I don't like is the cacheDB flag. I suggest to do usual table cleanup,
> then test whether there are any events/todos in the DB in case it's not the
> profile DB (because the profile-DB is the only one that is shared with calendar
> registration).
That's why I suggested to have one local calendar per file as well, to cleanup all of that.
Reporter | ||
Comment 18•17 years ago
|
||
(In reply to comment #15)
> (In reply to comment #13)
> > Is VACUUM expensive?
>
> Yes. Stefan already supplied the relevant links.
>
> BTW bug 352976 already asks for the ability to compress your database by using
> the VACUUM command. From an UX perspective I would recommend to implement a
> timer-based VACUUM, e.g. perform the VACUUM command after 5/10/30 minutes of
> application-inactivity.
Ok, I will remove those calls then.
Updated•17 years ago
|
Assignee: nobody → WSourdeau
OS: Windows XP → All
Hardware: PC → All
Updated•17 years ago
|
Status: NEW → ASSIGNED
Reporter | ||
Comment 19•17 years ago
|
||
(In reply to comment #17)
> (In reply to comment #13)
> > >+ this.mInBatchUpdate = false;
> > I think this needs to be a counter rather than a bool like suggested above.
>
Hi Daniel,
I think I have understood your suggestion. I am going to do some tests and probably resubmit a patch with it.
Reporter | ||
Comment 20•17 years ago
|
||
Now... the problem is that the value of the counter is not propagated to all instances of the storage calendar. We would thus need a proxy service to keep this value global, wouldn't we?
Assignee | ||
Comment 21•17 years ago
|
||
Why do you need to propagate that counter across multiple instances? My suggestion was about allowing multiple startBatch() calls, that the last endBatch() call commits the transaction.
Reporter | ||
Comment 22•17 years ago
|
||
(In reply to comment #21)
> Why do you need to propagate that counter across multiple instances? My
> suggestion was about allowing multiple startBatch() calls, that the last
> endBatch() call commits the transaction.
>
Because each cached calendar has its own instance of the storage calendar (pointing to the same file). Since only one transaction can happen per file at the same time, the amount of transactions in progress at the same time needs to be taken into account.
Assignee | ||
Comment 23•17 years ago
|
||
Comment on attachment 334014 [details] [diff] [review]
proposed patch
r- for now, based on previous comments; I appreciate an improved patch :)
Attachment #334014 -
Flags: review?(daniel.boelzle) → review-
Comment 24•17 years ago
|
||
(In reply to comment #20)
> Now... the problem is that the value of the counter is not propagated to all
> instances of the storage calendar. We would thus need a proxy service to keep
> this value global, wouldn't we?
I'm not sure if this is good style or not, I've just noticed by error that it seems to work:
if you set an object on the prototype, then the counter should be static, i.e:
function foo() {
this.staticThing.value++;
dump(this.staticThing.value + "\n");
}
foo.prototype = {
staticThing: {}
};
var f = new foo();
var g = new foo();
Should produce:
1
2
Assignee | ||
Comment 25•17 years ago
|
||
We're heading for 0.9 RC1 the next days. Wolfgang, if you've an updated patch, please submit it.
Comment 26•17 years ago
|
||
Comment on attachment 334014 [details] [diff] [review]
proposed patch
>- // For the extra-item data, note that we use mDBTwo, so that
>+ // For the extra-item data, note that we use mDB, so that
> // these can be executed while a selectItems is running!
I see you are getting rid of mDBTwo. Do you have a reliable source that this workaround is not needed anymore? The reason it was introduced back in the days is that a storage connection could only hold one result set at a time. In a getItems() call, stepping through mSelectNonRecurringEventsByRange calls getAdditonalData, which in turn does extra databse queries. If this query is not done against a different database connection reference, then the first set of results is thrown away.
If it turns out not to be needed anymore please instead of just changing the comments remove those that are meant to explain the above noted phenomenon.
Reporter | ||
Comment 27•17 years ago
|
||
This patch implements what has been suggested recently: mostly the implementation of startBatch() and endBatch() and the use of a counter to keep track of the amount of transactions.
Attachment #334014 -
Attachment is obsolete: true
Attachment #337589 -
Flags: review?
Reporter | ||
Comment 28•17 years ago
|
||
Attachment #337589 -
Flags: review? → review?(philipp)
Reporter | ||
Comment 29•17 years ago
|
||
I seem to have reproduced the problem regarding the NS_ERROR_FAILURE occuring when more than 2 transactions are happening, during the execution of certain selects... It seems to happen only when an exception occurs at a different stage in the workflow. I cannot yet explain the how of the why, but when no exception occur during a refresh, the transactions are processed properly.
My guess is that the latest patch is thus valid and ready to be applied after removing the LOG lines, but that other sentitive code (especially in the caldav provider) should be fixed.
Assignee | ||
Updated•17 years ago
|
Attachment #337589 -
Flags: review?(philipp) → review-
Assignee | ||
Comment 30•17 years ago
|
||
Comment on attachment 337589 [details] [diff] [review]
updated patch
> flushItem: function (item, olditem) {
> ASSERT(!item.recurrenceId, "no parent item passed!", true);
>
>- this.mDB.beginTransaction();
>+ if (batchCount == 0) {
>+ this.mDB.beginTransaction();
>+ }
> try {
> this.deleteItemById(olditem ? olditem.id : item.id, true /* hasGuardingTransaction */);
> this.writeItem(item, olditem);
>- this.mDB.commitTransaction();
>+ if (batchCount == 0) {
>+ this.mDB.commitTransaction();
>+ }
> } catch (e) {
> dump("flushItem DB error: " + this.mDB.lastErrorString + "\n");
> Components.utils.reportError("flushItem DB error: " +
> this.mDB.lastErrorString);
>- this.mDB.rollbackTransaction();
>+ if (batchCount == 0) {
>+ this.mDB.rollbackTransaction();
>+ }
It may happen that js context switches occur in between, or startBatch/endBatch calls. This would lead to wrong start/commit transaction calls. We should guard/count all the time, also in deleteItemById.
>+ startBatch: function stor_startBatch() {
>+// this.__proto__.__proto__.startBatch.apply(this, arguments);
>+ LOG("startBatch");
>+ LOG("batchCount incremented to " + (batchCount + 1));
>+ if (batchCount == 0) {
>+ this.mDB.beginTransaction();
>+ }
>+ batchCount++;
>+ },
Missing call of onStartBatch.
>+ endBatch: function stor_startBatch() {
>+// this.__proto__.__proto__.endBatch.apply(this,
same here
Assignee | ||
Comment 31•17 years ago
|
||
This patch fixes the above mentioned, counting on all transactions to properly control transaction flow and adds counting to base provider's startBatch/endBatch.
It's quite important that we properly guard startBatch/endBatch calls now (calDavProvider.js!), else the provider would get stuck in an uncommitted transaction.
Assignee: WSourdeau → daniel.boelzle
Attachment #337589 -
Attachment is obsolete: true
Attachment #337670 -
Flags: review?(philipp)
Updated•17 years ago
|
Attachment #337670 -
Flags: review?(philipp) → review+
Comment 32•17 years ago
|
||
Comment on attachment 337670 [details] [diff] [review]
alternative patch
r=philipp
I'll check this in together with my upcoming patch that also takes care of guarding the batch transactions.
Comment 33•17 years ago
|
||
Checked in on HEAD and MOZILLA_1_8_BRANCH
-> FIXED
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Target Milestone: --- → 0.9
Comment 34•17 years ago
|
||
Did you fix the original issue? I tested with recreating the cache after startup and didn't see a difference in creating a 880 kb cache-file, both the 07-09 build and the 10-09 build took about a minute to create the cache (TB being unresponsive after starting up).
Comment 35•17 years ago
|
||
I already said before that transaction just speed up a little bit the process...
But the process is mainly slown down by
this.observers.notify("onAddItem", [aItem]);
on the storage calendar...
Assignee | ||
Comment 36•17 years ago
|
||
Hubert, so the cause is actually that the views don't optimize for onStartBatch/onEndBatch notifications, right?
Comment 37•17 years ago
|
||
I think we don't have to call the notify("onAddItem", [aItem]);
while we are first refreshing the calendar (we call the onLoad after)
Comment 38•17 years ago
|
||
We should call onAddItem. That's how it's always been, and some observers might depend on it. I can think of good reasons for that. For example, observers that are very fast. Then can just do their work in onAddItem. That might be faster then starting from zero in onLoad.
What should be done is checking all the current observers to find which one is slow, and if it does honour the onStartBatch calls.
Reporter | ||
Comment 39•17 years ago
|
||
I have digged into the code (latest nightly: 2008-09-12) and everything works fine as far as the notifications are concerned:
Both the month and multiday views use a batch counter, which forces onAddItem, onModifyItem and onDeleteItem to return directly when not 0. So the problem that Hubert mentions does no longer occur.
You need to log in
before you can comment on or make changes to this bug.
Description
•