Closed Bug 366177 Opened 17 years ago Closed 16 years ago

Memory cache for storage provider

Categories

(Calendar :: Provider: Local Storage, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: jminta, Assigned: dbo)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

Attachments

(2 files, 6 obsolete files)

As discussed on mda.calendar, implementing a memory cache for the storage provider can dramatically improve performance for large calendars, and is a pre-req for some caching models.  This bug is for adding such a cache to the storage provider.
Flags: blocking-calendar0.5?
Attached patch add memory caching (obsolete) β€” β€” Splinter Review
Parts of this patch were included in bug 344401, which was rejected on other grounds.  The main difference here is that the cache in that patch was unlimited, while this one is capped by a preference.
Assignee: nobody → jminta
Status: NEW → ASSIGNED
Attachment #250713 - Flags: second-review?(mvl)
Attachment #250713 - Flags: first-review?(lilmatt)
Comment on attachment 250713 [details] [diff] [review]
add memory caching

>Index: calendar/providers/storage/calStorageCalendar.js
>===================================================================
> function calStorageCalendar() {
>     this.wrappedJSObject = this;
>     this.mObservers = new Array();
>-    this.mItemCache = new Array();
>+    this.mItemCache = new Object();
>+    this.mCacheIndex = new Array();
> }
I feel that the index variable should have the same root name as the cache object if the two are related (ie: mItemCacheIndex or mItemCacheIdx).  It helps remove any question regarding whether or not the two are related.

>         // helper function to handle converting a row to an item,
>         // expanding occurrences, and queue the items for the listener
>-        function handleResultItem(item, flags, theIID) {
>-            self.getAdditionalDataForItem(item, flags);
>-            item.makeImmutable();
>+        function handleResultItem(item, flags, getData, theIID) {
>+            if (getData) {
>+                self.getAdditionalDataForItem(item, flags);
>+                self.addToCache(item);
>+            }
>+
>+            if (item.isMutable) {
>+                item.makeImmutable();
>+            }
Please add what "getData" is for to the function's comment, and add a comment above the if (item.isMutable) about why we're making the mutable item immutable.

>+                    if (this.mItemCache[row.id]) {
>+                        resultItems.push({item: this.mItemCache[row.id], done: true});
>+                        continue;
>+                    }
>+
A comment would be nice here as well.

>             // process the events
>             for each (var evitem in resultItems) {
>-                count += handleResultItem(evitem.item, evitem.flags, Ci.calIEvent);
>+                count += handleResultItem(evitem.item, evitem.flags, !evitem.done, Ci.calIEvent);
Move !evitem.done and Ci.calIEvent to the next line.

>             while (this.mSelectTodosByRange.step()) {
>                 var row = this.mSelectTodosByRange.row;
>+                if (this.mItemCache[row.id]) {
>+                    if (!item.isCompleted && !wantNotCompletedItems)
>+                        continue;
>+                    if (item.isCompleted && !wantCompletedItems)
>+                        continue;
>+                    resultItems.push({item: this.mItemCache[row.id], done: true});
>+                    continue;
>+                }
>+
Those if...continue statements need curly braces.

>+                    if (this.mItemCache[row.id]) {
>+                        if (!item.isCompleted && !wantNotCompletedItems)
>+                            continue;
>+                        if (item.isCompleted && !wantCompletedItems)
>+                            continue;
>+                        resultItems.push({item: this.mItemCache[row.id], done: true});
>+                        continue;
>+                    }
>+
Ugh. Repeated code. Same curly brace fix here.  If you can come up with some clever way to not have to repeat this block twice, bonus points for you.

>             // process the todos
>             for each (var todoitem in resultItems) {
>-                count += handleResultItem(todoitem.item, todoitem.flags, Ci.calITodo);
>+                count += handleResultItem(todoitem.item, todoitem.flags, !todoitem.done, Ci.calITodo);
Move !todoitem.done and Ci.calITodo to the next line.

> 
>+    addToCache: function stor_addToCache(aItem) {
I'd like to name this function addToItemCache to be consistent with the actual object's name.

>+        var maxCache = getPrefSafe("calendar.cache.size", 500);
Perhaps we can rename this to "calendar.providers.storage.itemCacheSize" or something.  Unless I'm missing something, since it's storage-only, calendar.cache.size seems a bit too general, especially with the offline work you guys are talking about coming up.


r=lilmatt with those addressed
Attachment #250713 - Flags: first-review?(lilmatt) → first-review+
Target Milestone: --- → Sunbird 0.5
Comment on attachment 250713 [details] [diff] [review]
add memory caching

>Index: calendar/providers/storage/calStorageCalendar.js
>-    this.mItemCache = new Array();
>+    this.mItemCache = new Object();
>+    this.mCacheIndex = new Array();

Why the need for two objects? You can use strings as key in an array or object, right?

>@@ -625,19 +626,25 @@ calStorageCalendar.prototype = {
>+        function handleResultItem(item, flags, getData, theIID) {

The meaning of the getData parameter is not clear to me.

>+            if (getData) {
>+                self.getAdditionalDataForItem(item, flags);
>+                self.addToCache(item);
>+            }
You are adding a getAdditionalDataForItem() here. Should you not remove it somewhere else (maybe by putting it an if(!getData) ?

>+            if (item.isMutable) {
>+                item.makeImmutable();
>+            }
Can mutable items end up in the cache? makeImmutable is a pretty costly call, so you sheald really cache the result.

>             while (this.mSelectEventsByRange.step()) {
>                 var row = this.mSelectEventsByRange.row;
>+                if (this.mItemCache[row.id]) {
You are not even using the index here...
>+                    resultItems.push({item: this.mItemCache[row.id], done: true});
>+                    continue;

>@@ -705,28 +716,33 @@ calStorageCalendar.prototype = {

>+                    if (this.mItemCache[row.id]) {
>+                        resultItems.push({item: this.mItemCache[row.id], done: true});

I would prefer to call |done| something more descriptive, like hasAllProperties (or whatever it really means)

>@@ -743,16 +759,25 @@ calStorageCalendar.prototype = {

>             while (this.mSelectTodosByRange.step()) {
>                 var row = this.mSelectTodosByRange.row;
>+                if (this.mItemCache[row.id]) {
>+                    if (!item.isCompleted && !wantNotCompletedItems)
>+                        continue;
I don't see item getting declared here...
>+                    if (item.isCompleted && !wantCompletedItems)
>+                        continue;
>+                    resultItems.push({item: this.mItemCache[row.id], done: true});
>+                    continue;
>+                }
>+
>                 var flags = {};
>                 var item = this.getTodoFromRow(row, flags);
Ah, there it is. A bit late :)

>                 flags = flags.value;
> 
>                 if (!item.isCompleted && !wantNotCompletedItems)
>                     continue;
>                 if (item.isCompleted && !wantCompletedItems)
>                     continue;

Can you somehow not copy this set of ifs?

>@@ -769,16 +794,25 @@ calStorageCalendar.prototype = {
>+                    if (this.mItemCache[row.id]) {
>+                        if (!item.isCompleted && !wantNotCompletedItems)
>+                            continue;

As above.

>@@ -791,17 +825,17 @@ calStorageCalendar.prototype = {

>+    addToCache: function stor_addToCache(aItem) {
>+        var maxCache = getPrefSafe("calendar.cache.size", 500);
Add this pref to the default prefs file. Then we can one day get rid of the defaults being spread all over the codebase.

>@@ -2027,17 +2076,21 @@ calStorageCalendar.prototype = {
>-        delete this.mItemCache[aID];
>+        var index = this.mCacheIndex.indexOf(aID);
>+        if (index != -1) {
>+            this.mCacheIndex.splice(index, 1);
>+            delete this.mItemCache[aID];
>+        }

For cleanness, this wants to be a function, deleteFromCache as complement to addToCache.



This is just a code review, I still need to look at the architectural consequences.
In normal calendar usage, how noticeable is this cache?
Comment on attachment 250713 [details] [diff] [review]
add memory caching

>             while (this.mSelectEventsByRange.step()) {
>                 var row = this.mSelectEventsByRange.row;
>+                if (this.mItemCache[row.id]) {
>+                    resultItems.push({item: this.mItemCache[row.id], done: true});
>+                    continue;
>+                }
>                 var flags = {};
>                 var item = this.getEventFromRow(row, flags);
>                 flags = flags.value;
> 
>                 resultItems.push({item: item, flags: flags});
> 
>                 if (asOccurrences && flags & CAL_ITEM_FLAG_HAS_RECURRENCE)
>                     handledRecurringEvents[row.id] = true;

Here, you add items to the resultset, without considering the occurrences. The flag might not be the same for all calls.
Maybe you need to add the call to the cache in getEventFromRow(), instead of here.
(In reply to comment #5)
> Here, you add items to the resultset, without considering the occurrences. The
> flag might not be the same for all calls.
I don't understand what's wrong with this.  The |flags| are merely data for getAdditionalDataForItem, so that it knows where there are attendees, etc.  Since we already have this additional data in the cache version, the flags aren't necessary.

If you're referencing the arguments passed to getItems, those are factored into the SELECT statement, so as long as the id comes back from that statement, we should be legitimately returning the event.

What am I misunderstanding here?

I was indeed talking about the asOccurences flag. The ID's might be in the select call, but the cache doesn't store if the item is the parent or one of the occurences. So from what i can see, the cache can not return the right item.
jminta: does this still need review, or are the problems i though there are real problems that need a fix?
Comment on attachment 250713 [details] [diff] [review]
add memory caching

Canceling review request.  This needs some cleanup
Attachment #250713 - Flags: second-review?(mvl)
This doesn't block 0.5.
Flags: blocking-calendar0.5? → blocking-calendar0.5-
Not going to make the 0.5 train.
Target Milestone: Sunbird 0.5 → ---
Flags: blocking-calendar0.7?
Re-assigning my bugs to nobody@mozilla.org due to recent developments.
Assignee: jminta → nobody
Status: ASSIGNED → NEW
not blocking 0.7
Flags: blocking-calendar0.7? → blocking-calendar0.7-
Flags: wanted-calendar0.8?
Version: Trunk → unspecified
It would be good to have this in 0.8
Flags: wanted-calendar0.8?
Flags: wanted-calendar0.8+
Flags: blocking-calendar0.7-
Flags: blocking-calendar0.5-
Daniel, would this help us for our upcoming offline support story?
Flags: blocking-calendar0.8?
IMO only slightly related. Offline mode will push more calendars into storage land (and some are likely large), but this is IMO not blocking.
Flags: blocking-calendar0.8? → blocking-calendar0.8-
Attached patch caching (obsolete) β€” β€” Splinter Review
This patch implements lookup caches. Due to the fact that the recurring items are iterated on every query, they are read out once into the cache. The only database query left is for single items.
However this gets us better performance, I am still a bit unsure whether reading out the whole database once at startup (and building up more sophisticated memory lookup tables) would perform better.
It passes sebo's provider unit tests. Everybody is invited to test and further improve this patch.
Assignee: nobody → daniel.boelzle
Attachment #250713 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #297788 - Flags: review?(sebo.moz)
Blocks: 412914
Attachment #297788 - Attachment is obsolete: true
Attachment #297788 - Flags: review?(sebo.moz)
Attached patch caching - v2 (obsolete) β€” β€” Splinter Review
Attachment #299198 - Flags: review?(sebo.moz)
Comment on attachment 299198 [details] [diff] [review]
caching - v2

I don't see where there's any cache expiration in this patch.  The patch I wrote ensured that over time, events that had not been retrieved in awhile were deleted, preventing memory-bloat.  Did I miss something or are you going to hold every item in memory for the lifetime of the app?
(In reply to comment #19)
> deleted, preventing memory-bloat.  Did I miss something or are you going to
> hold every item in memory for the lifetime of the app?
No, you haven't missed that: I simply don't think it's relevant on modern computers, as it isn't for ics/memory. I'd build up even more memory tables for lookup if that would speed up the queries.
Whats the rough estimate of memory that an item needs? Is it different for recurring items?
I'd guess about 1kb. No big difference between recurring / non-recurring ones since only the unexpanded (parent) items are cached.
If the cache just allocates new memory and doesn't release it until application shutdown or e.g. upon minimizing the application users will call this a memory leak. Just look at the firefox-leaks-memory-no-it-just-caches discussion every time a new firefox build is released.
I think this discussion runs into a wrong direction: Deleted items are of course removed from the memory cache and modified items get updated. However, I am *not* convinced that releasing items (that are in use) from the cache is relevant (e.g. constrained by a timeout, fixed number of items, LRU or whatever).
>+            // Note: I fixed the "if (asOccurrences)... code" which looks incorrect to me:
>+            // Even if ITEM_FILTER_CLASS_OCCURRENCES is not specified, we need to return those parent
>+            // items which DTSTART/DTEND doesn't explicitly fall, but their expand series falls
>+            // into the demanded range.
>+            // To me ITEM_FILTER_CLASS_OCCURRENCES just states that recurring items are to be returned
>+            // "expanded" or not.

Whats the use-case for returning parents that are outside of the requested range?
(In reply to comment #25)
Thinking about that filter more, you are probably right. But that would need to be changed in calMemoryCalendar as well!? A unit test would also be needed since the  current behaviour is changing. All in all this is really another bug...
(In reply to comment #26)
> (In reply to comment #25)
> Thinking about that filter more, you are probably right. But that would need to
> be changed in calMemoryCalendar as well!?
No, I changed storage to behave the same way as memory (and other providers) work, i.e. take the recurring items into account regardless of whether ITEM_FILTER_CLASS_OCCURRENCES is set or not.
Storage has entered that check iff ITEM_FILTER_CLASS_OCCURRENCES has been set, but IMHO ITEM_FILTER_CLASS_OCCURRENCES only specifies whether the result is to be passed in its expanded form (or not).
(In reply to comment #27)
> No, I changed storage to behave the same way as memory (and other providers)
> work, i.e. take the recurring items into account regardless of whether
> ITEM_FILTER_CLASS_OCCURRENCES is set or not.
Probably, I just need more sleep, but I don't see the memory calendar evaluating the occurences in any case [1].

[1] http://mxr.mozilla.org/mozilla1.8/source/calendar/providers/memory/calMemoryCalendar.js#411
Oh yes, I see. You're totally right, I think I need more sleep... ;)
Attached patch caching - v3 (obsolete) β€” β€” Splinter Review
Cleaned out changes w.r.t. ITEM_FILTER_CLASS_OCCURRENCES, will be handled in bug 416975.
Attachment #299198 - Attachment is obsolete: true
Attachment #303037 - Flags: review?(sebo.moz)
Attachment #299198 - Flags: review?(sebo.moz)
Attached file test calendar β€”
Unfortunately the patch seems to break the app. I will attach a test calendar where I saw the following problem with the patch applied:
- If in unifinder "All Events" is selected, no events are shown
- I could not export the calendar
I did see an error in the beginning but cannot get it now, so I hope, you will figure out the reason without this error message. I recall that startRange being null has been a problem for function calculateDates in calRecurrenceInfo.js.
I hope you can reproduce this behaviour...

Tested with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b4pre) Gecko/2008020823 Calendar/0.6a1
Comment on attachment 303037 [details] [diff] [review]
caching - v3

Just two further comments on the patch:
> 
>             // first get non-recurring todos and recurring todos that happen
That comment needs to be changed.

>-    getEventFromRow: function (row, flags) {
>-        var item = createEvent();
>+    cacheItem: function stor_cacheItem(item) {
>+        this.mItemCache[item.id] = item;
>+        if (item.recurrenceInfo) {
>+            if (isEvent(item)) {
>+                this.mRecEventCache[item.id] = item;
>+            } else {
>+                this.mRecTodoCache[item.id] = item;
>+            }
>+        }
>+    },

It would be nice if we could avoid double caching.
How about querying all three caches in the places where needed?
Attached patch caching - v4 (obsolete) β€” β€” Splinter Review
(In reply to comment #32)
> It would be nice if we could avoid double caching.
> How about querying all three caches in the places where needed?
I think double caching is fine here, since the item is not cloned into each cache. Changing the item in one cache also changes it in both other caches.

This patch fixes what you are seeing, the master items were not correctly returned.

I don't quite know how to understand bug 416975 but I removed a comment regarding this bug since recurring items have to be taken into account also for open ranges, and have been before.

I hope I didn't break any range specific stuff, i.e recurring events should not be returned if not in the range. A simple check using the filter on the unifinder seemed to work, but it would be great if you could also test this.
Attachment #303037 - Attachment is obsolete: true
Attachment #304016 - Flags: review?(sebo.moz)
Attachment #303037 - Flags: review?(sebo.moz)
Comment on attachment 304016 [details] [diff] [review]
caching - v4

As discussed on IRC, there are some flaws in function handleResultItem.
Attachment #304016 - Flags: review?(sebo.moz) → review-
Attached patch caching - v5 (obsolete) β€” β€” Splinter Review
This patch passes all unit tests.

It is ok to go through all recurring items, since in handleResultItems getOccurrences() will return an empty array if the item is totally out of range.

I also put in the checkIfInRange() function for non-recurring items
Attachment #304016 - Attachment is obsolete: true
Attachment #304242 - Flags: review?(sebo.moz)
Attached patch caching - v6 β€” β€” Splinter Review
Thats, what I had in mind. Checking all non-recurring items for the range is not needed since the database-query is doing that. I think the rest is ok. I didn't find time to test whether this actually speeds things up or not.
Attachment #304275 - Flags: review?(philipp)
Attachment #304242 - Attachment is obsolete: true
Attachment #304242 - Flags: review?(sebo.moz)
Comment on attachment 304275 [details] [diff] [review]
caching - v6

Looks good. I have tested this and it seems performance is increased.


r=philipp
Attachment #304275 - Flags: review?(philipp) → review+
Checked in on HEAD and MOZILLA_1_8_BRANCH

-> FIXED
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Target Milestone: --- → 0.8
Depends on: 418657
Status: RESOLVED → VERIFIED
Flags: wanted-calendar0.8+
Flags: blocking-calendar0.8-
You need to log in before you can comment on or make changes to this bug.