Closed Bug 1246655 Opened 8 years ago Closed 7 years ago

Excessive orphan-nodes in DOM cause performance degradation and memory leak

Categories

(Calendar :: Calendar Frontend, defect)

Lightning 4.0.5.2
defect
Not set
major

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: mozilla.pkoch, Unassigned)

References

Details

(Keywords: perf, testcase)

Attachments

(14 files)

User Agent: Mozilla/5.0 (Windows NT 5.1; rv:43.0) Gecko/20100101 Firefox/43.0
Build ID: 20160105164030

Steps to reproduce:

1) Create a new Thunderbird profile and exit Thunderbird
2) Replace prefs.js with the attached version
3) Start Thunderbird, skip creation of mail account, open Calendar tab, should show 10 Calendars
4) Import c01.ics, ... , c10.ics into Calendar 01..10, all events are in first week of february 2016
5) Open error console and evaluate openDialog('about:memory');
6) in the about:memory-windows "Measuer and save..." a memory report into memory-report-01.json.gz
7) uncheck all calendars starting with calendar 01, then check all calendars again starting with calendar 01. When (un)checking do not wait until checkbox changes its state but click on all 10 checkboxes one per second.
8) repeat step 7) for a couple of times.
9) "Measure and save..." memory-report memory-report-02.json.gz and compare that with report-01


Actual results:

In my case repeating step 7 for 10 times raises memory consumption of Thunderbird (measured with tas manager) from 190MB to 500MB. Also checking calendars is slow (5 seconds) and unchecking is even slower (30 seconds)
Memory report clearly shows that orphan nodes in DOM are using more and more memory.


Expected results:

Orphan nodes should be garbage collected, this should free memory and speed up DOM-operations.
Attached file Data for Calendar 01 β€”
Attached file Data for Calendar 02 β€”
Attached file Data for Calendar 03 β€”
Attached file Data for Calendar 04 β€”
Attached file Data for Calendar 05 β€”
Attached file Data for Calendar 06 β€”
Attached file Data for Calendar 07 β€”
Attached file Data for Calendar 08 β€”
Attached file Data for Calendar 09 β€”
Attached file Data for Calendar 10 β€”
Does it happen the same when you display the events on month or multiweek views instead of week view?
(In reply to Decathlon from comment #14)
> Does it happen the same when you display the events on month or multiweek
> views instead of week view?

After a fresh start of Thunderbird it does not happen when using the month or multiweek view no matter what month or what intervall of weeks is shown.

It also does not happen after a fresh start when using the single week view and the week shown is different from the first week of february (that's where all the events are). Same is true for the day view. It does only happen when a day between february 1 and 5 is shown.

"Does not happen" means clicking the calendars is very responsive and no memory leaking is happening. Checking a calendar needs no longer than a second and unchecking a calender feels even faster (in contrast to the problem where unchecking lasts a lot longer than checking).

If I (un)check calendars while using the week view with the first week of february Thunderbird is slow und memory usage increases. If I switch the view or change to a different week, memory usage does not increase further but slowlyness stays.

So the problem seems to be caused by the week or day view but only if events from different calendars are shown. If after a fresh startup no events are shown or if events are shown in multiweek or month views then everything works as expected. After the week view has been used multiview and month view seem to suffer from the problems the week view has caused.
Attachment #8716961 - Attachment mime type: text/calendar → text/plain
Keywords: perf
I did some profiling, i.e.:

1) Started remote Debugging session between Firefox and Thunderbird
2) Enabled all 10 calendars
3) started profiling
4) unchecked all 10 calandars
5) stopped profiling
6) stored profiling data into profile.json

There are lots of function calls with duration less than 1 millisecond and there seems to be a function that's called once per unchecked calendar. It needs between 190 and 5500 ms depending on how many calendars are still checked. When unchecking the first 3 calendars this functions needs approx. 3400 ms. When unchecking calendar 04 it even needs 5500 ms and from calendar 05 to 10 execution time goes down from 3100 ms to 190 ms.
Attached file Profiling data β€”
removeCalendar: function cCC_removeCalendar(aCalendar) {
   let id = aCalendar.id;
   let newCalendars = this.mCalendars.filter(function(calendar) { return calendar.id != id; });
   if (newCalendars.length != this.mCalendars) {
       this.mCalendars = newCalendars;
       if (this.mPrefPrefix) {
           aCalendar.deleteProperty(this.mActivePref);
           aCalendar.deleteProperty(this.mDefaultPref);
       }
       aCalendar.removeObserver(this.mObserverHelper);
       this.mCompositeObservers.notify("onCalendarRemoved", [aCalendar]);
   }
}

The profiling data made me believe that's there might be a problem with function cCC_removeCalendar(), so I had a look at it. I never saw the Lightning source before, I know only little about JavaScript, so foregive me if this is nonsense. But (newCalendars.length != this.mCalendars) compares an integer with a list, so that's always true. How about: newCalendars.length != this.mCalendars.length
(In reply to Peter Koch from comment #18)

> calendar.id != id; });
>    if (newCalendars.length != this.mCalendars) {
>        this.mCalendars = newCalendars;
> 
> The profiling data made me believe that's there might be a problem with
> function cCC_removeCalendar(), so I had a look at it.

Indeed, this is an error, not for javascript but the meaning of the code is clear. If you can't provide a patch I will do it on your behalf.

Unfortunately I think it doesn't affect the performance issue reported in this bug.

I did some test with TB 47.0a1 and LG 4.9a1.
When TB started it took ~150 MB;
then i did s.t.r. 7 and 8 many times (more than two or three) and finally the memory went to over 500MB;
when I stopped to activate/deactivate calendars, after a few tens of seconds, the memory became  ~300MB-320MB.

As far as I can see orphan-nodes is not that big:

16.23 MB (100.0%) -- window-objects
β”œβ”€β”€10.35 MB (63.75%) -- dom
β”‚  β”œβ”€β”€β”€9.21 MB (56.73%) ── orphan-nodes
β”‚  β”œβ”€β”€β”€0.68 MB (04.17%) ── element-nodes
β”‚  β”œβ”€β”€β”€0.46 MB (02.80%) ── other
β”‚  └───0.01 MB (00.05%) ++ (4 tiny)
β”œβ”€β”€β”€4.65 MB (28.63%) -- layout
β”‚   β”œβ”€β”€1.31 MB (08.07%) ── style-structs
β”‚   β”œβ”€β”€1.27 MB (07.84%) ── frames
β”‚   β”œβ”€β”€0.69 MB (04.27%) ── pres-shell
β”‚   β”œβ”€β”€0.56 MB (03.45%) ── style-contexts
β”‚   β”œβ”€β”€0.38 MB (02.36%) ── style-sets
β”‚   β”œβ”€β”€0.26 MB (01.62%) ── rule-nodes
β”‚   └──0.16 MB (01.01%) ++ (3 tiny)
β”œβ”€β”€β”€1.22 MB (07.54%) ── style-sheets
└───0.01 MB (00.08%) ── property-tables

What is big is: 

166.93 MB (100.0%) -- decommitted
β”œβ”€β”€166.33 MB (99.64%) -- js-non-window
β”‚  β”œβ”€β”€160.33 MB (96.05%) ── gc-heap/decommitted-arenas
β”‚  └────6.00 MB (03.59%) ── runtime/gc/nursery-decommitted
└────0.60 MB (00.36%) ++ workers/workers(chrome)/worker(resource://gre/modules/osfile/osfile_async_worker.js, 0x1174e000)


Calendars with events like yours generate many boxes in the view other than the calendar event ones (try to see with DOM inspector). When changing elements in the view, the function "relayout" inside calendar-multiday-view.xml [1] is responsible for the 95% of the work and maybe there's something to improve or to correct, but maybe does the garbage collector also have a little fault about memory leaks?


For example, about improvements, at first glance the relayout function deletes and redraws the grid inside a column (the lines of the hours) every time an element is added or deleted from a calendar (see first lines of function "clear" [2]) and this could be avoided in that particular case if it doesn't cause side effects but, again, I don't think it would change performance for a case like this.


[1] http://mxr.mozilla.org/comm-central/source/calendar/base/content/calendar-multiday-view.xml#631
[2] http://mxr.mozilla.org/comm-central/source/calendar/base/content/calendar-multiday-view.xml#615
(In reply to Decathlon from comment #19)

> Indeed, this is an error, not for javascript but the meaning of the code is
> clear. If you can't provide a patch I will do it on your behalf.

Please do so - I have no idea how to provide patches

> Unfortunately I think it doesn't affect the performance issue reported in
> this bug.

The whole routine looks strange to me. Its purpose is to remove one element from the mCalendars-array. But instead of removing one element it produces a new array containing all elements of mCalendar except the one that is to be removed. If the filter()-function produces deep copies of all elements of mCalendars except one, then this would need a lot of memory. And this.mCalendars=newCalendars replaces the old list with the new one, releasing all the memory of the old list.

I don't know wether I interpret the profiling data correctly but it seems like cCC_removeCalendar() has a maximum execution time of 5500 ms

> I did some test with TB 47.0a1 and LG 4.9a1.
> When TB started it took ~150 MB;
> then i did s.t.r. 7 and 8 many times (more than two or three) and finally
> the memory went to over 500MB;
> when I stopped to activate/deactivate calendars, after a few tens of
> seconds, the memory became  ~300MB-320MB.

That's exactly what I'm observing.

> As far as I can see orphan-nodes is not that big:
> β”‚  β”œβ”€β”€β”€9.21 MB (56.73%) ── orphan-nodes

That's different from myy observations. Have a look at report-02. 104MB orphan-nodes and only 36MV decommited. Of course that's an older version.

Can you also look at the profiling data and confirm that in my case cCC_removeCalendar() is the routine that consumes most of the execution time (and not relayouting). 

> Calendars with events like yours generate many boxes in the view other than
> the calendar event ones (try to see with DOM inspector). When changing
> elements in the view, the function "relayout" inside
> calendar-multiday-view.xml [1] is responsible for the 95% of the work and
> maybe there's something to improve or to correct, but maybe does the garbage
> collector also have a little fault about memory leaks?
> 
> For example, about improvements, at first glance the relayout function
> deletes and redraws the grid inside a column (the lines of the hours) every
> time an element is added or deleted from a calendar (see first lines of
> function "clear" [2]) and this could be avoided in that particular case if
> it doesn't cause side effects but, again, I don't think it would change
> performance for a case like this.

If relayouting was so expensive then Lightning should not do it over and over again. In our case we want to quickly disable multiple calendars. If I uncheck the first calendar out of 10 calendars Lihtning starts to relayout the remaining 9 calendars. If I then uncheck another calendar while Lightning is still busy it should immedeately stop relayouting and start again now relayouting the remaining 8 calendars.

Either relayouting is started within a separate thread that can be canceled when another calendar is (un)checked. Or the relayouting-routine should peek into the message queue and stops prematurely if an (un)check-event on another calender was detected.

In the first place relayouting of 50 events should not take so long. I will try to understand the algorithm behind that routine. I'm sure I cannot do the coding of an improved version but maybe I can provide some ideas how to improve the algorithm.
(In reply to Peter Koch from comment #20)

> I don't know wether I interpret the profiling data correctly but it seems
> like cCC_removeCalendar() has a maximum execution time of 5500 ms
>

Actually all that time is caused by the relayout function, that indeed is called many times when removing a calendar (I think once for every event of that calendar in the view).

The function removeCalendar actives a notification "onCalendarRemoved" and then the sequence is:
onCalendarRemoved (calendar-base-view.xml) -> deleteItemsFromCalendar (caledar-multiday-view.xml) -> doDeleteItem -> deleteEvent -> (internalDeleteEvent) -> relayout.
doDeleteItem is called once for each item and when it ends, the control comes back to removeCalendar.

For sure optimizations are possible because that code is old but it is quite fragile as well. Something similar had been done for month/multiweek view for particular cases, but here is more complex.

I've sent you an email.
perhaps similar to bug 753343?

Do you also see the symptoms of Bug 441710 - compose and/or UI occasionally freezes or is slow?
Severity: normal → major
Flags: needinfo?(mozilla.pkoch)
See Also: → 1246390
We haven't heard from you in a while, therefore I am closing this bug for now. We can reopen if there are specific actions that can be taken. If there are some functions that can be throttled we should identify those, otherwise I think it would be great if someone can step up and work on a patch.
Status: UNCONFIRMED → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
Flags: needinfo?(mozilla.pkoch)

Just realized this has a testcase. Given the calendar changes of the last couple years, is testcase worth revisiting?

Peter had also opened bug 1246390

Flags: needinfo?(geoff)
Keywords: testcase
Flags: needinfo?(geoff)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: