Closed Bug 387014 Opened 13 years ago Closed 11 years ago

While Lightning reloads remote calendars, all Thunderbird windows are unresponsive.

Categories

(Calendar :: Lightning Only, defect, major)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Chris, Assigned: dbo)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
Build Identifier: Lightning 0.5 (build 2007062404)

Using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.4) Gecko/20070604 Thunderbird/2.0.0.4 ID:2007060411 and Lightning 0.5 (build 2007062404)

I have several remote calendars which are set to reload every X minutes.  When that auto-reload takes place, every Thunderbird windows hangs and is unresponsive until the reload has completed.  If the remote server is slow to respond, Thunderbird can be locked up for several seconds.  In the compose message window, while writing an email, no characters will appear while typing until the reload has finished.

Example slow loading remote calendar.
http://www.wunderground.com/auto/ical/OH/Cleveland.ics?units=both

Reproducible: Always

Steps to Reproduce:
1. Wait for auto remote calendar reload
2. Thunderbird windows hang
Actual Results:  
Once the reload is complete, Thunderbird performs as usual

Expected Results:  
Reloads should be done in the background asynchronously and not affect the user interaction with Thunderbird.
More info:

Add quite a few remote calendars.  Thunderbird will hang while Lightning reloads all the calendars starting with the first and continuing until the last one has finished reloading.
We have this problem refreshing data from a single on-campus calendar.  The calendar is part of a Zimbra server.  It can take from 20-60 seconds to refresh the data depending on how much calendar data the person has, and Thunderbird is completely locked up until the refresh is complete.
It would be nice if a solution could be found for this but I'm guessing until we have some kind of caching-mechanism thi scan't be resolved. Is this bug blocked by bug 366177 or is it a duplicate?
Severity: normal → major
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: wanted-calendar0.8?
this bug is not related to bug 366177. That one is about storage calendars, while this one is about remote calendars.
I think that it's possible that this bug is not directly related to local vs. remote calendars.  In month view for example, it always takes a long time to switch between months with either local or remote calendars (calculating recurrences?).  So when calendars are reloaded every N minutes, I think that it's possible that Thunderbird is unresponsive become of the general slowness of re-drawing the view rather than the time that it takes to download a remote calendar from the server.
Pete, if I remember correctly (with Lightning 0.5, and I have relocated all remote calendars locally since), Thunderbird would also hang if you only have mail-related windows open at the time of a calendar reload. I do not have any such delays now that I hold my calendars on the local disk.
Also, since it's possible that some Lightning users might use the "World Weather+ 2.1" extension, I should say that it causes Thunderbird to be unresponsive for a few seconds (100% CPU) right before it refreshes the current temperature.  It might be completely unrelated to this bug, but I wanted to mention this.
I can attest that this is not due to any other extensions being installed.  This happens every time with only TB 2.0.0.9 and Lightning .7 or .8pre (nightly).  This is not the 2-3 second unresponsiveness of the local calendar, this is a 20-60 second complete hang on a remote calendar refresh.  I have it set to refresh every 55 minutes, and I don't have a ton on my calendar, but it still takes 20-30 seconds for my calendar to finish and TB is not usable during that time frame.  A colleague uses his calendar much more (probably 2-3x as much data) and TB will hang for 40-60 seconds for him.  Both systems are under a year old.  This is syncing to a campus Zimbra Collaborative Suite server using iCal.

Setting wanted0.8- as the main Calendar developers will not devote any time to
this in the 0.8 timeframe. Patches are, of course, always welcome.
Flags: wanted-calendar0.8? → wanted-calendar0.8-
I get the same behavior with Thunderbird 2.0.0.9 and Lightning 0.7 on both Solaris and Linux (Ubuntu 7.10).  All Thunderbird fucntionality freezes up for 20-30 seconds at a time while it's fetching remote calendars.  It does not only happen when another window is open (like a window composing a new message) as someone said in comment #6.  I see the freeze even if there is only one Thunderbird window open.  When it starts to fetch the remote calendars I see everything disappear from the today pane.  Then while it is updating, I cannot change from one message to another in my inbox, can't even scroll down in the message that was already visible.  If I happen to have a compose window open when it starts updating calendars, that freezes up too.
> (comment #10) It does not only happen when another window is open ... as
> someone said in comment #6.  I see the freeze even if there is only one
> Thunderbird window open.

To clarify my comment #6: I observed this with only the main 3-pane window open in Thunderbird, no need for any additional (compose, etc.) windows to be active, or to view the calendar at that time.
Flags: wanted-calendar0.9?
I fear a design flaw is hitting us hard here: The autoreload timers in calCalendarManager.js refresh every single registered calendar one by one. Everyone of those leads to an onLoad event in composite which forwards it to the views/etc, because the composite calendar (being a unity of all calendars logically) has been refreshed. Since the views/etc trigger getItems calls on composite in their onLoad handler, this leads to unnecessary traffic.
Flags: wanted-calendar0.9?
Flags: wanted-calendar0.9+
Flags: wanted-calendar0.8-
The firing of multiple onloads is on purpose: the views shouldn't wait for the last calendar to finish before redrawing the contents.
But the views should be smarting in handling the multiple onloads. Maybe, in order to do that, the events shouldn't come from the composite, but from the individual calendars.

Also, I'm not sure if the multiple onloads is the only problem. comment 2 indicates that there are also problems with just a single calendar. The first thing to look into would be to decrease load times, but that has it's limits. I think the load should somehow be done in the background. I'm not sure yet how. (Can't use a thread, since most of the time is spend in the UI, something that can't be modified from a background thread.)
(In reply to comment #13)
> The firing of multiple onloads is on purpose: the views shouldn't wait for the
> last calendar to finish before redrawing the contents.
IMHO a poor solution --- because of the unnecessary load it causes for multiple calendars.

> But the views should be smarting in handling the multiple onloads. Maybe, in
> order to do that, the events shouldn't come from the composite, but from the
> individual calendars.
IMO wrong bending of that API: If I register an observer at the composite calendar, I don't expect to get onLoad events from different calendars, but only the one I've registered to.

More than once we've run into problems with the refresh mimic (I remember a last-minute alarm refresh regression I need to do for 0.5). We should throughly revise this instead of putting more constraints into the existing API.

I think the core problem is the composite (multiplexing 1:N) calendar. Notifications work only good into one direction in this model.
Yes, I remember talking about this lately. I think we should use a different set of listeners for the composite calendar, making it not so strictly implement calICalendar.

The different listeners could notfiy onLoad with the specific calendar that is finished loading, in an ideal world telling the view to only refresh events from that particular calendar.
If I understand you correctly, this would also break the listener mimic: I think it's invalid that composite fires onLoad(someContainedCalendar != this) to its observers.
Can we set this as blocking the 1.0 release?  The option isn't available for me do so.
Flags: wanted-calendar0.9+ → wanted-calendar1.0+
I have the same problem.

Is possible make a window refresh only if the calendars has been modified?

Where is the function that draw the window?

Thanks
Blocks: 441710
Duplicate of this bug: 485152
Depends on: 490309
IMO the main problem is that the UI thread is too often blocked by long running code, scanning for events and todos in the providers. I've spent some time investigating to split that off into a separate thread, but finally failed.
It seems almost impossible to create and call our core objects calItemBase/calEvent/calTodo on a separate thread without issues. Besides some wrapper problems (e.g. checks like item.parentItem != item), I most often ran into deadlocks inside the js engine.
Nevertheless, putting the native ics parsing part into a separate thread works and I'll come up with a patch for that (will be covered by bug 490309).

Moreover, what we should do is processing pending UI events, to keep on painting and be responsive (although this won't make the app faster). This patch keeps processing pending events of the current thread while processing events in the provider code. I won't request review for this simple patch yet, because I am encountering occasional problems, and run into borked views or see exceptions like stated in bug 487205:

Error: uncaught exception: [Exception... "Component returned failure code:
0x80004003 (NS_ERROR_INVALID_POINTER)"  nsresult: "0x80004003
(NS_ERROR_INVALID_POINTER)"  location: "JS frame ::
chrome://lightning/content/messenger-overlay-sidebar.js :: swapPopupMenus ::
line 507"  data: no]

I suspect our code is not re-entrant which becomes apparent on startup (maybe also the cause for bug 482152).
Assignee: nobody → dbo.moz
Status: NEW → ASSIGNED
*I'd appreciate further testing on the proposed patch, especially on different platforms.* The borked views problems suddenly seems to have gone for me, though I don't know the exact reason, maybe it's caused by different network latencies.
After applying this patch:

1) After clicking Reload in the context menu of the calendar list, I can now switch to the mail tab much more quickly.  It switches during the reload whereas before the patch it waited until the reload was done.  So that's good.

2) Unrelated to Reloading calendars:  Whenever I switch from the mail tab to the calendar tab, Lightning computes and shows all the events in month view, clears them from the view, then shows them again.  I'm not sure if the patch causes this or if Lightning has always done this (but was hidden from the user because Thunderbird waited to switch to the calendar tab until the composite was completely drawn).  I see the same thing when I change months.  Maybe Lightning redraws the view for each calendar?

3) Regarding bug 482152 ("this.mParentItem is null when start Thunderbird"), I can no longer reproduce it, with or without this patch.

WinXP / TB3.0b2 / Lightning 1.0pre 2009-05-01
Calendars are all local except for one http ics calendar.
Comment on attachment 374759 [details] [diff] [review]
keep on processing events [checked in]

The idea of the patch is nice, it would make lightning 'feel' much more responsive.
But, one thing I worry about is what happens when an event gets fired that also wants to do some operation on calendar data. For example, what happens when the users deletes an item from a calendar when that calendar is still (re)loading? Or when the users moves an event?
I think that it will work, because all calendar operations already take async operations into account, but I'm not sure of that.
(In reply to comment #22)
> 2) Unrelated to Reloading calendars:  Whenever I switch from the mail tab to
> the calendar tab, Lightning computes and shows all the events in month view,
> clears them from the view, then shows them again.  I'm not sure if the patch
> causes this or if Lightning has always done this (but was hidden from the user
> because Thunderbird waited to switch to the calendar tab until the composite
> was completely drawn).  I see the same thing when I change months.  Maybe
> Lightning redraws the view for each calendar?
I think this should be covered in a separate bug. It may well be that this patch uncovers other flaws.

thanks for testing!

(In reply to comment #23)
> But, one thing I worry about is what happens when an event gets fired that also
> wants to do some operation on calendar data. For example, what happens when the
> users deletes an item from a calendar when that calendar is still (re)loading?
> Or when the users moves an event?
> I think that it will work, because all calendar operations already take async
> operations into account, but I'm not sure of that.
Exactly what I wrote at the end of comment #20: I fear this patch might uncover bugs w.r.t. re-entrancy. We need to find those anyway, so this might be a good start.

I appreciate further testing, and will ask for review in the meantime. Anyway, the changes are minimal; we could always easily switch back to the former behaviour.
Attachment #374759 - Flags: review?(philipp)
OS: Windows XP → All
Hardware: x86 → All
Comment on attachment 374759 [details] [diff] [review]
keep on processing events [checked in]

Looks good, r=philipp
Attachment #374759 - Flags: review?(philipp) → review+
Comment on attachment 374759 [details] [diff] [review]
keep on processing events [checked in]

pushed: <http://hg.mozilla.org/comm-central/rev/724a2ac69c25>

leaving open for now for further evaluation and patches
Attachment #374759 - Attachment filename: process-events → process-events [checked in]
Attachment #374759 - Attachment description: keep on processing events → keep on processing events [checked in]
Attachment #374759 - Attachment filename: process-events [checked in] → process-events
Looks like the checkin regressed Bug 492192.
The checkin <http://hg.mozilla.org/comm-central/rev/724a2ac69c25> seems to be responsible for multiple breakage in Lightning:

1. category colors are not displayed;

2. last selected calendar view is not saved across restarts - defaults to "Day";

3. can't open "Properties" from the calendar context menu;

4. calendar icon on the tab is missing unless switching the view, closing the tab and opening the lightning tab again;

This list might be incomplete. The corresponding error in the Error Console:

Error: view is null
Source File: chrome://calendar/content/calendar-views.js
Line: 363

Reversing the patch <https://bugzilla.mozilla.org/attachment.cgi?id=374759> resolves the issue.

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1b5pre) Gecko/20090509 Lightning/1.0pre Thunderbird/3.0b3pre

hg identify
d6977a2fb36c tip

Shall I file a new bug on these regressions?
Yes, please file new bugs on these. However, I can't reproduce your findings in my current nightly setup (having several wcap, ics, local calendars configured), so please add detailed information about your setup. I can experience the flickering mentioned in comment #22.
(In reply to comment #29)
> Yes, please file new bugs on these. However, I can't reproduce your findings > in my current nightly setup [...].

Unfortunately for me, the problem is linked to my local calendar. Can't reproduce the issues with a new profile or with my old profile and an almost empty local.sqlite. But they strike me immediately again, if I import the local calendar previously exported to *.ics.

wc -l calendar_2009-05-09.ics 
15639 calendar_2009-05-09.ics

if it matters. Will investigate later.
Got it. I see actually the same issue as described in comment #27 but in Lightning, just the symptoms are much worse.

STR:

1. Backup your local calendar database

2. Import <http://www.mozilla.org/projects/calendar/caldata/GermanHolidays.ics> (File / Import Calendar...) into the local calendar

3. Restart Thunderbird

Actual results: see comment #28.

I hope that a fix for Bug 492192 will fix this as well.
(In reply to comment #28)
I'm seeing the exact same issues in Lightning: no calendar colors, only one calendar (of 2 or 3) listed in the tree, starting in day view, no lightning tab title and icon, everything disabled in context menu. I use only Google GData Provider calendars, no local ones.

I'm not sure bug 492192 is the same, since the error message is different:
"Error: item.calendar.name is null
Source File: chrome://calendar/content/calendar-view-core.xml
Line: 243"

vs.

"Error: view is null
Source File: chrome://calendar/content/calendar-views.js
Line: 363"

Anyhow, a checkin with so many regressions must be backed out and fixed before resubmitting. That's at least what I expect from Firefox and Thunderbird, I hope Calendar doesn't leave the product broken much longer.
I wouldn't count this as many regressions for this specific case. If something goes wrong while loading the window, then all these symptoms you describe show up at once.

Anyway, I'm sure daniel is working on finding out how to fix this. We were aware that this bug might cause regressions, but given they are hard to identify we decided not to checkin/backout/checkin/backout/... until all cases are found.
The problem seems to be that chrome startup does not expect to halt onload with the UI thread processing further events of its queue, at least not over a longer period, e.g. on larger calendars. This is the case for all synchronous providers (storage, memory), because those don't return from getItems until all data is retrieved. Decoupling getItems from chrome onload solves the problem for me. Further testing wanted, of course. Thanks!
Attachment #376988 - Flags: review?(philipp)
Oops. Strictly spoken making storage/memory work asynchronously violates calISyncCalendar. However, since only calICalendar::getItems is working asynchronously, this should not have effect on caching (read calls like getItems are plain forwarded to the cache calendar while modifying calls must not behave asynchronous).
Comment on attachment 376988 [details] [diff] [review]
postponing getItems calls [checked in]

>+        let worker = { // nsIRunnable:
>+            run: function worker_run() {
>+                func();
>+            }
>+        };
This can be shortened:
let worker = { run: func };


We should be a bit careful with calISyncCalendar. We don't want to rely too much on implementation details, so we should either find a different solution or assume that storage and memory is also asynchronous. Maybe we can get away with reipnaming to calISyncWriteCalendar or such.

r=philipp anyway, I'll leave this to your discretion.
Attachment #376988 - Flags: review?(philipp) → review+
(In reply to comment #35)
> Oops. Strictly spoken making storage/memory work asynchronously violates
> calISyncCalendar.

What if you leave those calendars as they were (thus sync), and make the UI spin the eventloop at times? I always had the feeling that the slowness was not in the storage provider, but in the drawing of eventboxes etc. At least, that is what I got out of my profiling (granted, that was quite some time ago, things might have changed)
(In reply to comment #34)
> Created an attachment (id=376988) [details]
> postponing getItems calls
> 
> Further testing wanted, of course. Thanks!

Can confirm that the patch solves the issues mentioned in comment #28 and in comment #32. Additionally, the flickering when switching to calendar tab has gone.

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1b5pre) Gecko/20090513 Lightning/1.0pre Thunderbird/3.0b3pre

built from 8d3b009b43d7. Thanks a lot!
(In reply to comment #37)
> What if you leave those calendars as they were (thus sync), and make the UI
> spin the eventloop at times? I always had the feeling that the slowness was not
> in the storage provider, but in the drawing of eventboxes etc.

Original submitter here.  The initial reason I added this bug was to stop the UI from locking up my EMail Compose window while the calendars reloaded/redraw.  So, cast my vote for whatever allows my keystrokes to be displayed as I type them while the calendars are reloading.  This suggestion sounds (to me) as if to say that the UI should get locked for reloading/redrawing.  This deficiency is why I still need to run Sunbird as a separate app rather than the Lightning Add-on.  Apologizes in advance if I misunderstood the intent of this comment.
I opted to rename the interface to calISyncWriteCalendar and pushed to comm-central <http://hg.mozilla.org/comm-central/rev/b84ee5636318>.

(In reply to comment #37)
> (In reply to comment #35)
> > Oops. Strictly spoken making storage/memory work asynchronously violates
> > calISyncCalendar.
> 
> What if you leave those calendars as they were (thus sync), and make the UI
> spin the eventloop at times? I always had the feeling that the slowness was not
> in the storage provider, but in the drawing of eventboxes etc. At least, that
> is what I got out of my profiling (granted, that was quite some time ago,
> things might have changed)
This depends. There's definitely a fair bit of the story in the providers for large calendars. Nevertheless, it makes sense to shed light on the UI part, too. Numbers/evaluation appreciated...
Target Milestone: --- → 1.0
Attachment #376988 - Attachment description: postponing getItems calls → postponing getItems calls [checked in]
(In reply to comment #40)
Should the UUID be changed if the interface gets a complete new name?
(In reply to comment #22)
...
> 2) Unrelated to Reloading calendars:  Whenever I switch from the mail tab to
> the calendar tab, Lightning computes and shows all the events in month view,
> clears them from the view, then shows them again.  I'm not sure if the patch
> causes this or if Lightning has always done this (but was hidden from the user
> because Thunderbird waited to switch to the calendar tab until the composite
> was completely drawn).  I see the same thing when I change months.  Maybe
> Lightning redraws the view for each calendar?

follow-up bug 493730 filed
I am a bit unsure whether this bug is sufficiently fixed. Opinions?
I can test on Linux, Solaris, and Windows if anybody has .xpi files built with these fixes in them.  I unfortunately do not have facilities to build them myself...
(In reply to comment #45)
Lightning nightly builds for Thunderbird 3.0b3pre are available from <http://ftp.mozilla.org/pub/mozilla.org/calendar/lightning/nightly/latest-trunk/>
While Lightning still "slowly" redraws views (with calendars that have lots of recurring events), I think the specific issue of this bug is fixed because I can do other things in Thunderbird (e.g. typing in a composition window) while Lightning is reloading/drawing the view.

Yes, there are some small delays from when I type the keys to when they appear in the composition window, but no keystrokes are lost and I think it's no longer true that "Thunderbird windows are unresponsive."

Ideally Lightning wouldn't use 100% of the CPU during reloading or at least not affect the composition window at all, but I'm not sure that this can be fixed in the Lightning code (e.g. it might be a limitation of the Mozilla core or the OS handling multiple threads in the single Thunderbird process, or just that my CPU is too slow!).

Of course I speak only about the behavior on my own XP computer and TB3.0b2.
I agree with Comment #47.  The minor delay while typing in the composition window is quite acceptable.  As the reporter, in my opinion this issue is resolved.
Thanks; resolving FIXED per comment #47 and #48.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
I know that Lightning doesn't use Java, but for what it's worth, I noticed in my Java notes that you can set the priority of threads (as opposed to the whole process).  If that's possible in the Lightning code then maybe the priority of the composition window's thread could be increased or the priority of the calendar reloading decreased.

Theoretically it might make typing in the composition window smoother during reloads, and shouldn't slow the reloading when the user has manually started a reload and thus isn't using the composition window.  Just a thought.
Most of the time there's really only the main resp. UI thread involved. I've investigated heavily into using more threads for calendar, but apart from ical parsing (bug 490309), I refrain from using more threads. They all lead to one or the other problem, most often deadlocks. Thus the patch of this bug just postpones calendar work regularly in case there are pending events on the main/UI thread, which increases responsiveness.
I have a request not directly related to this issue but that patch has highlighted.
I've done some testing with Musicians' Birthdays calendar and on my slow system (celeron 1,3GHz 512MB) to switch from e.g. November to December it takes about 17 seconds, but Lightning uses first 5 sec. to update minimonth calendar on the left side panel, after that it updates month view and starts to show events on the view. Without the patch of this bug was impossible to notice this behavior.

Since 5/17 is ~30%  and, with calendar bigger than that one I used, this operation could take more time, could you consider to postpone minimonth update after the view update is completed?
Total time would be the same but informations about the the events, more important than the free/busy status, would come first and the user could decide to switch to another month without waiting month-view completion (now, with this patch, you don't have to wait the complete update if what you search is already showed).
Could be useful on slow system or with many large calendars.
Unfortunately its not easy to set the order of the getItems() calls
I'm unsure why this is marked resolved fixed, as I am suffering from that problem with a relatively recent nightly (20100625031746)
These bugs are likely targeted at Lightning 1.0b1, not Lightning 1.0. If this change was done in error, please adjust the target milestone to its correct value. To filter on this bugspam, you can use "lightning-10-target-move".
Target Milestone: 1.0 → 1.0b1
I believe your change is an error, Philipp: the bug still occurs in Lightning 1.0. Also, it would be great if someone could reopen it because as Shaya mentioned, it is not fixed (I can also observe this). Sadly, I seem to not have the appropriate permissions to do so myself.
natanji, please file a new bug report on your problem as this issue has already been closed more than a year ago.
I believe someone else already beat me to it: https://bugzilla.mozilla.org/show_bug.cgi?id=576017

It would be great if other people that watch this could see if they can confirm the behaviour described by Fred there. Basically, once the cache feature is enabled in lightning, every calendar freresh creates a lot of disk i/o because of of multiple cache files being created and deleted in quick succession.
I'm still experiencing this exact bug very prominently with TB 15.0.1, Lightning 1.7 on Linux Mint 13 x86_64 kernel 3.2.0-31. Since I'm not using caching it's definitely not #576017.

I have 27 remote calendars, about half of them read-write via CalDAV (google calendar), the other half read-only iCal. Every 30min when they're periodically reloaded TB becomes completely unresponsive for > 1min!
You need to log in before you can comment on or make changes to this bug.