Closed Bug 1502923 Opened Last year Closed 2 months ago

Thunderbird processing / "Not Responding" 5 to 30 minutes at startup with 63.0beta + Lightning 6.5 and 64.0b1 + Lightning 6.6 after upgrade, and also "Unresponsive Script" - Performance issues in unifinder.js

Categories

(Calendar :: General, defect, major)

Unspecified
All
defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: richard.leger, Assigned: richard.leger)

References

(Blocks 2 open bugs, )

Details

(Keywords: perf, Whiteboard: [not regression])

Attachments

(20 files)

6.23 KB, image/png
Details
34.00 KB, image/png
Details
64.17 KB, image/png
Details
3.59 MB, image/jpeg
Details
35.13 KB, image/png
Details
36.15 KB, image/png
Details
109.51 KB, image/png
Details
14.47 KB, image/png
Details
4.65 MB, application/x-zip-compressed
Details
1.52 MB, application/x-zip-compressed
Details
463 bytes, text/plain
Details
463 bytes, patch
Details | Diff | Splinter Review
909 bytes, patch
Details | Diff | Splinter Review
1.72 KB, patch
darktrojan
: review+
Details | Diff | Splinter Review
3.10 KB, patch
darktrojan
: review+
Details | Diff | Splinter Review
5.59 KB, patch
Details | Diff | Splinter Review
3.68 KB, patch
pmorris
: review+
Details | Diff | Splinter Review
64.58 KB, application/pdf
Details
5.39 KB, application/x-zip-compressed
Details
10.84 KB, text/plain
Details
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:62.0) Gecko/20100101 Firefox/62.0

Steps to reproduce:

Upgraded from TB 60.0b11 to 63.0b1 with Lightning 6.5 add-on last updated 29/10  (and then to 64.0b1 with Lightning 6.6 add-on last updated 29/10) let the application restart when prompted...


Actual results:

The application take ages to start and keep processing/not responding ad vitam aeternam... it is so slow that application is simply unusable... though if you try to open menu after long wait it would do it so the UI is not completely frozen. In safe mode or with just Ligthning add-on disabled (that is the only one active)... such issue does not occur.


Expected results:

After starting the application, it should be usable within seconds... in safe-mode it take about 20-40 seconds to be usable.
FYI, boot-up performance of TB with Calendar has always been bad, has worthen then improved and then worthen again till today it come up to the point where TB can no longer be used!

It would be great if start-up performance with the lightning plug-in could be addressed somehow. I know some of the issue is linked to the mono-processing architecture of TB... but then maybe lightning processing (e.g calendar sync, populate UI widget) may be delayed after boot-up for a while till others task are completed by TB before sync... part of the delay is calendar sync and populating the UI with calendar and task items information...
Severity: normal → major
Component: Untriaged → General
Keywords: perf
Product: Thunderbird → Calendar
Version: 63 → unspecified
Event if I activate Calendar logging, I can't see Lightning logs in the Developer Tools > Error Console... because itself it is not responding as well... nothing shows up... even after a long wait...
After restart of TB I immediately tried to access the Calendar tab view and managed to turned off the three active CalDav Network calendars (out of 6 - 3 were already turned off). 

That seems to have stopped the processing after a while. None of the Calendars are set with offline mode enabled as it has caused issues in the past... some items keep being populated in the view somehow for a short while then it stopped.

After a restart of TB all calendars are gone!!! See attached...
The calendar list was indeed folded, I did not noticed at first that the arrow (in front of Calendar) was horizontal, clicking on it unfolded the list of calendar so they have not disappeared were just hidden... all appear disabled as expected...
OS: Unspecified → Windows 10
Summary: Thunderbird 63.0beta processing / "Not Responding" with Lightning 6.5 after upgrade → Thunderbird processing / "Not Responding" with 63.0beta + Lightning 6.5 and 64.0b1 + Ligthning 6.6 after upgrade
Some additional info about issue in TB 64.0b1 with Lightning 6.6 (29 Oct)...

1.If I open calendar view immediately after TB UI opens, I can see the list of Calendars, and they seems to be well populated in the view....

2.If I keep the Calendar tab view opened and restart TB, I will automatically return to the Calendar tab view but no Calendar appear listed in the left column, and while some calendar items appears in the week view, items are transparent and not colored formatted as usual.

If I close the Calendar view, then re-opened TB and wait a long time it finish processing... if I open Calendar tab view, I get the same result as in .2

Basically the only way I can see my Calendars now is to close Calendar view and reopened TB and then immediately go into the Calendar view as soon as possible.

There are a lots of errors appearing in the Error Console as well... see attached... if that can help.
What other addons have you installed? Does this happen also if you disable _all_ of them and start with only Lightning enabled?

Can you also describe your calendar setup (which types/providers, offline support enabled/disabled)?

Have you made sure that your UI filters (dropddown in the list above the calendar grid, context menu of the task tree in the today pane are not set too broad? In general, these filter setting should limit the timespan to look up reasonable short - unlimited filters like all event or all future events are known to cause similar issues.

Most of the errors are not from Lightning and those that are, are fine (re unknown timezones). Can you post the entire startup log not just limited to errors (you can also just select and copy the content of the error console now, you're not forced to use screenshots anymore)?
Flags: needinfo?(richard.leger)
My point 1. describe in previous comment was no longer working for a while until the computer was rebooted and then this workaround worked again!

(In reply to [:MakeMyDay] from comment #6)
> What other addons have you installed? Does this happen also if you disable
> _all_ of them and start with only Lightning enabled?

Yes.
 
> Can you also describe your calendar setup (which types/providers, offline
> support enabled/disabled)?

Network calendars (calDav), provider Fabit using Davical calendar system running on Ubuntu server 16.04.
Offline support disabled
I even now disabled the Reminders but still same issue.
Calendars are set read/write

Out of 6 calendars only two turned ON one for calendar items and another one used for tasks. The reason it is setup this way is because I share my calendar with other users but not my tasks.
 
> Have you made sure that your UI filters (dropddown in the list above the
> calendar grid, context menu of the task tree in the today pane are not set
> too broad? In general, these filter setting should limit the timespan to
> look up reasonable short - unlimited filters like all event or all future
> events are known to cause similar issues.

I always set the UI filter to "Events in Current View" because of the performance issue encounter with broader filter (would be nice to have this issue sorted at some point as well but may not be possible while TB is in monolithic architecture :-)

I also tried to change the filter to "Currently Selected Day", which adjust accordingly the list in the list item view.

As for the Calendar view it is set on Week and I cannot switch to Day or Month view. When I click nothing happens.

This is the case when no calendar are listed in the left column.

If I manage as describe in my .1 above to access my list of Calendar and proper view, then I can switch between Day and Month view...

If not linked to Ligtning directly maybe the problem lie in the User Interface and the calendar widget view (left column and calendar view part...) items not properly populated in the views...

> Most of the errors are not from Lightning and those that are, are fine (re
> unknown timezones). Can you post the entire startup log not just limited to
> errors 

Would you explain what statup log you need and how I can get them for you?

>(you can also just select and copy the content of the error console
> now, you're not forced to use screenshots any more)?

I have done the screenshot because the interface did not allow me to copy/paste the content of the error/console... which itself sometime to time who "Not Responding"...
Flags: needinfo?(richard.leger)
This is the error console content when the Calendar view load fine (Calendars listed, items formatted ok, can switch between Day,Week,Month views), in that case I can copy/paste:

WebExtensions: Loading packed extension from C:\Users\richard\AppData\Roaming\Thunderbird\Profiles\cnant748.default\extensions\{e2fda1a4-762b-4020-b5ad-a41df1933103}.xpi ext-legacy.js:73
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:13
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:14
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:15
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:16
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:17
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:18
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:19
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:20
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:21
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:22
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:23
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:24
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:25
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:26
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:27
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:28
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:29
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:30
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:31
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:32
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:33
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:34
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:35
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:36
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:37
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:38
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:39
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:40
Ignoring unrecognized chrome manifest directive 'overlay'. chrome.manifest:41
Ignoring unrecognized chrome manifest directive 'style'. chrome.manifest:46
Ignoring unrecognized chrome manifest directive 'style'. chrome.manifest:47
Ignoring unrecognized chrome manifest directive 'style'. chrome.manifest:48
Ignoring unrecognized chrome manifest directive 'style'. chrome.manifest:49
Ignoring unrecognized chrome manifest directive 'style'. chrome.manifest:50
Ignoring unrecognized chrome manifest directive 'style'. chrome.manifest:51
Ignoring unrecognized chrome manifest directive 'style'. chrome.manifest:52
Ignoring unrecognized chrome manifest directive 'style'. chrome.manifest:53
Ignoring unrecognized chrome manifest directive 'style'. chrome.manifest:54
WebExtensions: Loading add-on preferences from  C:\Users\richard\AppData\Roaming\Thunderbird\Profiles\cnant748.default\extensions\{e2fda1a4-762b-4020-b5ad-a41df1933103}.xpi ext-legacy.js:106
WebExtensions: Firing profile-after-change listeners for {e2fda1a4-762b-4020-b5ad-a41df1933103} ext-legacy.js:110
Lightning: [calTimezoneService] Loading resource://calendar/timezones/zones.json calTimezoneService.js:73
Lightning: [calTimezoneService] Timezones version 2.2018e loaded calTimezoneService.js:124
TypeError: control.radioChildConstructed is not a function[Learn More]
radio.xml:29:13
Use of Mutation Events is deprecated. Use MutationObserver instead. calendar-widgets.xml:272:20
Lightning: CalDAV: send: <?xml version="1.0" encoding="UTF-8"?>
<D:propfind xmlns:D="DAV:" xmlns:CS="http://calendarserver.org/ns/" xmlns:C="urn:ietf:params:xml:ns:caldav"><D:prop><D:resourcetype/><D:owner/><D:current-user-principal/><D:supported-report-set/><C:supported-calendar-component-set/><CS:getctag/></D:prop></D:propfind>
calDavCalendar.js:1708
Lightning: CalDAV: Error without status on initial PROPFIND for calendar Richard's Calendar calDavCalendar.js:1718
Lightning: There has been an error reading data for calendar: Richard's Calendar.  However, this error is believed to be minor, so the program will attempt to continue. Error code: DAV_ERROR_BASE. Description: The resource at https://*******.*******.uk/davical/caldav.php/richard/home is either not a DAV collection or not available calCalendarManager.js:970
Lightning: There has been an error reading data for calendar: Richard's Calendar.  However, this error is believed to be minor, so the program will attempt to continue. Error code: READ_FAILED. Description: calCalendarManager.js:970
Lightning: CalDAV: Error without status on initial PROPFIND for calendar Richard's Tasks calDavCalendar.js:1718
Lightning: There has been an error reading data for calendar: Richard's Tasks.  However, this error is believed to be minor, so the program will attempt to continue. Error code: DAV_ERROR_BASE. Description: The resource at https://*****.******.co.uk/davical/caldav.php/richard/tasks is either not a DAV collection or not available calCalendarManager.js:970
Lightning: There has been an error reading data for calendar: Richard's Tasks.  However, this error is believed to be minor, so the program will attempt to continue. Error code: READ_FAILED. Description: calCalendarManager.js:970
Lightning: Start loading of locale dependent preference default values... calendar-chrome-startup.js:255
Lightning: Loading of locale sensitive preference default values completed. calendar-chrome-startup.js:275
1541070306294	Telemetry::CoveragePing	ERROR	no endpoint base set Log.jsm:684
Error while calling actor 'parentProcessTarget's method 'ensureCSSErrorReportingEnabled' Argument 1 of InspectorUtils.getAllStyleSheets is not an object. protocol.js:1041:5
writeError
resource://devtools/shared/protocol.js:1041:5
handler/</<
resource://devtools/shared/protocol.js:1235:31
ensureCSSErrorReportingEnabled@resource://devtools/shared/base-loader.js -> resource://devtools/server/actors/targets/browsing-context.js:1004:9
handler@resource://devtools/shared/base-loader.js -> resource://devtools/shared/protocol.js:1198:21
onPacket@resource://devtools/shared/base-loader.js -> resource://devtools/server/main.js:1313:15
send/<@resource://devtools/shared/base-loader.js -> resource://devtools/shared/transport/local-transport.js:64:11
exports.makeInfallible/<@resource://devtools/shared/base-loader.js -> resource://devtools/shared/ThreadSafeDevToolsUtils.js:109:14
exports.makeInfallible/<@resource://devtools/shared/base-loader.js -> resource://devtools/shared/ThreadSafeDevToolsUtils.js:109:14
protocol.js:1044:7
writeError
resource://devtools/shared/protocol.js:1044:7
handler/</<
resource://devtools/shared/protocol.js:1235:31
stylesheets actor: fetch failed for data:text/css,, using system principal instead. stylesheets.js:213:5
uncaught exception: Protocol error (unknownError): Argument 1 of InspectorUtils.getAllStyleSheets is not an object.
When it works, I seems to observe that the processing wheel of TB keep appearing while the Calendar view is very very very slowly populating Calendar items in the calendar Week view... and it remains in processing status till the calendar tab view is fully populated (all items listed in calendar Week view, list of items based on "Currently Selected Day" filter, fully populated). So it seems the performance and delay is due to populating items in the User Interface and its various widgets.

Also noticed this AM, when the Calendar Tab was not populating properly (No calendar listed and view not fully populated formatted ok) that some of my IMAP folder where also missing in the mailbox view. Not all subfolders were listed. But at the moment after reboot of computer and TB seems ok.

So may means there may be some UI bug/performance issues in the beta version of TB perhaps.
The log indicates your calendars cannot be loaded at all, most probably due to a timeout. Is your network connection stable?

Another thing you could check is whether applying the workaround mentioned in bug 1468755 and restarting TB resolves the issue (if not remove the workaround setting again)?

However, if you're missing imap folders as well, this indicates more a network problem (and how the UI deals with it, though I don't have any idea why this is happening - I have a TB here that is happyly loading even if it fails to get a response from a caldav server because it is down).

Can you reproduce the behaviour with a new profile?
(In reply to [:MakeMyDay] from comment #10)
> The log indicates your calendars cannot be loaded at all, most probably due
> to a timeout. Is your network connection stable?

I tried on separate connection client side to the server look stable. No packet loss to the server. The delay and slow down also happened on 50Mo/s symetric lease line... with 80/20 FTTC line at the other end...

> Another thing you could check is whether applying the workaround mentioned
> in bug 1468755 and restarting TB resolves the issue (if not remove the
> workaround setting again)?

Could not find the workaround in the bug report you refer to... which seems related to cardbook...

> However, if you're missing imap folders as well, this indicates more a
> network problem (and how the UI deals with it, though I don't have any idea
> why this is happening - I have a TB here that is happyly loading even if it
> fails to get a response from a caldav server because it is down).

This only happened once. Most of the time it works fine. I have no issue with accessing my mailbox on other client nor my calendar items...

> Can you reproduce the behaviour with a new profile?

I will try it out...
sorry, wrong bug reference. Must have been bug 1468912 - see the user story at the top.
Some update...

As soon as I got access back to my calendar, I have tried to delete and recreate recent items and it turns out one of them could not be deleted via TB (corrupted somehow?). 

I manage to delete it via my phone calendar app and recreated the item via TB fine. Since then it seems I can more reliably access my calendars... they keep listed on the left side and formatting looks fine. The "Not Responding" mention disappeared.

I do not see any time out with our server at the moment, and have not observed any IMAP issues like previously... but I still have performance issues to load the calendar items... especially at boot of TB on both stable ADSL and 4G connection...

So I decided to access my calendars via CyberDuck software (https://cyberduck.io/) and when I tried to download one .ics items ,from my home calendar, of about 1KB in size, it takes 35s to prepare... plus few seconds to effectively download the items. Total ~37s.

If I do the same on a similar size item in my tasks calendar (separate one - but on the same server/system) it takes only few seconds to prepare and download.
Total ~2s.

This may confirm that the issue may lie in my home calendar somehow and perhaps not Thunderbird or Lightning. I will work at fixing this issue in measure of possible... and revert back to confirm.

If still an issue after that I will try the workaround suggested. Thanks for help.
My home calendar has x10 times more items than my tasks calendar maybe that's what causes an issue/delay server side perhaps somehow... or additional corrupted items that may still be identified... 

Thanks to all for your feedback and help so far. Much appreciated.
I am still investigating this issue that is still occurring and may be cause by a very slow loading caldav calendar (root cause unknown), and one thing I have noticed in TB 64.0b3 (32-bit) is that:

1. if I choose 'Currently Selected Day' in the 'Find Event' view in Calendar tab and restart TB, all calendars appear listed fine, and while it is slow, event items are populated with few seconds interval in the view. TB is processing, extremely slow and almost unresponsive with inability to read email in my IMAP account for a while. Email content not loaded (via IMAP) when clicking on email till calendar is fully populated and loaded which takes a while...but works in the end.

2. if I choose 'Event in Current View' in the 'Find Event' view in Calendar tab and restart TB, calendars no longer appear listed in the left column, some event items appear as trans lucid events and the calendar is not populated any further... TB remain in processing state and unresponsive with inability to read email in my IMAP account or access any calendars. It does not work in the end...

I understand that there may be some issue with calendar perhaps or connectivity to the server possibly (that I still need to figure out as the root cause hopefully, though I haven't found any so far) but shouldn't TB be able to raise some sort of error or warning and handle the situation in a smooth and "civilised" manner without its processing/interface going haywire?

Why would it work more or less with the 'Currently Selected Day' and not with the 'Event in Current View'?
I can now confirm the issue encounter and reported in this bug report is due to Thunderbird/Lightning...

# Part 1 - Troubleshooting steps and Findings

Troubleshoot Findings:
- Clearing any error on calendar (timezone, freedesktop url, parsing issues) did NOT fixed the performance problem
Steps to clear parsing errors:
-- Downloaded my all .isc calendar via web, open in text file and searched for item(s) with the wrong timezone...
-- Opened envent in TB, show time zone (via menu) and changed timezone to the correct one.
-- Deleted some of events which were invites send from microsoft system (Outlook and/or Office 365) all of them have parsed warning in Lightning log... Replaced by TB event setup manually with similar setting to keep record of them (but no longer the original invite).
-- Lookup for all items with error "X-LIC-ERROR;X-LIC-ERRORTYPE=PROPERTY-PARSE-ERROR:Parse error in property name: ACKNOWLEDGED" and disable or reset reminder for all of those to correct the Parse error.
-- Lookup for all items with error "X-LIC-ERROR;X-LIC-ERRORTYPE=VALUE-PARSE-ERROR:No value for URL property. Removing entire property:". Open them, add attachment save, re-open, remove all attachment, save. Deleted event and recreated for those which were still having issues after that.
==> This has now cleared all errors and parsing warning in Lightning log... but still have performence issues...
- The probleme occurs only at TB startup when calendar is active and view set to 'All Events' (or other view showing a high number of events in the list)
- The problem increase as the number of items in calendar increases (e.g ~4000 items recurring, non recurring, invites over 6 years)
- Server send immediately response to client upon reception of request - No delay occur server side
- TB send multiple requests (multiget) to retrieve calendar items by chunck of about 100, but there is a delay between each request (during which TB is unresponsive/processing client side), that increase over time

==> therefore my suspicion is that the issue lie within TB/Lightning! So I setup a test environment to confirm it...

# Part 2 - Setup test enviroment for further troubleshooting and reproduction of issue at will

- Create a new TB profile, don't setup email (cancel email setup wizard)
- Add new network caldav calendar (https://.../calendar with about 4000 itesm) with no offline support, no email (I also disabled reminders but not an obligation)
- Login to calendar
- Delete the local 'Home' calendar
- Add CA certificate for my calendar server (to avoid having to setup an exception for SSL certificate validation)
- Make sure it shows 'All Events' in the 'Find Events' section in Calendar tab but don't touch any other TB settings, especially don't touch any of the sorting column headers (Title, etc...)
- Wait till no network connection to server calendar is establshed (443 connection to server disappear from network connection monitoring)
- Close all TB windows and make sure process thunderbird.exe terminates.
- Close all other application running on the PC.
- Open Task Manager (on Windows 10) > Performance tab > Wifi (if connected via Wifi to network, Ethernet otherwise) <== this allow to monitor the network traffic 
- Open Ressource Monitor in parallell > Network tab
- Open TB with -p option for opening profile manager
- In Ressource Monitor > Network, tick thunderbird.exe to limit monitoring to this process only, and then expand the "TCP Connections" section to observe status of connection for calendar sync (server IP, port 443) <== this help to confirm when connection cease and sync has completed (as TB does not show any progress status)
- Now open the TB profile use for testing (no email setup, just one network calendar)
- Enter credentials and login to calendar when prompted (disregard any email setup wizard if prompted)
- The calendar sync shall start, and items start to appear in calendar view
- Then observe network request/response in Task Manager > Performance tab > Wifi

Test 1 - OK:
- Following setup above, and open TB as indicated...
==> I can see about 12 CONSECUTIVE network request/response picks (send/receive traffic) over one minute time (corresponding to the multiget request from TB), with the first pick at about 6Mo and the following one between 1.5Mo to 2Mo each (about). The calendar takes about 1 minute to load completely with TB interface remaining available and usable at all time during this period (browse calendar at different dates, etc...)

Test 2 - KO:
- Following setup above but before closing all TB windows and reopening it as indicated to monitor, make sure "All Events" view is set, and click on the "Title" header column to order items by name in the "Find Events" section.
==> The network request/response picks (send/receive traffic) no longer appear consecutive, a delay appears between each request, and the delay increase exponentially between two picks as time passes... each pick still correspond to request/response (server answer immediately - multiget request from TB). It look like the calendar keeps loading indefinitely with TB interface freezing/unresponsive for long seconds at regular interval becoming unusable (almost), cannot browse calendar, cannot change the view, etc... very long anc clancky UI delays...

Test 3 - KO then OK:
- If reproducting test2 but immediately after login when calendar start to load I attempt (with difficulties and the UI is very very slow almost unresponsive) to change the view from "All Events" to "Currently Selected Day"... after a while, TB become responsive again and the loading of calendar resume and complete fast (as in Test 1), meaning it fixes the on going issue and TB become responsive again.

This testing and result now confirm that the issue lie client side (Thunderbird/Ligtning) and not server side (or network issue).

More info in my next comment...
# Part 3 - Source of issue

As I was testing restarting TB with "All Events" view set and after clicking on "Title" header column in "Find Events" section in "Calendar Tab", the following error message appears regularly in TB (see attached) while application seems to freeze/become unresponsive:

Unresponsive Script
...
Script: resource://calendar/modules/utils/calUnifinderUtils.jsm:110

This happens as delay between TB requests to load calendar items by chunks of about 100 items increases over time, as observed in Task Manager > Performance > Wifi graph.

I believe such delay are being caused by the unresponsive script which is trying to order items by "Title" in the "Find Events" section after EACH request/response... the total number of items increasing over time, the sorting takes more and more time to complete for some reason causing the delay between two request to increase.

My understanding it that the issue may lie in this peace of code within the script but I don't know for sure:

+    /**
+     * Sort the unifinder items by the given sort key, using the modifier to flip direction. The
+     * items are sorted in place.
+     *
+     * @param {calIItemBase[]} aItems           The items to sort
+     * @param {String} aSortKey                 The item sort key
+     * @param {?Number} aModifier               Either 1 or -1, to indicate sort direction
+     */
+    sortItems: function(aItems, aSortKey, aModifier=1) {
+        let comparer = calunifinder.sortEntryComparer(aSortKey);
+        aItems.sort((a, b) => {
+            let sortvalA = calunifinder.getItemSortKey(a, aSortKey);
+            let sortvalB = calunifinder.getItemSortKey(b, aSortKey);
+            return comparer(sortvalA, sortvalB, aModifier);
+        });
+    }
+};

And more precisely at possible line 110 corresponding to the following possible action:

let sortvalB = calunifinder.getItemSortKey(b, aSortKey);

Though I would not be able to know why or how to fix it!

Help from developers would be welcome.

I think there may be two things that could be improved:
- First, the sorting shall not occurs after each chunk of data response received, but only once at the end when all items have been loaded into TB. There may also need some sort of optimisation to make sure the sorting of 4000 items or more does not take too long in TB!!!
- Second, it would be great if the size of chunk could be customised via a configuration setting as slicing a 4000 items calendar by chunk of 100 is not really efficient to my opinion, it creates two many requests to the server which may be source of performance issue due to network overhead. As the total number of items increase in calendar, the chunk shall be bigger or let the client user decide the best optimised value of chunk size. A user on a 1GB LAN network shall be able to load calendar faster via bigger chunk (smaller number of request) - while a mobile user with poor connection may still benefit from smaller chunk size (higher number of request). Some CalDav client allow to end-user to set such setting, it would be great if Lightning would allow it as well.

In any case, Thunderbird UI shall remain available at all time during the loading/sorting of calendar data by chunk and not freeze or become unresponsive (processing state).

Hope those findings would help have issue sorted somehow.
The temporary workaround to this issue (until fixed in TB) is therefore to make sure the "Currently Selected Day" is set instead of "All Events" in the "Find Events" section of the Calendar tab prior closing/exiting Thunderbird, so when it is re-opened, this view would be set by default to "Currently Selected Day" (as TB remember the last setting set), minimising the number of items being sorted and listed during calendar loading from the server.
It would also be great and handy for Ligthning to indicate loading progress of calendar(s) either by percentage in status bar or via the Activity Manager windows (or both) so we could at least monitor and see what is happening when calendar(s) are loading (Lightning processing), watch progress and see status of completion... the same way we can see progress of indexing of mailbox for example...
For the record, tests and results in Comment 16 and Comment 17 were obtained via Thunderbird Version 64.0b3 (32-bit) [Lightning 6.6] on Windows 10 Pro 64 bits environment. But issue may have been existing in earlier version of TB at least seen in TB 60.3.2 (32 bits) on Windows 7 recently and possibly other previous versions...
See attached network performance graph corresponding to Test 1
OK - Normal result

Graph looks like this:

/\/\/\/\/\/\/\/\ where one pick is a request/response (send/receive)

with:

- setting "All Events" with no particular column header sorting set (which is the default after adding new calendar in TB in a new profile)
- sync caldav calendar complete within one minute
- TB remain responsive at all time
See attached network performance graph corresponding to Test 2
KO - Abnormal result

Graph looks like this:

/\_/\__/\___/\____/\_ where one pick is a request/response (send/receive) and flat line (no traffic - delay)

with:

- setting "All Events" with "Title" column header sorting set ascending order (set by clicking on the header column after adding the calendar)
- sync caldav calendar never really complete it keeps syncing
- TB become unresponsive (freeze/hang/processing mode)
Someone else encountered the issue recently with the same script and reported here: https://forums.mozfr.org/viewtopic.php?t=139935

Rough translation from quoted text in French:

"...
Symptoms : Thunderbird freeze.
It show some error alert: "Warning: Script unresponsive", with button choice to continue or stop the script.
I tried each: One click on one of the button does not change anything.
Script at issue (for example) = resource : //calendar/modules/utils/calUnifinderUtil.jsm:111
..."

Possible cause/solution indicated in further post on the same page?

"...
The problem was due to an upgrade of TB incompatible with other modules Thunderbird (SOGo).
A re-installation to version v52.9 allowed to resolve the problem (the module SOGo being a module which install the Lightning module).
So re-installed TB 52.9
Deleted all the modules
Re-installed the SOgo module which install Lightning
..."

I do have the SOgo connector but it is disabled see attached... though I don't see how it would interfere with TB/Ligthning or cause issue as it is disabled (because not compatible) among other add-ons no longer compatible.

I am reluctant to downgrade as this may cause additional issues.

It there a way to confirm that some functions from the SOgo modules are triggered and causing the issue while module is disabled?
As I just checked in my test environment (separate clean profiles with only one network caldav calendar setup) as per Test 1,2,3 (as described in Comment 16) the SOGo Add-On is not listed (see attached). None of the non-compatible Add-On are listed. So in theory no Add-On be the source of the issue... but Lightning. Back to square one!
As per Bug 1486692 Comment 12 someone else seemed to have encountered the same symptoms in Thunderbird 60 / Lightning 6.2 on Windows 10 Pro.
Hi MakeMyDay,

As you would see in my many comments above, the issue seems to lie in Thunderbird/Lightning at the moment...

I suspect this bug 1502923 may have been caused by (or be consequence of) changes implemented earlier this year as described in Bug 1439900 "Move unifinder related functions to calUnifinderUtils.jsm". It could also well be that the problem always existed but may have been emphasised somehow by those changes...

Would you be able to help further to troubleshoot and have this issue sorted somehow?

Cheers,
Flags: needinfo?(philipp)
Flags: needinfo?(makemyday)
Hi Richard,

Yes https://bugzilla.mozilla.org/show_bug.cgi?id=1322233#c42 refers to the same problem you describe.  I didn't take my investigation as far as you did.  The solution for me is to set "Events in the Next 14 Days" at least until Lightning is started.

It would be nice if Thunderbird/Lightning did this on startup, then restored the setting to "All Events" after the Thunderbird ecosystem is setup, operating, and stable.  In my mind it shouldn't be attempting all this sorting activity during "initialization"/instantiation, rather it should delay this activity until the Thunderbird/Lightning is fully initialized and transitioning to "normal" run-time operation.

But it is what it is, and as you note the workaround is to limit display of events to 14 days or less during startup, then set to your preferences after startup completes.  This is an odd thing to have to do for "normal" operation.  To me this is a bug in the Lightning code. i.e. disable lightning and the startup problem goes away, and on my 10 email account system, TB startup is nearly instantaneous without Lightning.  Reenable Lightning with "All Events" and startup is unresponsive for ~½ hour.  Since I rely heavily on Lightning, running TB without Lightning is also not acceptable.  This was 4 months ago and I'm pretty sure I was communicating with the Lightning developer (he demonstrated deep knowledge), so I'm surprised this problem still exists in production code.

BTW, I'm just a normal user, not a developer here.

My 2¢,
Craig
@ Richard and Craig, 

I found this thread searching for help with a recent series of upgrades to TB 60, 62.1 and 63.3
Initially I experienced a completely unresponsive TB so I downgraded to 52.9.1 and got my mail program and calendar working again. This joy was short lived. Deleting TB 60 caused me to lose my Firefox browser. 
After re-installing Firefox,  migrating prefs and syncing FF I was almost back to normal but save for the realization that I hadn't updated my white list for NoScript in quite some time. This was
unfortunate but my own darn fault. I left a copy of the file on my desktop to remind me to keep it updated.

About a day later a restart of my computer for a Windows upgrade caused TB to upgrade to 60 again. This happened despite turning off automatic upgrades so I resolved to use my phone and web based browser for a calendar until I could upgrade to a newer version of TB.  I held out hope that the developers would address the issue in a future upgrade. 

TB 62.1 caused me to lose Lightning altogether. TB mail worked great but I still was hoping for a functioning calendar. 
After checking the Addons and finding that there were no compatible versions of Lightning I searched on-line and discovered that the automatic update of Lightning was "stuck." To fix it was pretty easy - https://support.mozilla.org/en-US/kb/calendar-updates-issues-thunderbird#w_lightning-disappears-after-a-thunderbird-update-release-and-beta-versions

Score one for the good guys! 

However, now that I had updated Lightning I still wasn't any closer to optimum functionality of TB mail. TB calendar barely worked at all. It was sort of a read only calendar that seemed to completely bog down TB initialization. Another upgrade a few days later to TB 63.3 returned the same problems; slow loading and unresponsiveness that appeared to improve over time but still wasn't anywhere near useful. 

It was then that I came across this thread. I was searching on-line for the name of the script that seemed to be creating the problems with my TB calendar. 

//calendar/modules/utils/calUnifinderUtil.jsm:111      

After reading through Richard's exhaustive account of his trouble shooting efforts I decided to give his suggestion a try.  Craig's confirmation gave me confidence that it would work. 

I set the events to the next 14 days and restarted TB. Bam, it worked like a charm. It's not really a fix, more like a work around as Arno suggested but if you have been struggling with Mozilla's TB upgrade I suggest you give it a try.   

Thank you so much Richard and Craig! I created an account just to pass on my gratitude. 

Best wishes, 

Bob
Find attached DevTools Profile (Network/Performance) when calendar load properly (OK)

As per Test1: Find Events > All Events > No column order set (default)

As per ref: Comment 16 and Comment 21
Find attached DevTools Profile (Network/Performance) when calendar does NOT load properly (KO)

As per Test2: Find Events > All Events > Column order set by Title

As per ref: Comment 16 and Comment 22
The DevTools profile send in previous comments were taken with Thunderbird 64.0b3 (32-bit) on Windows 10 Pro 64bits 1803 (Version 10.0.17134.407).

But the same issue is also happening with Thunderbird 64.0b4 (64 bits) on Mac OS X 10.13.6.

While the workaround works, it currently rely on user interaction to remember changing the setting prior closing TB, not the most convenient.

In any case it cannot be considered as a long term solution, which should be to address the underlying and recurrent performance issue in Thunderbird and especially calendar loading shall work fine at all time without requiring workaround.

Feedback on this issue from Thunderbird developers would be much appreciated.
Summary: Thunderbird processing / "Not Responding" with 63.0beta + Lightning 6.5 and 64.0b1 + Ligthning 6.6 after upgrade → Thunderbird processing / "Not Responding" with 63.0beta + Lightning 6.5 and 64.0b1 + Lightning 6.6 after upgrade
Summary: Thunderbird processing / "Not Responding" with 63.0beta + Lightning 6.5 and 64.0b1 + Lightning 6.6 after upgrade → Thunderbird processing / "Not Responding" with 63.0beta + Lightning 6.5 and 64.0b1 + Lightning 6.6 after upgrade - Performance issue
Summary: Thunderbird processing / "Not Responding" with 63.0beta + Lightning 6.5 and 64.0b1 + Lightning 6.6 after upgrade - Performance issue → Thunderbird processing / "Not Responding" with 63.0beta + Lightning 6.5 and 64.0b1 + Lightning 6.6 after upgrade - Performance issues
This bug may also partially be linked to underlying issues in TB core (Dom loading, Gecko and/or Javascript engine) as I seems to also get huge delay and "Not Responding" issue with the devtools when trying to load a request response from Ligthing REPORT request in DevTools > Network.

For reference see also bugs:

Bug 1513551 Developer Tools - chrome://messender/content/messenger.xul - (Not Responding)

Bug 1515123 devtools network response from a Ligthting REPORT request loads very slowly (long delay) with UI freeze/not responding

Bug 1473163 devtools inspector not responding while loading external assets on slow connections

(In reply to Richard Leger from comment #26)

Hi MakeMyDay,

As you would see in my many comments above, the issue seems to lie in
Thunderbird/Lightning at the moment...

I suspect this bug 1502923 may have been caused by (or be consequence of)
changes implemented earlier this year as described in Bug 1439900 "Move
unifinder related functions to calUnifinderUtils.jsm". It could also well be
that the problem always existed but may have been emphasised somehow by
those changes...

Would you be able to help further to troubleshoot and have this issue sorted
somehow?

Sorry Richard, I have not too much time at the moment as you might have already noticed. You seem to have done an awesome job to debug the issue (though I have to admit I haven't read everything now).

If it would be related to bug 1439900, you would most probably see error messages in the error console. I assume the issue existed before but you didn't notice, since the vector of potentially relevant settings and conditions is quite hugh.

I will try to come back to this bug if I get some more time.

Hi,

(In reply to [:MakeMyDay] from comment #34)

Sorry Richard, I have not too much time at the moment as you might have already noticed.

I understand you may not have time yourself, but could you have someone else from the TB Dev team to look into it urgently?
This is a critical bug that impact performance of TB drastically. It has been 4 months that the issue has been reported and no-one seems to have look into it so far from the TB dev team.

Because of this bug, some TB users have to wait from 5mn to 20mn before being able to use TB when they open it. The current possible work around cannot be considered as a solution and is not workable for most users as it needs to be set prior they close TB which most users forgot to do, causing them headache next time they open TB.

When issue occurs, it is impossible to use TB or access even emails!!!

This should be set as critical bug and be sorted out urgently by the next stable release, it cannot wait any longer...

Could you or someone else have a look at it urgently?

Would really appreciate your help as this is as a huge impact for end-users at the moment and is damaging to TB image and reputation forcing people to consider using alternative software... as TB become not fit for purpose :-)

The impact on user base is growing as this bug has now made its way in the current stable release of TB!

Regards,
Richard

FYI, issue is still happening in Thunderbird 66.0b2 (32-bit).

TB remain unusable for between 5mn to 30mn when issue occurs (at startup).

Summary: Thunderbird processing / "Not Responding" with 63.0beta + Lightning 6.5 and 64.0b1 + Lightning 6.6 after upgrade - Performance issues → Thunderbird processing / "Not Responding" with 66.0b2 + Lightning 6.8 (and 64.0b1 + Lightning 6.6 and 63.0beta + Lightning 6.5) after upgrade - Performance issues
Version: unspecified → Lightning 6.8
Summary: Thunderbird processing / "Not Responding" with 66.0b2 + Lightning 6.8 (and 64.0b1 + Lightning 6.6 and 63.0beta + Lightning 6.5) after upgrade - Performance issues → Thunderbird processing / "Not Responding" with 66.0b2 + Lightning 6.8 (and 64.0b1 + Lightning 6.6 and 63.0beta + Lightning 6.5) after upgrade - Performance issues - Application unusable for 5mn to 30mn after startup
Severity: major → critical
Severity: critical → major
Summary: Thunderbird processing / "Not Responding" with 66.0b2 + Lightning 6.8 (and 64.0b1 + Lightning 6.6 and 63.0beta + Lightning 6.5) after upgrade - Performance issues - Application unusable for 5mn to 30mn after startup → Thunderbird processing / "Not Responding" 5 to 30 minutes at startup with 63.0beta + Lightning 6.5 and 64.0b1 + Lightning 6.6 after upgrade - Performance issues
Whiteboard: [regression:tb63?]
Version: Lightning 6.8 → unspecified

Please keep the bug specifications pointing to the earliest version in which the issue is found.

I had the same behavior (Thunderbird unusable and CPU at 100%) on Thunderbird 60.5.1 (latest debian stable) with Lightning 6.2.5. After setting the calendar to show only "Events in the Next 14 days" (as suggested in previous comments) the problem went away.

Maybe this guide should be updated with this workaround.

Wayne, I did left the earliest version mentioned when changing the title, my intention was to highlight in the title of the bug, the fact that it also affect 66.0b2 + Lightning 6.8.

I don't think the workaround can be considered as a solution as it does not always work... and the issue still cause major pain to end-users.

Why is this bug not looked at by someone?

All it would take to fix the issue is to prevent the view to re-order items EVERY time one is added in the list view.
Maybe just disable the ordering function for now... until long term fix is implemented (e.g run the sort function ONLY after the last item is added).

That shall not be that hard and long to quick fix for someone who is familiar with the code.

At the current speed, this bug would still be present in the next stable version of Thunderbird while it should be fixed by then.

It has been 4 months now this bug is present in TB for 6 version...

I understand your enthusiasm, unfortunately we haven't been able to find time for this bug. The sorting code has been there for more than 6 versions afaik, the bug you referenced just moved the code somewhere else.

This bug has accumulated a bunch of comments, and from what I understand the problem is sufficiently described. I think it would make sense to wait for someone to pick up this bug, they may have more specific questions.

I found what I believe is a simple fix for this bug.


In chrome://calendar/content/calendar-unifinder.js line 148 I would suggest to replace ...

unifinderTreeView.addItems(items.filter(filter.isItemInFilters, filter));

by

// Fix for bug Bug 1502923
unifinderTreeView.addItems(items.filter(filter.isItemInFilters, filter), true);


Notice the 'true' parameter added a the end of the addItems function...

Explanation...

Looking at the stack trace:
sortItems (calUnifinderUtils.jsm#108)
sortItems (calendar-unifinder.js#548)
addItems (calendar-unifinder.js#449)
addItemToTree (calendar-unifinder.js#148)
onAddItem (calendar-unifinder.js#87)
notify (calDataUtils.jsm#40)
notify (calDataUtils.jsm#68)
onAddItem (calCompositeCalendar.js#49)
notify (calDataUtils.jsm#40)
notify (calDataUtils.jsm#68)
onAddItem (calDavCalendar.js#3001)
notify (calDataUtils.jsm#40)
notify (calDataUtils.jsm#68)
adoptItem (calMemoryCalendar.js#164)
addTargetCalendarItem (calDavCalendar.js#1139)
endElement (calDavRequestHandlers.js#906)
onDataAvailable (calDavRequestHandlers.js#808)

You can see clearly that every time data is available and added to the tree, the sort function is called in an ever growing tree as data increased (added)... so the same sorting function occurs repetitively over an increasing amount of data so sorting become more and more lengthy time/processing consuming as data is added causing delay for the function/script to complete during which TB become "unresponsive".

The fix proposed above, would prevent the tree to be sorted every time data is added to the tree I believe.
Ideally the sorting shall occurs only once at the end when all data would have been added to the tree.

The addItems (calendar-unifinder.js#449) function accept an additional parameter called aDontSort which shall be used and set to 'true' to prevent sorting of the tree every time item(s) are appended.
(Is it normal that JavaScript does not raise an error indicating the parameter aDontSort is missing in the calling function?)

/**
* Add an item to the unifinder tree.
*
* @param aItemArray An array of items to add.
* @param aDontSort If true, the items will only be appended.
*/
addItems: function(aItemArray, aDontSort) {
this.eventArray = this.eventArray.concat(aItemArray);
let newCount = (this.eventArray.length - aItemArray.length - 1);
this.tree.rowCountChanged(newCount, aItemArray.length);

    if (aDontSort) {
        this.calculateIndexMap();
    } else {
        this.sortItems();
    }
},

As you can see I keep hope :-) to fix that real bugger for end-users and help remove a damaging image of TB at the moment performing poorly :-(

I downloaded the source file from here:
https://dxr.mozilla.org/comm-central/source/comm/calendar/base/content/calendar-unifinder.js#148

Modified the file as I suggested and created the patch attached for testing and review at your end...

I don't think I have any way to test it directly unless it is published in one of the next beta version... It does not seem possible to modify code on the fly via the Debugger to see how the proposed fix behave...

Re-submitted as a patch :-)
Sorry forgot to tick option previously...

Thanks for keeping persistent on this issue, Richard. I haven't checked in detail, but it looks this fix would prevent also sorting if you create a new event via UI that should be in range. If sorting is confirmed to be the issue here, one would probably need to listen to running sync operations, supress sorting on adding while sync is in opertaion and refresh the sorting once all pending a<nc operations are completed.

If you want to test your changes without building yourself, you could copy the xpi from YourTB-installation/distribution/extensions, unzip it, do your modification, zip it again, rename it to xpi and overinstall it using the TB addons manager.

Disclaimer: If you do this with your production profile, make sure you have backed up your TB profile before and return to the regular instalation straight after doing your checks. To get back to the ditributed xpi, uninstall the modified version via addons manager, and follow these instructions [1].

[1] https://support.mozilla.org/en-US/kb/calendar-updates-issues-thunderbird#w_lightning-disappears-after-a-thunderbird-update-release-and-beta-versions

Flags: needinfo?(makemyday)

That would greatly help, thank for tips...

As an additional note, coding wise, aDontSort parameter shall never be ommited in calling function because it does not have a default value set (as per addItems function definition), so in theory this means it is not designed to be ommited. It needs to be set explicitly to true or false.

(In reply to [:MakeMyDay] from comment #44)

Thanks for keeping persistent on this issue, Richard. I haven't checked in detail, but it looks this fix would prevent also sorting if you create a new event via UI that should be in range. If sorting is confirmed to be the issue here, one would probably need to listen to running sync operations, suppress sorting on adding while sync is in operation and refresh the sorting once all pending a<nc operations are completed.

Sorting may not be the only issue (I'll come back to that later once the sorting issue is sorted) but yes distinction need to be made if item is added from end-user via the UI (one item at a time) which could cause sorting [function call like: unifinderTreeView.addItems(items.filter(filter.isItemInFilters, filter), false) ], and item(s) loaded via network streams (item loaded one by one or in batch in thousands) which should not trigger sorting until after each item is added [function call like: unifinderTreeView.addItems(items.filter(filter.isItemInFilters, filter), true) ] but only at the end when all items have been added [last function call like: unifinderTreeView.addItems(items.filter(filter.isItemInFilters, filter), false) ]

Currently... when items are added at startup from network stream here is what happens...

  • 1st item added ==> trigger sort on one item
  • 2nd item added ==> trigger sort on two items
  • 3rd item added ==> trigger sort on three items

You can clearly see that when adding thousands of items how this processing would be time/processing consuming and inefficient by designed.

What should happen instead...

  • 1st item added
  • 2nd item added
  • 3rd item added
    (...)
  • last item added ==> trigger sort on all items

In this case if there is one or thousands of items does not make difference as adding processing is quite fast... and sorting would only occurs once on the total of items, which is also fast (equivalent in speed to end-user clicking on the header column to sort in the UI).

If you want to test your changes without building yourself, you could copy the xpi from YourTB-installation/distribution/extensions, unzip it, do your modification, zip it again, rename it to xpi and overinstall it using the TB addons manager.

Thanks for the tip, I will give it a try when I get a spare chance. It is good to know I have way to test and fix. Unfortunately my debugger in b3 does not seems to properly stop correctly at my previous breakpoints at the moment making it more difficult to debug but haven't got a change to look into that yet! I suspect introduction of workers to handle the network stream and add items to tree do no longer trigger some of my previously set breakpoints.

Disclaimer: If you do this with your production profile, make sure you have backed up your TB profile before and return to the regular instalation straight after doing your checks. To get back to the ditributed xpi, uninstall the modified version via addons manager, and follow these instructions [1].
[1] https://support.mozilla.org/en-US/kb/calendar-updates-issues-thunderbird#w_lightning-disappears-after-a-thunderbird-update-release-and-beta-versions

I have setup a dedicated test profile for testing it so far :-)
Thank for the tip, I'll revert installation right after...

Flags: needinfo?(richard.leger)

Also suggestion for later and long term to rename aDontSort into aSort and invert value/meaning. So value true would mean "sort after adding", false would mean "do not sort after adding"... this would be more human friendly and self understanting than current paradigm that use "Not" in parameter name with opposite value to achieve the same... which is not natural meaning and "force on the brain"

If you want to ommit parameters in calling fuction it could be defined with a default value in function definition to clarify what default behaviour processing is expected... I would suggest not to sort by default unless explicitly requested (when all items have been added or user request only) eg:

addItems: function(aItemArray, aSort=false) {
(...)
if (aSort) {
this.sortItems();
} else {
this.calculateIndexMap();
}
},

Instead of...

addItems: function(aItemArray, aDontSort) {
(...)

if (aDontSort) {
    this.calculateIndexMap();
} else {
    this.sortItems();
}

},j

Note the code change within the function to accomodate the change in parameter name and value.

This also mean parameter shall be updated accordingly where ever there is a function call to addItems function...

Just a suggestion...

Progress so far... managed to...

  • Fixed my debugger issues
    Turned out I had to update my breakpoints as source code had changed my previously set breakpoints were no longer valid (offset by few lines up or down therefore not being triggered in some cases) all ok after moving them back (clear old line - set on new line) to the right line of code.

  • Fixed my custom .xpi add-on that would not installed on TB (saying it was incompatible) at first
    That is because I was zipping back the folder and not only its content (this helped me to figure it out https://stackoverflow.com/questions/19240653/how-to-create-xpi-file-with-7zip)

  • Setup environment with custom extension, for further testing/troubleshooting to debug issue at will
    Created a separate TB profile with no email and just one network calendar with thousands items. This allow to setup custom add-on just for that profile (add-on are indeed installed in each profile even the default TB.lightning one), this way my production profile is unaffected while I can test, troubleshoot and debug possible fix(es) at will using Task Manager monitoring and TB Debugger.

Findings so far...

  • TB beta3 seems better at "multi-processing" so the "not responding" warning appears less at first but UI do slow down or remain mostly unavailable (or available but with major delay) which may now allow to apply the workaround quickly after startup when issue occurs (if you are patient to click and wait long seconds before action occurs in a UI) in case it was not applied before closing the application previously... but still very slow to load...

  • The previous patch provided (patch.v01) fixes only partially the issue observed but not entirely.

  • Some gaps remain between the network requests (pics)... but I may found other root cause (too many repetitive calls to calculateIndexMap() same issue as for sorting) and solution, preliminary tests are promising as I managed to remove gap (delay) which cut loading time by half from the first patch... need to test further...

  • While troubleshooting, I found an additional issue, I did observed before but haven't got a chance to look at, where when all network data is loaded in calendar view (fully loaded) all items suddenly disappear from calendar view (seems happening during or just before final sorting of the unifinder tree view) before they all re-appear again at once...

All this means the patch might be a bit more consequent than first foreseen... and may require extra troubleshooting/testing...

Flags: needinfo?(richard.leger)

FYI only, no need to review yet... a patch.v03 is coming up...

While waiting here are some statistics...

Network calendar (with ~4000 items) loading time comparison [TB 66.0b3 (32-bit) on Win10Pro64]:

  • with no patch (current upstream code)
    ==> ~2600+ seconds [43+ minutes] to load calendar in full - TB Not Responding most of the time during loading

  • with patch.v01 (no SortItems upon add)
    ==> ~170 seconds [2 minutes 50] (x15 time faster than upstream) to load calendar in full
    --> Preventing sorting [sortItems()] on each item added (do sort only after last item is added) fixes the "Not Responding" UI and the Warning Unresponsibe script: resource://calendar/modules/utils/calUnifinderUtils.jsm:110

  • with patch.v02 (no CalculateIndexMap upon add on top of no SortItems upon add - patch.v01)
    ==> ~68 seconds [1 minutes 8] (x38 time faster than upstream) to load calendar in full
    --> Preventing re-indexing [calculateIndexMap()] on each item added (do indexing once after sorting by call from sort function) fixes remaining gaps/delay between network requests (pics) and allow to get stream of data much faster...

What a difference!!!

Find attached patch.V03 replacing the two previous patches submitted.

Please review and include in next beta and stable as well as backport in current stable via a fix update in measure of possible.

Let me know what you think. I tried to keep it as simple as possible.

  • with patch.v03 (sortItems only upon onLoad or setItems - will trigger calculateIndexMap() in the process)
    ==> ~68 seconds [1 minutes 8] (x38 time faster than upstream) to load calendar in full
    --> Preventing sorting and re-indexing [calculateIndexMap()] on each addItem (do sorting only upon onLoad or setItems - indexing being triggered automatically once after sorting - function call in sortItems) fixes remaining gaps/delay between network requests (pics) and allow to get stream of data much faster...
    ==> No more gap/delay between network requests

Note1: Next step would be to look are reducing the number of network request by increasing batch size (or have it automatically adjusted depending on number of items and bandwidth speed) but that would be dealt with in a separate bug...

Note2: During troubleshooting of this bug, I observed that when Find Events > All Events is set, when clicking on a recurring event in the calendar view (e.g Month view), it does not appear selected in the Find Events list view. While it works find when other Find Events views (select options) are selected e.g Find Events > Events in Current View (or others). This issue only happens for recurring events. This issue is also happening in current stable version of TB so not linked to patch.V03 proposed for review.
==> shall be dealt with in a separate bug... later on...

FYI, patch.v03 proposed in previous message was generated from upstream code as available in extension Lighting 6.9 (10 April 2019) {e2fda1a4-762b-4020-b5ad-a41df1933103}.xpi copied from Thunderbird 67.0b1 (32-bit).

Comment on attachment 9057405 [details] [diff] [review]
bug1502923.patch.v03.sort.onload.txt

Geoff, can you give these patches a review?
Flags: needinfo?(philipp)
Attachment #9057405 - Flags: review?(geoff)
Assignee: nobody → richard.leger
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Comment on attachment 9057405 [details] [diff] [review]
bug1502923.patch.v03.sort.onload.txt

Review of attachment 9057405 [details] [diff] [review]:
-----------------------------------------------------------------

This seems fine at face value (minus the formatting nits) although I've only done some light testing. I'd appreciate more lines of context (use -U8 when calling diff) but as it's a first patch I'll let you off. ;-)

Richard, is this your final version? If it is, I'll fix it so that it applies cleanly to our source code and get it landed.

::: chrome/calendar/content/calendar/calendar-unifinder.js
@@ +82,5 @@
>              // again.
>              gUnifinderNeedsRefresh = true;
>              unifinderTreeView.clearItems();
> +        }else{
> +          unifinderTreeView.sortItems(); // sortItems() will trigger calculateIndexMap()

Spaces around the else, and two more spaces of indent for the next line.
Attachment #9057405 - Flags: review?(geoff) → review+

(In reply to Geoff Lankow (:darktrojan) from comment #53)

This seems fine at face value (minus the formatting nits) although I've only
done some light testing. I'd appreciate more lines of context (use -U8 when
calling diff) but as it's a first patch I'll let you off. ;-)

Considering all the work done and time spent to fix this issue I would not expect less, but I took note of your relevant comments ;-) A "thank you" would have been an appreciated plus!

Please fix formatting as you see fit to include patch... as you see fit...

OS: Windows 10 → All
Summary: Thunderbird processing / "Not Responding" 5 to 30 minutes at startup with 63.0beta + Lightning 6.5 and 64.0b1 + Lightning 6.6 after upgrade - Performance issues → Thunderbird processing / "Not Responding" 5 to 30 minutes at startup with 63.0beta + Lightning 6.5 and 64.0b1 + Lightning 6.6 after upgrade, and also "Unresponsive Script" - Performance issues

(In reply to Richard Leger from comment #50)

Note2: During troubleshooting of this bug, I observed that when Find Events > All Events is set, when clicking on a recurring event in the calendar view (e.g Month view), it does not appear selected in the Find Events list view. While it works find when other Find Events views (select options) are selected e.g Find Events > Events in Current View (or others). This issue only happens for recurring events. This issue is also happening in current stable version of TB so not linked to patch.V03 proposed for review.
==> shall be dealt with in a separate bug... later on...

FYI, Note2 issue has now been reported as a separate Bug 1543716 "When clicking on a recurring event in the calendar view (e.g Month view), event does not appear selected in the Find Events > All Events list view"

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/37e43d761fe9
Delay calling unifinderTreeView.sortItems until load event; r=darktrojan

Status: ASSIGNED → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED

Congratulations on your first patch landing. And thank you.

Target Milestone: --- → 7.0
Attachment #9057786 - Flags: review+
Attachment #9057786 - Flags: approval-calendar-esr?(philipp)
Attachment #9057786 - Flags: approval-calendar-beta?(philipp)
Comment on attachment 9057786 [details] [diff] [review]
1502923-unifinder-sortitems.diff

Review of attachment 9057786 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks Richard, I appreciate the work you've put into this!

Geoff, I'm suggesting we give this a try on beta before we send it down to ESR since we don't have any tests for the unifinder. Or do you think it is straightforward enough to uplift now?
Attachment #9057786 - Flags: approval-calendar-beta?(philipp) → approval-calendar-beta+

(In reply to Richard Leger from comment #50)

Note1: Next step would be to look are reducing the number of network request by increasing batch size (or have it automatically adjusted depending on number of items and bandwidth speed) but that would be dealt with in a separate bug...

FYI, Note1 issue has now been reported as a separate Bug 1543953 "Lightning network REPORT request receiving time too high (0.5 to 3 seconds per request) - Loading one ~4000 events network calendar takes more than 1mn - 46 network requests required (isn't it too much?)"

Comment on attachment 9057786 [details] [diff] [review]
1502923-unifinder-sortitems.diff

Review of attachment 9057786 [details] [diff] [review]:
-----------------------------------------------------------------

Geoff, please re-request ESR when you've replied to my last comment.
Attachment #9057786 - Flags: approval-calendar-esr?(philipp)

Richard, congrats on the persistence and results. So this is a regression that started in version 60? Or, it existed for a long time but something in newer Thunderbird/calendar just made the effects worse? And do we know why?

Definitely good enough reason to do a beta.

Blocks: 441710
Flags: needinfo?(richard.leger)
See Also: → 1543953
Summary: Thunderbird processing / "Not Responding" 5 to 30 minutes at startup with 63.0beta + Lightning 6.5 and 64.0b1 + Lightning 6.6 after upgrade, and also "Unresponsive Script" - Performance issues → Thunderbird processing / "Not Responding" 5 to 30 minutes at startup with 63.0beta + Lightning 6.5 and 64.0b1 + Lightning 6.6 after upgrade, and also "Unresponsive Script" - Performance issues in unifinder.js

(In reply to Wayne Mery (:wsmwk) from comment #62)

So this is a regression that started in version 60?

I don't think this is a regression. I believe that is a bug that "always" been there for years and that was affecting mostly people like me that have thousands of objects in their calendars and that connect to multiple network calendars. We most end-up having to keep most calendar disabled but one or two max to avoid most of the delay startup issues. And then enabling calendar "on demand"... and disable it afterwards to avoid performance issues...

With Lightning extension enabled, TB has ALWAYS been slow to start... this slowness was reported way back by multiple users... I can detect some of the code removed by my patch, already being present in 2010 code repository... e.g https://hg.mozilla.org/comm-central/rev/265623017726 so I don't see when exactly it was introduced...

It just became worse later on with the arrival of 60 stable version I believe or some previous version... not exactly sure which one...

I remember discussions about having Lightning added by default with TB as "core" feature where people argue on one hand that would slow down TB and would rather have it left separate as an extension to add by end-users, or on the other hand to have it in TB by default so issues could be detected/fixed more quickly as more testing before delivery would occur (also keeping TB version in line with Lightning version to avoid incompatibilities). Really an old debate this one probably around TB version 24 from memory :-) The later happened Lightning was integrated by default, but the performance issues with it never been really addressed as it should have... from a user point of view.

Or, it existed for a long time but something in newer Thunderbird/calendar just made the effects worse?

I believe this to be the case... and it has gone worth about two years ago as a rough estimate... when discussion raised about Thunderbird future and possible handover for maintenance to another organisation but it never happened while this lengthy situation had a drastic impact on TB performance as maintenance reduced at the time to sort pressing issue like crashes :-)

Hard to tell which of Thunderbird or Calendar introduced code caused the worsening effect... could be one or the other or both combined... but as I say, the issue was a bug by design that was lying there for a very long time (repeating sorting on every added items in an ever increasing dataset!) both team TB Core and Lightning thinking issue was not in their code possibly :-)

I think Lightning performance (and TB) in general shall be addressed in priority... but no one seems to have time to look into it, at least in Lightning... this bug is a clear example of the situation... despite TB hiring new paid engineers recently, probably working on more pressing/fancy issues :-)

More likely also because it is a boring and time consuming job to track-down performance issues, as opposed to adding new features which may be more interesting from a development point of view :-) and also lack of time/resources/interest/priority on the Lightning performance issues... it is also quite a complex machinery which is not easy to fix...

And do we know why?

I would not really know nor be able to tell... Probably since efforts started to make Thunderbird more "multi-processing" by "time sharing" the processing power of Thunderbird between various part of the app features... causing the sortItems() function to take much much more time to complete than previously... as it pause and resume as processing power is pre-empted and become available to it again perhaps... or merge of new feature from Firefox/TB core code... those are pure speculations but may give a hint on what could have possibly be the aggravating factor in this bug...

Thank you to all the developers that works on TB and Lightning, I use them both everyday and I really appreciate all the work you have done and continue to do to help improve, sort issues and add new features!

Flags: needinfo?(richard.leger)

(In reply to Richard Leger from comment #63)

I believe that is a bug that "always" been there for years

Possibly present since 11 years (day 1?) according to this bug 441710 report...

Could this change have regressed Bug 1545042?

Not impossible but would need to be demonstrated... It would be good if anyone else out their is able to test and confirm...

But during my testing with TB 67.0b1, LT 6.9, patch.v03 I haven't observed such issues... though I don't do drag and drop events... which is managed by the calendar UI object, not the unifinder.js which was patch here... but add/edit/delete worked fine at my end via unifinder or calendar UI objects... the sortItems (and calculateIndexMap()) would in any case be triggered via the unifinder setItems function which is used/triggered I believe during save following editing...

The drag and drop would occur on the calendar UI object interface so maybe it is not doing the right thing in version 7.x? e.g not calling the right functions any longer perhaps?

Other changes in LT 7.x could be the route cause of the issue as well potentially...

(In reply to Richard Leger from comment #66)

But during my testing with TB 67.0b1, LT 6.9, patch.v03 I haven't observed such issues...

FYI I have just tested again following instructions in Bug 1545042 and cannot reproduce the issue with TB 67.0b1 and LT 6.9 manually patched with my submitted patch.v03...

Depends on: 1545042
No longer depends on: 1545042
Regressions: 1545042

You're sure you want this in the beta given the regression it caused. Skipping this for now.

Flags: needinfo?(philipp)
Flags: needinfo?(geoff)
Comment on attachment 9057786 [details] [diff] [review]
1502923-unifinder-sortitems.diff

I'm thinking we should back this out given the regression, unless a fix for it can be found soon.
Flags: needinfo?(philipp)
Attachment #9057786 - Flags: approval-calendar-beta+
Flags: needinfo?(geoff)

Well instead of reverting back (easy non-solution), if someone encountering the regression could try to look at why it is happening and identify what may be the root cause and identify a possible fix (painful solution) that would be a greater option.

Because removing patch means intentionally reverting back to awful TB performance that have been plaguing end-users for about 11 years! Now may be a right moment to look into it and have it fixed...

Perhaps could be an issue related to the offline cache feature... as it is activated by default but I don't use it... (disable it manually)... just a hint...

We have roughly two weeks to solve bug 1545042 before needing to back this out when beta 68 rolls out

If the patch has a major known regression and a fix cannot be found within a few days I'd prefer to back it out. We can re-land when the regression is resolved. Jörg, can you handle the backout?

Flags: needinfo?(jorgk)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Sure, it's on my radar, but I'll leave the NI for now. Certainly it should be backed out before we go to TB 68 beta, see comment #71. Best to leave it FIXED for now and make it REOPENED when the backout actually happens.

Personally I'm puzzled by this: It's a pretty small patch (with apparently a big positive effect) and a very clear regression. Shouldn't it be possible to fix the regression? Generally the Mozilla approach is not to back things out, but to properly fix regressions. I don't always agree with that, but here we should make some effort.

Status: REOPENED → RESOLVED
Closed: 6 months ago6 months ago
Resolution: --- → FIXED

I suggested a possible test in Bug 1545042 Comment 16 that may help trying to narrow down the issue, if someone running a daily build would be able to give it a try... this could help identify if regression is due to additional code introduced in TB or LT since last beta... conflicting with the patch.v03 applied.

There are actually two regressions:

  • Bug 1545042 (more likely a view refresh issue - fixed by switching between two unifinder views - selection box)
  • Performance issues as per statistics test results available in Bug 1545042 Comment 21

I am still convinced that none of those regressions are due to patch.v03 which in any case bring much improvements in term of performance compared without it.

The regressions observed while possibly triggered somehow by patch.v03 are not due to it, but more likely due to the numerous other changes that occurred in Lightning as highlighted in Bug 1545042 Comment 20 (diff available between LT 6.9 and LT 7.0a1 as available from 20190412 which is when issue started to be observed).

Backed out on request of Philipp, the Calendar module owner.
https://hg.mozilla.org/comm-central/rev/91691e44548ba1784aa7b5c37bc1704078785cc7
Backed out changeset 37e43d761fe9 (bug 1502923) for causing problems in unifinder (bug 1545042) a=backout DONTBUILD

Sadly no one stepped up to the challenge to fix bug 1545042. Before backing this out I have verified manually that the backout fixes bug 1545042.

Status: RESOLVED → REOPENED
Flags: needinfo?(jorgk)
Resolution: FIXED → ---
Target Milestone: 7.0 → ---

So do you guys think this issue is at all sounding like what's happening in bug 1509739 or bug 1522990? The part about disabling Lightning and everything works seems quite similar. The fix seems to be to start use a new profile.

Richard, is this something you'd be willing to try and confirm as per what is stated in https://bugzilla.mozilla.org/show_bug.cgi?id=1509739#c4 and https://bugzilla.mozilla.org/show_bug.cgi?id=1513202#c6 ?

Flags: needinfo?(richard.leger)

(In reply to Arthur K. from comment #77)

The fix seems to be to start use a new profile.

All the findings in this bug report shall clearly indicate that the fix has nothing to do with using a new profile... a patch was issued to fix it but a regression raised in 68.x branch due to too many other changes in the code... so patch was backed out... the performance issues lies within Lightning code... currently not optimised for performance especially when use with large calendars (with ~4000 items or more) and multiple of them!!!

Until someone from the Lightning development teams get time to look seriously into it, and make it a priority, it can be fixed, but until then... I am afraid there is nothing else that can be done about it... we can only wait for someone to look into it... and have it fixed... I tried... it failed miserably because without time, support and effort from the development team to tackle performance issue in Lightning there is nothing we can do about it to have it fixed :-)

Sad but true.

Flags: needinfo?(richard.leger)

I was hoping that there might have been some correlation between this bug and the others but oh well. Yes, I can see that with large amounts of cal items, it would probably bog down as it's going out and parsing or retrieving those multitude of events. And this is with future cal items, not those those that have passed right?

One question: Google's calendars are set up such that when an event passes, they'll still stay in your calendar(s) ad infinitum. Is there a means or mechanism in TB by which old cal events which are or would likely be no longer needed could be purged out in the same way that the old POP3 setting to "remove mail from server after "x" days" could be implemented? Meaning, if an event is "x" days old, just remove it from the calendar so it's not just sitting there rotting uselessly? I would think such a setting to purge events which have passed but are still being stored might help a bit in reducing the overall events needing to be parsed?

(In reply to Arthur K. from comment #77)

So do you guys think this issue is at all sounding like what's happening in bug 1509739 or bug 1522990? The part about disabling Lightning and everything works seems quite similar.

bug 1509739 is likely to be a duplicate of this bug
bug 1522990 is unlikely to be a duplicate of this bug though may be related somehow...

The fix seems to be to start use a new profile.

Not the fix. Better try to make sure that in the Calendar view > Find Event area you choose something like "Currently Selected Day" instead of "All Events" in the selection box prior restarting TB. This is not a fix but a workaround to minimise the issue. Long term fix still to be determined and implemented...

(In reply to Arthur K. from comment #79)

I was hoping that there might have been some correlation between this bug and the others but oh well.

There might to some extent.

Yes, I can see that with large amounts of cal items, it would probably bog down as it's going out and parsing or retrieving those multitude of events. And this is with future cal items, not those those that have passed right?

Currently Lightning load all items past, present, future... for all calendars... I would expect to be relevant for end-users to have access to all items of all calendars past, present, future as a default feature, that seems to be the correct things to do... now how and when such items are loaded in TB and the views is another matter :-)

A full calendar of ~4000 items takes about 1sec to load in one go on an ADSL connection which is nothing really via web browser... the problem is not the number of items in the calendar... the issue is linked to flaws in Lightning code and design that is not optimised to handle multiple CalDav calendars containing thousands of items each, coupled with this bug in the code that cause exponentials sorting of all items for each items added to TB that still need fixing?!

One question: Google's calendars are set up such that when an event passes, they'll still stay in your calendar(s) ad infinitum.

Well that is the expected behaviour of a calendar... to allow you to keep historic of events by default.
Some calendar system may allow to "program" auto-deletion of items (behind time range, within a specific type, etc...) but CalDav protocol and Thunderbird do not allow to do so to my knowledge.

While this could be considered as a new feature, it shall not be considered as a solution for this bug.

It would be much better to fix and optimise Lightning code to allow handling of multiple large CalDav calendars with thousand of items each... which is (and should be considered as) a common use case... that Lightning code shall be tested regularly against, in term of performance...

Is there a means or mechanism in TB by which old cal events which are or would likely be no longer needed could be purged out in the same way that the old POP3 setting to "remove mail from server after "x" days" could be implemented?

I am not aware of any such mechanism in TB... or in CalDav protocol...

Meaning, if an event is "x" days old, just remove it from the calendar so it's not just sitting there rotting uselessly?

You can already delete one or more of such items manually quite easily already...

Auto-deleting old items could be a feature handy for some users, but majority of end-users may want to keep historic of ALL their events and they should be able to do so. Again deleting items shall not be considered as a solution to the problem here... which is a performance issue, though it could be considered as an added feature but not directly linked to this bug or related.

Lightning should be perfectly capable of handling thousand of events (TB does with emails)... most today's computer resources would allow it to do so without a itch...

But code (re)design review and refactoring would be required to handle event items in batch instead of individually and initials setup of calendar loading could be processed in a different way... to optimise and improve performance and loading speed...

I would think such a setting to purge events which have passed but are still being stored might help a bit in reducing the overall events needing to be parsed?

To my point of view, the problem is not the quantity of event being parsed that is the root cause of current performance issue, though it contribute to it, TB for example is able to handle thousand of emails via IMAP or other protocol so Lightning shall be able to handle thousands of event items without issue...

Lightning current design is buggy, and is just not optimised for handling high volume of events in batch or for optimised performance... especially in the various views... also some bug in the code may increase performance issues at the moment... I suspect also Lightning code is not tested/optimised to run with multiple CalDav calendars with thousand of items each on low speed connection (e.g ADSL).

Until the Lightning dev team review and tackle this bug as well as long lasting performance issues, there is not much end-users (as I am) can do to fix them but to use workaround as available to them where applicable... or wait for someone some day to look at it and fix it. The sooner the better for all of us, but it has been going on for years already :-)

I tried via this bug, but it did not workout...

Also, this Bug 1502923 is about Lightning bug that cause exponential sorting of all items for each item added at startup (first loading)... but there are other reasons of poor startup performance such as Bug 543953 "Lightning network REPORT request receiving time too high (0.5 to 3 seconds per request) - Loading one ~4000 events network calendar takes more than 1 minute - 46 network requests required (isn't it too much?)" ... this is only another example among possible others...

Those are just the tip of the iceberg... hopefully all will be uncovered and tackled one day... starting with those two...

(In reply to Richard Leger from comment #82)

such as Bug 543953

Sorry I made a typo in my previous message...

I meant bug Bug 1543953 "Lightning network REPORT request receiving time too high (0.5 to 3 seconds per request) - Loading one ~4000 events network calendar takes more than 1 minute - 46 network requests required (isn't it too much?)"

This bug may be blocked by Bug 1567055 "Add items to the unifinder list more efficiently"

See Also: → 1567055

Updated Ligthning CalDAV performance test results are available in Bug 1567055 Comment 12.
It includes test results for TB Nightly 70.0a1 (2019-07-26 08-35-20) as well comparison to watch progression of performance improvements.

(In reply to Richard Leger from comment #85)

Updated Ligthning CalDAV performance test results are available in Bug 1567055 Comment 12.
It includes test results for TB Nightly 70.0a1 (2019-07-26 08-35-20) as well comparison to watch progression of performance improvements.

Copying test results here for the record. This may allow monitoring future progress within this bug as Bug 1567055 is now closed.
For more details on context and setup that lead to those results, consult Bug 1567055 Comment 12.

Loading performance of one Caldav calendar with ~4000 items over ADSL connection:

Without sorting - Find Event View (unifinder) -> keep selectedColumn=false (default - no particular sorting)

# TB Version LoadingTime*(Disappearance**) DevTools > Networking
01 60.8.0 (32-bit) ESR ~6m23s (50s) 48 requests - 7.98MB transferred - Finish: 7m65s
05 66.0b3 (32-bit) Beta with patch.v3*** Not available Not available
02 68.0b5 (32-bit) Beta ~3m34s (14s) 45 requests - 7.97MB transferred - 3m71s
07 70.0a1 (2019-07-26 08-35-20) (32-bit)**** ~4m59s (15s) 45 requests - 7.97MB transferred - 4m86s

With sorting - Find Event View (unifinder) -> keep selectedColumn=true (Title column selected by end-user for sorting)

# TB Version LoadingTime*(Disappearance**) DevTools > Networking
03 60.8.0 (32-bit) ESR more than 100m (unknown) more than 61 requests - more than 12.33MB transferred - Finish more than: 101m76s
06 66.0b3 (32-bit) Beta with patch.v3*** ~68s (unknown) (unknown)
04 68.0b5 (32-bit) Beta ~47m48s (15s) 56 requests - 10.33MB transferred - 47m02s
08 70.0a1 (2019-07-26 08-35-20) (32-bit)**** ~8m58s (1m03s) 45 requests - 7.97MB transferred - 7m79s

(In reply to Richard Leger from comment #86)

(In reply to Richard Leger from comment #85)

Updated Ligthning CalDAV performance test results are available in Bug 1567055 Comment 12.
It includes test results for TB Nightly 70.0a1 (2019-07-26 08-35-20) as well comparison to watch progression of performance improvements.

Copying test results here for the record. This may allow monitoring future progress within this bug as Bug 1567055 is now closed.
For more details on context and setup that lead to those results, consult Bug 1567055 Comment 12.

Loading performance of one Caldav calendar with ~4000 items over ADSL connection:

Without sorting - Find Event View (unifinder) -> keep selectedColumn=false (default - no particular sorting)

# TB Version LoadingTime*(Disappearance**) DevTools > Networking
01 60.8.0 (32-bit) ESR ~6m23s (50s) 48 requests - 7.98MB transferred - Finish: 7m65s
05 66.0b3 (32-bit) Beta with patch.v3*** Not available Not available
02 68.0b5 (32-bit) Beta ~3m34s (14s) 45 requests - 7.97MB transferred - 3m71s
07 70.0a1 (2019-07-26 08-35-20) (32-bit)**** ~4m59s (15s) 45 requests - 7.97MB transferred - 4m86s

With sorting - Find Event View (unifinder) -> keep selectedColumn=true (Title column selected by end-user for sorting)

# TB Version LoadingTime*(Disappearance**) DevTools > Networking
03 60.8.0 (32-bit) ESR more than 100m (unknown) more than 61 requests - more than 12.33MB transferred - Finish more than: 101m76s
06 66.0b3 (32-bit) Beta with patch.v3*** ~68s (unknown) (unknown)
04 68.0b5 (32-bit) Beta ~47m48s (15s) 56 requests - 10.33MB transferred - 47m02s
08 70.0a1 (2019-07-26 08-35-20) (32-bit)**** ~8m58s (1m03s) 45 requests - 7.97MB transferred - 7m79s

Not sure if it would make much difference but could you also test with 69.0b?

(In reply to Arthur K. from comment #87)

Not sure if it would make much difference but could you also test with 69.0b?

Here are additional tests and results...

ADSL - All Events - No sorting

# TB Version LoadingTime*(Disappearance**) DevTools > Networking
09 69.0b1 (32-bit) ~2m59s (10s) 60 requests - 8.46MB - 3m07s

ADSL - All Events - Sorting by Title

# TB Version LoadingTime*(Disappearance**) DevTools > Networking
10 69.0b1 (32-bit) ~54m16s (14s) 61 requests - 10.29MB - 54mn32s

Note about test 10: Toward the end about 10 last minutes there seems to be a continuous network traffic between ~3.5MB with picks at 7.7MB some time to time as observed in Task Manager > Performance > Wifi graph

... and for the sake of comparison ...

Lease Line ~30Mbps - All Events - Sorting by Title

# TB Version LoadingTime*(Disappearance**) DevTools > Networking
11 70.0a1 (2019-07-26 08-35-20) (32-bit) ~8m54s (49s) 45 request - 7.97MB - 8m05s

Note about test 11: This last test just shows that higher network speed does not improve performance...

To the risk of repeating myself, my view on this performance issue exposed here remain that:

  • It is linked to the calendar unfinder additem function
  • It is already fixable "today" with a little bit of work and effort (I tried...)

The additem function is called for each item added to the unfinder tree therefore this function/definition shall be kept to the minimum, it should only be used for one thing: adding one item to the tree when it is called (or a batch of items if that is the future plan which may be even more efficient but I would keep that for later :-)

The rest such as sorting and map indexing shall ideally happen outside this function once all items would have been added or happens only once the last item is added (full tree loaded). In the eventuality items are added by batch, sorting and indexing shall happen once all batch have been added (loading of all items has completed)... I think that is the purpose of the onload function to detect such event...

If indexing map is still required within additem (it may be required as other view may use the same map to deal with events during loading perhaps not sure how the map works or what it is used for), then it should happen on the fly, without requiring parsing of the overall tree for each item added as done currently by calling this.calculateIndexMap(); within additem function. The sort function would in any case trigger the re-mapping following sorting... so that may just be a matter of putting the sorting function in the right place and to trigger it once all items would have been loaded (if a column is not selected if would not sort anyway, but would still trigger the re-mapping so it get covered). Yes items would be unsorted for the time of loading but really who cares as far as performance are there and items are sorted at the end of loading... In any case during loading it is not possible to interact with the calendar items currently... interface not responding... that is why it is essential it loads as fast as possible...

Sorting and mapping as currently done within additem function is the root cause of performance issue as highlighted by patch.v3 in this bug... simply because both of those actions parse the entire tree, and are done repeatedly for each item added (this to me is the design flow causing perf issue) so e.g 4000 time sorting and remapping of the entire tree (ever growing as items are added) for a calendar containing 4000 items (subject that All Events is selected in unifinder) and that is just assuming only one calendar is setup in TB! Which is the case for test results in previous comments... so imagine when you are connected to 3,4 or 5 calendars what happens!!!

Looking at Bug 1567055 and its related patch modifying the additem function, it does not seems that is the approach taken by developers sadly... meaning the performance issue exposed here would remain... for now... and next ESR... let see and wait on future progress... in fixing what is a design flaw causing huge performance issue...

The attached patch bug1502923.patch.v04.txt is similar to the previous one submitted but applicable to TB/Lightning version 69.0b1.

It bring a huge performance gain by dropping the calDAV calendar initial loading time from ~54m16s to ~1m26s in TB 69.0b1. See test result below...

ADSL - All Events - Sorting by Title [No sort & No index map in calendar-unifinder.js > additems function - just adding items :-)]

# TB Version LoadingTime*(Disappearance**) DevTools > Networking
17 69.0b1 (32-bit) with bug1502923.patch.v04 ~1m26s (13s) 45 request - 8MB transferred - Finish: 1m30s

Notes:

  • The list of items in unifinder appears sorted at the end of the loading as expected. Manual sorting takes only 1sec to 2sec in the UI. So doing it at the end of loading is not a huge issue considering it take more than a 1mn to load the calendar still...

  • No more gaps appear between network requests, the network graphs (task manager > performance > wifi) looks like this /\/\/\/\/\... (instead of /\_/\__/\___/\... without patch, native code) this means the speed of loading is now only limited by the network requests processing/speed...

  • May be worth checking that this patch do not cause regression as observed in 68 branch previously but I have not noticed any in my basic multiple tests (modify existing event, add, modify,remove,move events in unifinder and calendar week view) so far.
    -- If anyone bother to test this patch and has a reproducible use case of regression, please provide info on your current environment e.g your calendar and unifinder settings (caldav calendar option settings, selected view, selected column, etc...) and UI steps to reproduce the issue.

Is that not functionally the same change we already landed and backed out? It may solve your case but addItems is also called through other code paths.

I have a patch which achieves the same thing without breaking addItems.

This depends on bug 1571336, in the case of calDAV.

Attachment #9082937 - Flags: review?(paul)

Hi Geoff,

(In reply to Geoff Lankow (:darktrojan) from comment #92)

Created attachment 9082937 [details] [diff] [review]
1502923-unifinder-refresh-1.diff

This depends on bug 1571336, in the case of calDAV.

Which version of TB your patch can be applied to? 69.0b1?
Is it to be applied in addition of my patch, or instead?
Does it also require to apply patch as found in bug 1571336 as you seems to indicate it depends on it?

Cheers,

Flags: needinfo?(geoff)

Find attached TB 70.0a1(2019-07-30) > DevTools > Performance > Call Tree while loading CalDAV calendar with ~4000 items (about 25 network requests through out of 45) following publication of patch from Bug 1567055.

This highlight huge performance impact due to unifinder addItems function calls to:

  • calculateIndexMap
  • findIndex & map (required for splicing)

Tests

Performance tests below follow publication of patch from Bug 1567055, which applies to TB 70.0a1 (2019-07-30) (32-bit), to evaluate its performance impact...

Various patches (in attachment) were applied to run the tests to successively disable sortItems, calculateIndexMap, sortItems&calculateIndexMap (both), sortItems&calculateIndexMap&splicing (all three, just add items)... to highlight how performance progresses...

ADSL - All Events - Sorting by Title - with various patching tests

# TB Version LoadingTime(Disappearance) DevTools > Networking
12 70.0a1(2019-07-30) Native code ~7m67s (24s) 45 Requests - 7.99MB Transferred - Finish: 7mn44s
13 70.0a1(2019-07-30) patch.v05.noSortItems in addItems ~8m12s (20s) 45 Requests - 7.99MB Transferred - Finish: 7mn73s
14 70.0a1(2019-07-30) patch.v06.noCalculateIndexMap in addItems ~6m07s (24s) 52 Requests - 8.54MB Transferred - Finish: 5mn93s
15 70.0a1(2019-07-30) patch.v07.noSortItems&noCalculateIndexMap in addItems ~5m50s (20s) 45 Requests - 7.99MB Transferred - Finish: 5mn41s
16 70.0a1(2019-07-30) patch.v08.noSortItems&noCalculateIndexMap&noSplicing.justAddItems ~2m23s (25s) 48 Requests - 8.54MB Transferred - Finish: 2mn17s

Hi Geoff,

(In reply to Geoff Lankow (:darktrojan) from comment #91)

Is that not functionally the same change we already landed and backed out? It may solve your case but addItems is also called through other code paths.

Yes, it is the same patch but applied to a different version of TB.

I have a patch which achieves the same thing without breaking addItems.

Have you noticed the regression? I can't from my side somehow... I may not be doing what you are doing to see it, nor having the same settings as you perhaps... If you could precise how you get the regression, if you do, I could look into it to help sorting it out.

I suspect it is related to the mapIndex... but I believe this could be populated on the fly for the added item (I will come with a proposal later, shall I do it in Bug 1567055?)... without parsing the all try over and over again... each time to remap all items for each item added!

Clearly my patch and testings continue to show that from a performance point of view, the addItem function shall not make any calls to sortItems, calculateMapIndex nor any map of findIndex functions... as those have a huge impact on performance batch or no batch processing... I know you intend to add more than one item at a time in the feature via batch processing (as opposed to one-by-one item currently) which is a good thing, but that would not sort the all the performance issue observed, just reduce them slightly...

If addItems is "broken" somehow by my patch because called from somewhere else (that I may have not tested somehow), it may be worth to look into it considering the performance gain of my patch (only limited by the network requests traffic)... to analyse and find the root cause of regression... and fix it... don't you think?

This does no prevent to add batch processing feature, as you do, which would also improve the performance certainly, but it could be worked out both ways together... to fix all performance issues observed not just part of it...

I am trying to help here... let me know how I can do it better to help you... fix all performance issues... that I can see... to the benefit of everyone!

Cheers,

Flags: needinfo?(geoff)

(In reply to Richard Leger from comment #96)

I am trying to help here... let me know how I can do it better to help you... fix all performance issues... that I can see... to the benefit of everyone!

Thanks much for investing your time on this bug and your patches. Although I'm a user, not a dev, getting to the heart of the problem will certainly help users who've experienced this issue. I know TB doesn't have the manpower to fix everything like FF, every little bit helps.

Statistics - Number of times code blocks and functions are called in unifinder addItems function

FYI, find below some statistics following publication of patch from Bug 1567055 (https://hg.mozilla.org/comm-central/rev/3a098643e080), which applies to TB 70.0a1 (2019-07-30) (32-bit).

They were established, by applying bug1502923.stats.v01 (in attachment) to TB 70.0a1 (2019-07-30) (32-bit), to know how many time code blocks and functions are called within the unifinder addItems function when loading a calDAV calendar containing about ~4000 items...

This was done to get an idea:

  • how new code works currently...
  • how code design impact performance...

ADSL - All Events - No Sorting

addItems: 3837
    append: 3837
        sortItems: 0
        rowCountChanged: 3837
    newPosition (this.eventArray.map): 0
        forLoops (values.findIndex): 0
            push: 0
            splice: 0
    calculateIndexMap: 3837

ADSL - All Events - Sorting by Title

addItems: 3937
    append: 1
        sortItems: 1
        rowCountChanged: 0
    newPosition (this.eventArray.map): 3936
        forLoops (values.findIndex): 3936
            push: 3
            splice: 3933
    calculateIndexMap: 3937

Notes

About addItems

  • Currently items are still added one-by-one for now. So we get about ~4000 calls to addItems function.

About sortItems

  • sortItems is called only once (calendar-unifinder.js:456) when addItems is first called and only if selectedColumn is enabled, but there are no items in the unifinder view at this point in time, so it appears useless to call it, unnecessary call?

  • Especially considering sortItems is already called every time outside addItems function at the end of unifinder loading via: refreshTreeInternalFunc (calendar-unifinder.js:911) > addItemsFromCompositeCalendarInternal (calendar-unifinder.js:882) > setItems (calendar-unifinder.js:547) > sortItems (calendar-unifinder.js:574)

It is not very clear why a refresh of the tree would cause setItems to be called (and therefore sortItems) after the unifinder is fully loaded... but at least this seems to confirm again my suspicions that:

  • sortItems is called multiple times while it should be called only once
  • sortItems has nothing to do in the addItems function :-)

About newPosition (this.eventArray.map)

  • Working out new position of each items requires call to this.eventArray.map ~4000 times! Is that necessary? It has a huge performance impact.

About forLoops (values.findIndex)

  • The for loops trigger call to this.eventArray.map ~4000 times! Is that necessary? It has a huge performance impact.

About calculateIndexMap

  • calculateIndexMap function keep being called every time an item is added (so it means ~4000 times) and this continue to cause huge performance loss.

  • calculateIndexMap function would either need to be called only once, after unifinder has full loaded (all items added) which is already the case via sortItems, or index mapping to occur on the fly for the item added without having to reparse the entire tree each time (which is what it does every time caclulateIndexmap function is called, looping trough all items available in the tree).

  • calculateIndexMap function is called via sortItems outside addItems in any case (but perhaps not when no column are selected), so this function may still not need to be present in addItems...

Hope that helps bring some new clarity via a different angle on the performance matter :-)

(In reply to Richard Leger from comment #98)

About forLoops (values.findIndex)

  • The for loops trigger call to this.eventArray.map ~4000 times! Is that necessary? It has a huge performance impact.

CORRECTION, sorry I meant: "The for loops trigger call to values.findIndex ~4000 times! Is that necessary? It has a huge performance impact." in the section above, was bad copy/paste from my part :-(

Comment on attachment 9082937 [details] [diff] [review]
1502923-unifinder-refresh-1.diff

Review of attachment 9082937 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me.  r+
Attachment #9082937 - Flags: review?(paul) → review+

Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/f52ba6e5e48b
Delay adding and sorting unifinder items until load is complete. r=pmorris

Status: REOPENED → RESOLVED
Closed: 6 months ago2 months ago
Resolution: --- → FIXED

If this improves the performance, we should consider uplifting it, right?

Target Milestone: --- → 70

Eventually yes, but let's give it some time to see if any problems crawl out of the woodwork.

(In reply to Geoff Lankow (:darktrojan) from comment #103)

Eventually yes, but let's give it some time to see if any problems crawl out of the woodwork.

I'm inclined to agree, especially since this has been determined to not be a regression there is no urgency, although definitely a good thing for users. The natural progression is give it a week in a few users' hands, then uplift to beta 69 (with hopefully other newly identified issues in 68.0) so it can be proven ready for ESR, which would then probably be 68.1.

Whiteboard: [regression:tb63?] → [not regression]

Thanks for the patch... I will give it a try and report back on performance improvement it may bring... till then I remain doubtful that this patch address the issues raised in this bug which seem related to the unfinder addItems function... which the patch does not seem to cover... (could the patch have been published in the wrong bug report?)... seems premature to turn this bug as resolved just yet... keep you posted...

Geoff,

Progress but not out of the wood yet...
Can this bug be re-opened as it is still not fixed/resolved yet from an end-user point of view? It would make sense...

With TB Daily 70.0a1 (2019-08-08) (32-bit) which I believe include the patch https://hg.mozilla.org/comm-central/rev/f52ba6e5e48b and other code changes, the ~4000 items CalDAv calendar loads in about 2mn53 which is a much promising and welcome progress... Alléluia & Thank you! Though this loading duration above one minute remains not acceptable from end-user point of view, and I think you would agree...

I have observed a possible regression as when calendar is syncing no items appear in unifinder till the end of loading (with All Events and selected column Title), unfinder remain totally blank so you cannot see any progression of the loading while in the calendar view tab... it looks like nothing is happening... few items appearing in calendar week view... then nothing... till the end where "bang" all items appear at once suddenly in unifinder view!

Also noted, there are still gaps appearing between network requests (data gathering) which indicate delay from Thunderbird processing between the requests, the aim of this bug shall be to remove progressively all such gaps that cause unnecessary loading delay, but this time it is not coming from the addItems function as it seems this function is no longer triggered (a breakpoint set within that function is not triggered at all), is that normal?.

During calendar loading, Thunderbird UI keeps "freezing/lagging" but now the**(Not Responding)** notification is no longer showing up certainly because the delay of responsiveness has fall under a threshold... while its frequency remain regular... Thunderbird is more reactive but still lagging... I am conscious that this issue will not all be fully resolved here, this is just for information.

Hope that help bring progress further... work in progress, well done :-)

Regards,

Status: RESOLVED → REOPENED
Flags: needinfo?(geoff)
Resolution: FIXED → ---

I expected there to be no items shown until all were loaded, because that's what the patch did.

We must be stuck on some other slow code now, I wonder what. A new performance profile would be useful now. (You can save a profile as JSON from the developer tools, and then I can load it and look at it more closely than can be done with a screen shot.)

Flags: needinfo?(geoff)

Doesn't seem to be useful to reopen this. We file followup bugs for related issues and regressions (see new/clone bug below). An improvement landed here, so -> FIXED.

Status: REOPENED → RESOLVED
Closed: 2 months ago2 months ago
Resolution: --- → FIXED

@Geoff,
Will send you json perf profile shortly...

@ Magnus,
(In reply to Magnus Melin [:mkmelin] from comment #108)

Doesn't seem to be useful to reopen this. We file followup bugs for related issues and regressions (see new/clone bug below). An improvement landed here, so -> FIXED.

I understand you eagerness in trying to fix and close this bug, believe me I am the first one to want that to happen among all end-users out there :-)

But I would still suggest to keep that bug open to fix the remaining processing delay in TB due to some remaining slow code that still need to be narrowed down and sort out... to fix the gaps between network requests... that still exists... it is relevant to continue working on it here because it is fundamentally the same issue, because one calDAV calendar still take about 3mn to load via TB (will it take one seconds or two via http), because people interested in performance issues are already following this bug and all the background analysis is contained in it already.

As for the regression, it is less of a concern considering it is "by design" as Geoff seems to confirm and a separate regression/follow up bug can be file for it later on...

The delay and performance issues should be the priority... the team is making headways and we can see light in the tunnel but not there yet... unless it is an oncoming train...

Cheers,

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Closing this has nothing to do with eagerness, it is part of the process. The well understood practice in this bug system, BMO, is one bug per identified code fault. And when a patch has landed for that specific code fault, normally the bug is closed by the patch author, in this case Geoff who is a peer in the calendar module. (peer indicates he understands the code reasonably well)

Comments can continue to be made in closed bugs to gather more information, and spin new bugs spun from the analysis.

But for now the only thing which should cause this bug to be reopened is if Geoff's patch is backed out. So please leave it closed and comment and spin off new bugs as needed.

Status: REOPENED → RESOLVED
Closed: 2 months ago2 months ago
Resolution: --- → FIXED

Managing uplifts/backports for a bug with multiple patches can become a nightmare. So let's fix more issues in a new bug. Besides, at comment #111 this bug here is becoming confusing.

(In reply to Richard Leger from comment #106)

I have observed a possible regression as when calendar is syncing no items appear in unifinder till the end of loading (...)

A follow up/regression Bug 1572826 has been created to deal with that issue separately...

Also noted, there are still gaps appearing between network requests (data gathering) which indicate delay from Thunderbird processing between the requests, the aim of this bug shall be to remove progressively all such gaps that cause unnecessary loading delay, but this time it is not coming from the addItems function as it seems this function is no longer triggered (a breakpoint set within that function is not triggered at all), is that normal?.

Another follow up Bug 1572823 has been created to deal with this other issue separately...

Sorry, I missed the New/Clone Bug button on my screen so I created them as new bugs already, I don't know if someone can link them to this bug somehow for reference...

Duplicate of this bug: 1423838
You need to log in before you can comment on or make changes to this bug.