[Linux] alarms too late / internal time wrong after resuming from sleep / suspend / hibernation mode

RESOLVED FIXED in 1.9.2

Status

defect
--
critical
RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: mozilla, Assigned: mmecca)

Tracking

(Depends on 1 bug)

Lightning 1.4
1.9.2
x86_64
Linux
Dependency tree / graph

Details

Attachments

(1 attachment, 1 obsolete attachment)

I am using Lightning 1.4 with Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120429 Firefox/12.0 SeaMonkey/2.9.1.

Since one of the latest Lightning updates (I think it was 1.3) I noticed that alarm popups very often appear to late. With Lightning 1.4 the new "now" indicator (the red bar) was new, so now I see why the alarms appeared too late: the time that Lightning keeps internally often lags behind the real time.

A specialty of my setup may be that I am using it on a Laptop that I frequently suspend. After waking up I see that the software clock of the system is correct while it takes Lightning at least a few minutes (sometimes half an hour) to notice that it was wrong. Just now I woke the laptop up in the morning, the system time clock showed 09:22:32 (Wed 23 May 2012) but Lightning's red line was at ~22:45 on Wed 23 May.

While I am typing this, I set up a test event at 09:42 with an alarm 0 minutes before. It is now 09:46:00 and the alarm still hasn't shown up and the red indicator is still a bit above the alarm.

This is unacceptable.
The alarm that should have appeared at 09:42 finally showed up at about 09:54:30. It took until about 10:00:00 to visually see that red "now" indicator move up across the 09:42 event. (The laptop didn't suspend again in between.)
We don't keep an internal time, but the red bar only updates every 15 minutes to avoid too much useless system load. Check out the pref calendar.view.timeIndicatorInterval and set it to 1 if you really need updates every minute.

The alarms issue could happen due to a different set of timers. When the calendar refreshes, timers are created for events within the next hour or so. Although we pass flags to get a precise timer, maybe its not working.

Did you suspend between the refresh and the alarm firing? Could you test without suspending inbeween to see if it makes a difference?
No, there was no suspend between the 09:42 alarm setup and its final firing 12.5 minutes later (as I wrote in comment 1, but maybe I misunderstand the question?).

Where do I have to look to see the active timers? I'm totally outdated regarding ways to debug calendar code, but I would give it a try...
Are the late alarms reproducible if the laptop is not suspended at all after restarting Thunderbird?
I had to restart SeaMonkey (not Thunderbird) but for the moment the problem seems to have stopped.
OK, after some more tests I can confirm that suspending the laptop for a longer time (like several hours over night) without restarting SM afterwards causes the problem with the delayed alarms. Once I do restart the program, the problem is gone. Suspending for a short time (e.g. 45 minutes) doesn't seem to cause any problems.
Hmm strange, we usually restart the alarm service on wakeup, which should set the timers again. I don't really know how to further debug this aside from adding some debug info, would you be comfortable modifying the files in the .xpi?
Sure, I can modify files.
* In calendar-js/calAlarmService.js
  - add cal.ERROR("AS STARTUP") inside the startup() function.
  - add cal.ERROR("AS SHUTDOWN") inside the shutdown() function.
  - add cal.ERROR("AS UPDATE") inside the contained notify function
  - add cal.ERROR("ADD TIMER" + aItem.title + " in " + aTimeout + " for " +  
    cal.alarms.calculateAlarmDate(aItem, aAlarm)); inside the addTimer function
  - add cal.ERROR("FIRE TIMER " + cal.now() + " vs " +
    cal.alarms.calculateAlarmDate(aItem, aAlarm)) inside the notify functio,
    contained in addTimer.
  - add cal.ERROR("REMOVE TIMER " + aItem.title + " alarm " +
    aAlarm.toString(aItem)); inside the removeTimer functon
  - add cal.ERROR("WAKEUP: " + aTopic) inside the first if block of the observe
    function.

That should give you enough debugging to help us see where its going wrong
I added all that, restarted SeaMonkey and then suspended the laptop several times (but only briefly). The WAKEUP message never appeared. It only enters the observe function (of course with aTopic == "xpcom_shutdown") when I quit SeaMonkey.

I looked for documentation of "wake_notification" but didn't find anything. Bug 438040 comment 0 suggests that it doesn't work on Linux, although that comment is from 2008. That would also not explain why I never saw this problem until recently.
Hmm indeed this search <http://mxr.mozilla.org/comm-central/search?string=wake_notification> suggests that there is no support aside from mac and windows. Very unfortunate.

If timers are not reliable over a sleep cycle, thats probably a core bug. We try to reinitialize them using the wake notification, but it seems we are out of luck on Linux...
As a hack, we could track the current time in the alarm service, and run a consistency check every minute to check if current time > expected time, and reload alarms if so.
Sounds like a plan. I'm not sure if the skew will be large enough to detect for a timer that runs every minute though.

Maybe just set a timer that should fire every 10-15 minutes, save the expected time of firing with it, then when it fires compare the two. We should tinker with the interval to find out.

If someone knows linux API to detect when the systen wakes, we could also try patching core.
I have a possible patch for core that makes linux emit wake_notification. I don't know if it works yet and I don't know if it will be accepted since it uses functions from kernel headers not usually copied to /usr/include/linux.
Duplicate of this bug: 680778
Posted patch WIP - Temporary Workaround (obsolete) β€” β€” Splinter Review
This is a temporary workaround until wake_notification is supported on linux. I set the sleep monitor to check the expected time every minute, so that the alarms can be reloaded soon after a sleep cycle. 

I haven't tested this on Windows or Mac, I imagine there may be times where the alarm service could reload twice if the monitor timer fires before the wake_notification event. Maybe we could make the sleep monitor code OS dependant, or just use the sleep monitor for now until wake_notification is supported on all platforms?
Attachment #668783 - Flags: feedback?(philipp)
Assignee: nobody → matthew.mecca
Status: NEW → ASSIGNED
Duplicate of this bug: 806245
Comment on attachment 668783 [details] [diff] [review]
WIP - Temporary Workaround

Sounds like a good idea. We could also make this platform dependent and only run the timer on Linux, or better yet turn around the condition and check if the os is not osx or windows.
Attachment #668783 - Flags: feedback?(philipp) → feedback+
Duplicate of this bug: 841072
What's the status of this?  I got a Fedora bug on this (https://bugzilla.redhat.com/show_bug.cgi?id=910976) with a suggested patch.  Or I could try the WIP workaround here.  Suggestions?
(In reply to Orion Poplawski from comment #20)

Looks like the patch attached in the Fedora bug https://bugzilla.redhat.com/show_bug.cgi?id=910976 is just a copy of the WIP patch that is attached here.
(In reply to Orion Poplawski from comment #20)
> What's the status of this?  I got a Fedora bug on this
> (https://bugzilla.redhat.com/show_bug.cgi?id=910976) with a suggested patch.
> Or I could try the WIP workaround here.  Suggestions?

I'm planning to finish this up soon (when time permits). At a quick glance the suggested patch on the Fedora bug looks the same as the WIP patch, which will likely be the final patch with some OS conditional code.
Ah, sorry for not checking.  Also, 1.9.1 should be out soon, right?  I'd like to do that update and this patch at the same time.
I posted the patch to Fedora.  It is a copy of the last patch here, with lines adjusted to run against the Fedora code.

I posted it to Fedora because this bug harms 100% of Fedora users, while here in Mozilla-land you have a buffer of MacOS and Windows users you also have to keep happy.

I would strongly prefer the bug to get fixed upstream, but until you take the patch, I'd like Fedora to pick it up.  Because I hate missing my meetings :-)
If Orion picks up this patch (or its fedora equivalent, see comment 21), that will make me happy.  But what is needed to get this patch into Lightning proper, so other Unix users can make their meetings?
Summary: Lightning's alarms too late / internal time is wrong → [Linux] alarms too late / internal time wrong after resuming from sleep / suspend / hibernation mode
Posted patch Fix v1 β€” β€” Splinter Review
Use sleep monitor on platforms other than Windows and Mac.
Attachment #668783 - Attachment is obsolete: true
Attachment #717538 - Flags: review?(philipp)
Comment on attachment 717538 [details] [diff] [review]
Fix v1

Looks good to me, r=philipp
Attachment #717538 - Flags: review?(philipp) → review+
Pushed to comm-central - https://hg.mozilla.org/comm-central/rev/51640de8ce37
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → 2.4
Thunderbird just pulled in the 1.9.1 update, and the bug isn't fixed. I'm on Fedora 17. Fortunately, my meeting this morning had been canceled. 

The only way it works is if I restart Thunderbird after suspending. Otherwise, I never get the reminder. 

Is this fix only in the Fedora thunderbird-lightning package, or in lightning proper?
This bug is only fixed in Lightning 2.4 for Thunderbird 22 (and newer), see comments and status fields above.
Attachment #717538 - Flags: approval-calendar-release?(philipp)
Attachment #717538 - Flags: approval-calendar-beta?(philipp)
Attachment #717538 - Flags: approval-calendar-aurora?(philipp)
Whiteboard: [wanted-1.9.x]
Attachment #717538 - Flags: approval-calendar-release?(philipp)
Attachment #717538 - Flags: approval-calendar-release+
Attachment #717538 - Flags: approval-calendar-beta?(philipp)
Attachment #717538 - Flags: approval-calendar-beta+
Attachment #717538 - Flags: approval-calendar-aurora?(philipp)
Attachment #717538 - Flags: approval-calendar-aurora+
Carl Ollivier-Gooch added the following comment to Launchpad bug report 1052800:

It would be great to see this backported to versions that the various long-term support Linux distros use.  Otherwise this bug fix (long overdue) still won't reach a lot of users until their next OS upgrade, at least.


-- 
http://launchpad.net/bugs/1052800
I'm afraid this is another case of "yes we support Linux, but it's not a priority." After all, it's not like we have a lot of options. This is why I stopped using T-bird long ago, and only returned for exchange mail support. I only us it at work, otherwise I use a browser. Unfortunately there just aren't any reliable options for Linux, outside of Android.
You need to log in before you can comment on or make changes to this bug.