Last Comment Bug 762620 - Evidence that idle-daily can start misfiring
: Evidence that idle-daily can start misfiring
Status: RESOLVED FIXED
:
Product: Firefox for Android
Classification: Client Software
Component: General (show other bugs)
: Trunk
: ARM Android
: -- critical (vote)
: Firefox 16
Assigned To: Gian-Carlo Pascutto [:gcp]
:
: Sebastian Kaspari (:sebastian)
Mentors:
Depends on: 1085554
Blocks: 792801 762590 763991 834327
  Show dependency treegraph
 
Reported: 2012-06-07 12:08 PDT by Gian-Carlo Pascutto [:gcp]
Modified: 2016-07-29 14:25 PDT (History)
18 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
wontfix
+
fixed
.N+


Attachments
Patch 1. Add logging for idle-daily (2.06 KB, patch)
2012-06-11 13:26 PDT, Gian-Carlo Pascutto [:gcp]
mark.finkle: review+
Details | Diff | Splinter Review
Log of timer service (5.63 KB, text/plain)
2012-06-12 11:37 PDT, Gian-Carlo Pascutto [:gcp]
no flags Details
Patch 1. Add logging for idle-daily (13.62 KB, patch)
2012-06-12 11:40 PDT, Gian-Carlo Pascutto [:gcp]
mark.finkle: review+
Details | Diff | Splinter Review
Patch 2. Add the extra log in the actual DailyIdle event (979 bytes, patch)
2012-06-27 23:42 PDT, Gian-Carlo Pascutto [:gcp]
blassey.bugs: review+
Details | Diff | Splinter Review
Another suspicious log of IdleService (4.22 KB, text/plain)
2012-07-02 02:26 PDT, Gian-Carlo Pascutto [:gcp]
no flags Details
Patch 1. Force saving perferences after updating idle time (1.20 KB, patch)
2012-07-04 00:46 PDT, Gian-Carlo Pascutto [:gcp]
blassey.bugs: review+
akeybl: approval‑mozilla‑aurora+
akeybl: approval‑mozilla‑beta+
Details | Diff | Splinter Review
Patch 2. Restart daily timers that return early. (4.12 KB, patch)
2012-07-05 11:17 PDT, Gian-Carlo Pascutto [:gcp]
roc: review+
irving: feedback+
akeybl: approval‑mozilla‑aurora+
akeybl: approval‑mozilla‑beta+
Details | Diff | Splinter Review

Description Gian-Carlo Pascutto [:gcp] 2012-06-07 12:08:16 PDT
It seems that some stuff that fires in idle-daily can start to go bonkers in Native Fennec:

https://bugzilla.mozilla.org/show_bug.cgi?id=758193
https://bugzilla.mozilla.org/show_bug.cgi?id=762590
Comment 1 Gian-Carlo Pascutto [:gcp] 2012-06-07 12:17:06 PDT
Telemetry is getting basically DDOS-ed from Fennec devices, so this is important.
Comment 2 Brad Lassey [:blassey] (use needinfo?) 2012-06-08 11:42:44 PDT
qawanted for str
Comment 3 Gian-Carlo Pascutto [:gcp] 2012-06-08 11:48:42 PDT
Can the telemetry guys give us some indication of they can see if its specific to a certain Android device or Android version?
Comment 4 Xavier Stevens [:xstevens] 2012-06-08 13:11:38 PDT
Gian-Carlo, On the data service I was seeing Firefox 14.0 on Android mostly. I don't record the payloads themselves right now when I reject the requests so I can't give you much more detail than that. Here's an example user-agent string that I log:

Mozilla/5.0 (Android; Mobile; rv:14.0) Gecko/14.0 Firefox/14.0
Comment 5 Gian-Carlo Pascutto [:gcp] 2012-06-09 02:01:43 PDT
Do you have the possibility to record one or some of the payloads?
Comment 6 Aaron Train [:aaronmt] 2012-06-11 08:52:12 PDT
Need a way of logging or tracking to continue here; re comment #2.
Comment 7 Mark Finkle (:mfinkle) (use needinfo?) 2012-06-11 12:51:34 PDT
GCP - Can you add logging to verify that idle-daily is really firing too much?
Comment 8 Daniel Einspanjer [:dre] [:deinspanjer] 2012-06-11 13:08:00 PDT
Is this a problem with the general idle-daily mechanism or is it specific to Telemetry?  I'm wondering if things like AUS or Blocklist could also be getting "DOS'ed"
Comment 9 (dormant account) 2012-06-11 13:23:00 PDT
(In reply to Daniel Einspanjer :dre [:deinspanjer] from comment #8)
> Is this a problem with the general idle-daily mechanism or is it specific to
> Telemetry?  I'm wondering if things like AUS or Blocklist could also be
> getting "DOS'ed"

it's a general problem... So it's very likely those will be affected too, can you check logs?

For telemetry, I'm not sure how much metrics can do since duplicate telemetry pings are discarded asap on server. Xavier I think the way to go here is to check when Fennec 14 started firing high-frequency idle-daily pings to get  a regression range. Do we have http logs that go a month or two back?
Comment 10 Gian-Carlo Pascutto [:gcp] 2012-06-11 13:26:57 PDT
Created attachment 632006 [details] [diff] [review]
Patch 1. Add logging for idle-daily

I'd suggest to put this in Nightlies so we have more info/are able to detect if someone manages to reproduce.
Comment 11 Mark Finkle (:mfinkle) (use needinfo?) 2012-06-11 14:29:36 PDT
Comment on attachment 632006 [details] [diff] [review]
Patch 1. Add logging for idle-daily

I'm OK with adding this for a week or two, until we get a better feel for the situation.

Let's remember to remove it though.
Comment 12 Gian-Carlo Pascutto [:gcp] 2012-06-11 14:54:44 PDT
>I'm wondering if things like AUS or Blocklist could also be getting "DOS'ed"

FWIW AUS/blocklist are on the update/application-timer, not on idle-daily, so we're expecting them not to be affected. Of course if it's something like a device/firmware bug that is triggering this then all bets are off.
Comment 13 Alex Keybl [:akeybl] 2012-06-11 18:20:03 PDT
(In reply to Gian-Carlo Pascutto (:gcp) from comment #12)
> >I'm wondering if things like AUS or Blocklist could also be getting "DOS'ed"
> 
> FWIW AUS/blocklist are on the update/application-timer, not on idle-daily,
> so we're expecting them not to be affected. Of course if it's something like
> a device/firmware bug that is triggering this then all bets are off.

I've asked the AMO guys to let us know if they're seeing higher than normal traffic, just in case.
Comment 14 Gian-Carlo Pascutto [:gcp] 2012-06-12 00:46:21 PDT
I/PRLog   (10837): 855712[5f051080]: idleService: Register idle observer 5d50d5a0 for 300 seconds
I/PRLog   (10837): 855712[5f051080]: idleService: Register: adjusting next switch from -1 to 300 seconds
I/PRLog   (10837): 855712[5f051080]: idleService: next timeout 1339486359856191 usec (-551 msec from now)
I/PRLog   (10837): 855712[5f051080]: idleService: SetTimerExpiryIfBefore: next timeout 1339486359856191 usec
I/PRLog   (10837): 855712[5f051080]: idleService: reset timer expiry from 0 usec to 1339486360418479 usec
I/PRLog   (10837): 855712[5f051080]: idleService: Get idle time: polled 1546536376 msec, valid = 0
I/PRLog   (10837): 855712[5f051080]: idleService: Get idle time: time since reset 300561 msec
I/PRLog   (10837): 855712[5f051080]: idleService: Idle timer callback: current idle time 300561 msec
I/PRLog   (10837): 855712[5f051080]: idleService: Reset idle timeout (last interaction 300561 msec)
I/PRLog   (10837): 855712[5f051080]: idleService: next timeout 1339486359857350 usec (-561 msec from now)
I/PRLog   (10837): 855712[5f051080]: idleService: SetTimerExpiryIfBefore: next timeout 1339486359857350 usec
I/PRLog   (10837): 855712[5f051080]: idleService: reset timer expiry from 0 usec to 1339486360428651 usec
I/PRLog   (10837): 855712[5f051080]: idleService: ReconfigureTimer: no idle or waiting observers
I/PRLog   (10837): 855712[5f051080]: idleService: Idle timer callback: tell observer 5d50d5a0 user is idle
I/PRLog   (10837): 855712[5f051080]: idleService: Remove idle observer 5d50d5a0 (300 seconds)
I/PRLog   (10837): 855712[5f051080]: idleService: Get idle time: polled 1546536376 msec, valid = 0
I/PRLog   (10837): 855712[5f051080]: idleService: Get idle time: time since reset 300569 msec
I/PRLog   (10837): 855712[5f051080]: idleService: Idle timer callback: current idle time 300569 msec
I/PRLog   (10837): 855712[5f051080]: idleService: Reset idle timeout (last interaction 300569 msec)

Some observations:
1) I had to fix the logging to get the "-551 msec from now". The logging in the code is wrapping around. Is this value supposed to be negative?
2) If I understand what the code is doing, it's reprogramming the next timeout *before* the last one fired. There is some code that protects against that by adding 10 ms to the timeout. As a result if fires twice consecutively.

I'm going to dig a bit more into this, but the above looks a bit strange to me. If anyone more familiar with this service wants to comment...
Comment 15 :Irving Reid (No longer working on Firefox) 2012-06-12 08:09:55 PDT
> Some observations:
> 1) I had to fix the logging to get the "-551 msec from now". The logging in
> the code is wrapping around. Is this value supposed to be negative?

Yes, this is the case where somebody registers an idle observer saying "notify me when we've been idle for 300 seconds" and the application has already been idle for 300.551 seconds.

> 2) If I understand what the code is doing, it's reprogramming the next
> timeout *before* the last one fired. There is some code that protects
> against that by adding 10 ms to the timeout. As a result if fires twice
> consecutively.
> 
> I'm going to dig a bit more into this, but the above looks a bit strange to
> me. If anyone more familiar with this service wants to comment...

I might have an idea of what's happening. Does Telemetry want the behaviour "call my callback every 300 seconds while the application is idle"? The idle service doesn't actually support that right now. Instead what you get is:

Telemetry registers idle observer for 300 seconds
Idle timer fires when app is idle 300 seconds 
 Calls Telemetry observer
   Telemetry observer de-registers itself, does its action, registers a *new* idle observer for 300 seconds
Idle timer fires immediately (plus 10 ms fudge factor) because app is already idle more than 300 seconds
  Calls Telemetry observer again
    (and we either go into our loop, or fail out based on the bug below)


The difference that I saw, working on desktop Thunderbird, was that the Telemetry observer would throw an exception rather than continually re-register the observer; I filed https://bugzilla.mozilla.org/show_bug.cgi?id=743776 about that.
Comment 16 Marco Bonardo [::mak] 2012-06-12 09:18:58 PDT
just for information, idle-daily is nothing special, it is just a simple idle observer that once gets an idle notification verifies if it did fire in the last 24 hours, and, if not, it notifies.
The problem here seems to be more related to the idle service itself, than idle-daily, maybe the Android backed is just wrongly measuring idle?
Telemetry just needs to be notified about the first idle _after_ idle-daily.
Comment 17 Gian-Carlo Pascutto [:gcp] 2012-06-12 09:23:12 PDT
For what it's worth, nsIdleDaily on Android doesn't use polling, but just timers and resets based on Gecko Events. i.e. there is no Android-specific code in the service when compiled for Android.

Even if it were measuring idle time wrongly, I don't think that should start the rapid-fire behavior that the reports in the description seem to suggest.

I'd certainly not be surprised if the clocks on some Android devices could behave..."unexpectedly" to trigger this, though.
Comment 18 Gian-Carlo Pascutto [:gcp] 2012-06-12 11:37:38 PDT
Created attachment 632347 [details]
Log of timer service

Here's some more strange behavior caught when logging.
Comment 19 Gian-Carlo Pascutto [:gcp] 2012-06-12 11:40:54 PDT
Created attachment 632351 [details] [diff] [review]
Patch 1. Add logging for idle-daily

Add more logging so we can see what is going on. This will be fairly low output if things work normally, but more verbose if stuff goes bonkers.
Comment 20 Gian-Carlo Pascutto [:gcp] 2012-06-13 00:22:28 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/86cd5f375c7d

Please leave this bug open when merging.
Comment 21 Ed Morley [:emorley] 2012-06-13 02:21:55 PDT
You need to use the whiteboard annotation [1], or the (new, still in testing) tool will close as fixed regardless :-)

[1] https://wiki.mozilla.org/Tree_Rules/Inbound#Please_do_the_following_after_pushing_to_inbound
Comment 23 Marco Bonardo [::mak] 2012-06-13 07:41:11 PDT
(In reply to Gian-Carlo Pascutto (:gcp) from comment #17)
> For what it's worth, nsIdleDaily on Android doesn't use polling

I guess you meant nsIdleService here, as I said idleDaily is just a normal observer added to the service, I think we are making confusion here, the problem must be idle, not the daily notification.
Btw, maybe the "difference" is exactly this one, polling has been fixed/changed recently, while the events based one may be mis-firing.
Comment 24 Brad Lassey [:blassey] (use needinfo?) 2012-06-13 08:59:21 PDT
with telemetry disabled, I don't think this blocks anymore
Comment 25 Matt Brubeck (:mbrubeck) 2012-06-13 13:29:36 PDT
https://hg.mozilla.org/mozilla-central/rev/548af4154271
Comment 26 :Irving Reid (No longer working on Firefox) 2012-06-14 08:26:37 PDT
(In reply to Gian-Carlo Pascutto (:gcp) from comment #18)
> Created attachment 632347 [details]
> Log of timer service
> 
> Here's some more strange behavior caught when logging.

That log is interesting in that it shows the underlying one-shot timer we set up at http://mxr.mozilla.org/comm-central/source/mozilla/widget/xpwidgets/nsIdleService.cpp#680 is calling us back much sooner than we expect; this could point at a problem with the timer service on Android.

The log also shows that the idle service correctly keeps resetting the timer until the desired idle time has passed, and only calls the idle listener then.


The idle daily service is also using the timer service to wait for one day between activities (see http://mxr.mozilla.org/comm-central/source/mozilla/widget/xpwidgets/nsIdleService.cpp#113). If the underlying timer service is calling back too quickly this would cause the daily event to re-run. Adding a log call to nsIdleServiceDaily::DailyCallback() would help us spot this.
Comment 27 :Irving Reid (No longer working on Firefox) 2012-06-14 08:42:00 PDT
To clarify, the sequence of events I'm postulating is:

1) idle-daily registers listener waiting for 300 sec. of idle time

2) idle service calls back idle-daily handler

3) idle-daily handler de-registers listener, notifies daily observers, registers @mozilla.org/timer;1 for SECS_PER_DAY * MSEC_PER_SEC

4) timer service calls back idle-daily long before the requested day has passed

5) idle-daily timer handler registers idle listener waiting for 300 sec.

6) idle service calls back handler listener immediately, because we've already been idle more than 300 sec.

7) Go to 3) and repeat
Comment 28 Tony Chung [:tchung] 2012-06-18 09:26:48 PDT
(In reply to Irving Reid (:irving) from comment #27)
> To clarify, the sequence of events I'm postulating is:
> 
> 1) idle-daily registers listener waiting for 300 sec. of idle time
> 
> 2) idle service calls back idle-daily handler
> 
> 3) idle-daily handler de-registers listener, notifies daily observers,
> registers @mozilla.org/timer;1 for SECS_PER_DAY * MSEC_PER_SEC
> 
> 4) timer service calls back idle-daily long before the requested day has
> passed
> 
> 5) idle-daily timer handler registers idle listener waiting for 300 sec.
> 
> 6) idle service calls back handler listener immediately, because we've
> already been idle more than 300 sec.
> 
> 7) Go to 3) and repeat

Irving, are you able to provide steps to reproduce for this bug?
Comment 29 :Irving Reid (No longer working on Firefox) 2012-06-19 07:14:00 PDT
Tony, I haven't observed the bug myself; that's :gcp. I'm just trying to help interpret his log file based on the log messages I added to the timer service a little while ago.
Comment 30 Gian-Carlo Pascutto [:gcp] 2012-06-19 07:21:14 PDT
I never observed this bug either, the log I pasted is from running Fennec normally. We seem to be seeing that this basically never works and likely never worked correctly on Android.
Comment 31 Naoki Hirata :nhirata (please use needinfo instead of cc) 2012-06-19 13:22:57 PDT
So would this mean just keeping the device charged and then having fennec in the foreground, but just letting the device go to sleep?
Comment 32 Gian-Carlo Pascutto [:gcp] 2012-06-27 23:42:17 PDT
Created attachment 637386 [details] [diff] [review]
Patch 2. Add the extra log in the actual DailyIdle event

As requested above in the comments.
Comment 33 Gian-Carlo Pascutto [:gcp] 2012-06-27 23:48:53 PDT
>The idle daily service is also using the timer service to wait for one day between 
>activities (see http://mxr.mozilla.org/comm-central/source/mozilla/widget/xpwidgets
>/nsIdleService.cpp#113). If the underlying timer service is calling back too 
>quickly this would cause the daily event to re-run. Adding a log call to 
>nsIdleServiceDaily::DailyCallback() would help us spot this.

DailyCallback only calls AddIdleObserver, which would make a log entry. In the strange log that is only shown once. Of course absence of evidence is not evidence of absence here.
Comment 34 Gian-Carlo Pascutto [:gcp] 2012-06-28 06:37:31 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/9519fbc7fb92
Comment 35 Ed Morley [:emorley] 2012-06-29 00:48:28 PDT
https://hg.mozilla.org/mozilla-central/rev/9519fbc7fb92
Comment 36 Gian-Carlo Pascutto [:gcp] 2012-07-02 02:26:44 PDT
Created attachment 638283 [details]
Another suspicious log of IdleService

I see several things wrong here:

I/IdleService(15997): Storing last idle time as 1341212144 (Mon, 02 Jul 2012 06:55:44 GMT)
I/IdleService(16793): DailyCallback started
I/IdleService(16793): Storing last idle time as 1341213045 (Mon, 02 Jul 2012 07:10:45 GMT)
I/IdleService(16878): DailyCallback started
I/IdleService(16878): Storing last idle time as 1341213529 (Mon, 02 Jul 2012 07:18:49 GMT)

So within the space of 25 minutes, we fired the DailyCallback twice, which shouldn't be possible.

I/IdleService(15997): next timeout 1341210519361288 usec (-1625476 msec from now)
I/IdleService(15997): SetTimerExpiryIfBefore: next timeout 1341210519361288 usec
I/IdleService(15997): next timeout 5636177514361288 usec (4294965369508 msec from now)
I/IdleService(15997): SetTimerExpiryIfBefore: next timeout 5636177514361288 usec

Note that the android log uses %lld, so the second value is really that big and not negative. Something in the code is overflowing fatally.

Additionally, I found a logic error in the IdleService::Init function:
We check if more than a day has passed since the last check. If so, we call DailyCallback. If not, we set the timer a day from now. This means that it can take up to *two* days before that timer refires.

Unfortunately I don't have real STR. But the above will give some clues where to start looking.
Comment 37 Naoki Hirata :nhirata (please use needinfo instead of cc) 2012-07-02 09:37:34 PDT
qawanted removed; tried to find STR but could not.  It seems like there is something more than just idling device...
Comment 38 Gian-Carlo Pascutto [:gcp] 2012-07-03 02:50:05 PDT
I found STR for the wrong behavior. Not the behavior that can explain the rapid-fire of idles, but a very normal usage scenario that will cause idle-daily to be triggered multiple times per day.

1) Launch Fennec
2) Browse a webpage
3) Switch away from Fennec (putting it into the background)
4) Wait for 5 minutes so idle-daily fires
5) Force Fennec to quit via task manager
6) Repeat

Each iteration will fire idle-daily. Note that step 5 will happen naturally as other Android apps start using RAM.

I'm guessing the Preferences aren't hitting disk before Fennec is killed.
Comment 39 Gian-Carlo Pascutto [:gcp] 2012-07-03 07:37:08 PDT
What is happening is the following: the moment the user switches away from Fennec, we get ACTIVITY_PAUSING followed by ACTIVITY_STOPPING.

https://mxr.mozilla.org/mozilla-central/source/widget/android/nsAppShell.cpp#418

This forces a SavePrefFile call at that point. There are no further message when the activity gets destructed, though I do see a COMPOSITOR_PAUSE just before we get killed.

The problem is now that the idle-daily call will fire after 5 minutes, while the app is back-grounded. idle-daily will then store its updated last-run time in preferences, but at this point there will be no more SavePrefFile. As a result, the updated preference is lost if the activity dies with the user foregrounding it again.

The most simple solution here seems to modify nsIdleServiceAndroid to force a SavePrefFile call after the idle-daily has run, though a case could be made to just do it in the generic code, too (it would avoid repeated idle-daily calls if Firefox crashes, for example).
Comment 40 Mark Finkle (:mfinkle) (use needinfo?) 2012-07-03 09:12:26 PDT
Be careful calling savePrefFile as it will block main thread IO: bug 683808
Comment 41 Gian-Carlo Pascutto [:gcp] 2012-07-03 09:16:03 PDT
I think we can live with that given that it'll trigger once per day after the application has been idle for 5 minutes.
Comment 42 Gian-Carlo Pascutto [:gcp] 2012-07-04 00:46:19 PDT
Created attachment 638988 [details] [diff] [review]
Patch 1. Force saving perferences after updating idle time

This adds the save in the generic path, I think this will help in situations where desktop Firefox crashes as well.
Comment 43 Gian-Carlo Pascutto [:gcp] 2012-07-05 09:51:03 PDT
>4) timer service calls back idle-daily long before the requested day has passed
>
>5) idle-daily timer handler registers idle listener waiting for 300 sec.
>
>6) idle service calls back handler listener immediately, because we've already 
>been idle more than 300 sec.

I believe you're exactly right here. I was looking at logic errors in the code because the logs looked so strange, but I couldn't pinpoint any. (The logic is quite...complicated in places, but it looks correct)

I then modified the code to artificially let the "one day" timer fire too quickly, and lo and behold, I get exactly the bad behavior *and* the same log as above.

nsITimer doesn't work correctly on Android...sometimes.

To prevent this loop, perhaps we can do the same as the Idle Timer is doing, and verify that the passed time is actually what we set the timer for, then refire if it returned to soon.
Comment 44 :Irving Reid (No longer working on Firefox) 2012-07-05 10:36:37 PDT
(In reply to Gian-Carlo Pascutto (:gcp) from comment #43)
...
> To prevent this loop, perhaps we can do the same as the Idle Timer is doing,
> and verify that the passed time is actually what we set the timer for, then
> refire if it returned to soon.

I was thinking about taking that approach too. The other way would be to use a recurring nsITimer for the daily trigger, and then schedule a one-shot idle timer each time we get the daily event from the recurring timer. We might still want an extra check to make sure the recurring timer really is only calling us once a day.
Comment 45 Gian-Carlo Pascutto [:gcp] 2012-07-05 11:17:13 PDT
Created attachment 639402 [details] [diff] [review]
Patch 2. Restart daily timers that return early.
Comment 46 Alex Keybl [:akeybl] 2012-07-05 11:42:21 PDT
Quick heads up guys - looks like you guys have this on lockdown, but we should be targeting Aurora 15 this week, for uplift to Beta 14 prior to our final beta go-to-build Monday (7/9).
Comment 47 :Irving Reid (No longer working on Firefox) 2012-07-05 12:17:53 PDT
(In reply to Gian-Carlo Pascutto (:gcp) from comment #36)
...
> I/IdleService(15997): next timeout 1341210519361288 usec (-1625476 msec from
> now)
> I/IdleService(15997): SetTimerExpiryIfBefore: next timeout 1341210519361288
> usec
> I/IdleService(15997): next timeout 5636177514361288 usec (4294965369508 msec
> from now)
> I/IdleService(15997): SetTimerExpiryIfBefore: next timeout 5636177514361288
> usec
> 
> Note that the android log uses %lld, so the second value is really that big
> and not negative. Something in the code is overflowing fatally.

The overflow is because we've somehow failed to correctly keep track of (or handle) the next timeout event - it shouldn't be in the past, as the "-1625427 msec from now" line shows.

Do we get idle events while the device is asleep? If not, that could be a factor.

The patch looks good, I'm doing a bit of testing on it now, but I'm nowhere near a peer for this module so I'll give feedback but we should find a peer for the r?.
Comment 48 Gian-Carlo Pascutto [:gcp] 2012-07-05 12:47:24 PDT
>The overflow is because we've somehow failed to correctly keep track of (or 
>handle) the next timeout event - it shouldn't be in the past, as the "-1625427 
>msec from now" line shows.

This will happen if the idle timeout has already been exceeded and the daily-timer fires. It will check when the next timeout (300s idle) should be, see that it's in the past (because we've been idle longer), and get that negative value. The large one is not an overflow - I assumed so too, but it's really mean to be INT_MAX - it just means that there is no "next" time to fire, because all observes are in Idle already.

>Do we get idle events while the device is asleep? If not, that could be a factor.

At least on my test device idle-daily fires even if it's turned "off".
Comment 49 :Irving Reid (No longer working on Firefox) 2012-07-05 13:59:44 PDT
Just a thought: are there any signed vs. unsigned compiler warnings in the Android build of this code? It builds without warnings on Mac OS X.
Comment 50 Gian-Carlo Pascutto [:gcp] 2012-07-05 22:29:36 PDT
>Just a thought: are there any signed vs. unsigned compiler warnings in the Android 
>build of this code?

IIRC only in the logging code. As noted in comment 48 the negative/large values in the log look "correct" to me.
Comment 53 Brad Lassey [:blassey] (use needinfo?) 2012-07-06 21:56:09 PDT
Comment on attachment 638988 [details] [diff] [review]
Patch 1. Force saving perferences after updating idle time

gcp, please request aurora and beta approval for both of these patches
Comment 54 Gian-Carlo Pascutto [:gcp] 2012-07-07 01:43:55 PDT
Comment on attachment 638988 [details] [diff] [review]
Patch 1. Force saving perferences after updating idle time

[Approval Request Comment]
User impact if declined: Required to re-enable telemetry on Fennec.
Testing completed (on m-c, etc.): Landed on m-c a day ago.
Comment 55 Gian-Carlo Pascutto [:gcp] 2012-07-07 01:44:46 PDT
Comment on attachment 639402 [details] [diff] [review]
Patch 2. Restart daily timers that return early.

[Approval Request Comment]
User impact if declined: Required to re-enable Fennec telemetry.
Testing completed (on m-c, etc.): Landed on m-c day ago.
Risk to taking this patch (and alternatives if risky): Daily timers not firing correctly (all platforms) if there's a bug.
Comment 56 Alex Keybl [:akeybl] 2012-07-08 09:00:52 PDT
Comment on attachment 638988 [details] [diff] [review]
Patch 1. Force saving perferences after updating idle time

[Triage Comment]
.N+ blocker, so let's land on all branches in preparation for our final beta build Monday.

Please advise how QA can verify this fix.
Comment 57 Gian-Carlo Pascutto [:gcp] 2012-07-08 14:46:41 PDT
https://hg.mozilla.org/releases/mozilla-aurora/rev/e860b5fe213b
https://hg.mozilla.org/releases/mozilla-aurora/rev/e10bd54565c8

https://hg.mozilla.org/releases/mozilla-beta/rev/ad82f6e15f40
https://hg.mozilla.org/releases/mozilla-beta/rev/93e99923951b

QA: verify that in the Android log, under the IdleService tag, the following message pops up exactly once per day: "DailyCallback registering Idle observer", after Firefox is idle for 5 minutes (but still in memory of the phone). You might need to visit at least one webpage for this to trigger.

The steps described in comment 38 should not cause it to be printed more than once per day.
Comment 58 Aaron Train [:aaronmt] 2012-07-09 07:09:09 PDT
Just confirming here, that the only message with the IdleService tag I have seen after delay is: 'Setting timer a day from now'.
Comment 59 Gian-Carlo Pascutto [:gcp] 2012-07-09 10:34:27 PDT
This means that the service has already run once today. You should see the message "DailyCallback registering Idle observer" when you use that phone again after 24hrs. (Or on a phone you didn't use for at least a day)
Comment 60 Gian-Carlo Pascutto [:gcp] 2012-07-10 09:56:17 PDT
Backed out of beta after discussion wrt bug 763991 and bug 772075. (Too many
risks to pull all patches into beta now)

https://hg.mozilla.org/releases/mozilla-beta/rev/07ec0ad51ecb
https://hg.mozilla.org/releases/mozilla-beta/rev/d5e38890ddcb
Comment 61 :Irving Reid (No longer working on Firefox) 2012-07-23 10:52:36 PDT
Comment on attachment 639402 [details] [diff] [review]
Patch 2. Restart daily timers that return early.

Review of attachment 639402 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry, was on vacation. Glad that this is already in the tree.

Note You need to log in before you can comment on or make changes to this bug.