Closed Bug 1086598 Opened 10 years ago Closed 10 years ago

[Clock] Due to AMD, showing alarm screen takes ~7 sec

Categories

(Firefox OS Graveyard :: Gaia::Clock, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(tracking-b2g:+, b2g-v2.0 affected, b2g-v2.0M affected, b2g-v2.1 affected, b2g-v2.2 affected)

RESOLVED FIXED
tracking-b2g +
Tracking Status
b2g-v2.0 --- affected
b2g-v2.0M --- affected
b2g-v2.1 --- affected
b2g-v2.2 --- affected

People

(Reporter: swathiks115, Assigned: jrburke)

References

Details

(Keywords: regression, Whiteboard: [LibGLA,17546, TD-444,KAT, B][partner-blocker])

Attachments

(3 files)

STR:

1) Launch clock app
2) Set an alarm which will expire in next minute
3) kill clock app and wait for alarm to trigger.
4) Observe the current time in some other device/PC

Observation:
The alarm screen is shown after 10sec e.g If we set alarm for 10:10:00, alarm screen is visible approx at 10:10:10
We have done some analysis of this issue and it seems that loading of AMD module takes time when clock app is started in BG.

definition of AlarmPanel AMD and creation of ActiveAlarm is happening after ~7sec.

When we launch the clock app from Homescreen these are completed within 1-2 sec

Dear Vance,

Please help us in finding the suitable person to look into this.
Flags: needinfo?(vchen)
Whiteboard: [LibGLA,17546, TD-444,KAT, B]
Hi Marcus, Would you please kindly look into the bug which is requested by partner ?

Thank you very much!
Flags: needinfo?(m)
[Blocking Requested - why for this release]:

Hi Dylan, we have double confirm with partner bug1086598 is a high priority issue. 

If we don't take it now, then it will soon turn out to be a blocker to the next stage.


Please kindly find someone to look into the issue. Thank you very much!

Set it to 2.0? for now
blocking-b2g: --- → 2.0?
Flags: needinfo?(sku)
Flags: needinfo?(kkuo)
Flags: needinfo?(doliver)
Flags: needinfo?(kkuo) → needinfo?(bbajaj)
Whiteboard: [LibGLA,17546, TD-444,KAT, B] → [LibGLA,17546, TD-444,KAT, B][partner-blocker]
This is a partner IOT Blocker, the maximum acceptable delay is 7 sec.  Please kindly investigate the issue !

Thank you very much!!
Group: kddi-confidential
Group: kddi-confidential
Keven, we'd appreciate it if the Devices team would take the lead on this issue. We will help where we can but this does not meet the criteria for blocking by the OS team.
Flags: needinfo?(m)
Flags: needinfo?(kkuo)
Flags: needinfo?(doliver)
Flags: needinfo?(bbajaj)
Hi! Evelyn,

Could your team help to look at this issue first and figure out what is the root cause?
Thanks

--
Keven
Flags: needinfo?(kkuo) → needinfo?(ehung)
Tony, do we have this case in Smoke test? If not, I suggest we list it on per comment 4. This case seems a regression after Clock refactoring.
Flags: needinfo?(ehung) → needinfo?(tchung)
Keywords: regression
According to comment 1, we understand the mozSetMessageHandler was blocked by other modules loading, and some module could be redundant. My best guess is that we didn't do build optimization (squash a set of js into one file) for alarm screen, and we only optimized for app launch time. 

@Marcus, what do you think?
Flags: needinfo?(m)
Looking at the build_stage folder, all of the JS files are correctly optimized and concatenated into one file, so I don't think that's the cause. There have been other bugs filed about the delay between alarm firing and the screen showing up[1] as well.

If the Clock app itself is not running, the system has to first load the Clock app (index.html), and only then can it load the attention screen (onring.html); we do this because we need access to the database and other Clock code, which is loaded in the main frame. It's possible that there's just too much work being done there to perform well on lower-end devices. If that's the case, changing the loading mechanism may be nontrivial; currently we need most of the database code and other JS dependencies in order to properly retrieve alarm details and fire the attention screen. I'd have to look into this in more detail to know if there's any low-hanging fruit that could be lazy-loaded after we load the attention screen.

[1]: https://bugzilla.mozilla.org/show_bug.cgi?id=961989
Flags: needinfo?(m)
Chens, I simply printed the log on startup.js and active_alarm.js as below:

diff --git a/apps/clock/js/panels/alarm/active_alarm.js b/apps/clock/js/panels/alarm/active_alarm.js
index 32d0f89..c19908f 100644
--- a/apps/clock/js/panels/alarm/active_alarm.js
+++ b/apps/clock/js/panels/alarm/active_alarm.js
@@ -20,6 +20,7 @@ define(function(require) {
     this.alertWindow = new ChildWindowManager(
       window.location.origin + '/onring.html');

+    dump("==== mozSetMessageHandler " + new Date());
     // Handle the system's alarm event.
     navigator.mozSetMessageHandler('alarm', this.onMozAlarm.bind(this));
     window.addEventListener('test-alarm', this.onMozAlarm.bind(this));
diff --git a/apps/clock/js/startup.js b/apps/clock/js/startup.js
index dd1ed59..d7b741e 100644
--- a/apps/clock/js/startup.js
+++ b/apps/clock/js/startup.js
@@ -1,5 +1,6 @@
 'use strict';

+dump("==== startup.js called " + new Date());
 define('startup_init', function(require) {


and the results are:

in normal app launched from homescreen case: (no delay)

I/GeckoDump( 1707): ==== startup.js called Mon Sep 29 2014 04:43:48 GMT-0400 (EDT)
I/GeckoDump( 1707): ==== mozSetMessageHandler Mon Sep 29 2014 04:43:48 GMT-0400 (EDT)

in alarm screen case: (9 sec delay)

I/GeckoDump( 1861): ==== startup.js called Mon Sep 29 2014 04:47:00 GMT-0400 (EDT)
I/GeckoDump( 1861): ==== mozSetMessageHandler Mon Sep 29 2014 04:47:09 GMT-0400 (EDT)


According to comment 9, js files are concatenated into one, so it's not a js file optimization issue. As Marcus said, changing loading sequence might be nontrivial, but I'd like to know which part takes too much time on loading (or maybe there is no significant one..)
Could you help profiling the issue more? Thanks.
Flags: needinfo?(sku) → needinfo?(shchen)
Sure, I will take a look.
Flags: needinfo?(shchen)
Confirmed with partner, launch blocker. 


Hi Sherman, please kindly check this. Thank you very much!
blocking-b2g: 2.0? → 2.0+
Flags: needinfo?(shchen)
Flags: needinfo?(shchen)
Attached file clock app background
Set alarm and keep clock app in background, alarm rings immediately.
Attached file clock app killed
Set alarm and kill clock app, clock app takes 8+ seconds to load and ring alarm.
Hi! Chens,

How about keeping clock app in process? Just like we did in Tarako project.
Thanks

--
Keven
These two attachments break down loading time for each module, ring alarm with cold launch seems take a long time, tested on flame and b2g-desktop and both of them have same result. @Marcus, do you have any ideas?
Flags: needinfo?(m)
(In reply to Keven Kuo [:kkuo] from comment #15)
> Hi! Chens,
> 
> How about keeping clock app in process? Just like we did in Tarako project.
> Thanks
> 

I don't think in-process help, because it's not an identical issue with the tarako one.
The delay is not on cross process messaging, it's on loading JS modules. It doesn't matter which process the app's been launched, as long as it get killed when it's in background, these JS modules will be loaded back next time.
Made a experimental in-process patch and no luck, still need 8+ second for clock app to ring alarm.

One interesting finding worth to mention, normally if an alarm set and clock app killed, it takes 8+ seconds to ring. But if we manually launch clock app after the target alarm time, say alarm is set to 12:00:00 and we manually launch clock at 12:00:02, alarm will start ringing right after clock opened (<500ms). 

I will try to find the difference between launch by system message and launch manually, there could be something behind and causes the delay.
I could see why it would take maybe a second or two for the alarm to ring (i.e. the time for normal clock startup is 1 second, and maybe onring 500ms); but 7+ seconds seems unreasonably long for "only" the clock JS to load. :asuth mentioned that maybe the Clock process has too low of a thread priority, and it's getting starved trying to launch immediately into an attention screen? I have never investigated thread priorities for Clock.
Flags: needinfo?(m)
Launching in background should be the reason of slow start, if we don't put clock app in background when receiving an open-app system message there will be no issue, alarm will ring without significant delay.
(In reply to Marcus Cavanaugh [:mcav] (MoCo SF) from comment #19)
> :asuth mentioned that maybe the Clock process has too low of a thread
> priority, and it's getting starved trying to launch immediately into an
> attention screen? I have never investigated thread priorities for Clock.

The process priority is actually very high (NICE=0) due to pending system message.

According to Chens' investigation (comment 20), running Clock app in background causes the delay no matter how it's launched, i.e. if we force app being launched in background when pressing Clock app icon on homescreen, it also loads slower than it usually does.

So, maybe it's because some APIs don't work immediately in background (like setTimeout or requestAnimationFrame) so they cause the delay?
Flags: needinfo?(m)
Originally this thread noted that the long delay of load time was JS modules, which indicates to me that it's not necessarily a particular API, but something in the system causing the loading process to take an unusually long time. If Clock loads relatively quickly when launched normally, but 7x slower when launched in background mode, it seems like a system-level problem.
Flags: needinfo?(m)
This seems like a case where we want to see exactly what's happening on the device.  Ideally bug 831611 could be addressed so we could get the absolute truth, but failing that, perhaps profile.sh could be used to gather profiler traces across all apps?
Alarm also needs 8+ seconds to show up on 2.1 and master.

Device info (Flame-kk master):
Gaia-Rev        7918024c737c4570cacd784f267e28737ae05dea
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/2114ef80f6ae
Build-ID        20141105160209
Version         36.0a1
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  27
FW-Date         Thu Sep  4 14:59:02 CST 2014
Bootloader      L1TC10011800

Device info (Flame-kk 2.1):
Gaia-Rev        9658b93b412bdcc0f953d668e8c8e68318c99fb8
Gecko-Rev       https://hg.mozilla.org/releases/mozilla-b2g34_v2_1/rev/76880403db44
Build-ID        20141105161202
Version         34.0
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  27
FW-Date         Thu Sep  4 14:59:02 CST 2014
Bootloader      L1TC10011800
(In reply to Andrew Sutherland [:asuth] from comment #23)
> This seems like a case where we want to see exactly what's happening on the
> device. 

Just want to clarify that this issue was not only happened on devices, but also happened on b2g desktop. It's platform independent.
(In reply to Evelyn Hung [:evelyn] from comment #25)
> (In reply to Andrew Sutherland [:asuth] from comment #23)
> > This seems like a case where we want to see exactly what's happening on the
> > device. 
> 
> Just want to clarify that this issue was not only happened on devices, but
> also happened on b2g desktop. It's platform independent.

That's great news!  Then it should be relatively easy to grab a linux perf trace!
(In reply to Evelyn Hung [:evelyn] from comment #7)
> Tony, do we have this case in Smoke test? If not, I suggest we list it on
> per comment 4. This case seems a regression after Clock refactoring.

No smoketest for a case like this. we can certainly add a functional regression test that gets covered during testrun.   flagging edchen to add a test for in-moztrap? into 2.2.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(tchung) → needinfo?(edchen)
Add testcase into regression testsuit.
https://moztrap.mozilla.org/manage/case/14934/
Flags: needinfo?(edchen)
Sherman, are you still investigating this issue?
Flags: needinfo?(shchen)
No, currently I don't have enough bandwidth on this.
Flags: needinfo?(shchen)
convinced partner not to block this bug for MP. push to 2.2 ?

Thanks!
blocking-b2g: 2.0+ → 2.2?
triage: not able to block on this one but will mark as priority and do some further investigation to better understand the root of the slowdown.
blocking-b2g: 2.2? → ---
tracking-b2g: --- → +
Attached file GitHub pull request
My hypothesis is that when in the background, the platform elongates the setTimeout callbacks to a slower schedule. alameda.js, the module loader, used those because native promises where not available. The attached pull request replaces alameda with the native promise version. This is the same alameda used in the current email app for 2.2.

This seemed to fix it for me. At least, I was able to get an 8 second delay with current master, then went much faster after using this change (reached active_alarm's onMozAlarm in 328 ms instead of 8124ms using a crude Date.now() timing log technique).

I would like to see confirmation from QA though that this does fix it. I have run it a few times and it seems to be better, but I do suspect it sometimes depends on what else is going on, how much of the device has been powered down, as I was able to get at least one fast load before this change.

I have tested with this change a few times, once with a 2 minute delay and screen off to hopefully get to a more quiescence state and it seems reliably fast.

Since native promises are used, and their internal queue execution could cause different async expectations that before, I would be cautious about uplifting to 2.1, make sure that the state on master is good with the change.

I waited for the try build to pass before opening this pull request, so hopefully that is a good overall sign.
Assignee: nobody → jrburke
Status: NEW → ASSIGNED
Attachment #8537525 - Flags: review?(m)
Hi James, I try to use AMD in BT app. I meet some performance issue while system message is launching BT app. Since BT app will show a pairing dialog while it's receiving bluetooth-pairing-request. It's not a good UX with some delay for showing a pairing dialog. 

In case of alarm in clock app, I think it's also an important performance issue here. My experiment is mentioned in https://bugzilla.mozilla.org/show_bug.cgi?id=1102796#c7

PS: My AMD version is same with settings app(alameda 0.2.0).

It's strange in case of system message(not activity request). But I might try your updating patch with 0.2.0-native-promise version. Then, share the result with you.:)
The AMD 0.2.0-native-promise version is improving performance in my experiment(https://bugzilla.mozilla.org/show_bug.cgi?id=1102796#c10). The script execution time[1] is reduced from 800 millisecond to 50 millisecond. 

[1] The time difference is from script loaded to required module's callback.

I think the improvement is useful in Clock app really.
Comment on attachment 8537525 [details] [review]
GitHub pull request

I'm amazed that the difference is so substantial. Thanks, James! Sorry it took me a while to get back to this.
Attachment #8537525 - Flags: review?(m) → review+
Merged in master:
https://github.com/mozilla-b2g/gaia/commit/bcd567333cfb396f4552144219ad07b254f485a1

from pull request:
https://github.com/mozilla-b2g/gaia/pull/26832

:ianlu, I followed the link to bug 1102796, and it seems the same change helped the Bluetooth app in a similar way. Good to hear!
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
See Also: → 1118272
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: