Closed
Bug 942840
Opened 11 years ago
Closed 10 years ago
Unable to measure app load time for Phone app on mozilla-central/master builds in CI
Categories
(Firefox OS Graveyard :: Gaia::Dialer, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
1.3 C3/1.4 S3(31jan)
People
(Reporter: davehunt, Unassigned)
References
Details
(Keywords: perf, Whiteboard: [c=automation p= s=2014.01.31 u=])
Attachments
(2 files, 1 obsolete file)
5.40 KB,
text/plain
|
Details | |
3.80 KB,
patch
|
Details | Diff | Splinter Review |
The b2g.hamachi.mozilla-central.master.perf job in Jenkins CI is failing to retrieve the cold_load_time metric when launching the Phone app. See http://qa-selenium.mv.mozilla.com:8080/view/B2G%20Perf/job/b2g.hamachi.mozilla-central.master.perf/620/consoleFull (requires VPN) $ b2gperf --delay=10 --sources=b2g-distro/sources.xml --testvars=/home/webqa/webqa-credentials/b2g/b2g-3.json --dz-project=b2g --dz-branch=master --dz-device=hamachi --dz-key=******** --dz-secret=******** Phone Contacts Messages Settings Gallery Video Music Camera Email Calendar Clock FM Radio Usage Template Browser 2013-11-25 04:34:36 | B2GPerfRunner | INFO | Running B2GPerfLaunchTest 2013-11-25 04:36:16 | B2GPerfRunner | INFO | Phone [1/30] Traceback (most recent call last): File "/var/jenkins/workspace/b2g.hamachi.mozilla-central.master.perf/.env/local/lib/python2.7/site-packages/b2gperf/b2gperf.py", line 378, in run raise MissingMetricError(self.app_name, metric, i) MissingMetricError: Phone missing cold_load_time metric in iteration 0 Traceback (most recent call last): File "/var/jenkins/workspace/b2g.hamachi.mozilla-central.master.perf/.env/local/lib/python2.7/site-packages/b2gperf/b2gperf.py", line 378, in run raise MissingMetricError(self.app_name, metric, i) MissingMetricError: Phone missing cold_load_time metric in iteration 0 ... Traceback (most recent call last): File "/var/jenkins/workspace/b2g.hamachi.mozilla-central.master.perf/.env/local/lib/python2.7/site-packages/b2gperf/b2gperf.py", line 244, in measure_app_perf test.run() File "/var/jenkins/workspace/b2g.hamachi.mozilla-central.master.perf/.env/local/lib/python2.7/site-packages/b2gperf/b2gperf.py", line 388, in run raise ExceededThresholdError() ExceededThresholdError: Exceeded failure threshold for gathering results
Reporter | ||
Comment 1•11 years ago
|
||
I've been unable to replicate this locally. I have set the log level on CI to DEBUG and switched the first two apps (Phone and Contacts) to see if this affects the first app or specifically the Phone app.
Flags: needinfo?(dave.hunt)
Reporter | ||
Comment 2•11 years ago
|
||
It appears that the issue is specific to the Phone app. From adding debug logging I can see that the very first launch of the app appears to provide a metric, but all subsequent launches fail. It could be that the app is not successfully killed. I will try to replicate this again locally, but may need somebody in MV to watch the device while I run tests remotely.
Flags: needinfo?(dave.hunt)
Reporter | ||
Updated•11 years ago
|
Summary: Unable to measure app load time for Phone app on mozilla-centra/master builds in CI → Unable to measure app load time for Phone app on mozilla-central/master builds in CI
Reporter | ||
Comment 3•11 years ago
|
||
I'm still unable to replicate this locally. Stephen: Could you (or someone else from MV) help me to investigate this? I can trigger a job but will need someone to be watching the phone to see what's happening.
Flags: needinfo?(stephen.donner)
Comment 4•11 years ago
|
||
(In reply to Dave Hunt (:davehunt) from comment #3) > I'm still unable to replicate this locally. Stephen: Could you (or someone > else from MV) help me to investigate this? I can trigger a job but will need > someone to be watching the phone to see what's happening. I'll try to help with this tomorrow, when I'm (hopefully) back in the office.
Comment 5•11 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #4) > (In reply to Dave Hunt (:davehunt) from comment #3) > > I'm still unable to replicate this locally. Stephen: Could you (or someone > > else from MV) help me to investigate this? I can trigger a job but will need > > someone to be watching the phone to see what's happening. > > I'll try to help with this tomorrow, when I'm (hopefully) back in the office. I'll try to help with this tomorrrow (be in earlier); if you'll be around too.
Reporter | ||
Comment 6•11 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #5) > (In reply to Stephen Donner [:stephend] from comment #4) > > (In reply to Dave Hunt (:davehunt) from comment #3) > > > I'm still unable to replicate this locally. Stephen: Could you (or someone > > > else from MV) help me to investigate this? I can trigger a job but will need > > > someone to be watching the phone to see what's happening. > > > > I'll try to help with this tomorrow, when I'm (hopefully) back in the office. > > I'll try to help with this tomorrrow (be in earlier); if you'll be around > too. This is also now affecting mozilla-aurora/v1.3. When Stephen observed this on Friday it appeared that the phone was doing nothing and sitting at the homescreen. I'm going to take a node offline and see if I can replicate this remotely in a reduced testcase.
Flags: needinfo?(stephen.donner)
Reporter | ||
Comment 7•11 years ago
|
||
Unable to replicate by logging in via SSH and triggering the Phone perf tests on the same device/gecko/gaia where the issue has been seen. I've created a new temporary job (b2g.hamachi.mozilla-central.master.perf.phone) that just runs this test on a dedicated node (b2g-8) with more iterations.
Comment 8•11 years ago
|
||
I ran http://qa-selenium.mv.mozilla.com:8080/job/b2g.hamachi.mozilla-central.master.perf.phone/11/console just now and watched it -- the Phone app definitely launches, then closes; rinse, lather, repeat, so I don't know why it's claiming that it's missing the perf metrics. Dave says he'll try adding debug to the job tomorrow, and we'll reconvene our investigations :-)
Reporter | ||
Comment 9•11 years ago
|
||
So this appears to be caused by the subsequent launches being warm. This implies that the Phone app is not completely killed when we start it. I've added some more debug and running it again now.
Reporter | ||
Comment 10•11 years ago
|
||
B2GPerfRunner DEBUG | Unlocking device B2GPerfRunner DEBUG | Killing all running apps B2GPerfRunner DEBUG | Returning to home screen B2GPerfRunner DEBUG | Waiting for 10 seconds B2GPerfRunner DEBUG | Running apps: Homescreen B2GPerfRunner DEBUG | Launching 'Phone' B2GPerfRunner DEBUG | Killing 'Phone' <--- we kill Phone here B2GPerfRunner DEBUG | Running apps: Homescreen <--- and now it's not running B2GPerfRunner DEBUG | Metric 'cold_load_time' returned: 817 B2GPerfRunner DEBUG | Waiting for 10 seconds B2GPerfRunner DEBUG | Running apps: Phone, Homescreen <--- but now it's running again!? B2GPerfRunner DEBUG | Launching 'Phone' As far as I can tell, it looks like the Phone app is being automatically started (perhaps in the background?) after it's killed. I'm not sure what might cause this, perhaps something to do with the SIM/carrier?
Reporter | ||
Comment 11•10 years ago
|
||
Moving the component to Dialer in case anyone is able to help. I will be removing this app from the tests running on Jenkins as it continues to fail.
Component: General → Gaia::Dialer
Reporter | ||
Comment 12•10 years ago
|
||
I've just replicated this manually. The Phone app is shown in cards view, I swipe up to kill it, then a short while later it's back! I'll attach the relevant section from logcat.
Reporter | ||
Comment 13•10 years ago
|
||
Reporter | ||
Comment 14•10 years ago
|
||
Etienne: Any idea why this is happening?
Flags: needinfo?(etienne)
Comment 15•10 years ago
|
||
The Dialer apps can be launched in background by a bunch of system messages [1]. It's probably the case here, but not sure which one. If you have everything set up to replicate the issue, can you try with the |mozSetMessageHandler|s commented? [1] https://github.com/mozilla-b2g/gaia/blob/master/apps/communications/dialer/js/dialer.js#L358-375
Flags: needinfo?(etienne)
Reporter | ||
Comment 16•10 years ago
|
||
My manually reproduced case may differ from what we're seeing in automation. It seems if I launch the Camera app, the Phone app is being launched. Then, even after closing the Camera app, the Phone app keeps coming back after I close it. (In reply to Etienne Segonzac (:etienne) from comment #15) > If you have everything set up to replicate the issue, can you try with the > |mozSetMessageHandler|s commented? Sorry, it's not clear to me what you're suggesting I change here. Could you provide more details?
Flags: needinfo?(etienne)
Comment 17•10 years ago
|
||
(In reply to Dave Hunt (:davehunt) from comment #16) > (In reply to Etienne Segonzac (:etienne) from comment #15) > > If you have everything set up to replicate the issue, can you try with the > > |mozSetMessageHandler|s commented? > > Sorry, it's not clear to me what you're suggesting I change here. Could you > provide more details? Here's a patch :) Just to be clear, we're purposefully breaking the incoming call support (and more) here in order to gain some information about what's causing the dialer to relaunch.
Flags: needinfo?(etienne)
Reporter | ||
Comment 18•10 years ago
|
||
I applied the patch and ran make reset-gaia but I can no longer replicate the issue. Would the patch have that effect? I'm not sure what to expect with all that commented out. :)
Flags: needinfo?(etienne)
Comment 19•10 years ago
|
||
(In reply to Dave Hunt (:davehunt) from comment #18) > I applied the patch and ran make reset-gaia but I can no longer replicate > the issue. Would the patch have that effect? I'm not sure what to expect > with all that commented out. :) Ok. So we just confirmed that the dialer is being launched by a system message [1]. So either the phone is receiving a call, or a notification is triggered etc... With this new patch you should be able to reproduce the issue, and the logcat (greped on "+++") will tell us what's causing the inadvertent launch. [1] https://groups.google.com/forum/?fromgroups=#!topic/mozilla.dev.webapi/o8bkwx0EtmM
Attachment #8362465 -
Attachment is obsolete: true
Flags: needinfo?(etienne)
Reporter | ||
Comment 20•10 years ago
|
||
It appears we've found the culprit... E/GeckoConsole(10156): Content JS LOG at app://communications.gaiamobile.org/dialer/js/dialer.js:177 in headsetCommandHandler: +++ waking up because of a headset message The devices that run the tests have a headset plugged in so they can run the radio tests, but why would this cause the Phone app the launch? I can confirm that removing the headset causes the behaviour to stop, but connecting it again does not cause it to start again. I can then cause it to occur again by manually clicking a button on the headset. It seems we have phantom headset button clicking somehow...
Flags: needinfo?(etienne)
Comment 21•10 years ago
|
||
(In reply to Dave Hunt (:davehunt) from comment #20) > I can then cause it to occur again by manually clicking a button on the > headset. It seems we have phantom headset button clicking somehow... Yep :/ Randy do you have any idea?
Flags: needinfo?(etienne) → needinfo?(rlin)
Comment 22•10 years ago
|
||
Do you use the headset in hamachi box? Or change with another one? It looks like that hamachi device isn't calibration the headset detection parameter very well.
Flags: needinfo?(rlin)
Updated•10 years ago
|
Whiteboard: [c=automation p= s= u=]
(In reply to Randy Lin [:rlin] from comment #22) > Do you use the headset in hamachi box? Or change with another one? > It looks like that hamachi device isn't calibration the headset detection > parameter very well. Apologies, all :-( Answering for Dave -- since the performance-measuring Hamachis are hosted by my team, in the Mountain View office -- most of them were actually using a different set of earbuds (we weren't careful in setting them up, clearly - some were ZTE/Inari, others, more generic earbuds). All Hamachis now are using the factory-provided Alcatel-Lucent earbuds - let's see how subsequent runs fare; again, sorry for this oversight - it's something we'll be sure to get right whenever we set up new phones in the future!
Reporter | ||
Comment 24•10 years ago
|
||
That appears to have fixed it!! This still seems like a valid bug to me though, why would a different set of headphones cause this behaviour?
Reporter | ||
Comment 25•10 years ago
|
||
I've raised bug 962996 for the issue and closing this as we can now run our automation again! :)
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment 26•10 years ago
|
||
I'm not seeing the Dialer app on Datazilla. Do we need to wait a bit more or is there another issue?
Comment 27•10 years ago
|
||
Never mind, I was looking for Dialer but it shows up as Phone.
Updated•10 years ago
|
Whiteboard: [c=automation p= s= u=] → [c=automation p= s=2014.01.31 u=]
Updated•10 years ago
|
Target Milestone: --- → 1.3 C3/1.4 S3(31jan)
You need to log in
before you can comment on or make changes to this bug.
Description
•