Closed Bug 951185 Opened 11 years ago Closed 11 years ago

increased variability in datazilla launch times across multiple apps starting Dec 16

Categories

(Firefox OS Graveyard :: General, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: bkelly, Assigned: huseby)

References

Details

(Keywords: perf, regression, Whiteboard: [c=automation p=2 s=2014.02.14 u=] [b2gperf_regression])

On Dec 17 we experienced a regression of 50ms in launch time across multiple apps: http://mzl.la/1gDH7x9 From looking at the graphs it seems like this could have been backed out quickly and then relanded on Dec 17. Based on the commit log this is unrelated to the change on Dec 17, though, so I am writing it up as a separate bug. There were no gecko changes. Here is the gaia regression range: https://github.com/mozilla-b2g/gaia/compare/ff5a87e8c13c1...358cd74fd2b2ef
It seems that ever since date/rev there has been a much higher degree of variability in the datazilla results. All apps seem to fluctuate up and down at the same time, but the regression range suggests there are no changes causing it. For example, see bug 951161. Also, I backed out bug 948856 which was clearly causing a settings regression. However, datazilla results after the backout only show partial success. The minimum launch time dropped down to what we would expect, but a large percentage of the results remained elevated.
Summary: 50ms datazilla launch regression across multiple apps (Dec 16) → increased variability in datazilla launch times across multiple apps starting Dec 16
blocking-b2g: --- → 1.4?
Whiteboard: [c=automation p= s= u=] → [c=automation p= s= u=][b2gperf_regression]
Jonathan, Our initial investigation into this doesn't doesn't show gecko or gaia changes that should cause this. Can you have someone look into whether Datazilla, b2gperf or the hardware setup is part of this problem? Thanks, Mike
Flags: needinfo?(jgriffin)
There haven't been any changes to b2gperf or b2gpopulate since Dec 13. On the hardware side, we can ask davehunt and stephend if anything changed around that time. There was a change made to hamachi builds around that time (specifically changes on Dec 15 and 16), see https://git.mozilla.org/?p=external/caf/quic/lf/b2g/build.git;a=summary. Could this be responsible?
Flags: needinfo?(stephen.donner)
Flags: needinfo?(jgriffin)
Flags: needinfo?(dave.hunt)
We have had a couple of devices failing to flash, but that simply takes them out of the pool until it's addressed. Otherwise, I'm not aware of any hardware changes. The machine name (MAC address) is shown in datazilla are the firmware version details. It's possible that the devices are reporting different results, however it appears that most of the results over the last seven days are from a single device. The default Jenkins behaviour is to build each job on the same node as the last run so long as it's available.
Flags: needinfo?(dave.hunt)
It looks like the perf runs always happen against one of two devices (which appear in datazilla with distinct "machine" tags), and there doesn't appear to be any correlation between these two devices and changes in the perf metrics. As far as the Jenkins jobs themselves, we don't have history going back to Dec 17 for the mozilla-central perf job, but we dp for b2g26...on Dec 16 for b2g26 perf jobs, we changed the build that was used from b2g.hamachi.mozilla-b2g26_v1_2.nonril.download to b2g.hamachi.mozilla-b2g26_v1_2.download. I don't know if this is just a name change or if we actually started using a different build, and I don't know if we did the same for mozilla-central jobs. Stephen or Dave, do you remember what this change inferred? We may want to consider saving Jenkins job definitions in hg or github, to assist with this kind of archaeology in the future.
(In reply to Jonathan Griffin (:jgriffin) from comment #6) > As far as the Jenkins jobs themselves, we don't have history going back to > Dec 17 for the mozilla-central perf job, but we dp for b2g26...on Dec 16 for > b2g26 perf jobs, we changed the build that was used from > b2g.hamachi.mozilla-b2g26_v1_2.nonril.download to > b2g.hamachi.mozilla-b2g26_v1_2.download. I don't know if this is just a > name change or if we actually started using a different build, and I don't > know if we did the same for mozilla-central jobs. Stephen or Dave, do you > remember what this change inferred? We do have the history for the m-c build also, it's here: http://qa-selenium.mv.mozilla.com:8080/view/B2G%20Perf/job/b2g.hamachi.mozilla-central.master.perf/jobConfigHistory/? This was just a name change. Prior to this we moved from the nightly builds packaged with the commercial RIL (but we weren't using this RIL), and we temporarily named these new download jobs to indicate they were not using the commercial RIL. Once it had been in use for a while we renamed the jobs as this distinction was no longer necessary. > We may want to consider saving Jenkins job definitions in hg or github, to > assist with this kind of archaeology in the future. It may well be worth revisiting this idea. It has been proposed in the past, but this Jenkins instance is often configured via the UI and changing this to rely on submitting changes to a repo would be quite distruptive. Fortunately, the job configuration history plugin does a reasonable job of identifying configuration changes. We could consider moving the B2G jobs to a separate Jenkins instance backed by a repository, or perhaps consider another way of keeping the history backed up and versioned.
(In reply to Jonathan Griffin (:jgriffin) from comment #4) > There haven't been any changes to b2gperf or b2gpopulate since Dec 13. On > the hardware side, we can ask davehunt and stephend if anything changed > around that time. > > There was a change made to hamachi builds around that time (specifically > changes on Dec 15 and 16), see > https://git.mozilla.org/?p=external/caf/quic/lf/b2g/build.git;a=summary. > Could this be responsible? I haven't done anything to the devices in a long while (nor has Raymond), so, sorry, but unsure of what this might be. (Just today, however, I did swap out the earbuds, for https://bugzilla.mozilla.org/show_bug.cgi?id=942840#c23, just FYI.)
Flags: needinfo?(stephen.donner)
Please continue investigating. Will not block on this as yet till perf finds this to be an issue.
Dave Hunt recently landed an improvement to b2gperf which ensures there are no alarms or other stateful information on the device. See bug 926454. This was enabled in datazilla around Jan 28. (I think after that nice improvement in launch times.) I'm hopeful this will solve the increased variability.
Assignee: nobody → dhuseby
Status: NEW → ASSIGNED
Whiteboard: [c=automation p= s= u=][b2gperf_regression] → [c=automation p=2 s= u=][b2gperf_regression]
Sorry I didn't mention this in our meeting, but based on last 3 days of output I think we can probably declare this fixed and WFM. Results have been pretty stable. To be extra safe we could wait until Monday.
(In reply to Ben Kelly [:bkelly] from comment #11) > Sorry I didn't mention this in our meeting, but based on last 3 days of > output I think we can probably declare this fixed and WFM. Results have > been pretty stable. To be extra safe we could wait until Monday. Thx, Ben. As a note (and I'll send email about this, too), I'm also updating the vendor firmware/base build to the v1.2-device.cfg version, which I hear has some touch-screen/scrolling and potentially other stability fixes (as well as the ability to have unique device names in ADB).
Ben, I was just looking at the last 30 days of datazilla data and it looks like we're pretty stable again. The measured times are fluctuating up and down like you described in Comment 1. Can we close this out as fixed? -dave
Flags: needinfo?(bkelly)
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Flags: needinfo?(bkelly)
Resolution: --- → FIXED
Resolution: FIXED → WORKSFORME
Whiteboard: [c=automation p=2 s= u=][b2gperf_regression] → [c=automation p=2 s=2014.02.14 u=] [b2gperf_regression]
blocking-b2g: 1.4? → ---
You need to log in before you can comment on or make changes to this bug.