Closed Bug 971001 Opened 6 years ago Closed 6 years ago
AWSY mobile: runtimes spiked from ~4min to ~9min on Feb 4
Recent runs on areweslimyet.com/mobile have been failing due to hitting the 10-minute timeout I enforce for each run. It appears that the runs used to take about 4 minutes but recently jumped to taking 8+ minutes and often exceed the 10 minute timeout. Data coming shortly.
I scraped the harness logs to determine the time from the "Start proc org.mozilla.fennec" message to the "AWSY-ARMV6-DONE" message in logcat, and ran the script on my mozilla-inbound and fx-team runs. The times clearly show a jump from ~4 minutes to ~9 minutes. The relevant pushlog range for inbound is https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=529211698369&tochange=c8abc650f217 and the pushlog range for fx-team is https://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=7260560686bb&tochange=0965914f979c (which includes a merge that pulled over the inbound changes). So it seems pretty likely this was caused by one of the changes in the inbound pushlog range.
The most likely culprit is bug 625383. I'm going to verify by pushing latest m-c to try with and without that bug backed out to see if I get consistent results.
ARMv6 builds with and without the backout: https://tbpl.mozilla.org/?tree=Try&rev=48472661d4e5 https://tbpl.mozilla.org/?tree=Try&rev=294d9c8509fd Once these are done I'll run them through the harness.
Yeah with the backout build the runs are back down to ~4 minutes. Definitely caused by bug 625383.
weird - there is a tradeoff involved in 625383 - but nothing that should have an impact on this scale (for good or for bad). It can certainly be backed out, but I'm concerned its exposing a latent bug that will remain and be tripped over in the wild. Can I understand what the test is doing? Is it managable (small) enough to be able to get NSPR HTTP logging for the with and without use cases? https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging?redirectlocale=en-US&redirectslug=HTTP_Logging
The test harness basically loads 15 pages in separate tabs. There is a small proxy running on the device that funnels the requests over the ADB connection to a nginx server running on a desktop, and the network prefs are set up to use the proxy. The code that drives this at a high level can be found at  (in particular the openTab function at line 129 which opens each of the URLs at the top of the file, one at a time). I should be able to run it with NSPR logging, I'll give that a shot.  https://github.com/staktrace/awsy-armv6/blob/master/fennec-addon-awsy/bootstrap.js
Sorry for the large attachment, I didn't think it would get this big. This is the NSPR log using the http logging environment listed on the MDN page, WITH bug 625383 applied. I will upload the corresponding log with bug 625383 backed out as soon as it is done in a few minutes.
:kats thanks! That is a lot of data; I'm going to have to queue it to look at it properly. I'll back out 625383 while it sits in my queue. (meeting now - will do it later today)
backed out remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/a9f3f9263459
Thanks! I'll close this bug once AWSY catches up to this inbound cset.
Merged to m-c. Leaving open per comment 11. https://hg.mozilla.org/mozilla-central/rev/a9f3f9263459
The AWSY harness has caught up and is back down to 4-minute runtimes.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.