Closed Bug 864637 Opened 11 years ago Closed 11 years ago

11% Android Tp4 NoChrome regression on 2013-04-18 (caused by infrastructure change?)

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

ARM
Android
task
Not set
normal

Tracking

(fennec-)

RESOLVED DUPLICATE of bug 877779
Tracking Status
fennec - ---

People

(Reporter: mbrubeck, Unassigned)

References

Details

(Keywords: perf, regression)

Regression: Mozilla-Inbound - Tp4 Mobile NoChrome - Android 2.2 (Native) - 11.1% increase

    Previous: avg 705.762 stddev 13.063 of 12 runs up to 3297733a2661
    New     : avg 784.438 stddev 16.677 of 12 runs since 7b252af6f343
    Change  : +78.675 (11.1% / z=6.023)
    Graph   : http://mzl.la/13PDP41

Changeset range: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=3297733a2661&tochange=7b252af6f343

We initially suspected bug 859391, but the regression started (and persisted) after that patch was backed out, so it cannot be the cause.

Some other bugs are in code that is not built or not used on Android, so we can rule them out (bug 824963, bug 847354).

dzbarsky's changes appear trivial and unlikely to regress this benchmark. (Please correct me if I am wrong.)

That leaves only shu's changes from this push:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=993d97de1132

shu, is it possible that any of these patches hurt page load speed on Android/ARM?  The next step is probably to try backing out some or all of the patches, either on Try or inbound, to see if it fixes the regression.

I also retriggered Tp4 runs on a changeset from before the regression, to rule out an infrastructure glitch like bug 855044:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=3297733a2661&jobname=tp4m
While I wouldn't rule out that my patches are the cause, none of them are architecture-specific.

I'm not exactly familiar with this test, what is it testing? All of these commits are for Ion compilation. Does this test run enough JS to trigger Ion?

993d97de1132 optimizes a particular IR in Ion for self-hosted code. I have a hard time seeing how it affects page load speed.

a1d95089b0b7 is PJS-specific. Unless the page being loaded uses ParallelArrays, I also don't see how it might cause problems.

d5719da78339 inlines 2 native calls that are used from inside self-hosted code, but is not architecture specific. If I had to pick, I would try backing this out first.
(In reply to Shu-yu Guo [:shu] from comment #1)
> I'm not exactly familiar with this test, what is it testing? All of these
> commits are for Ion compilation. Does this test run enough JS to trigger Ion?

The "TP" tests repeatedly load a set of HTML pages from top-500 sites and measure the time to the "load" event on each one, then reports the average.  I don't really know how much JavaScript ends up getting executed on this test, though I believe SpiderMonkey changes have helped/hurt it in the past.
(In reply to Matt Brubeck (:mbrubeck) from comment #0)
> I also retriggered Tp4 runs on a changeset from before the regression, to
> rule out an infrastructure glitch like bug 855044:
> https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=3297733a2661&jobname=tp4m

The results from this retrigger are *also* regressed, so maybe this is an infrastructure issue.  Running some more retriggers on this and other changesets to test the theory:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&fromchange=77eba8ff0a5f&tochange=3297733a2661&jobname=tp4m
The retrigger results are consistent: All the values from newly-triggered jobs are "bad" (higher), even on changesets that previously returned "good" (lower) results.  So this must be something that changed outside of mozilla-central code, similar to bug 855044.

Moving to releng for investigation.  Un-CC-ing shu and dzbarsky; thanks for looking at this and sorry for the false alarm.  Un-nominating for tracking-firefox23.
No longer blocks: 844887
Component: General → Release Engineering: Platform Support
Depends on: 855044
Product: Firefox for Android → mozilla.org
QA Contact: coop
Version: Firefox 23 → other
Summary: 11% Android Tp4 NoChrome regression on 2013-04-18 → 11% Android Tp4 NoChrome regression on 2013-04-18 (caused by infrastructure change?)
Possible reversal:

Message: 1
Date: Wed, 08 May 2013 19:22:34 -0000
From: nobody@cruncher.build.mozilla.org
To: dev-tree-management@lists.mozilla.org
Subject: (Improvement) Mozilla-Inbound - Tp4 Mobile NoChrome - Android
        2.2        (Native) - 7.94%
Message-ID:
        <20130508192234.B724F104133@cruncher.srv.releng.scl3.mozilla.com>
Content-Type: text/plain; charset="us-ascii"

Improvement: Mozilla-Inbound - Tp4 Mobile NoChrome - Android 2.2 (Native) - 7.94% decrease
------------------------------------------------------------------------------------------
    Previous: avg 784.767 stddev 10.667 of 12 runs up to revision ef2134c93dae
    New     : avg 722.475 stddev 26.427 of 12 runs since revision 069c966819d6
    Change  : -62.292 (7.94% / z=5.840)
    Graph   : http://mzl.la/13inFxC

Changeset range: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=ef2134c93dae&tochange=069c966819d6

Changesets:
  * http://hg.mozilla.org/integration/mozilla-inbound/rev/92a5a4a9b76b
    : Hannes Verschore <hv1989@gmail.com> - Bug 864468: IonMonkey: Skip argument type checks when type is known to match, r=jandem
    : http://bugzilla.mozilla.org/show_bug.cgi?id=864468

  * http://hg.mozilla.org/integration/mozilla-inbound/rev/e698d3534b96
    : Gian-Carlo Pascutto <gpascutto@mozilla.com> - Backed out changeset 4b3a3f40730f (Bug 863290) for Android mochi-4 orange.
    : http://bugzilla.mozilla.org/show_bug.cgi?id=863290

  * http://hg.mozilla.org/integration/mozilla-inbound/rev/069c966819d6
    : Nicholas D. Matsakis <nmatsakis@mozilla.com> - Bug 865028 - Fuse ParallelDo and ForkJoin r=shu
    : http://bugzilla.mozilla.org/show_bug.cgi?id=865028

Bugs:
  * http://bugzilla.mozilla.org/show_bug.cgi?id=865028 - PJS: Fuse ParallelDo and ForkJoin
  * http://bugzilla.mozilla.org/show_bug.cgi?id=863290 - crash in webrtc::videocapturemodule:eviceInfoAndroid::NumberOfDevices
  * http://bugzilla.mozilla.org/show_bug.cgi?id=864468 - IonMonkey: Skip argument type check when type is known to match
Now we see something very similar (to the originally reported regression) on mozilla-beta. From dev-tree-management Digest, Vol 54, Issue 6:

Message: 2
Date: Mon, 03 Jun 2013 15:42:19 -0000
From: nobody@cruncher.build.mozilla.org
To: dev-tree-management@lists.mozilla.org
Subject: <Regression> Mozilla-Beta - Tp4 Mobile NoChrome - Android 2.2
        (Native) - 10.6%
Message-ID:
        <20130603154219.3EBFA191B@cruncher.srv.releng.scl3.mozilla.com>
Content-Type: text/plain; charset="us-ascii"

Regression: Mozilla-Beta - Tp4 Mobile NoChrome - Android 2.2 (Native) - 10.6% increase
--------------------------------------------------------------------------------------
    Previous: avg 710.633 stddev 13.730 of 12 runs up to revision 8ca75debb97a
    New     : avg 786.271 stddev 21.974 of 12 runs since revision 3fca8ec83316
    Change  : +75.638 (10.6% / z=5.509)
    Graph   : http://mzl.la/11RnLxc

Changeset range: http://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=8ca75debb97a&tochange=3fca8ec83316

Changesets:
  * http://hg.mozilla.org/releases/mozilla-beta/rev/51c78e9573f4
    : ffxbld - Automated checkin: version bump for firefox 22.0b3 release. DONTBUILD CLOSED TREE a=release

  * http://hg.mozilla.org/releases/mozilla-beta/rev/928e04dcafff
    : ffxbld - Added FIREFOX_22_0b3_RELEASE FIREFOX_22_0b3_BUILD1 tag(s) for changeset 51c78e9573f4. DONTBUILD CLOSED TREE a=release

  * http://hg.mozilla.org/releases/mozilla-beta/rev/91d9aba291a0
    : ffxbld - Automated checkin: version bump for fennec 22.0b3 release. DONTBUILD CLOSED TREE a=release

  * http://hg.mozilla.org/releases/mozilla-beta/rev/468705b10735
    : ffxbld - Added FENNEC_22_0b3_RELEASE FENNEC_22_0b3_BUILD1 tag(s) for changeset 91d9aba291a0. DONTBUILD CLOSED TREE a=release

  * http://hg.mozilla.org/releases/mozilla-beta/rev/3fca8ec83316
    : Chris Peterson <cpeterson@mozilla.com> - Bug 776223 - Catch NullPointerException from Samsung's buggy clipboard API. r=blassey a=akeybl
    : http://bugzilla.mozilla.org/show_bug.cgi?id=776223

Bugs:
  * http://bugzilla.mozilla.org/show_bug.cgi?id=776223 - java.lang.NullPointerException: at android.content.ClipboardManager.setPrimaryClip(ClipboardManager.java) on Samsung devices running ICS
tracking-fennec: --- → ?
I would be VERY surprised if bug 776223 caused page load times to increase 10%. That bug just wrapped an exception try/catch around a call to setClipboardText(). That said, the other commits changesets in that pushlog are only bumping version numbers and tags. <:\
minusing based on the assumption that this is due to an infra change. Renom if that is not the case
tracking-fennec: ? → -
See Also: → 877779
Here's the list of releng/infra changes from that period:

https://wiki.mozilla.org/ReleaseEngineering:BuildbotMasterChanges:Archive#Q2_.26_Q1

Specifically, here's what changed on 2013-04-18:

* bug 855492 - install gstreamer devel packages in mock environments for try,
* bug 857697 - Use x86_64 build environment for linux and linux-debug builds on the "date" branch, and install the necessary i686 -devel packages.
* bug 860246 - Do linux32 on date and no b2g.
* bug 860246 - Fixups.
* bug 860766 - Remove repurposed bld-linux64 machines.
* bug 862275 - support schedulers in setup-masters.py (round 2)
* bug 862275 - support schedulers in setup-masters.py,
* bug 862545 - Disable signing for b2g.
* Backout bug 862275 - for test failures
* Make deduping less noisy, and only enabled in test mode
* Update release config for Fennec-21.0b3-build1
* Update release config for Firefox-21.0b3-build1 

Nothing is leaping out at me there.
This was probably caused by the same external network dependency as bug 877779.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
Product: mozilla.org → Release Engineering
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.