Closed
Bug 888765
Opened 11 years ago
Closed 11 years ago
Many tests on Win8 on nightly builds fail to start
Categories
(Toolkit :: Telemetry, defect)
Tracking
()
RESOLVED
FIXED
mozilla25
People
(Reporter: philor, Assigned: jimm)
References
()
Details
(Whiteboard: [qa-])
Attachments
(1 file)
956 bytes,
patch
|
bbondy
:
review+
bajaj
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
https://tbpl.mozilla.org/?rev=c5ce065936fa&showall=1&jobname=6.2.*pgo and then https://tbpl.mozilla.org/?rev=cbb24a4a96af&showall=1&jobname=6.2.*pgo - the first set of tests is on the periodic PGO build, the second plus the retriggers are on the nightly build.
The only difference between the two ought to be that the former has MOZ_UPDATE_CHANNEL=default and the latter has MOZ_UPDATE_CHANNEL=nightly, but then you have people like http://mxr.mozilla.org/mozilla-central/source/configure.in#8824 deciding that "nightly" means "shoot ourselves in the foot, but only once a day."
Assuming it's always going to happen to at least one suite, the regression range is the rather unwieldy https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=8e3a124c9c1a&tochange=c5ce065936fa
Reporter | ||
Comment 1•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=845a7ce7ddb0 is just PGO plus MOZ_TELEMETRY_ON_BY_DEFAULT, and I expect it will show the failure (https://tbpl.mozilla.org/?tree=Try&rev=80034f32f9c5 where I blew the talos part of the trychooser syntax, seems to have); https://tbpl.mozilla.org/?tree=Try&rev=35e02f18786b is that with https://hg.mozilla.org/mozilla-central/rev/d8d194d3dcc1 backed out.
Component: General → Telemetry
Product: Core → Toolkit
Reporter | ||
Comment 2•11 years ago
|
||
When this does prove to have been nightly-only bustage caused by telemetry's fondness for abusing MOZ_UPDATE_CHANNEL, this won't be the first time that fondness has caused nightly-only bustage, will it?
Reporter | ||
Comment 3•11 years ago
|
||
Since it wasn't that, the only thing in the range that mentioned telemetry in the commit message, pushed https://tbpl.mozilla.org/?tree=Try&rev=f5efb2ec7814 with just PGO to make sure we can have a try build that *doesn't* show the bustage.
Comment 4•11 years ago
|
||
This needs an owner ASAP. Nightly-only test bustage is not acceptable.
Severity: normal → blocker
Flags: needinfo?(taras.mozilla)
Flags: needinfo?(dtownsend+bugmail)
Comment 5•11 years ago
|
||
mozilla-central is closed due to this
Reporter | ||
Comment 6•11 years ago
|
||
And that PGO but not MOZ_TELEMETRY_ON_BY_DEFAULT try was green.
Unless reftests and talos are actually inheriting prefs from automation.py.in (which now gets its prefs from testing/profiles/prefs_general.js), and I don't think they are, the way that mochitests don't hit it probably means that having one or both of toolkit.telemetry.prompted and toolkit.telemetry.notifiedOptOut set to 999 avoids it.
Comment 7•11 years ago
|
||
Taras, please find an owner for this
Assignee: nobody → taras.mozilla
Flags: needinfo?(dtownsend+bugmail)
Reporter | ||
Comment 8•11 years ago
|
||
rm http://mxr.mozilla.org/mozilla-central/source/configure.in?mark=8822-8827#8823 would reopen the tree.
Comment 9•11 years ago
|
||
(In reply to Dave Townsend (:Mossop) from comment #7)
> Taras, please find an owner for this
Seems to me Taras IS the owner for this according to the WIKI page for telemetry.
Comment 10•11 years ago
|
||
(In reply to Phil Ringnalda (:philor) from comment #6)
> mochitests don't hit it probably means that having one or both of
> toolkit.telemetry.prompted and toolkit.telemetry.notifiedOptOut set to 999
> avoids it.
That really shouldn't be the case, since both of those prefs are unused since bug 829881.
Comment 11•11 years ago
|
||
If we don't get an owner for this soon we'll just disable telemetry by default on nightlies
Flags: needinfo?(vdjeric)
Flags: needinfo?(nfroyd)
Flags: needinfo?(irving)
Flags: needinfo?(dteller)
Comment 12•11 years ago
|
||
(In reply to Dave Townsend (:Mossop) from comment #11)
> If we don't get an owner for this soon we'll just disable telemetry by
> default on nightlies
Good call! But still on WIKI page Taras identifies himself as Directly Responsible Individual. Sounds like should be bug owner to me.
Comment 13•11 years ago
|
||
Disabling telemetry by default seems worse than nightly-only test failures.
Comment 14•11 years ago
|
||
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #13)
> Disabling telemetry by default seems worse than nightly-only test failures.
I don't see how. getting telemetry on builds we are not doing seems completely useless.
Comment 15•11 years ago
|
||
(In reply to Dave Townsend (:Mossop) from comment #11)
> If we don't get an owner for this soon we'll just disable telemetry by
> default on nightlies
Seems like bug 888927 will fix this and Gavin has already posted a patch in that bug.
(In reply to Bill Gianopoulos [:WG9s] from comment #14)
> (In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment
> #13)
> > Disabling telemetry by default seems worse than nightly-only test failures.
>
> I don't see how. getting telemetry on builds we are not doing seems
> completely useless.
How can we be getting telemetry on builds that are not being done?
Flags: needinfo?(nfroyd)
Comment 16•11 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #15)
> (In reply to Dave Townsend (:Mossop) from comment #11)
> > If we don't get an owner for this soon we'll just disable telemetry by
> > default on nightlies
>
> Seems like bug 888927 will fix this and Gavin has already posted a patch in
> that bug.
My reading is that tests fail when telemetry is enabled, so won't that just make tests fail in all builds, not just nightlies? Happy to hear otherwise.
Comment 17•11 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #15)
> (In reply to Dave Townsend (:Mossop) from comment #11)
> > If we don't get an owner for this soon we'll just disable telemetry by
> > default on nightlies
>
> Seems like bug 888927 will fix this and Gavin has already posted a patch in
> that bug.
>
> (In reply to Bill Gianopoulos [:WG9s] from comment #14)
> > (In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment
> > #13)
> > > Disabling telemetry by default seems worse than nightly-only test failures.
> >
> > I don't see how. getting telemetry on builds we are not doing seems
> > completely useless.
>
> How can we be getting telemetry on builds that are not being done?
That was kind of my whole point. I guess you figured it out.
Updated•11 years ago
|
Flags: needinfo?(dteller)
Comment 18•11 years ago
|
||
Jim, do any of the Windows 8 commits in the comment 0 regression range stand out to you as a possible culprit?
Updated•11 years ago
|
Flags: needinfo?(jmathies)
Assignee | ||
Comment 19•11 years ago
|
||
On a cursory look through the blame, the only bug that pops out at me is bug 873073. But from a discussion on irc with khuey sounds like if this was the cause it would also happen on inbound builds.
FWIW, none of the metro specific landings here would get in the way of tests start up. They are all front end js.
Flags: needinfo?(jmathies)
Comment 20•11 years ago
|
||
I've been running several try builds in an attempt to bisect:
https://tbpl.mozilla.org/?tree=Try&rev=e393837ecc16
https://tbpl.mozilla.org/?tree=Try&rev=82f9f73cf954
https://tbpl.mozilla.org/?tree=Try&rev=e3323fc7a7d1
https://tbpl.mozilla.org/?tree=Try&rev=5717788a2de4
Assuming that I didn't screw anything up, those builds have telemetry turned on and PGO turned on and are fully green. So anything that went wrong happened after:
https://hg.mozilla.org/mozilla-central/rev/cec705c00777
I would like to push more builds, but Try appears to be busted.
Comment 21•11 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #20)
> I've been running several try builds in an attempt to bisect:
Try builds by default don't get --enable-update-channel=nightly, as I understand it, so this might not be reproducible there without some build config changes.
Updated•11 years ago
|
Flags: needinfo?(taras.mozilla)
Updated•11 years ago
|
Flags: needinfo?(vdjeric)
Comment 22•11 years ago
|
||
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #21)
> (In reply to Nathan Froyd (:froydnj) from comment #20)
> > I've been running several try builds in an attempt to bisect:
>
> Try builds by default don't get --enable-update-channel=nightly, as I
> understand it, so this might not be reproducible there without some build
> config changes.
Try builds get run with --enable-update-channel=, so changes like:
https://hg.mozilla.org/try/rev/bcaf37d7c0e4
should be sufficient to turn on telemetry. I did screw up my previous builds: they didn't turn on MOZILLA_OFFICIAL, so telemetry was likely not running correctly.
I've run several win PGO try builds with the above:
https://tbpl.mozilla.org/?tree=Try&rev=572cb21c69e0
https://tbpl.mozilla.org/?tree=Try&rev=3175952c3a0b
https://tbpl.mozilla.org/?tree=Try&rev=2665c5b13d74
https://tbpl.mozilla.org/?tree=Try&rev=7bac89250f31
https://tbpl.mozilla.org/?tree=Try&rev=d94c47fba470
https://tbpl.mozilla.org/?tree=Try&rev=513e708b1169
https://tbpl.mozilla.org/?tree=Try&rev=533d5c8b74ae (results not in yet, try is very backed up)
and all of them are green. All those pushes are based off changesets in the regression range philor mentioned in comment 1. I suppose it's possible that something that wants MOZ_UPDATE_CHANNEL=nightly isn't getting properly triggered on those builds, which is what's needed to cause the (intermittent?) problems described in comment 1.
Reporter | ||
Comment 23•11 years ago
|
||
You absolutely should be able to repro on try with that (or less, what you did at first with just the s/nightly/default/ was enough) plus PGO, I did so multiple times.
I think rather than "backed up" that last one is "thrown on the floor" and you'll have to repush it, though - there's a period during buildbot reconfigs when pushes just get ignored.
Comment 24•11 years ago
|
||
Finally got it down to a single push, behold:
good: https://tbpl.mozilla.org/?tree=Try&rev=f939bb960eb6
bad: https://tbpl.mozilla.org/?tree=Try&rev=613bd3e6acd6
The crashtest/reftest oranges seem to be either intermittent or caused/exacerbated by a later push. But the talos reds start showing up with:
https://hg.mozilla.org/mozilla-central/rev/5cd49ff35fb9
and even still appear to be intermittent (e.g. many retriggers of talos tests came up green in the bad push above). We also have pushes with patches after the bad landing:
https://tbpl.mozilla.org/?tree=Try&rev=fc7b16936ac8
https://tbpl.mozilla.org/?tree=Try&rev=6dc7fb98dd99
coming up with red talos. So I'm going to pin the tail on bug 873073. (Which I apparently can't block this bug on because s-g. Lovely.)
Jim, do you see how your patch for bug 873073 could be causing those talos hangs?
Flags: needinfo?(irving) → needinfo?(jmathies)
Assignee | ||
Comment 25•11 years ago
|
||
I really don't. As I understand it the only different between the builds that show the issue and those that don't is official branding plus telemetry. Not sure why creating a temporary window on startup would break things based on those changes.
Let's go ahead and back out 873073 all the same and confirm the problem goes away. If it does I can take a look at that bug again and push to try with different fixes to come up with a new fix that doesn't trigger this.
Flags: needinfo?(jmathies)
Assignee | ||
Comment 26•11 years ago
|
||
Actually I might be able to fix this without the backout based on an idea I have. Need to do some try pushes.
Assignee: taras.mozilla → jmathies
Assignee | ||
Comment 27•11 years ago
|
||
Moving this query call from the toolkit to the first toplevel window fixes the problem.
https://tbpl.mozilla.org/?tree=Try&showall=0&rev=5a2f9cedb166
Attachment #771677 -
Flags: review?(netzen)
Assignee | ||
Comment 28•11 years ago
|
||
Thanks for the bisection work Nathan!
Updated•11 years ago
|
Attachment #771677 -
Flags: review?(netzen) → review+
Assignee | ||
Comment 29•11 years ago
|
||
Comment 30•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/01b6ce24c59b
*fingers crossed*
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
Comment 31•11 years ago
|
||
Today's nightly is looking green! Thanks Jim :)
Assignee | ||
Comment 32•11 years ago
|
||
Attachment #771677 -
Flags: approval-mozilla-aurora?
Updated•11 years ago
|
Attachment #771677 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 33•11 years ago
|
||
status-firefox24:
--- → fixed
status-firefox25:
--- → fixed
You need to log in
before you can comment on or make changes to this bug.
Description
•