Closed Bug 1727943 Opened 3 years ago Closed 3 years ago

Migrate remaining Windows 10 x64 CCov suites from AWS to Azure

Categories

(Testing :: General, task)

task

Tracking

(firefox-esr91 fixed, firefox94 fixed)

RESOLVED FIXED
94 Branch
Tracking Status
firefox-esr91 --- fixed
firefox94 --- fixed

People

(Reporter: masterwayz, Assigned: masterwayz)

References

Details

Attachments

(10 files)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
No description provided.
Assignee: nobody → michelle
Status: NEW → ASSIGNED
Pushed by michelle@masterwayz.nl:
https://hg.mozilla.org/integration/autoland/rev/8339eb731281
Part 1: Migrate Windows 10 ccov from AWS to Azure r=jmaher
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 93 Branch
Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: FIXED → ---
Target Milestone: 93 Branch → ---
Status: REOPENED → ASSIGNED
Blocks: 1727793
No longer blocks: 1718290
Attachment #9240827 - Attachment description: Bug 1727943 - Migrate more Windows 10 ccov suites from AWS to Azure r=jmaher → Bug 1727943 - Part 2: Migrate more Windows 10 ccov suites from AWS to Azure r=jmaher
Attachment #9240827 - Attachment description: Bug 1727943 - Part 2: Migrate more Windows 10 ccov suites from AWS to Azure r=jmaher → Bug 1727943 - Part 2: Migrate mochitest-plain and jittest Windows 10 ccov suites from AWS to Azure r=jmaher
Attachment #9240827 - Attachment description: Bug 1727943 - Part 2: Migrate mochitest-plain and jittest Windows 10 ccov suites from AWS to Azure r=jmaher → Bug 1727943 - Part 2: Migrate mochitest, reftest and crashtest Windows 10 ccov suites from AWS to Azure r=jmaher
Pushed by michelle@masterwayz.nl:
https://hg.mozilla.org/integration/autoland/rev/9f89e5ae8b4b
Part 2: Migrate mochitest, reftest and crashtest Windows 10 ccov suites from AWS to Azure r=jmaher

As requested in Matrix, NI'ing :marco.
You can see the "blocking" meta bug for more info and also Bug 1718290.

Flags: needinfo?(mcastelluccio)
Flags: needinfo?(mcastelluccio)
Regressions: 1733505
No longer regressions: 1733505
See Also: → 1733505
Pushed by michelle@masterwayz.nl:
https://hg.mozilla.org/integration/autoland/rev/cc2c8ad78660
Part 3: Migrate wpt Windows 10 CCov suites from AWS to Azure r=jmaher,webdriver-reviewers,whimboo
https://hg.mozilla.org/integration/autoland/rev/77db0112e100
Part 4: Migrate jittest and xpcshell Windows 10 CCov suites from AWS to Azure r=jmaher
Keywords: leave-open
Status: ASSIGNED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 94 Branch

Joel, Michelle, I see jittest and xpcshell tests are no longer running under coverage on Windows.
Looking at https://hg.mozilla.org/mozilla-central/rev/77db0112e100, it seems they were disabled.
Why were they disabled? Can we revert that patch and keep them running on AWS for now until we figure out a solution?

Flags: needinfo?(michelle)
Flags: needinfo?(jmaher)

As part of a migration, we need to not be on the hook to investigate everything and work to migrate things in a timely fashion, hence we filed this bug. Often there are edge cases which can take weeks of time (we already spent almost 2 weeks on ccov already longer than planned). With that said, I know Mark has a try push with the jit/ccov running on an upgraded disk instance:
https://treeherder.mozilla.org/jobs?repo=try&revision=3c1d45b320cfff3c7580d696587b5dd2980b0e81

this looks production ready and we should have the coverage soon.

Flags: needinfo?(michelle)
Flags: needinfo?(jmaher)

I understand the need for a timely resolution, but I was only contacted about this on Thursday night, and the suite disablings landed on Friday night, earlier than 24 hours later.

Next time, if you hit a code coverage-related problem, could you contact me or Calixte earlier so we can help you from the beginning? Our diagnosis was pretty quick, so we could have spared you the 2 weeks of investigation.

I didn't know who to talk to, I had to ask around for a few days after things were looking problematic. We need a clear ownership map for build variants. The failures were not showing up in code coverage code, they were showing up in test suites timing out, so we didn't think it was really code coverage related- I am refining this process and in the future all stakeholders who will have jobs they care about migrating will be aware of a migration affecting them (i.e. you and calixte would have known we were migrating windows and a rough timeline on when we would start working with a bug filed so you both would know when the work was starting).

Thanks for the quick diagnosis, I am excited to see this all up and running this week.

(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #16)

I didn't know who to talk to, I had to ask around for a few days after things were looking problematic. We need a clear ownership map for build variants. The failures were not showing up in code coverage code, they were showing up in test suites timing out, so we didn't think it was really code coverage related- I am refining this process and in the future all stakeholders who will have jobs they care about migrating will be aware of a migration affecting them (i.e. you and calixte would have known we were migrating windows and a rough timeline on when we would start working with a bug filed so you both would know when the work was starting).

This sounds like a great idea in general. If jobs weren't all defined together in the same files we could piggyback on the module system Zeid is working on.

Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d5597b7c7a5b
run win10 ccov on -ssd instances. r=MasterWayZ
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e93f80ea6d58
turn on mochitest* tests that were skipped on win10-ccov and now pass. r=MasterWayZ

https://hg.mozilla.org/mozilla-central/rev/8339eb731281 seems to have disabled a few suites on CCOV (we discovered it in bug 1605650), was it intentional?

Flags: needinfo?(jmaher)

it wasn't well documented if it was intentional.

I see these missing:
test-coverage-wpt
desktop-screenshot-capture
firefox-ui-functional-local
firefox-ui-functional-remote
mochitest-a11y
mochitest-chrome
mochitest-chrome-gpu
mochitest-media
mochitest-plain-gpu
mochitest-webgpu
mochitest-webgl1-core
mochitest-webgl1-ext
mochitest-webgl2-core
mochitest-webgl2-ext
reftest
telemetry-tests-client
test-verify-wpt

:masterwayz, want to give those test suites a try on win10-ccov?

Flags: needinfo?(jmaher) → needinfo?(michelle)

Whoops, that was not intentional!
Try has push been made, it may run more jobs than it should though, will contact you through Matrix once it is up again.

Flags: needinfo?(michelle)
Regressions: 1740155
Pushed by michelle@masterwayz.nl:
https://hg.mozilla.org/integration/autoland/rev/5324daf54f51
Re-enable Windows 10 x64 2004 CCov tests that were forgotten r=jmaher
Pushed by nfay@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8030fda0796e
Fix mistake in manifest edit r=jmaher a=fix CLOSED TREE
Attachment #9250091 - Attachment description: WIP: Bug 1727943 - Fix refest failure on Windows → Bug 1727943 - Fix refest failure on Windows
Pushed by michelle@masterwayz.nl:
https://hg.mozilla.org/integration/autoland/rev/d00a942b7b05
Fix another refest failure on Windows
Regressions: 1740413
No longer regressions: 1740413
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: