Closed Bug 1413630 Opened 7 years ago Closed 6 years ago

Update Windows automation to MozillaBuild 3.1

Categories

(Taskcluster :: General, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: marco, Assigned: pmoore)

References

Details

(Whiteboard: [stockwell infra])

Attachments

(2 files, 1 obsolete file)

32-bit Python can easily run out of memory (e.g. bug 1317041 and bug 1413015). We should switch to using 64-bit Python.
Ted, what would be the process to use 64-bit Python instead of the 32-bit one?
Flags: needinfo?(ted)
We would need to update the Python that gets installed on the Taskcluster Windows AMIs to be 64-bit. pmoore can speak more intelligently about this than I can.

I think it should be possible to deploy a test AMI with this change, create test worker types that use it, and test the changes out on try.

MozillaBuild 3 shipped with a 64-bit Python (IIRC) and developers are using that to build Firefox, so I wouldn't expect any major surprises from the build system. It's possible that something in automation could break due to only having 32-bit wheels available or something like that. Our actual Python code in the tree shouldn't care about 32 vs. 64-bit.
Component: Build Config → General
Flags: needinfo?(ted)
Product: Core → Taskcluster
pmoore, can you help here?
Flags: needinfo?(pmoore)
I've opened https://github.com/mozilla-releng/OpenCloudConfig/pull/103 to update MozillaBuild to 3.0. I'll need help testing it.
Flags: needinfo?(pmoore) → needinfo?(rthijssen)
Depends on: 1344643
to test, modify the pr to only change the beta worker types:

- gecko-1-b-win2012-beta
- gecko-t-win7-32-beta
- gecko-t-win7-32-gpu-b
- gecko-t-win10-64-beta
- gecko-t-win10-64-gpu-b

once that gets merged and deployed, you can run builds and tests that use the beta worker types by pushing a change like this to try:

https://hg.mozilla.org/try/rev/5bbf8932fa07da469860fe75ccd643f625faaafa
Flags: needinfo?(rthijssen)
Looks like the builds are failing during installer packaging because they can't find UPX.
https://treeherder.mozilla.org/logviewer.html#?job_id=141975526&repo=try&lineNumber=36865

Version 3.0 did update UPX to version 3.94 and the paths in etc/profile.d/profile-extravars.sh accordingly. Do we have a hard-coded path to UPX somewhere else that needs updating?
(In reply to Ryan VanderMeulen [:RyanVM] from comment #7)
> Looks like the builds are failing during installer packaging because they
> can't find UPX.
> https://treeherder.mozilla.org/logviewer.
> html#?job_id=141975526&repo=try&lineNumber=36865
> 
> Version 3.0 did update UPX to version 3.94 and the paths in
> etc/profile.d/profile-extravars.sh accordingly. Do we have a hard-coded path
> to UPX somewhere else that needs updating?

Yes, it looks like it's in OpenCloudConfig. I've submitted another PR to update the test worker types: https://github.com/mozilla-releng/OpenCloudConfig/pull/106.
New try build: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1f882bdf8c0bce30fb318b00bba7f283b34d3bb5.
It looks green enough, I think the oranges are known intermittents.
Updating summary to reflect where this bug is taking us.

Also, blocking python 3 tracking bug because MozillaBuild 3.0 will automagically get us Python 3 in automation \o/
Blocks: buildpython3
Summary: Use 64-bit Python in Windows builds → Update Windows automation to MozillaBuild 3.0
To get python3 (and nodejs if so desired), there will need to be an OpenCloudConfig update akin to https://github.com/mozilla-releng/OpenCloudConfig/commit/497ec413e962b3151259cd57e35abc988045f4f1 first.
Version 3.1 is out now. Let's use that instead.
Summary: Update Windows automation to MozillaBuild 3.0 → Update Windows automation to MozillaBuild 3.1
I would prefer switching to MozillaBuild 3.0 first and then MozillaBuild 3.1 to keep the number of changes low. The reason for this bug is just switching to 64-bit Python, so MozillaBuild 3.0 would be enough. Is there a compelling reason why we should switch to 3.1 directly?
Blocks: 1415163
Because the odds of anybody else shepherding it are slim to none.
What's the big concern? Can't we find out easily enough on Try if there's an issue with 3.1 or not?
I've just seen your announcement on dev-platform and there are very few changes, so I agree we can update to 3.1 directly.
Assignee: nobody → mcastelluccio
Status: NEW → ASSIGNED
Note that some of us are running into an issue on MozillaBuild 3.1 - ref bug 1415374.
See Also: → 1415374
No longer blocks: 1415163
Merged https://github.com/mozilla-releng/OpenCloudConfig/pull/105

Deploying to gecko-t-win10-64-cu gecko-t-win10-64-gpu gecko-t-win10-64 in https://tools.taskcluster.net/groups/L7EGT69hRImX655PGzNj3g

Thanks Marco!
MozillaBuild 3.1.1 got released with some patch fixes for bad regressions in 3.1. Shouldn't you try it instead?

(unable to needinfo :grenade as he is away on PTO)
Flags: needinfo?(pmoore)
I think we might be OK, actually, because the DLL issue was only an issue if there was a system Python also in the path (which isn't the case for our CI machines AFAIK) and I believe that the version of hg we use is independent of the one that comes with MozillaBuild as well.
And I suppose fsmonitor being enabled/disabled isn't a factor then? Just wanting to make sure this doesn't affect even more people.
Flags: needinfo?(ryanvm)
Yes, we should have Pete or Rob confirm it, but I'm pretty sure it's not an issue. The mercurial.ini we ship with MozillaBuild would very-much be tied to the version of hg we ship on top of Python.

Or we can just use 3.1.1 and not have any lingering doubts :). On the bright side, I haven't gotten any reports of new issues with that release so far!
Flags: needinfo?(ryanvm)
The update on the build worker types has been deployed 9 days ago and we haven't encountered issues so far.
I would say let's close this now, and open a new bug if we decide to migrate from MozillaBuild 3.1 to 3.1.1.

Thanks Marco, Ryan, Rob, Greg, Gary, Ted, Henrik for your shared involvement on this - great to see it live in production! \o/
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Flags: needinfo?(pmoore)
Resolution: --- → FIXED
I've spotted we haven't rolled this out to level 2,3 builders - it is only only level 1 builders at the moment. Should we roll this out to level 2,3 builders now?
Status: RESOLVED → REOPENED
Flags: needinfo?(mcastelluccio)
Resolution: FIXED → ---
Assignee: mcastelluccio → pmoore
Attachment #8948418 - Flags: review?(rthijssen)
It isn't needed for the code coverage builds, which only run on level 1 builders for now, but I expect we want this change on other builders too for Python 3.
Flags: needinfo?(mcastelluccio)
Attachment #8948418 - Attachment is obsolete: true
Attachment #8948418 - Flags: review?(rthijssen)
Attachment #8948420 - Flags: review?(rthijssen)
Thanks Marco!
Attachment #8948420 - Flags: review?(rthijssen) → review+
So guess who forgot to update the chain of trust key on gecko-3-b-win2012 and caused a tree closure.

SORRY!!!

I've rolled back gecko-3-b-win2012 for now.
I've triggered a redeployment...
We now have mozilla-build on all our 64 bit Windows worker types (so all Windows 10 and Windows Server 2012 variants).

Windows 7 worker types are all 32 bit, and remain on MozillaBuild 2.2.0.
Status: REOPENED → RESOLVED
Closed: 7 years ago6 years ago
Resolution: --- → FIXED
(In reply to Pete Moore [:pmoore][:pete] from comment #36)
> We now have mozilla-build on all our 64 bit Windows worker types (so all
> Windows 10 and Windows Server 2012 variants).

Correction:

We now have *MozillaBuild 3.1* on all our 64 bit Windows worker types (so all Windows 10 and Windows Server 2012 variants).
Whiteboard: [stockwell infra]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: