Closed Bug 1795158 Opened 2 years ago Closed 2 years ago

windows rust builds are failing to run on azure/py39

Categories

(Firefox Build System :: General, defect, P2)

defect

Tracking

(firefox108 fixed)

RESOLVED FIXED
108 Branch
Tracking Status
firefox108 --- fixed

People

(Reporter: jmaher, Assigned: ahochheiden)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

in order to migrate away from aws, windows builds need to be migrated. While doing this we end up an updated mozilla-build package which includes python 3.9.10 (vs 3.6.5 on the old package).

There were some issues getting ccov/plain to work, but we have resolved those with os.normcase and removing some issues with CRLF vs \n.

here is a try push with a failing br job (a small hack here to use -j16 in order to match AWS jobs- didn't help):

Z:\task_166568980078172\workspace\obj-build\ipc\ipdl\PBrowser.ipdl:0: error: Trying to load `PBrowser' from a file when we'd already seen it in file `z:\build\workspace\obj-build\ipc\ipdl\PBrowser.ipdl'
mozmake[4]: *** [Makefile:30: ipdl.track] Error 1

in looking in detail I noticed that on py36/aws we consistently run the build in the same order/steps, on py39/azure we run them consistently in a different order. One simple example is here:

  • py36: dist_public, then dist_private
  • py39: dist_private, then dist_public

another difference I see is the path for python, on py36, we call:
z:/build/workspace/obj-build/_virtualenvs/build/Scripts/python.exe Z:/task/build/src/ipc/ipdl/ipdl.py

but on py39 we call:
Z:/task_XYZ/workspace/obj-build/_virtualenvs/build/Scripts/python.exe Z:/task/build/src/ipc/ipdl/ipdl.py

you can see 2 differences:

  • z: vs Z:
  • root dir is build vs task_XYZ

I don't know if these observations mean anything- I am not an expert of reading the tea leaves, any advice or fixes would help!

:glandium - do you have any ideas on how to move forward?

Flags: needinfo?(mh+mozilla)

Add filename to the logged error in https://searchfox.org/mozilla-central/rev/76ccfc801e6b736c844cde3fddeab7a748fc8515/ipc/ipdl/ipdl/parser.py#67
Chances are one is in z:/build and the other in z:/task_XYZ. If you're lucky, it could "just" be / vs \

Flags: needinfo?(mh+mozilla)
Blocks: 1795165

https://treeherder.mozilla.org/jobs?repo=try&selectedTaskRun=RrsD4PoWTkKXheyXRC8Wcw.0&tier=1%2C2%2C3&revision=fd4566a589e6386168a80c8b5a786dbfafa9e690:
Z:\task_166569780467540\workspace\obj-build\ipc\ipdl\PBrowser.ipdl:0: error: filenameZ:\task_166569780467540\workspace\obj-build\ipc\ipdl\PBrowser.ipdl' :: Trying to load PBrowser' from a file when we'd already seen it in filez:\build\workspace\obj-build\ipc\ipdl\PBrowser.ipdl'`

what is really interesting is this is saying the file in z:\task... was already seen in z:\build.. I assume there is a symlink (in windows?) where z:\build\workspace -> z:\task_xyz\workspace

it's supposed to be a volume mount point IIRC.

Pete probably has some insight.

Flags: needinfo?(pmoore)
Severity: -- → S3
Priority: -- → P2

(In reply to Mike Hommey [:glandium] from comment #5)

Pete probably has some insight.

I'm not sure how this was set up in firefox-ci. Any idea Mark?

Flags: needinfo?(pmoore) → needinfo?(mcornmesser)

I'm not sure how this was set up in firefox-ci. Any idea Mark?

I have been looking into this as well. From the image configuration side we don't do much in regards to the Z drive. There is a script that mounts the drive, a script that does some clean up if needed, and a registry setting point to Z as a error dump drive.

In the generic-worker config we are setting:

            "tasksDir": "Z:\\",
            "cachesDir": "Z:\\caches",
            "downloadsDir": "Z:\\downloads",

I am not seeing anywhere pre-task where the above difference can be coming from.

Flags: needinfo?(mcornmesser)
Assignee: nobody → ahochheiden
Status: NEW → ASSIGNED

I gave up on trying to find and fix the root cause of the problem (I spent way too much time on it), so that patch just fixes the symptom. It's kind of a hack, and I do feel a bit uneasy about 'fixing' something I don't fully understand, but at least this lets us move forward with Azure.

rusttests try run: https://treeherder.mozilla.org/jobs?repo=try&revision=3aca3dadacb116ca57035b0bd5e0550fbd9ce100&selectedTaskRun=d54i1l8rQmS0y7OTN8HwCQ.0
auto try run: https://treeherder.mozilla.org/jobs?repo=try&revision=3e3f1ef6153bc9df6d6b3e0351d88f531066d88b

Pushed by ahochheiden@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/87d747a7a020
Strip the two leading path segments to resolve an issue on Windows Rust builds caused by comparing paths from two different mount points r=jmaher
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 108 Branch
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 108 Branch → ---
Attachment #9302129 - Attachment is obsolete: true
Pushed by ahochheiden@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/476e32a794e8
Apply `os.path.realpath` to `topsrcdir` and `topobjdir` in `MozbuildObject` and in `site.py` r=firefox-build-system-reviewers,glandium
Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 108 Branch

as a note, we will need to migrate this to all branches, including ESR.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: