Open Bug 1321922 Opened 5 years ago Updated 8 months ago

Support symlinks on Windows

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

People

(Reporter: gps, Unassigned)

References

(Depends on 1 open bug, Blocks 3 open bugs)

Details

Symlinks on NTFS exist. But they have historically required administrative privileges to create and therefore are effectively unusable by the build system.

This appears to be changing!

https://blogs.windows.com/buildingapps/2016/12/02/symlinks-windows-10/ documents the new hotness.

Hopefully a few months from now we can detect whether userland symlinks are supported on Windows and use them automatically. This should make a number of development related tasks on Windows significantly faster, as we'll no longer need to copy and keep synchronized thousands of files into the objdir.
They definitely work on the Taskcluster builders, we're doing all of our Windows TC builds from inside a symlinked directory there:
https://dxr.mozilla.org/mozilla-central/rev/6dbc6e9f62a705d5f523cc750811bd01c8275ec6/taskcluster/taskgraph/transforms/job/mozharness.py#224

I believe the Task user was explicitly granted symlink permission, however.
Blocks: 1326328
The last time we tried this we got hung up on the msys tools we were using for some parts of the build not supporting symlinks properly. I think we were using `tar` for copying files around or something like that and it blew up and corrupted the build.

A lot of that code has changed lately, so it would be worthwhile to try again.
> A lot of that code has changed lately, so it would be worthwhile to try
> again.

Please do :)  Thanks.
Product: Core → Firefox Build System
If someone wanted to take a crack at fixing this, the most useful place to start would probably be the mozpack code, since that's what we use to install files from install manifests, which covers things like `EXPORTS` and `FINAL_TARGET_FILES`. Currently the only way we turn on symlink usage is checking `hasattr(os, 'symlink')` here:
https://dxr.mozilla.org/mozilla-central/rev/0528a414c2a86dad0623779abde5301d37337934/python/mozbuild/mozpack/copier.py#302

We'd need to make this a slightly more complicated check. If the script is running on Windows, and developer mode is enabled (there's a registry key you can check: https://stackoverflow.com/a/41232108/69326 ). Then we'd need to actually implement `AbsoluteSymlinkFile.copy` for Windows:
https://dxr.mozilla.org/mozilla-central/rev/0528a414c2a86dad0623779abde5301d37337934/python/mozbuild/mozpack/files.py#321

Presumably you'd want to do that by calling `CreateSymbolicLinkW` directly, something like:
http://blog.bfitz.us/?p=2035

(note that the blog post in comment 0 mentions that you need to pass SYMBOLIC_LINK_FLAG_ALLOW_UNPRIVILEGED_CREATE (0x2) in the flags!)
I was thinking about this again today and tried out some sample code. This Python code works to create symlinks on Windows 10:
```
SYMBOLIC_LINK_FLAG_ALLOW_UNPRIVILEGED_CREATE = 0x2

def symlink(path, target):
    flags = SYMBOLIC_LINK_FLAG_ALLOW_UNPRIVILEGED_CREATE
    path_buf = ctypes.create_unicode_buffer(path)
    target_buf = ctypes.create_unicode_buffer(target)
    return ctypes.windll.kernel32.CreateSymbolicLinkW(path_buf, target_buf, flags) != 0
```

Here's a Python snippet to check if developer mode is enabled:
```
def developer_mode_enabled():
    import _winreg
    try:
        key = _winreg.OpenKey( _winreg.HKEY_LOCAL_MACHINE, r'SOFTWARE\Microsoft\Windows\CurrentVersion\AppModelUnlock')
        val, ty = _winreg.QueryValueEx(key, 'AllowDevelopmentWithoutDevLicense')
        if ty == _winreg.REG_DWORD:
            return val != 0
    except WindowsError:
        pass
    return False
```
You are my hero, Ted!
I have a ton of patches in bug 1382697 that basically implement this. Many of them can probably land with minimal refactoring. IIRC that bug suffered from classical "biting off more than you can chew." The work should definitely be split up to land the basic refactoring and support for symlinks before any of the larger, more controversial and dangerous changes.

nalexander was asking about this in Whistler. See also the work in bug 1382697, which got stalled out.

Python > 3.3 supports symbolic links out the box on Windows. We'll almost have symlink support for free once the build system switches... which it might actually be able to do already: configure requires python >= 3.5 being installed, and a lot of mozbuild code was made python3 ready recently. It might just be possible to run e.g. install-manifest through python3 instead of python2.

Well... except that it would require a python3 virtualenv we don't have.

I'd like to up-vote on this one. Specially, because when you make a mistake in build IDEs (like VS) by opening an obj dir copy of a header (by clicking an compilation error reference) and make the changes by mistake in that (dist) copy, you will likely loose your changes. This happens to me quite often even though I quite aware of that danger.

Can I somehow help pushing this forward?

Well... except that it would require a python3 virtualenv we don't have.

Is that still true, as I've seen quite a lot recent changes to support python3?

Flags: needinfo?(mh+mozilla)

Yes, Python3 has first-class support in the build system, is required to build, and we use a virtualenv for it. As of last week we use Python 3 for configure in 100% of cases as well.

See the comment in files.py from glandium's latest revision; as far as I know the "weird consequences on automation" are the only remaining blockers to using symlinks on Windows in the build system. I don't know that any of us have done any additional investigation into that though.

My gut feeling is that it's related to Windows version.

Flags: needinfo?(mh+mozilla)

Symlinks without admin previlege are only supported on Windows build 14972+ so it can be a problem if CI has a lower version of the OS.

CI is running Windows 2012 R2(?), whatever version of normal Windows that would correspond to. I wouldn't be surprised if it's older than that.

That would also explain why it worked just fine locally on my machine.

(In reply to Mike Hommey [:glandium] from comment #17)

CI is running Windows 2012 R2(?), whatever version of normal Windows that would correspond to. I wouldn't be surprised if it's older than that.

Server 2012 R2 corresponds to Windows 8.1. :-/

That just means we need to add a condition on the Windows version.

The page also says users should enable Developer Mode to get it. The relevant Python doc says it will throw OSError otherwise.

In practice, it doesn't throw OSError.

The doc also says it needs Python 3.8 while currently MozillaBuild provides 3.7.

In practice, it doesn't throw OSError.

You mean not on Windows, right?

On Windows, Py3.7:

$ python
Python 3.7.4 (tags/v3.7.4:e09359112e, Jul  8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.symlink("mozconfig", "mozconfig_symlink")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: symbolic link privilege not held

I mean on Windows 2012 R2, on automation, it doesn't throw.

I mean on Windows 2012 R2, on automation, it doesn't throw.

Possibly because the machine got an explicit permission for symlinks, per Comment 2.

I tried commenting out the Windows check on files.py and AFAICT the only failures I get are from xpcshell test in Windows 2012 asan CI job.

INFO -  check> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
INFO -  check> ..\..\build\src\testing\xpcshell\selftest.py:558: in assertTestResult
INFO -  check>     """ % ("passed" if expected else "failed", self.log.getvalue()))
INFO -  check> E   AssertionError: Tests should have passed, log:
INFO -  check> E   ========
INFO -  check> E   runxpcshelltests.py | using symbolizer at z:\build\workspace\obj-build\dist\bin\llvm-symbolizer.exe
INFO -  check> E   Found node at z:/build/fetches/node/node.exe
INFO -  check> E   Found moz-http2 at z:\build\build\src\testing\xpcshell\moz-http2\moz-http2.js
INFO -  check> E   Http3 server not found at z:/build/workspace/obj-build\dist\bin\http3server.exe. Tests requiring http/3 will fail.
INFO -  check> E   Running tests sequentially.
INFO -  check> E   SUITE-START | Running 1 tests
INFO -  check> E   profile dir is c:\users\task_1588180085\appdata\local\temp\xpcshell\xpcshellprofile
INFO -  check> E   TEST-START | test_add_test_simple.js
INFO -  check> E   TEST-UNEXPECTED-FAIL | test_add_test_simple.js | xpcshell return code: -2147483645
INFO -  check> E   TEST-INFO took 2104ms
INFO -  check> E   >>>>>>>
INFO -  check> E   PID 2608 | Couldn't convert chrome URL: chrome://branding/locale/brand.properties
INFO -  check> E   PID 2608 | [2608, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 88
INFO -  check> E   PID 2608 | [2608, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 88
INFO -  check> E   PID 2608 | [2608, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 88
INFO -  check> E   PID 2608 | [2608, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 88
INFO -  check> E   PID 2608 | [2608, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 88
INFO -  check> E   PID 2608 | [2608, Main Thread] WARNING: Could not get the program name for a cubeb stream.: 'NS_SUCCEEDED(rv)', file z:/build/build/src/dom/media/CubebUtils.cpp, line 363
RROR -  check> E   PID 2608 | JavaScript error: resource://testing-common/PromiseTestUtils.jsm, line 70: TypeError: can't access property "Debugging", JSMPromise is undefined
RROR -  check> E   PID 2608 | JavaScript error: z:\\build\\build\\src\\testing\\xpcshell\\head.js, line 311: TypeError: can't access property "QueryInterface", _fakeIdleService is undefined
INFO -  check> E   PID 2608 | [2608, Main Thread] WARNING: Extra shutdown CC: 'i < NORMAL_SHUTDOWN_COLLECTIONS', file z:/build/build/src/xpcom/base/nsCycleCollector.cpp, line 3359
INFO -  check> E   PID 2608 | Assertion failure: data, at z:/build/build/src/xpcom/base/nsCycleCollector.cpp:3757
INFO -  check> E   PID 2608 | #01: nsTArray_Impl<nsCOMPtr<nsISupports>,nsTArrayInfallibleAllocator>::ClearAndRetainStorage (z:\\build\\build\\src\\xpcom\\ds\\nsTArray.h:1544)
INFO -  check> E   PID 2608 | #02: nsTArray_Impl<nsCOMPtr<nsISupports>,nsTArrayInfallibleAllocator>::~nsTArray_Impl (z:\\build\\build\\src\\xpcom\\ds\\nsTArray.h:1109)
INFO -  check> E   PID 2608 | #03: mozilla::CycleCollectedJSContext::~CycleCollectedJSContext (z:\\build\\build\\src\\xpcom\\base\\CycleCollectedJSContext.cpp:116)
INFO -  check> E   PID 2608 | #04: XPCJSContext::~XPCJSContext (z:\\build\\build\\src\\js\\xpconnect\\src\\XPCJSContext.cpp:1031)
INFO -  check> E   PID 2608 | #05: XPCJSContext::~XPCJSContext (z:\\build\\build\\src\\js\\xpconnect\\src\\XPCJSContext.cpp:995)
INFO -  check> E   PID 2608 | #06: nsXPConnect::~nsXPConnect (z:\\build\\build\\src\\js\\xpconnect\\src\\nsXPConnect.cpp:131)
INFO -  check> E   PID 2608 | #07: nsXPConnect::~nsXPConnect (z:\\build\\build\\src\\js\\xpconnect\\src\\nsXPConnect.cpp:100)
INFO -  check> E   PID 2608 | #08: nsXPConnect::Release (z:\\build\\build\\src\\js\\xpconnect\\src\\nsXPConnect.cpp:53)
INFO -  check> E   PID 2608 | #09: nsXPConnect::ReleaseXPConnectSingleton (z:\\build\\build\\src\\js\\xpconnect\\src\\nsXPConnect.cpp:167)
INFO -  check> E   PID 2608 | #10: nsComponentManagerImpl::Shutdown (z:\\build\\build\\src\\xpcom\\components\\nsComponentManager.cpp:935)
INFO -  check> E   PID 2608 | #11: mozilla::ShutdownXPCOM (z:\\build\\build\\src\\xpcom\\build\\XPCOMInit.cpp:730)
INFO -  check> E   PID 2608 | #12: XRE_XPCShellMain (z:\\build\\build\\src\\js\\xpconnect\\src\\XPCShellImpl.cpp:1383)
INFO -  check> E   PID 2608 | #13: NS_internal_main (z:\\build\\build\\src\\js\\xpconnect\\shell\\xpcshell.cpp:66)
INFO -  check> E   PID 2608 | #14: wmain (z:\\build\\build\\src\\toolkit\\xre\\nsWindowsWMain.cpp:131)
INFO -  check> E   PID 2608 | #15: __scrt_common_main_seh (f:\\dd\\vctools\\crt\\vcstartup\\src\\startup\\exe_common.inl:288)
INFO -  check> E   PID 2608 | #16: BaseThreadInitThunk (C:\\Windows\\system32\\KERNEL32.DLL + 0x13d2)
INFO -  check> E   PID 2608 | #17: RtlUserThreadStart (C:\\Windows\\SYSTEM32\\ntdll.dll + 0x154e4)
INFO -  check> E   <<<<<<<
INFO -  check> E   INFO | Result summary:
INFO -  check> E   INFO | Passed: 0
INFO -  check> E   INFO | Failed: 1
INFO -  check> E   INFO | Todo: 0
INFO -  check> E   INFO | Retried: 0
INFO -  check> E   SUITE-END | took 2s
INFO -  check> E   Node moz-http2 server shutting down ...

Since PromiseTestUtils.jsm is copied into (OBJDIR)/_tests/modules as a symlink, I think the OS does create symlink but somehow ChromeUtils.import can't deal with symlinks.

This can also be locally reproduced on Windows 10 with Python 3.8 by ./mach build && ./mach python testing/xpcshell/selftest.py.

More digging says it hits Promise.jsm but mozIJSSubScriptLoader can't read the symlinked Promise-backend.js. The tests pass if obj-x86_64-pc-mingw32\dist\bin\modules\Promise-backend.js is replaced by an actual file.

Depends on: 1634446
No longer depends on: 1634446
Depends on: 1634454
Depends on: 1635428
Depends on: 1643072
Depends on: 1692300
You need to log in before you can comment on or make changes to this bug.