Open Bug 1380416 Opened 4 years ago Updated 2 years ago

Investigate using hardlinks instead of symlinks in the dev build on mac/linux

Categories

(Core :: Security: Process Sandboxing, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: Alex_Gaynor, Unassigned)

References

(Depends on 1 open bug)

Details

(Whiteboard: sb+)

Attachments

(2 files)

Currently development builds on macOS and Linux symlink files from the source directory into the build artifact, this makes builds much faster than copying all the files. It also presents a challenge for sandboxing - the content process needs permissions to read both inside the .app, and to wherever the symlink points.

We should investigate if hardlinks can solve this problem: The sandbox should only need whitelist'd access to the .app, not the source directory, since the files will point directly to the inode with the data, and builds should still be fast since they don't rely on copying tons of data.

If this works we can remove the MOZ_REPO_DIR dance I believe.
See Also: → 1380141
See Also: → 1380560
Whiteboard: sb+
:gps, does the possibility of using hardlinks here make sense to you? I know they're a relatively obscure feature, but I think they actually work very nicely here and let us remove a fair bit of complexity from the sandboxing and build processes.
Flags: needinfo?(gps)
(In reply to Alex Gaynor [:Alex_Gaynor] from comment #1)
> :gps, does the possibility of using hardlinks here make sense to you? I know
> they're a relatively obscure feature, but I think they actually work very
> nicely here and let us remove a fair bit of complexity from the sandboxing
> and build processes.

I think we should strongly consider copying instead of symlinking (or hardlinking), so that more of the build process looks the same across operating systems.  I have never investigated the performance difference, which might be significant.

Independently, bgrins and I have been investigating on generalizing some reload front-end workflows (used mostly in devtools land) that depend on having symlinks from the object directory to the source directory to larger parts of the code base.  I've thrown together a watchman file watcher and partial |mach build faster| implementation to support that work, which would also allow to copy rather than symlink for devtools.
Copying would also work from a sandboxing perspective, but I was previously told that the need to copy on windows was a big contributor to why windows builds were slower than macOS/Linux, and I don't want to propose anything which would be a regression for people's workflows! (Including my own ;-))
Ok, threw up a proof-of-concept patch.

Good news: It builds and runs!
Bad news: There's a bug where binaries are still symlinks (notably libsoftokn3.dylib), not hardlinks
Good news: It interacts exactly as I expected!
News: It's still a proof-of-concept and definitely requires cleanup :-)
(In reply to Alex Gaynor [:Alex_Gaynor] from comment #6)
> Ok, threw up a proof-of-concept patch.
> 
> Good news: It builds and runs!
> Bad news: There's a bug where binaries are still symlinks (notably
> libsoftokn3.dylib), not hardlinks
> Good news: It interacts exactly as I expected!
> News: It's still a proof-of-concept and definitely requires cleanup :-)

Awesome.

Regarding discussion of hard links, copying, and symlinks, using hard links is the one option that won't work if the object directory is configured to be on another filesystem.
Comment on attachment 8886679 [details]
Bug 1380416 - Part 1 - In development builds on macOS and Linux use hardlinks rather than symlinks for placing resources in the app bundle

https://reviewboard.mozilla.org/r/157464/#review162984

Not sure what you want from review here.  With the big TODO, it's not suitable for landing.

This has potential to change many things.  Flag gps and glandium to review, please.

::: config/nsinstall.c:413
(Diff revision 2)
> -						readlink(toname, buf, sizeof buf) != len ||
> +          (fromsb.st_dev != tosb.st_dev || fromsb.st_ino != tosb.st_ino)) {
> -						strncmp(buf, name, (unsigned int)len) != 0 || 
> -			((stat(name, &fromsb) == 0) &&
> -			 (fromsb.st_mtime > tosb.st_mtime) 
> -			 ))) {
> -                (void) (S_ISDIR(tosb.st_mode) ? rmdir : unlink)(toname);
> +        (void) (S_ISDIR(tosb.st_mode) ? rmdir : unlink)(toname);

nit: indentation looks strange here.

::: python/mozbuild/mozpack/files.py:395
(Diff revision 2)
>  
>          os.rename(temp_dest, dest)
>          return True
>  
>  
> +class HardlinkFile(File):

is this `AbsoluteHardlinkFile`?  If not, why not?

::: python/mozbuild/mozpack/files.py:431
(Diff revision 2)
> +            if st.st_dev == path_st.st_dev and st.st_ino == path_st.st_ino:
> +                return False
> +            # The dest exists and it points to the wrong place
> +            os.remove(dest)
> +
> +        # Now, either the dest used to exist and we just deleted it, or it never

nit: full sentence comment.
Attachment #8886679 - Flags: review?(nalexander) → review-
:nalexander mostly wanted confirmation that this wasn't totally unacceptable. Thanks!
I think using hardlinks/copies is reasonable. However, it will require builds in some scenarios where builds weren't required before (e.g. changing frontend files that are currently symlinked).

In addition, hardlinks don't work across filesystems/volumes. There is a cost to testing for that on every operation. I think we should test for symlinks and hardlinks in moz.configure and then propagate whatever method down to tools that establish files. This will keep the code for "installing files" simpler (we can code to a common "install" interface and choose an implementation from a configure-derived value) and will allow us to easily change and test "install" methods via configure flags. That will make it much easier to unlock future awesomeness, like the use of symlinks/junctions on NTFS.

Patches to add support for hardlinks to the install manifest processor and to detect filesystem linking support are encouraged. There are some performance concerns with the implementation. I'd encourage you to send patches my way.

Patches to actively move away from symlinks need to be weighed against build workflow concerns. I think we need to get some better tooling in place first. I may need to brush off the dust from an old patch series to integrate watchman to kick off background builds when certain files change...
Flags: needinfo?(gps)
(In reply to Gregory Szorc [:gps] from comment #13)
> In addition, hardlinks don't work across filesystems/volumes. There is a
> cost to testing for that on every operation.

If it helps, you can link() unconditionally and then fall back to copying if it fails with EXDEV.  It'll add an extra system call to the copying case, but that's already the slow path so it may not be significant.
Depends on: 1382697
Depends on: 1382739
Depends on: 1384224
Priority: -- → P3
See Also: → 1562684
You need to log in before you can comment on or make changes to this bug.