Intermittent FileNotFoundError: [Errno 2] No such file or directory: 'S.gpg-agent.browser'

RESOLVED FIXED in Firefox 67

Status

defect
P5
normal
RESOLVED FIXED
9 months ago
2 months ago

People

(Reporter: intermittent-bug-filer, Assigned: glandium)

Tracking

({intermittent-failure})

unspecified
mozilla67
Dependency tree / graph

Firefox Tracking Flags

(firefox67 fixed)

Details

(Whiteboard: [stockwell unknown])

Attachments

(1 attachment)

Reporter

Description

9 months ago
treeherder
Filed by: ncsoregi [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=198755023&repo=autoland

https://queue.taskcluster.net/v1/task/FEgbpyYWQwWoBMCvH030Kw/runs/0/artifacts/public/logs/live_backing.log

[vcs 2018-09-11T22:59:22.134Z] TinderboxPrint:<a href=https://hg.mozilla.org/integration/autoland/rev/23860890d959f318130e619976cbeae4472937a3 title='Built from autoland revision 23860890d959f318130e619976cbeae4472937a3'>23860890d959f318130e619976cbeae4472937a3</a>
[task 2018-09-11T22:59:22.134Z] executing ['bash', '-c', 'cd /builds/worker && workspace/build/src/taskcluster/scripts/misc/build-mingw32-nsis.sh']
[task 2018-09-11T22:59:22.137Z] 
[task 2018-09-11T22:59:22.137Z] # We set the INSTALL_DIR to match the directory that it will run in exactly,
[task 2018-09-11T22:59:22.137Z] # otherwise we get an NSIS error of the form:
[task 2018-09-11T22:59:22.137Z] #   checking for NSIS version...
[task 2018-09-11T22:59:22.137Z] #   DEBUG: Executing: `/home/worker/workspace/build/src/mingw32/
[task 2018-09-11T22:59:22.137Z] #   DEBUG: The command returned non-zero exit status 1.
[task 2018-09-11T22:59:22.137Z] #   DEBUG: Its error output was:
[task 2018-09-11T22:59:22.137Z] #   DEBUG: | Error: opening stub "/home/worker/workspace/mingw32/
[task 2018-09-11T22:59:22.137Z] #   DEBUG: | Error initalizing CEXEBuild: error setting
[task 2018-09-11T22:59:22.137Z] #   ERROR: Failed to get nsis version.
[task 2018-09-11T22:59:22.137Z] 
[task 2018-09-11T22:59:22.137Z] WORKSPACE=$HOME/workspace
[task 2018-09-11T22:59:22.137Z] + WORKSPACE=/builds/worker/workspace
[task 2018-09-11T22:59:22.137Z] HOME_DIR=$WORKSPACE/build
[task 2018-09-11T22:59:22.137Z] + HOME_DIR=/builds/worker/workspace/build
[task 2018-09-11T22:59:22.137Z] INSTALL_DIR=$WORKSPACE/build/src/mingw32
[task 2018-09-11T22:59:22.137Z] + INSTALL_DIR=/builds/worker/workspace/build/src/mingw32
[task 2018-09-11T22:59:22.137Z] TOOLTOOL_DIR=$WORKSPACE/build/src
[task 2018-09-11T22:59:22.137Z] + TOOLTOOL_DIR=/builds/worker/workspace/build/src
[task 2018-09-11T22:59:22.137Z] UPLOAD_DIR=$HOME/artifacts
[task 2018-09-11T22:59:22.137Z] + UPLOAD_DIR=/builds/worker/artifacts
[....]
[task 2018-09-11T23:03:44.924Z] scons: done building targets.
[task 2018-09-11T23:03:45.074Z] 
[task 2018-09-11T23:03:45.074Z] # --------------
[task 2018-09-11T23:03:45.074Z] 
[task 2018-09-11T23:03:45.074Z] cd $WORKSPACE/build/src
[task 2018-09-11T23:03:45.074Z] + cd /builds/worker/workspace/build/src
[task 2018-09-11T23:03:45.074Z] tar caf nsis.tar.xz mingw32
[task 2018-09-11T23:03:45.074Z] + tar caf nsis.tar.xz mingw32
[task 2018-09-11T23:03:46.769Z] 
[task 2018-09-11T23:03:46.769Z] mkdir -p $UPLOAD_DIR
[task 2018-09-11T23:03:46.769Z] + mkdir -p /builds/worker/artifacts
[task 2018-09-11T23:03:46.770Z] cp nsis.tar.* $UPLOAD_DIR
[task 2018-09-11T23:03:46.770Z] + cp nsis.tar.xz /builds/worker/artifacts
[fetches 2018-09-11T23:03:46.772Z] removing /builds/worker/workspace/build
Traceback (most recent call last):
  File "/builds/worker/bin/run-task", line 754, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/builds/worker/bin/run-task", line 749, in main
    shutil.rmtree(fetches_dir)
  File "/usr/lib/python3.5/shutil.py", line 480, in rmtree
    _rmtree_safe_fd(fd, path, onerror)
  File "/usr/lib/python3.5/shutil.py", line 418, in _rmtree_safe_fd
    _rmtree_safe_fd(dirfd, fullname, onerror)
  File "/usr/lib/python3.5/shutil.py", line 438, in _rmtree_safe_fd
    onerror(os.unlink, fullname, sys.exc_info())
  File "/usr/lib/python3.5/shutil.py", line 436, in _rmtree_safe_fd
    os.unlink(name, dir_fd=topfd)
FileNotFoundError: [Errno 2] No such file or directory: 'S.gpg-agent.browser'
[taskcluster 2018-09-11 23:03:47.115Z] === Task Finished ===
[taskcluster 2018-09-11 23:03:48.775Z] Unsuccessful task run with exit code: 1 completed in 412.955 seconds
Comment hidden (Intermittent Failures Robot)
https://wiki.mozilla.org/Bug_Triage#Intermittent_Test_Failure_Cleanup
Status: NEW → RESOLVED
Last Resolved: 8 months ago
Resolution: --- → INCOMPLETE

Updated

7 months ago
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Comment hidden (Intermittent Failures Robot)
My guess is that this is caused by a race condition in run-task, where the socket is getting removed while `run-task` is trying to recursively remove the parent directory (i.e. the file gets removed between doing os.listdir and os.remove, causing the later to faii).
Duplicate of this bug: 1506356
Assignee

Comment 6

6 months ago
Part of the problem here is that run_task cleans up MOZ_FETCHES_DIR, which in many cases is a directory that contains much more than just the result of fetches. For instance, on the wine toolchain jobs, it's ~/workspace/build, and the source tree is under ~/workspace/build/src... That also causes problems for interactive tasks of toolchain jobs, because once run_task is finished, the interactive task has everything interesting removed, which kills its usefulness (I thought I had filed a separate bug for that, but can't seem to find it). Why does run_task need to remove MOZ_FETCHES_DIR in the first place anyways?
(In reply to Mike Hommey [:glandium] from comment #6)
> Why does run_task need to remove MOZ_FETCHES_DIR in
> the first place anyways?

It was added in Bug 1466660 because the linux hardware workers don't (or maybe didn't) clean up between runs.
In the last 30 days this occurred 10 times only on autoland on the 15th of November.
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
15 failures in the last 7 days, all on autoland. 

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=215756918&repo=autoland&lineNumber=37544

[fetches 2018-12-06T16:08:05.462Z] removing /builds/worker/workspace/build
Traceback (most recent call last):
  File "/builds/worker/bin/run-task", line 761, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/builds/worker/bin/run-task", line 756, in main
    shutil.rmtree(fetches_dir)
  File "/usr/lib/python3.5/shutil.py", line 480, in rmtree
    _rmtree_safe_fd(fd, path, onerror)
  File "/usr/lib/python3.5/shutil.py", line 418, in _rmtree_safe_fd
    _rmtree_safe_fd(dirfd, fullname, onerror)
  File "/usr/lib/python3.5/shutil.py", line 438, in _rmtree_safe_fd
    onerror(os.unlink, fullname, sys.exc_info())
  File "/usr/lib/python3.5/shutil.py", line 436, in _rmtree_safe_fd
    os.unlink(name, dir_fd=topfd)
FileNotFoundError: [Errno 2] No such file or directory: 'S.gpg-agent.ssh'
[taskcluster 2018-12-06 16:08:06.156Z] === Task Finished ===
[taskcluster 2018-12-06 16:08:10.541Z] Unsuccessful task run with exit code: 1 completed in 566.894 seconds

Are there any updates here? Tom? Mike?
Whiteboard: [stockwell needswork:owner]
Comment hidden (Intermittent Failures Robot)
Should we mark these tasks as auto-retrying until this is fixed?

Other possible fixes:
 - run rmtree in a loop until success
 - stop running gpg-agent in this directory (would just running gpg be enough?)
 - killall gpg-agent before rmtree
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
Assignee

Comment 18

3 months ago

These toolchain tasks are the last ones using the historical
download-tools script from build/unix/build-gcc, which invokes gpg to
validate the downloaded tarballs. The consequence is that gpg-agent is
spawned and stays running, preventing a cleanup script from doing its
job, making the tasks fail.

Fetches are the new way to download sources, and can also do gpg
validation without those caveats.

The download-tools.sh script can then be removed as it's not used
anymore.

Comment hidden (Intermittent Failures Robot)

Comment 20

3 months ago
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/autoland/rev/8ecd607b3d51
Use fetches for nsis and wine toolchain tasks. r=froydnj

Comment 21

2 months ago
bugherder
Status: REOPENED → RESOLVED
Last Resolved: 8 months ago2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67
Assignee: nobody → mh+mozilla
Comment hidden (Intermittent Failures Robot)
You need to log in before you can comment on or make changes to this bug.