Closed
Bug 1070074
Opened 10 years ago
Closed 7 years ago
Local mozharness B2G builds fail if B2G.git doesn't already exist
Categories
(Release Engineering :: Applications: MozharnessCore, defect)
Release Engineering
Applications: MozharnessCore
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: mshal, Unassigned)
Details
(Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1930] )
Attachments
(1 file)
2.42 KB,
patch
|
Details | Diff | Splinter Review |
When trying to run b2g_build.py locally, if I point the --work-dir to a directory that doesn't yet have a clone of B2G.git, mozharness/tools do the following operations: b2g_build.py --work-dir=/home/worker/volume_cache/B2G ... 18:37:06 INFO - mkdir: /home/worker/volume_cache/B2G 18:37:06 INFO - Changing directory to /home/worker/volume_cache/B2G. 18:37:06 INFO - Running command: ['gittool.py', 'https://git.mozilla.org/b2g/B2G.git', '/home/worker/volume_cache/B2G'] 18:37:06 INFO - 2014-09-19 18:37:06,219 /home/worker/volume_cache/B2G doesn't appear to be a valid git directory; clobbering Then log_cmd() calls getcwd(), which fails with: 18:37:06 INFO - OSError: [Errno 2] No such file or directory 18:37:06 ERROR - Return code: 1 sh: 0: getcwd() failed: No such file or directory In other words, mozharness creates/chdirs to the B2G directory, then the git.py in the tools repo checks B2G, realizes that it's not a valid git repository and deletes the B2G tree. getcwd() now fails since the mozharness process is still chdir'd into the defunct directory. jlund and I took a look yesterday, but we haven't yet figured out the best place to fix it, nor why the same steps happen in buildbot builds but without the failure.
Reporter | ||
Comment 1•10 years ago
|
||
This seems like a pretty simple fix for the issue. Ultimately I think the problem stems from the fact that vcs_checkout_repos() assumes a parent_dir of 'work_dir' if parent_dir is not set, and it thinks that parent_dir is one level up from where the repo will be checked out, which means it should be fine to chdir into even if the repo gets clobbered. However, buildb2gbase.py is passing in a full path for the repo destination, which also happens to be set to 'work_dir'. Since parent_dir is unset, it takes the default value, and then parent_dir==(repo destination), which breaks the assumptions. This workaround explicitly sets parent_dir in buildb2gbase.py to be the parent of 'work_dir'.
Attachment #8492400 -
Flags: review?(jlund)
Comment 2•10 years ago
|
||
(In reply to Michael Shal [:mshal] from comment #1) > Created attachment 8492400 [details] [diff] [review] > bug1070074 > > This seems like a pretty simple fix for the issue. Ultimately I think the > problem stems from the fact that vcs_checkout_repos() assumes a parent_dir > of 'work_dir' if parent_dir is not set, and it thinks that parent_dir is one > level up from where the repo will be checked out, which means it should be > fine to chdir into even if the repo gets clobbered. > > However, buildb2gbase.py is passing in a full path for the repo destination, > which also happens to be set to 'work_dir'. Since parent_dir is unset, it > takes the default value, and then parent_dir==(repo destination), which > breaks the assumptions. This workaround explicitly sets parent_dir in > buildb2gbase.py to be the parent of 'work_dir'. catlee, why do we use the work_dir itself as the B2G repo dest: http://mxr.mozilla.org/build/source/mozharness/mozharness/mozilla/building/buildb2gbase.py#309 seems like it's dangerous creating/clobbering/creating the script's work_dir (comment 0) /me tries to grep what why this dance is happening in this snippet: 12:58:00 INFO - mkdir: /builds/slave/b2g_ced_flm_dep-00000000000000/build 12:58:00 INFO - Changing directory to /builds/slave/b2g_ced_flm_dep-00000000000000/build. 12:58:00 INFO - retry: Calling <bound method B2GBuild._get_revision of <__main__.B2GBuild object at 0x165da10>> with args: (<mozharness.base.vcs.gittool.GittoolVCS object at 0x1615fd0>, '/builds/slave/b2g_ced_flm_dep-00000000000000/build'), kwargs: {}, attempt #1 12:58:00 INFO - Running command: ['gittool.py', 'https://git.mozilla.org/b2g/B2G.git', '/builds/slave/b2g_ced_flm_dep-00000000000000/build'] 12:58:00 INFO - Copy/paste: gittool.py https://git.mozilla.org/b2g/B2G.git /builds/slave/b2g_ced_flm_dep-00000000000000/build 12:58:00 INFO - Using env: {'GIT_SHARE_BASE_DIR': '/builds/git-shared/git', 12:58:00 INFO - 'PATH': '/usr/local/bin:/usr/lib64/ccache:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/cltbld/bin'} 12:58:00 INFO - 2014-09-12 12:58:00,357 creating bare repo /builds/git-shared/git/git.mozilla.org/b2g%2FB2G.git 12:58:00 INFO - 2014-09-12 12:58:00,358 removing /builds/git-shared/git/git.mozilla.org/b2g%2FB2G.git 12:58:00 INFO - 2014-09-12 12:58:00,358 git init --bare -q /builds/git-shared/git/git.mozilla.org/b2g%2FB2G.git 12:58:00 INFO - 2014-09-12 12:58:00,372 Checking dest /builds/slave/b2g_ced_flm_dep-00000000000000/build 12:58:00 INFO - fatal: Not a git repository (or any parent up to mount parent /builds) 12:58:00 INFO - Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). 12:58:00 INFO - 2014-09-12 12:58:00,374 /builds/slave/b2g_ced_flm_dep-00000000000000/build doesn't appear to be a valid git directory; clobbering 12:58:00 INFO - 2014-09-12 12:58:00,376 removing /builds/slave/b2g_ced_flm_dep-00000000000000/build 12:58:00 INFO - 2014-09-12 12:58:00,376 git init -q /builds/slave/b2g_ced_flm_dep-00000000000000/build 12:58:00 INFO - 2014-09-12 12:58:00,395 command: START 12:58:00 INFO - 2014-09-12 12:58:00,395 command: git fetch -q https://git.mozilla.org/b2g/B2G.git +refs/heads/*:refs/remotes/origin/* 12:58:00 INFO - 2014-09-12 12:58:00,395 command: cwd: /builds/git-shared/git/git.mozilla.org/b2g%2FB2G.git 12:58:00 INFO - 2014-09-12 12:58:00,395 command: output: 12:58:02 INFO - 2014-09-12 12:58:02,836 command: END (2.44s elapsed) 12:58:02 INFO - 2014-09-12 12:58:02,837 command: START 12:58:02 INFO - 2014-09-12 12:58:02,837 command: git fetch -q /builds/git-shared/git/git.mozilla.org/b2g%2FB2G.git +refs/remotes/origin/*:refs/remotes/origin/* 12:58:02 INFO - 2014-09-12 12:58:02,837 command: cwd: /builds/slave/b2g_ced_flm_dep-00000000000000/build 12:58:02 INFO - 2014-09-12 12:58:02,837 command: output: 12:58:02 INFO - 2014-09-12 12:58:02,928 command: END (0.09s elapsed) 12:58:02 INFO - 2014-09-12 12:58:02,931 /builds/slave/b2g_ced_flm_dep-00000000000000/build: adding remote origin https://git.mozilla.org/b2g/B2G.git 12:58:02 INFO - 2014-09-12 12:58:02,932 command: START 12:58:02 INFO - 2014-09-12 12:58:02,932 command: git remote add origin https://git.mozilla.org/b2g/B2G.git 12:58:02 INFO - 2014-09-12 12:58:02,932 command: cwd: /builds/slave/b2g_ced_flm_dep-00000000000000/build 12:58:02 INFO - 2014-09-12 12:58:02,932 command: output: 12:58:02 INFO - 2014-09-12 12:58:02,934 command: END (0.00s elapsed) 12:58:02 INFO - 2014-09-12 12:58:02,934 Updating local copy refname: None; revision: None 12:58:02 INFO - 2014-09-12 12:58:02,935 command: START 12:58:02 INFO - 2014-09-12 12:58:02,935 command: git checkout -q -f origin/master^0 12:58:02 INFO - 2014-09-12 12:58:02,935 command: cwd: /builds/slave/b2g_ced_flm_dep-00000000000000/build 12:58:02 INFO - 2014-09-12 12:58:02,935 command: output: 12:58:02 INFO - 2014-09-12 12:58:02,961 command: END (0.03s elapsed) 12:58:02 INFO - Got revision 4be35b239e7b090f8b5b4b39485812975f67000f 12:58:02 INFO - Return code: 0 I feel like the fix is to put B2G checkout within work_dir not work_dir itself. not sure I'm grepping that right or the implications
Flags: needinfo?(catlee)
Reporter | ||
Comment 3•10 years ago
|
||
FYI I also stumbled across this comment in buildb2gbase.py: # That may have blown away our build-tools checkout. It would # be better if B2G were checked out into a subdirectory, but # for now, just redo it. self.checkout_tools() So it sounds like we should consider moving B2G to a subdir. Thoughts?
Comment 4•10 years ago
|
||
I'm not really sure what's going on TBH. On the build slaves, the directory layout is like this: /builds/slave/b2g_b2g-in_emu-d_dep-000000000 'base_work_dir': '/builds/slave/b2g_b2g-in_emu-d_dep-000000000' 'work_dir': 'build' so that means that 'abs_work_dir' is /builds/slave/b2g_b2g-in_emu-d_dep-000000000/build logs go into /builds/slave/b2g_b2g-in_emu-d_dep-000000000/logs The B2G repo is checked out into abs_work_dir, clobbering it if necessary. I think perhaps the problem is in the difference between base_work_dir and work_dir. The code probably assumes that work_dir is a child of base_work_dir, and when you override workdir on the command-line that assumption is broken?
Flags: needinfo?(catlee)
Comment 5•10 years ago
|
||
(In reply to Chris AtLee [:catlee] from comment #4) > I'm not really sure what's going on TBH. On the build slaves, the directory > layout is like this: > > /builds/slave/b2g_b2g-in_emu-d_dep-000000000 > > 'base_work_dir': '/builds/slave/b2g_b2g-in_emu-d_dep-000000000' > 'work_dir': 'build' > > so that means that 'abs_work_dir' is > /builds/slave/b2g_b2g-in_emu-d_dep-000000000/build > logs go into /builds/slave/b2g_b2g-in_emu-d_dep-000000000/logs I don't think it's mozharness that is complaining about logs: log_cmd (from tools/util/commands.py) is complaining about os.getcwd(). > > The B2G repo is checked out into abs_work_dir, clobbering it if necessary. granted I have not dived into this too much but we have things in abs_work_dir that are not just the B2G checkout. e.g. tools repo (causing a vcs checkout within another checkout), 'upload' dir (also holds a copy of the script log and props). By making the B2G checkout abs_work_dir itself (/builds/slave/b2g_b2g-in_emu-d_dep-000000000/build) we will clobber all of that if we want to clobber B2G checkout. In our automation, we must be able to do this in a way that things don't complain even though we comment that it isn't optimal (comment 3) > I think perhaps the problem is in the difference between base_work_dir and > work_dir. The code probably assumes that work_dir is a child of > base_work_dir, and when you override workdir on the command-line that > assumption is broken? hmm, maybe. mshal are you overriding the work_dir? I thought you tried and failed without playing with the work_dir at all.
Comment 6•10 years ago
|
||
Comment on attachment 8492400 [details] [diff] [review] bug1070074 Review of attachment 8492400 [details] [diff] [review]: ----------------------------------------------------------------- I'm going to reset this review request. It's been the black sheep in my queue. feel free to send the request again once recent comments derive a conclusion. IMO - I still see the fix as putting src in a subdir instead of the work_dir itself for reasons stated in my last comment.
Attachment #8492400 -
Flags: review?(jlund)
Updated•10 years ago
|
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1930]
Reporter | ||
Comment 7•8 years ago
|
||
I no longer have any context here, so I'm not planning to fix it.
Assignee: mshal → nobody
Updated•7 years ago
|
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•