Closed Bug 1126945 Opened 10 years ago Closed 10 years ago

Frequent B2G device image Automation Error: mozprocess timed out after 3300 seconds running ['./config.sh', '-q', 'flame-kk', '/builds/slave/b2g_b2g-in_flm-kk_eng_dep-0000/build/tmp_manifest/flame-kk.xml']

Categories

(Release Engineering :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: RyanVM, Assigned: selenamarie)

References

Details

(Keywords: intermittent-failure)

06:41:08 INFO - Running command: ['./config.sh', '-q', 'flame-kk', '/builds/slave/b2g_b2g-in_flm-kk_eng_dep-0000/build/tmp_manifest/flame-kk.xml'] in /builds/slave/b2g_b2g-in_flm-kk_eng_dep-0000/build 06:41:08 INFO - Copy/paste: ./config.sh -q flame-kk /builds/slave/b2g_b2g-in_flm-kk_eng_dep-0000/build/tmp_manifest/flame-kk.xml 06:41:08 INFO - Calling ['./config.sh', '-q', 'flame-kk', '/builds/slave/b2g_b2g-in_flm-kk_eng_dep-0000/build/tmp_manifest/flame-kk.xml'] with output_timeout 3300 06:41:08 INFO - Initialized empty Git repository in /builds/slave/b2g_b2g-in_flm-kk_eng_dep-0000/build/tmp_manifest_repo/.git/ 06:41:08 INFO - [master (root-commit) ccd1878] manifest 06:41:08 INFO - 1 file changed, 153 insertions(+) 06:41:08 INFO - create mode 100644 flame-kk.xml 06:41:08 INFO - Get tmp_manifest_repo 06:41:08 INFO - From tmp_manifest_repo 06:41:08 INFO - * [new branch] master -> origin/master 06:41:08 INFO - repo has been initialized in /builds/slave/b2g_b2g-in_flm-kk_eng_dep-0000/build 07:36:09 INFO - Automation Error: mozprocess timed out after 3300 seconds running ['./config.sh', '-q', 'flame-kk', '/builds/slave/b2g_b2g-in_flm-kk_eng_dep-0000/build/tmp_manifest/flame-kk.xml'] 07:36:09 ERROR - timed out after 3300 seconds of no output 07:36:09 ERROR - Return code: -9
Blocks: gitperf
Depends on: 1177322
Depends on: 1178927
Repo launches many git actions in parallel so the final 'git' command in the log was different each time a job hung and ultimately failed. I'd heard the previous week that a corrupt git repo existed, but didn't have evidence of which one from the logs, so on a hunch :garndt and I started looking into exactly which sync was failing. To diagnose, we had to find and login to several systems that were hung (a trivial but slightly time consuming process), look at what processes were running and note that we had the same repo stuck. Then we needed help from :bkero to look into the repo that was problematic on git.mo. He found the corruption and then after some discussion decided to purge and refresh the repo. This is done with a tool called 'vcssync'. After that, :garndt killed our in-process device image and emulator builders. Then we watched some builds. Things seemed to be ok, although unrelated bustage prevented us from confirming things were ok for about an hour. We suspect this all comes back to an issue with git.mo (bug 1176192) that caused a ton of problems a couple weeks ago. :bkero kept a copy of the busted repo for analysis later.
Assignee: nobody → sdeckelmann
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.