Closed
Bug 1398741
Opened 7 years ago
Closed 7 years ago
[tier-3] Permafail - ERROR - The following files failed: 'macosx64-minidump_stackwalk'
Categories
(Testing :: Firefox UI Tests, defect, P5)
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: intermittent-bug-filer, Unassigned)
Details
(Whiteboard: [stockwell unknown])
Filed by: fmezei [at] mozilla.com https://treeherder.mozilla.org/logviewer.html#?job_id=130047122&repo=mozilla-central https://firefox-ui-tests.s3.amazonaws.com/ffde9cf6-0ae4-4bb9-a4b7-bf3f512e61fc/log_info.log After re-enabling the ondemand update tests in bug 1386628, all OSX update jobs fail on Nightly with this error.
Comment 1•7 years ago
|
||
Error is not seen on Beta 56. David is this something that you could help with?
Comment 2•7 years ago
|
||
It seems that we cannot find the appropriate version of the minidump-stackwalk program on tooltool: https://treeherder.mozilla.org/logviewer.html#?job_id=130047122&repo=mozilla-central&lineNumber=798 Ted, do you know what's wrong with this? Are we missing some specific version of the tools?
Flags: needinfo?(ted)
Comment 3•7 years ago
|
||
https://firefox-ui-tests.s3.amazonaws.com/d05bc674-0288-435b-b282-0736ed1693f0/log_info.log 04:31:15 INFO - Calling ['/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python', '/Users/mozauto/jenkins/workspace/mozilla-central_update/build/tooltool.py', '--url', 'https://tooltool.mozilla-releng.net/', 'fetch', '-m', '/Users/mozauto/jenkins/workspace/mozilla-central_update/build/tests/config/tooltool-manifests/macosx64/releng.manifest', '-o'] with output_timeout 600 04:31:15 INFO - INFO - Attempting to fetch from 'https://tooltool.mozilla-releng.net/'... 04:31:16 INFO - INFO - ...failed to fetch 'macosx64-minidump_stackwalk' from https://tooltool.mozilla-releng.net/ vs (success on osx 10.10 firefox-ui): https://public-artifacts.taskcluster.net/d9HPpzpGTQK5QC3uRo2_Dw/0/public/logs/live_backing.log 16:02:22 INFO - Calling ['/tools/tooltool.py', '--url', 'https://tooltool.mozilla-releng.net/', '--authentication-file', '/builds/relengapi.tok', 'fetch', '-m', '/Users/cltbld/tasks/task_1505083747/build/tests/config/tooltool-manifests/macosx64/releng.manifest', '-o', '-c', '/builds/tooltool_cache'] with output_timeout 600 16:02:22 INFO - INFO - File macosx64-minidump_stackwalk retrieved from local cache /builds/tooltool_cache I notice these differences: - no --authentication-file used for failure case - no tooltool cache used for failure case - different location for tooltool.py (different version possible?) I recall setting up osx tooltool cache in bug 1385629 https://dxr.mozilla.org/mozilla-central/rev/fd87bb184e299fec695f69bd2977276c25719b98/testing/mozharness/configs/firefox_ui_tests/taskcluster_mac.py#3 but that's probably not the most important issue.
Comment 4•7 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #3) > I notice these differences: > - no --authentication-file used for failure case I don't think this should matter, the file has `visibility: public`. > - no tooltool cache used for failure case It's possible that not having a tooltool cache tickles a tooltool bug. > - different location for tooltool.py (different version possible?) This is pretty plausible. The first log from comment 3 shows: 04:31:15 INFO - Downloading https://raw.githubusercontent.com/mozilla/build-tooltool/master/tooltool.py to /Users/mozauto/jenkins/workspace/mozilla-central_update/build/tooltool.py the second log shows: 16:01:35 INFO - 'exes': {'tooltool.py': '/tools/tooltool.py', ...so it's using some baked-in copy of tooltool.py? It might help to stick a `-v` in the tooltool commandline to get more info out of tooltool. Functionally it's not very complicated, it just builds a URL from the digest in the manifest and tries to download it.
Flags: needinfo?(ted)
Comment 5•7 years ago
|
||
The tests which are failing here are not run via Taskcluster but via Mozmill CI. They do not use a tooltool cache, that's correct. But it's also not using a local copy of it. For each job we run, a fresh copy of mozharness is getting downloaded: https://github.com/mozilla/mozmill-ci/blob/master/jenkins-master/jobs/scripts/workspace/runtests.py#L76 Then we use the common.tests.zip archive, and run the firefox-ui-update script, which is based on the script which gets used by the fx-ui tests as executed in TC. The only difference is the used config file for mozharness which is qa_jenkins.py: https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/configs/firefox_ui_tests And it says `download_tooltool: True`, which always forces a fresh copy of the files. I assume Ted is right that we unmask a problem here which only happens on OS X those days. Linux and Windows are both fine.
Comment 6•7 years ago
|
||
If the cause is unclear, Florin could trigger some Nighly update tests in CI based on older Nightlies. The regression is definitely in the range of August 14th to Sep 8th, so about 5 test runs should be necessary to nail this down.
Comment hidden (Intermittent Failures Robot) |
Comment 8•7 years ago
|
||
This continues to block update tests on OS X - all tests have failed last week. Once we merge to Beta (57 Beta 1 should happen tomorrow), this will probably block all update tests on OS X. Does anyone have any update on a potential fix here?
Comment 9•7 years ago
|
||
I'm not familiar with Mozmill CI so I'm not actively trying to debug this. To follow up on some of the speculation in earlier comments, it would be nice if someone (Florin?, Henrik?) added -v to the tooltool command line, temporarily, to get better diagnostics. Also, retriggering tests as suggested in comment 6 might be useful.
Updated•7 years ago
|
Whiteboard: [stockwell needswork]
Comment 10•7 years ago
|
||
who is working on this? I see this will cross our 'need to disable' threshold really soon and we will disable the offending tests- without a specific set of offending tests, the entire suite. :whimboo, I see you as the QA contact- are these tests ones that you are responsible for?
Flags: needinfo?(hskupin)
Comment 11•7 years ago
|
||
This is tier-3, and I no longer maintain those tests. Geoff and myself gave proposals but so far nothing happened on this bug.
Flags: needinfo?(hskupin)
Summary: Permafail - ERROR - The following files failed: 'macosx64-minidump_stackwalk' → [tier-3] Permafail - ERROR - The following files failed: 'macosx64-minidump_stackwalk'
Comment 12•7 years ago
|
||
Florin, who is responsible for developing these tests and ensuring they are green. I see in the title this is 'tier-3', but we are getting stars, I would really like to avoid getting data in orangefactor for tests that are tier-3.
Flags: needinfo?(florin.mezei)
Comment 13•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #12) > Florin, who is responsible for developing these tests and ensuring they are > green. I see in the title this is 'tier-3', but we are getting stars, I > would really like to avoid getting data in orangefactor for tests that are > tier-3. Joel, I don't know of anyone responsible for maintaining these anymore. The tests themselves have been disabled, but then re-enabled temporarily until we do a complete analysis of existing coverage and gaps in automated update testing. They were re-enabled because in there absence we need to do manual testing of updates which adds lots of extra effort to the manual QA teams. Is there anything else that we can do to avoid getting this in orangefactor? If we don't star the failing tests, would that help?
Flags: needinfo?(florin.mezei) → needinfo?(jmaher)
Comment 14•7 years ago
|
||
Shouldn't we start by adding '-v' to the tooltool invocation to figure out what's going on? If this can be reproduced on try I'll be glad to help out with that.
Comment 15•7 years ago
|
||
if the failures are not starred then they will not get into orangefactor, that would help! I assume the team responsible for writing code for installers and application update would be able to help fix issues with the tests
Flags: needinfo?(jmaher)
Comment 16•7 years ago
|
||
Gabriele is correct in #c14 -- we need that to figure out what the issue is.
Comment 17•7 years ago
|
||
I won't star these anymore. Also I've tried re-running some tests with older Nightlies here: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=42151fcd6cfc216d147730d0f2c6a2acd52d22fd&filter-searchStr=fxup&filter-tier=3&selectedJob=131969510. The jobs still failed the same way, with this error. For example this job [1] was run with the same source build (--installer-url parameter) and same test package (--test-packages-url parameter) as when it originally passed on August 13. I'm thinking that either I'm doing something wrong, or the problem is not in the Firefox code. [1] https://firefox-ui-tests.s3.amazonaws.com/71281e3c-9476-42de-9225-55175388bde6/log_info.log
Comment 18•7 years ago
|
||
We are now also seeing this error in Firefox 57 Beta 3 - https://firefox-ui-tests.s3.amazonaws.com/27c140aa-37e5-488f-948e-787d96ba2937/log_info.log. Also, while I haven't noticed this before, this also affects Windows with a similar error: ERROR - The following files failed: 'win32-minidump_stackwalk.exe' - https://firefox-ui-tests.s3.amazonaws.com/ddd76aa7-4a6b-4d6f-bf33-7bf842e357a7/log_info.log.
Updated•7 years ago
|
Whiteboard: [stockwell needswork] → [stockwell unknown]
Comment 19•7 years ago
|
||
https://wiki.mozilla.org/Bugmasters#Intermittent_Test_Failure_Cleanup
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•