1398741 - [tier-3] Permafail - ERROR - The following files failed: 'macosx64-minidump_stackwalk'

Reporter

Description

•

8 years ago

treeherder

Filed by: fmezei [at] mozilla.com https://treeherder.mozilla.org/logviewer.html#?job_id=130047122&repo=mozilla-central https://firefox-ui-tests.s3.amazonaws.com/ffde9cf6-0ae4-4bb9-a4b7-bf3f512e61fc/log_info.log After re-enabling the ondemand update tests in bug 1386628, all OSX update jobs fail on Nightly with this error.

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 1

•

8 years ago

Error is not seen on Beta 56. David is this something that you could help with?

Gabriele Svelto [:gsvelto]

Comment 2

•

8 years ago

It seems that we cannot find the appropriate version of the minidump-stackwalk program on tooltool: https://treeherder.mozilla.org/logviewer.html#?job_id=130047122&repo=mozilla-central&lineNumber=798 Ted, do you know what's wrong with this? Are we missing some specific version of the tools?

Flags: needinfo?(ted)

Geoff Brown [:gbrown]

Comment 3

•

8 years ago

https://firefox-ui-tests.s3.amazonaws.com/d05bc674-0288-435b-b282-0736ed1693f0/log_info.log 04:31:15 INFO - Calling ['/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python', '/Users/mozauto/jenkins/workspace/mozilla-central_update/build/tooltool.py', '--url', 'https://tooltool.mozilla-releng.net/', 'fetch', '-m', '/Users/mozauto/jenkins/workspace/mozilla-central_update/build/tests/config/tooltool-manifests/macosx64/releng.manifest', '-o'] with output_timeout 600 04:31:15 INFO - INFO - Attempting to fetch from 'https://tooltool.mozilla-releng.net/'... 04:31:16 INFO - INFO - ...failed to fetch 'macosx64-minidump_stackwalk' from https://tooltool.mozilla-releng.net/ vs (success on osx 10.10 firefox-ui): https://public-artifacts.taskcluster.net/d9HPpzpGTQK5QC3uRo2_Dw/0/public/logs/live_backing.log 16:02:22 INFO - Calling ['/tools/tooltool.py', '--url', 'https://tooltool.mozilla-releng.net/', '--authentication-file', '/builds/relengapi.tok', 'fetch', '-m', '/Users/cltbld/tasks/task_1505083747/build/tests/config/tooltool-manifests/macosx64/releng.manifest', '-o', '-c', '/builds/tooltool_cache'] with output_timeout 600 16:02:22 INFO - INFO - File macosx64-minidump_stackwalk retrieved from local cache /builds/tooltool_cache I notice these differences: - no --authentication-file used for failure case - no tooltool cache used for failure case - different location for tooltool.py (different version possible?) I recall setting up osx tooltool cache in bug 1385629 https://dxr.mozilla.org/mozilla-central/rev/fd87bb184e299fec695f69bd2977276c25719b98/testing/mozharness/configs/firefox_ui_tests/taskcluster_mac.py#3 but that's probably not the most important issue.

(not currently active) Ted Mielczarek

Comment 4

•

8 years ago

(In reply to Geoff Brown [:gbrown] from comment #3) > I notice these differences: > - no --authentication-file used for failure case I don't think this should matter, the file has `visibility: public`. > - no tooltool cache used for failure case It's possible that not having a tooltool cache tickles a tooltool bug. > - different location for tooltool.py (different version possible?) This is pretty plausible. The first log from comment 3 shows: 04:31:15 INFO - Downloading https://raw.githubusercontent.com/mozilla/build-tooltool/master/tooltool.py to /Users/mozauto/jenkins/workspace/mozilla-central_update/build/tooltool.py the second log shows: 16:01:35 INFO - 'exes': {'tooltool.py': '/tools/tooltool.py', ...so it's using some baked-in copy of tooltool.py? It might help to stick a `-v` in the tooltool commandline to get more info out of tooltool. Functionally it's not very complicated, it just builds a URL from the digest in the manifest and tries to download it.

Flags: needinfo?(ted)

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 5

•

8 years ago

The tests which are failing here are not run via Taskcluster but via Mozmill CI. They do not use a tooltool cache, that's correct. But it's also not using a local copy of it. For each job we run, a fresh copy of mozharness is getting downloaded: https://github.com/mozilla/mozmill-ci/blob/master/jenkins-master/jobs/scripts/workspace/runtests.py#L76 Then we use the common.tests.zip archive, and run the firefox-ui-update script, which is based on the script which gets used by the fx-ui tests as executed in TC. The only difference is the used config file for mozharness which is qa_jenkins.py: https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/configs/firefox_ui_tests And it says `download_tooltool: True`, which always forces a fresh copy of the files. I assume Ted is right that we unmask a problem here which only happens on OS X those days. Linux and Windows are both fine.

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 6

•

8 years ago

If the cause is unclear, Florin could trigger some Nighly update tests in CI based on older Nightlies. The regression is definitely in the range of August 14th to Sep 8th, so about 5 test runs should be necessary to nail this down.

Comment hidden (Intermittent Failures Robot)

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 8

•

8 years ago

This continues to block update tests on OS X - all tests have failed last week. Once we merge to Beta (57 Beta 1 should happen tomorrow), this will probably block all update tests on OS X. Does anyone have any update on a potential fix here?

Geoff Brown [:gbrown]

Comment 9

•

8 years ago

I'm not familiar with Mozmill CI so I'm not actively trying to debug this. To follow up on some of the speculation in earlier comments, it would be nice if someone (Florin?, Henrik?) added -v to the tooltool command line, temporarily, to get better diagnostics. Also, retriggering tests as suggested in comment 6 might be useful.

Geoff Brown [:gbrown]

Updated

•

8 years ago

Whiteboard: [stockwell needswork]

Joel Maher ( :jmaher ) (UTC -8)

Comment 10

•

8 years ago

who is working on this? I see this will cross our 'need to disable' threshold really soon and we will disable the offending tests- without a specific set of offending tests, the entire suite. :whimboo, I see you as the QA contact- are these tests ones that you are responsible for?

Flags: needinfo?(hskupin)

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 11

•

8 years ago

This is tier-3, and I no longer maintain those tests. Geoff and myself gave proposals but so far nothing happened on this bug.

Flags: needinfo?(hskupin)

Summary: Permafail - ERROR - The following files failed: 'macosx64-minidump_stackwalk' → [tier-3] Permafail - ERROR - The following files failed: 'macosx64-minidump_stackwalk'

Joel Maher ( :jmaher ) (UTC -8)

Comment 12

•

8 years ago

Florin, who is responsible for developing these tests and ensuring they are green. I see in the title this is 'tier-3', but we are getting stars, I would really like to avoid getting data in orangefactor for tests that are tier-3.

Flags: needinfo?(florin.mezei)

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 13

•

8 years ago

(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #12) > Florin, who is responsible for developing these tests and ensuring they are > green. I see in the title this is 'tier-3', but we are getting stars, I > would really like to avoid getting data in orangefactor for tests that are > tier-3. Joel, I don't know of anyone responsible for maintaining these anymore. The tests themselves have been disabled, but then re-enabled temporarily until we do a complete analysis of existing coverage and gaps in automated update testing. They were re-enabled because in there absence we need to do manual testing of updates which adds lots of extra effort to the manual QA teams. Is there anything else that we can do to avoid getting this in orangefactor? If we don't star the failing tests, would that help?

Flags: needinfo?(florin.mezei) → needinfo?(jmaher)

Gabriele Svelto [:gsvelto]

Comment 14

•

8 years ago

Shouldn't we start by adding '-v' to the tooltool invocation to figure out what's going on? If this can be reproduced on try I'll be glad to help out with that.

Joel Maher ( :jmaher ) (UTC -8)

Comment 15

•

8 years ago

if the failures are not starred then they will not get into orangefactor, that would help! I assume the team responsible for writing code for installers and application update would be able to help fix issues with the tests

Flags: needinfo?(jmaher)

David Durst [:ddurst]

Comment 16

•

8 years ago

Gabriele is correct in #c14 -- we need that to figure out what the issue is.

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 17

•

8 years ago

I won't star these anymore. Also I've tried re-running some tests with older Nightlies here: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=42151fcd6cfc216d147730d0f2c6a2acd52d22fd&filter-searchStr=fxup&filter-tier=3&selectedJob=131969510. The jobs still failed the same way, with this error. For example this job [1] was run with the same source build (--installer-url parameter) and same test package (--test-packages-url parameter) as when it originally passed on August 13. I'm thinking that either I'm doing something wrong, or the problem is not in the Firefox code. [1] https://firefox-ui-tests.s3.amazonaws.com/71281e3c-9476-42de-9225-55175388bde6/log_info.log

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 18

•

8 years ago

We are now also seeing this error in Firefox 57 Beta 3 - https://firefox-ui-tests.s3.amazonaws.com/27c140aa-37e5-488f-948e-787d96ba2937/log_info.log. Also, while I haven't noticed this before, this also affects Windows with a similar error: ERROR - The following files failed: 'win32-minidump_stackwalk.exe' - https://firefox-ui-tests.s3.amazonaws.com/ddd76aa7-4a6b-4d6f-bf33-7bf842e357a7/log_info.log.

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

8 years ago

Whiteboard: [stockwell needswork] → [stockwell unknown]

Firefox Bug Husbandry Bot

Comment 19

•

8 years ago

https://wiki.mozilla.org/Bugmasters#Intermittent_Test_Failure_Cleanup

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → INCOMPLETE

Bugzilla

[tier-3] Permafail - ERROR - The following files failed: 'macosx64-minidump_stackwalk'

Categories

(Testing :: Firefox UI Tests, defect, P5)

Tracking

(Not tracked)

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Whiteboard: [stockwell unknown])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Comment 17

Comment 18

Updated

Comment 19