Closed Bug 1546827 Opened 6 years ago Closed 6 years ago

[Ronin Windows] Puppet configured nodes failing on The following files failed: 'win32-minidump_stackwalk.exe'

Categories

(Infrastructure & Operations :: RelOps: Puppet, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: markco, Assigned: markco)

Details

Attachments

(1 file)

No description provided.

From
Traceback (most recent call last):
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)>
ERROR - The following files failed: 'win32-minidump_stackwalk.exe'

1412667Intermittent ERROR - The following files failed: 'win32-minidump_stackwalk.exe'

Return code: 1
Traceback (most recent call last):
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)>
ERROR - The following files failed: 'win32-minidump_stackwalk.exe'

1412667Intermittent ERROR - The following files failed: 'win32-minidump_stackwalk.exe'

Return code: 1
Traceback (most recent call last):
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)>
ERROR - The following files failed: 'win32-minidump_stackwalk.exe'

1412667Intermittent ERROR - The following files failed: 'win32-minidump_stackwalk.exe'

Return code: 1
Traceback (most recent call last):
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)>
ERROR - The following files failed: 'win32-minidump_stackwalk.exe'

1412667Intermittent ERROR - The following files failed: 'win32-minidump_stackwalk.exe'

Return code: 1
Traceback (most recent call last):
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)>
ERROR - The following files failed: 'win32-minidump_stackwalk.exe'

1412667Intermittent ERROR - The following files failed: 'win32-minidump_stackwalk.exe'

Return code: 1
Tooltool C:\Users\task_1556143335\build\tests\config/tooltool-manifests/win32/releng.manifest fetch failed!
Running post_fatal callback...
Exiting -1
[taskcluster:error] exit status 4294967295

1545973Intermittent [taskcluster:error] exit status 4294967295 After No connection could be made because the target machine actively refused it

Show / Hide more

All the failures look rather similar to this one.

Assignee: relops → mcornmesser

Before trace back:

22:38:41 INFO - retry: Calling run_command with args: (['C:\mozilla-build\python\python2.7.exe', '-u', 'C:\Users\task_1556145316\mozharness\external_tools\tooltool.py', '--url', 'https://tooltool.mozilla-releng.net/', 'fetch', '-m', 'C:\Users\task_1556145316\build\tests\config/tooltool-manifests/win32/releng.manifest', '-o', '-c', 'c:\build\tooltool_cache'],), kwargs: {'output_timeout': 600, 'error_list': [{'substr': 'command not found', 'level': 'error'}, {'regex': <_sre.SRE_Pattern object at 0x0000000002810D00>, 'level': 'warning'}, {'substr': 'Traceback (most recent call last)', 'level': 'error'}, {'substr': 'SyntaxError: ', 'level': 'error'}, {'substr': 'TypeError: ', 'level': 'error'}, {'substr': 'NameError: ', 'level': 'error'}, {'substr': 'ZeroDivisionError: ', 'level': 'error'}, {'regex': <_sre.SRE_Pattern object at 0x00000000030D4030>, 'level': 'critical'}, {'regex': <_sre.SRE_Pattern object at 0x00000000030B6C60>, 'level': 'critical'}, {'substr': 'ERROR - ', 'level': 'error'}], 'cwd': 'C:\Users\task_1556145316\build', 'privileged': False}, attempt #1
22:38:41 INFO - Running command: ['C:\mozilla-build\python\python2.7.exe', '-u', 'C:\Users\task_1556145316\mozharness\external_tools\tooltool.py', '--url', 'https://tooltool.mozilla-releng.net/', 'fetch', '-m', 'C:\Users\task_1556145316\build\tests\config/tooltool-manifests/win32/releng.manifest', '-o', '-c', 'c:\build\tooltool_cache'] in C:\Users\task_1556145316\build
22:38:41 INFO - Copy/paste: C:\mozilla-build\python\python2.7.exe -u C:\Users\task_1556145316\mozharness\external_tools\tooltool.py --url https://tooltool.mozilla-releng.net/ fetch -m C:\Users\task_1556145316\build\tests\config/tooltool-manifests/win32/releng.manifest -o -c c:\build\tooltool_cache
22:38:41 INFO - Calling ['C:\mozilla-build\python\python2.7.exe', '-u', 'C:\Users\task_1556145316\mozharness\external_tools\tooltool.py', '--url', 'https://tooltool.mozilla-releng.net/', 'fetch', '-m', 'C:\Users\task_1556145316\build\tests\config/tooltool-manifests/win32/releng.manifest', '-o', '-c', 'c:\build\tooltool_cache'] with output_timeout 600
22:38:41 INFO - INFO - File win32-minidump_stackwalk.exe not present in local cache folder c:\build\tooltool_cache
22:38:41 INFO - INFO - Attempting to fetch from 'https://tooltool.mozilla-releng.net/'...
22:38:41 INFO - INFO - ...failed to fetch 'win32-minidump_stackwalk.exe' from https://tooltool.mozilla-releng.net/

Another link: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=242446049&repo=try&lineNumber=582 It looks like tooltool fetch is failing but not seeing a reason why.

Jmaher: Grenade: any suggestions on what to look for here?

Flags: needinfo?(rthijssen)
Flags: needinfo?(jmaher)

It looks like an ssl error:

22:38:41 INFO - URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)>
22:38:41 ERROR - ERROR - The following files failed: 'win32-minidump_stackwalk.exe'
22:38:41 ERROR - Return code: 1

I suspect there is a piece in OCC I missed when I ported over the configuration.

Flags: needinfo?(jmaher)

the cache path looks suspect to me.

i think it should be C:\builds\tooltool_cache, rather than C:\build\tooltool_cache, unless these workers are doing some different setup that involves granting write access to task users on C:\build the way we normally do for C:\builds & creating the tooltool_cache folder.

but in any case, i'm not sure that would cause the errors we're seeing.

The CERTIFICATE_VERIFY_FAILED could be down to C:\mozilla-build\msys\etc\cacert.pem being missing/outdated/invalid.

You may get better error output by downloading

  • tooltool.py
  • releng.manifest
    to one of these instances and then running:
C:\mozilla-build\python\python2.7.exe -u tooltool.py --url https://tooltool.mozilla-releng.net/ fetch -m releng.manifest -o -c c:\build\tooltool_cache

also check that C:\builds\relengapi.tok exists and contains a valid token. you can try to manually download a file from tooltool at the powershell prompt with something like this:

$url = 'https://tooltool.mozilla-releng.net/sha512/2bc729f9cedfba59b5c7a088f00d00fc078af3bd08e88ee41bbb1ea092038466f46589cef036e0d928249f6037fb22828f62e6d82a32d018f66ca92a834393c8'
$localPath = 'C:\Windows\Temp\win32-minidump_stackwalk.exe'
$tokenPath = ('{0}\builds\occ-installers.tok' -f $env:SystemDrive)
$bearerToken = (Get-Content -Path $tokenPath -Raw)
$webClient = New-Object -TypeName 'System.Net.WebClient'
$webClient.Headers.Add('Authorization', ('Bearer {0}' -f $bearerToken))
$webClient.DownloadFile($url, $localPath)
Flags: needinfo?(rthijssen)

The c:\build directory is being made by the task. I went through and added C:\builds to the configuration, and the task created C:\builds\tooltool_cache. The pem file was int he wrong location. I am now moving that into the correct location. The node appears the same as the OCC configured node. Neither of which have C:\builds\relengapi.tok. However, the tests still fail at the same point. Even with creating C:\builds\relengapi.tok.

The odd thing is if I do the test to download win32-minidump_stackwalk.exe, the download works and the test will then complete. https://tools.taskcluster.net/groups/XWY53NPOQAudsbFzzctNNg/tasks/XWY53NPOQAudsbFzzctNNg/details

It looks like tooltool is hitting a SSL error before the token is checked. I have not been able to get any additional info running commands locally:
PS C:\Users\task_1556594145> C:\mozilla-build\python\python2.7.exe -u C:\mozilla-build\tooltool.py --url https://tooltoo
l.mozilla-releng.net/ -v --authentication-file c:\builds\relengapi.tok fetch -m C:\ProgramData\build\tests\talos\tp5n-pa
geset.manifest -o -c c:\build\tooltool_cache
DEBUG - processing 'fetch' command with args ''
DEBUG - using options: {'cache_folder': 'c:\build\tooltool_cache', 'algorithm': 'sha512', 'loglevel': 10, 'region': No
ne, 'base_url': ['https://tooltool.mozilla-releng.net/'], 'visibility': None, 'manifest': 'C:\ProgramData\build\tests
\talos\tp5n-pageset.manifest', 'version': None, 'auth_file': 'c:\builds\relengapi.tok', 'message': None, 'unpack': F
alse, 'overwrite': True, 'size': 0.0}
DEBUG - materialized main.FileRecord(filename='tp5n.zip', size=81753814, digest='***', algorithm='sha512', visi
bility=None)
DEBUG - loaded manifest from file 'C:\ProgramData\build\tests\talos\tp5n-pageset.manifest'
INFO - File tp5n.zip not present in local cache folder c:\build\tooltool_cache
DEBUG - fetching tp5n.zip
INFO - Attempting to fetch from 'https://tooltool.mozilla-releng.net/'...
DEBUG - Using Bearer token in c:\builds\relengapi.tok
INFO - ...failed to fetch 'tp5n.zip' from https://tooltool.mozilla-releng.net/
Traceback (most recent call last):
File "C:\mozilla-build\tooltool.py", line 689, in fetch_file
f = urllib2.urlopen(req)
File "C:\mozilla-build\python\lib\urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "C:\mozilla-build\python\lib\urllib2.py", line 429, in open
response = self._open(req, data)
File "C:\mozilla-build\python\lib\urllib2.py", line 447, in _open
'_open', req)
File "C:\mozilla-build\python\lib\urllib2.py", line 407, in _call_chain
result = func(*args)
File "C:\mozilla-build\python\lib\urllib2.py", line 1241, in https_open
context=self._context)
File "C:\mozilla-build\python\lib\urllib2.py", line 1198, in do_open
raise URLError(err)
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)>
ERROR - The following files failed: 'tp5n.zip'
PS C:\Users\task_1556594145>

Powershell was downloading and adding the cert to the store for https://tooltool.mozilla-releng.net.

The pem file in c:\mozilla-build\msys\etc\ doesn't seem to be in use when tooltool is being called.

To work around this, I have added a script to get the certificate during the puppet run.

Attached file GitHub Pull Request
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: