Closed Bug 1561261 Opened 5 years ago Closed 5 years ago

Perma tier 2 SSLError: SSL validation failed for https://s3.us-west-2.amazonaws.com/gecko-docs.mozilla.org/main/latest/_images/AsyncPanZoomArchitecture.png [Errno 2] No such file or directory

Categories

(Core :: Graphics, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla69
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- unaffected
firefox68 --- unaffected
firefox69 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: kats)

References

(Regression)

Details

(Keywords: intermittent-failure, regression)

Attachments

(1 file)

Filed by: aciure [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=253277210&repo=mozilla-central
Full log: https://queue.taskcluster.net/v1/task/Y_Vk-jauRXGImLT1fWFmDw/runs/0/artifacts/public/logs/live_backing.log


[task 2019-06-25T09:50:07.007Z] /builds/worker/checkouts/gecko/python/mozbuild/mozbuild/mozinfo.py:docstring of mozbuild.mozinfo.write_mozinfo:1: WARNING: Undefined substitution referenced: "file".
[task 2019-06-25T09:50:07.007Z] WARNING: autodoc: failed to import module u'invoke_mach_command' from module u'mach.test'; the following exception was raised:
[task 2019-06-25T09:50:07.007Z] Traceback (most recent call last):
[task 2019-06-25T09:50:07.007Z] File "/builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenvs/docs-y-qLfVwz/lib/python2.7/site-packages/sphinx/ext/autodoc/importer.py", line 154, in import_module
[task 2019-06-25T09:50:07.007Z] import(modname)
[task 2019-06-25T09:50:07.007Z] File "/builds/worker/checkouts/gecko/build/mach_bootstrap.py", line 400, in call
[task 2019-06-25T09:50:07.007Z] module = self._original_import(name, globals, locals, fromlist, level)
[task 2019-06-25T09:50:07.007Z] File "/builds/worker/checkouts/gecko/python/mach/mach/test/invoke_mach_command.py", line 4, in <module>
[task 2019-06-25T09:50:07.007Z] subprocess.check_call([sys.executable] + sys.argv[1:])
[task 2019-06-25T09:50:07.007Z] File "/usr/lib/python2.7/subprocess.py", line 541, in check_call
[task 2019-06-25T09:50:07.007Z] raise CalledProcessError(retcode, cmd)
[task 2019-06-25T09:50:07.007Z] CalledProcessError: Command '['/usr/bin/python2.7', 'doc', '--upload', '--no-open', '--no-serve']' returned non-zero exit status 2
[task 2019-06-25T09:50:07.007Z]
[task 2019-06-25T09:50:07.007Z] WARNING: autodoc: failed to import module u'registrar_dispatch' from module u'mach.test'; the following exception was raised:
[task 2019-06-25T09:50:07.007Z] Traceback (most recent call last):
[task 2019-06-25T09:50:07.007Z] File "/builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenvs/docs-y-qLfVwz/lib/python2.7/site-packages/sphinx/ext/autodoc/importer.py", line 154, in import_module
[task 2019-06-25T09:50:07.007Z] import(modname)
[task 2019-06-25T09:50:07.007Z] File "/builds/worker/checkouts/gecko/build/mach_bootstrap.py", line 400, in call
[task 2019-06-25T09:50:07.007Z] module = self._original_import(name, globals, locals, fromlist, level)
[task 2019-06-25T09:50:07.007Z] File "/builds/worker/checkouts/gecko/python/mach/mach/test/registrar_dispatch.py", line 3, in <module>
[task 2019-06-25T09:50:07.007Z] self._mach_context.commands.dispatch('uuid', self._mach_context) # noqa: F821
[task 2019-06-25T09:50:07.007Z] NameError: name 'self' is not defined
[task 2019-06-25T09:50:07.007Z]
[task 2019-06-25T09:50:07.007Z] WARNING: autodoc: failed to import module u'zero_microseconds' from module u'mach.test'; the following exception was raised:
[task 2019-06-25T09:50:07.008Z] Traceback (most recent call last):
[task 2019-06-25T09:50:07.008Z] File "/builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenvs/docs-y-qLfVwz/lib/python2.7/site-packages/sphinx/ext/autodoc/importer.py", line 154, in import_module
[task 2019-06-25T09:50:07.008Z] import(modname)
[task 2019-06-25T09:50:07.008Z] File "/builds/worker/checkouts/gecko/build/mach_bootstrap.py", line 400, in call
[task 2019-06-25T09:50:07.008Z] module = self._original_import(name, globals, locals, fromlist, level)
[task 2019-06-25T09:50:07.008Z] File "/builds/worker/checkouts/gecko/python/mach/mach/test/zero_microseconds.py", line 3, in <module>
[task 2019-06-25T09:50:07.008Z] old = self._mach_context.post_dispatch_handler # noqa: F821
[task 2019-06-25T09:50:07.008Z] NameError: name 'self' is not defined
[task 2019-06-25T09:50:07.008Z]
[task 2019-06-25T09:50:07.008Z] /builds/worker/checkouts/gecko/python/mozbuild/mozbuild/frontend/context.py:docstring of mozbuild.frontend.context.ContextDerivedTypedRecord:10: WARNING: Definition list ends without a blank line; unexpected unindent.
[task 2019-06-25T09:50:07.008Z] /builds/worker/checkouts/gecko/python/mozbuild/mozbuild/frontend/context.py:docstring of mozbuild.frontend.context.Path:5: WARNING: Unexpected indentation.
[task 2019-06-25T09:50:07.008Z] /builds/worker/checkouts/gecko/python/mozbuild/mozbuild/frontend/context.py:docstring of mozbuild.frontend.context.Path.join:1: WARNING: Inline emphasis start-string without end-string.
[task 2019-06-25T09:50:07.008Z] /builds/worker/checkouts/gecko/python/mozbuild/mozbuild/frontend/reader.py:docstring of mozbuild.frontend.reader.BuildReader.read_mozbuild:6: WARNING: Inline emphasis start-string without end-string.
[task 2019-06-25T09:50:07.008Z] evaluate.js:evaluate(32):127: WARNING: Unknown target name: "not".
[task 2019-06-25T09:50:07.008Z] WARNING: autodoc: failed to import module u'test_manifest' from module u'mozbuild.test'; the following exception was raised:
[task 2019-06-25T09:50:07.008Z] No module named nose.tools

Summary: Intermittent SSLError: SSL validation failed for https://s3.us-west-2.amazonaws.com/gecko-docs.mozilla.org/main/latest/_images/AsyncPanZoomArchitecture.png [Errno 2] No such file or directory → Perma tier 2 SSLError: SSL validation failed for https://s3.us-west-2.amazonaws.com/gecko-docs.mozilla.org/main/latest/_images/AsyncPanZoomArchitecture.png [Errno 2] No such file or directory

Mike can this be from bug 1554987?

Or from Kartikaya's bug 1519598 ?

Flags: needinfo?(mh+mozilla)
Flags: needinfo?(kats)

Doesn't seem related to my patch. Looks like a TLS misconfiguration on the S3 bucket or something.

Flags: needinfo?(kats)

It looked like it could be an infra problem, but it's not happening on retriggers of the parent commits, and is still happening on new retriggers of the already failing tasks.
I don't see how any of the two bugs mentioned in comment 1 could be responsible. But I also don't see how any of the things in the merge where it happened first could be responsible either.
Redirecting to Andrew, who's touched the code the most recently, and might have an idea.

Flags: needinfo?(mh+mozilla) → needinfo?(ahal)

I'm not very familiar with the S3 aspect of the docs. Dustin, do you have any ideas? Is there an AWS management console you (or someone you know) might have access to?

Flags: needinfo?(ahal) → needinfo?(dustin)

I don't see anything about SSL in comment 0. What's the issue?

Flags: needinfo?(dustin)

Sorry, should have provided more context. Comment 0 is the wrong part of the log (there are a lot of expected tracebacks as part of doc generation). The real issue is towards the end of the log:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=253277210&repo=mozilla-central&lineNumber=2412

SSLError: SSL validation failed for https://s3.us-west-2.amazonaws.com/gecko-docs.mozilla.org/main/latest/_images/AsyncPanZoomArchitecture.png [Errno 2] No such file or directory

Though to be honest, it doesn't look like a certificate error, so maybe it's caused by an in-tree change after all.

Also in case it wasn't clear, this bug means that doc changes are not being reflected to firefox-source-docs.mozilla.org, so it should be high priority to fix.

Priority: P5 → P1

Got it. Is it just me, or is that traceback upside-down?

[task 2019-06-25T09:53:52.320Z] SSLError: SSL validation failed for https://s3.us-west-2.amazonaws.com/gecko-docs.mozilla.org/main/latest/_images/AsyncPanZoomArchitecture.png [Errno 2] No such file or directory
[task 2019-06-25T09:53:52.320Z] 
[task 2019-06-25T09:53:52.320Z]   File "/builds/worker/checkouts/gecko/tools/docs/mach_commands.py", line 91, in build_docs
[task 2019-06-25T09:53:52.320Z]     self._s3_upload(savedir, self.project, self.version)
[task 2019-06-25T09:53:52.320Z]   File "/builds/worker/checkouts/gecko/tools/docs/mach_commands.py", line 190, in _s3_upload
[task 2019-06-25T09:53:52.320Z]     s3_upload(files, key_prefix='%s/latest' % project)
[task 2019-06-25T09:53:52.320Z]   File "/builds/worker/checkouts/gecko/tools/docs/moztreedocs/upload.py", line 85, in s3_upload
[task 2019-06-25T09:53:52.320Z]     f.result()
[task 2019-06-25T09:53:52.320Z]   File "/builds/worker/checkouts/gecko/third_party/python/futures/concurrent/futures/_base.py", line 398, in result
[task 2019-06-25T09:53:52.320Z]     return self.__get_result()
[task 2019-06-25T09:53:52.321Z]   File "/builds/worker/checkouts/gecko/third_party/python/futures/concurrent/futures/thread.py", line 55, in run
[task 2019-06-25T09:53:52.321Z]     result = self.fn(*self.args, **self.kwargs)
[task 2019-06-25T09:53:52.321Z]   File "/builds/worker/checkouts/gecko/tools/docs/moztreedocs/upload.py", line 61, in upload
[task 2019-06-25T09:53:52.321Z]     s3.upload_fileobj(f, bucket, key, ExtraArgs=extra_args)
[task 2019-06-25T09:53:52.321Z]   File "/builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenvs/docs-y-qLfVwz/lib/python2.7/site-packages/boto3/s3/inject.py", line 539, in upload_fileobj
[task 2019-06-25T09:53:52.321Z]     return future.result()
[task 2019-06-25T09:53:52.321Z]   File "/builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenvs/docs-y-qLfVwz/lib/python2.7/site-packages/s3transfer/futures.py", line 106, in result
[task 2019-06-25T09:53:52.321Z]     return self._coordinator.result()
[task 2019-06-25T09:53:52.321Z]   File "/builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenvs/docs-y-qLfVwz/lib/python2.7/site-packages/s3transfer/futures.py", line 265, in result
[task 2019-06-25T09:53:52.321Z]     raise self._exception

Based on the error, I'd suspect something wrong with the Python SSL configuration -- unable to find its certificate store, perhaps? All it's doing is making a PUT request to an Amazon S3 endpoint, nothing too complicated.

Assuming this file is the only one afflicted, we could probably just catch/print the exception instead of bailing. That way the rest of the files would presumably still sync.

Though, I'm attempting to run a bisection on autoland to see if we can trace this to something in-tree.

Note the file that fails is the first one. Presumably, every other one would fail too.

So after bisecting on autoland, turns out this is somehow regressed by bug 1519598 after all:
https://treeherder.mozilla.org/#/jobs?repo=autoland&searchStr=docup&tochange=72d858dcdb26047b09afc9cda33c18dc14b44b52&fromchange=21826fb830de97ce71635ad784acf5a0b0bf237c

This makes absolutely no sense to me, so I decided to back it out on try and sure enough it went green again:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=567558a11911c2778907d22da5a958dde95d04fc

So uh.. if anyone has any ideas feel free to chime in. In case anyone wants to poke around, the upload code lives here:
https://searchfox.org/mozilla-central/source/tools/docs/moztreedocs/upload.py

Kats, I don't think we need this fixed right this instant but if we don't have any leads by next week we should probably back out.

Flags: needinfo?(kats)
Regressed by: 1519598

Bizarre. I'll take a look.

Assignee: nobody → kats
Flags: needinfo?(kats)

Somewhat unsurprisingly, it looks like https://hg.mozilla.org/integration/autoland/rev/72d858dcdb26047b09afc9cda33c18dc14b44b52 is the specific commit that causes the regression. It's almost 100% a bug in our doc generation/upload code. I'll take a look too.

I did try pushes to bisect. It's the top level import requests that's the problem. Testing a fix now

For some reason this causes some sort of failure in the doc upload
process. No other mach_commands.py files do this so this patch moves
the import into the functions that use it, which bypasses the problem.

Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/887674e502de
Don't import requests at the top level. r=ahal
See Also: → 1562226
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla69
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: