Closed Bug 1233716 Opened 7 years ago Closed 7 years ago

Fix dumping of crashes in docker containers

Categories

(Testing :: General, defect)

defect
Not set
normal

Tracking

(firefox46 fixed)

RESOLVED FIXED
mozilla46
Tracking Status
firefox46 --- fixed

People

(Reporter: armenzg, Assigned: armenzg)

References

Details

Attachments

(1 file)

In bug 1233554 we noticed that we're not dumping crashes.

whimboo noticed that we don't set download_minidump_stackwalk:

However, setting minidump_stackwalk_path should have been sufficient (not needing to download):
> 18:56:14     INFO -  'minidump_stackwalk_path': '/usr/local/bin/linux64-minidump_stackwalk',

I'm look into this.

[1]
> 21:40:12     INFO -  'download_minidump_stackwalk': False,
This explains why it is happening. The reason for that is the following config file which you make use of:
https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/configs/remove_executables.py
And, indeed, that file is baked into the image:

root@taskcluster-worker:~# file /usr/local/bin/linux64-minidump_stackwalk
/usr/local/bin/linux64-minidump_stackwalk: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, for GNU/Linux 2.6.24, BuildID[sha1]=0xc972418a691a3bbf8a96044b63d8729bbd19eccc, stripped

However,
https://dxr.mozilla.org/mozilla-central/source/testing/docker/desktop-test/Dockerfile#17
# TODO: remove
ADD            https://s3-us-west-2.amazonaws.com/test-caching/packages/linux64-stackwalk /usr/local/bin/linux64-minidump_stackwalk
RUN chmod u+x /usr/local/bin/linux64-minidump_stackwalk

Notably, that's not coming from tooltool.  So (a) why is it not using the given file and (b) should we do this a different way?
dustin, you're right. It should be using, however, that changes seem that it never got tested if Mozharness would actually use it. The following patch fixes it.

From https://public-artifacts.taskcluster.net/Ac1gvqrGSUKRe5Jm7gV7hA/0/public/logs/live_backing.log
16:24:53     INFO - ENV: MINIDUMP_STACKWALK is now /usr/local/bin/linux64-minidump_stackwalk
It seems that another issue is discovered after fixing the first one.
I will add more info in the output and determine what the permission is for the .dmp file.


16:38:44  WARNING -  TEST-UNEXPECTED-FAIL | file:///home/worker/workspace/build/tests/reftest/tests/dom/canvas/crashtests/780392-1.html | application terminated with exit code 6
16:38:44     INFO -  mozcrash INFO | Downloading symbols from: https://queue.taskcluster.net/v1/task/e3NeMV5ZT3iUrNfEgbrbug/artifacts/public/build/target.crashreporter-symbols.zip
16:38:51     INFO -  Traceback (most recent call last):
16:38:51     INFO -    File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 738, in <module>
16:38:51     INFO -      main()
16:38:51     INFO -    File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 734, in main
16:38:51     INFO -      sys.exit(reftest.runTests(options.tests, options))
16:38:51     INFO -    File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 404, in runTests
16:38:51     INFO -      return self.runSerialTests(manifests, options, cmdlineArgs)
16:38:51     INFO -    File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 684, in runSerialTests
16:38:51     INFO -      debuggerInfo=debuggerInfo)
16:38:51     INFO -    File "/home/worker/workspace/build/tests/reftest/runreftest.py", line 652, in runApp
16:38:51     INFO -      symbolsPath, test_name=self.lastTestSeen)
16:38:51     INFO -    File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/mozcrash/mozcrash.py", line 93, in check_for_crashes
16:38:51     INFO -      for info in crash_info:
16:38:51     INFO -    File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/mozcrash/mozcrash.py", line 202, in __iter__
16:38:51     INFO -      rv = self._process_dump_file(path, extra)
16:38:51     INFO -    File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/mozcrash/mozcrash.py", line 237, in _process_dump_file
16:38:51     INFO -      stderr=subprocess.PIPE)
16:38:51     INFO -    File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
16:38:52     INFO -      errread, errwrite)
16:38:52     INFO -    File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_child
16:38:52     INFO -      raise child_exception
16:38:52     INFO -  OSError: [Errno 13] Permission denied
Summary: Fix MINIDUMP_STACKWALK not being set → Fix dumping of crashes in docker containers
Comment on attachment 8700051 [details]
MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund

https://reviewboard.mozilla.org/r/28539/#review25385

Keep in mind that I'm not an official peer for mozharness, so you might want to ask someone else for a final review. Otherwise you can find my thoughts below.

::: testing/mozharness/mozharness/mozilla/testing/testbase.py:659
(Diff revision 1)
> +            return self.minidump_stackwalk_path

I don't think that we should set the internal property here. It should simply behave like all the other cases and only return the path.

::: testing/mozharness/mozharness/mozilla/testing/testbase.py:682
(Diff revision 1)
> +            return self.minidump_stackwalk_path

We have duplicated code here. Why cannot we simply call `query_minidump_filename()` and return immediately in case a path is set?
Attachment #8700051 - Flags: review?(hskupin)
Armen and I talked about this on IRC and agreed that using the in-image binary is fine for now. It doesn't change that often. Hopefully in the future we'll be able to rebuild images directly from the in-tree Dockerfiles and updating it will be easy.
Comment on attachment 8700051 [details]
MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/28539/diff/1-2/
Attachment #8700051 - Flags: review?(hskupin)
Comment on attachment 8700051 [details]
MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/28539/diff/2-3/
Attachment #8700051 - Flags: review?(hskupin) → review?(jlund)
I will fix mozcrash not notifying us properly to bug 1234353.
I will fix the binary not being executable for the worker user to bug 1234352.
Depends on: 1234352, 1234353
Comment on attachment 8700051 [details]
MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund

https://reviewboard.mozilla.org/r/28539/#review25683

I have some questions before stamping it. Looks good otherwise :)

::: testing/mozharness/mozharness/mozilla/testing/testbase.py:641
(Diff revision 2)
> -        if self._is_windows():
> +        if not platform_name:

shouldn't this be `if platform_name:` ?

::: testing/mozharness/mozharness/mozilla/testing/testbase.py:642
(Diff revision 2)
> -            # we use the same minidump binary for 32 and 64 bit windows
> +            tooltool_path = "config/tooltool-manifests/%s/releng.manifest" % platform_name

what about when this is 64 bit windows?

::: testing/mozharness/mozharness/mozilla/testing/testbase.py:653
(Diff revision 2)
> -        if self._is_windows():
> +        if not platform_name:

again I think we want `if platform_name` here

::: testing/mozharness/mozharness/mozilla/testing/testbase.py:655
(Diff revision 2)
> -            return minidump_filename % ('win32',) + '.exe'
> +            if platform_name == 'win32':

again, what about the win64 case?

::: testing/mozharness/mozharness/mozilla/testing/testbase.py:664
(Diff revision 2)
> +        elif self.config.get('minidump_stackwalk_path'):

this is already going to happen once we run query_minidump_filename() a few lines below correct? also, this means we will never download_minidump_stackwalk if we define it via self.config

which I'm not sure we want to do.. ?
Attachment #8700051 - Flags: review?(jlund)
Comment on attachment 8700051 [details]
MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/28539/diff/2-3/
Attachment #8700051 - Flags: review?(hskupin)
https://reviewboard.mozilla.org/r/28539/#review25683

> what about when this is 64 bit windows?

The same binary is used for both archs.
Attachment #8700051 - Flags: review?(hskupin) → review?(jlund)
Comment on attachment 8700051 [details]
MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/28539/diff/3-4/
Attachment #8700051 - Attachment description: MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=whimboo → MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund
Attachment #8700051 - Flags: review?(jlund)
Comment on attachment 8700051 [details]
MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund

https://reviewboard.mozilla.org/r/28539/#review25691

::: testing/mozharness/mozharness/mozilla/testing/testbase.py:641
(Diff revisions 2 - 4)
> -        if not platform_name:
> +        if platform_name:

I still think you need to special case for when self.platform_name() returns: https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/base/script.py#61

sorry if I'm wrong

::: testing/mozharness/mozharness/mozilla/testing/testbase.py:664
(Diff revisions 2 - 4)
> -        elif self.config.get('minidump_stackwalk_path'):
> +        elif self.config.get('minidump_stackwalk_path') and not c.get('download_minidump_stackwalk'):

I guess my point was do we need this elif block at all? the way I see it is self.query_minidump_filename() (~8 lines below) will do this exact thing for us.
Attachment #8700051 - Flags: review?(jlund)
Assignee: nobody → armenzg
Comment on attachment 8700051 [details]
MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/28539/diff/4-5/
Comment on attachment 8700051 [details]
MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund

https://reviewboard.mozilla.org/r/28539/#review26497

lgtm :)
Attachment #8700051 - Flags: review?(jlund) → review+
Comment on attachment 8700051 [details]
MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/28539/diff/5-6/
In my latest change I've made self.platform_name() a valid function of ScriptMixin which calls the function platform_name() of the same module.

It failed on Android test jobs.
I asked on IRC but I've lost my scrollback.

I only added a small change to what you reviewed.
Let me know if you want to review it again or if fine to land if the try push comes back properly.
https://hg.mozilla.org/integration/mozilla-inbound/rev/1687f6972fe2e546f0b45afa50bebf386a4915be
Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund
Apparently broke regular Mn tests, too.
Comment on attachment 8700051 [details]
MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund

https://reviewboard.mozilla.org/r/28539/#review26653

::: testing/mozharness/mozharness/mozilla/testing/testbase.py:664
(Diff revision 6)
> -        if self._is_windows():
> +        platform_name = self.platform_name()

I think we forgot a TOOLTOOL_PLATFORM_DIR equivalent for minidump stackwalk because platform_name() returns things like 'linux' and 'win64'. both are values we never want..

::: testing/mozharness/mozharness/mozilla/testing/testbase.py:666
(Diff revision 6)
> -            return minidump_filename % ('win32',) + '.exe'
> +            minidump_filename = '%s-minidump_stackwalk' % platform_name

hm, I think I commented about this before but missed it while looking at your last interdiff. Shouldn't this be 'win32' even if the platform is 'win64'?
Attachment #8700051 - Flags: review+
https://reviewboard.mozilla.org/r/28539/#review26653

> I think we forgot a TOOLTOOL_PLATFORM_DIR equivalent for minidump stackwalk because platform_name() returns things like 'linux' and 'win64'. both are values we never want..

Ideally calling self.tooltool_fetch() would actually returning the path to the file downloaded. I will nevertheless fix it as is.
Attachment #8700051 - Flags: review?(jlund)
Comment on attachment 8700051 [details]
MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/28539/diff/6-7/
For my own reference:
14:32:19     INFO - Running command: ['/tools/tooltool.py', '--url', 'https://api.pub.build.mozilla.org/tooltool/', '--authentication-file', '/builds/relengapi.tok', 'fetch', '-m', '/builds/slave/test/build/tests/config/tooltool-manifests/linux32/releng.manifest', '-o', '-c', '/builds/tooltool_cache'] in /builds/slave/test/build
14:32:19     INFO - Copy/paste: /tools/tooltool.py --url https://api.pub.build.mozilla.org/tooltool/ --authentication-file /builds/relengapi.tok fetch -m /builds/slave/test/build/tests/config/tooltool-manifests/linux32/releng.manifest -o -c /builds/tooltool_cache
14:32:19     INFO -  INFO - File linux32-minidump_stackwalk retrieved from local cache /builds/tooltool_cache
14:32:19     INFO - Return code: 0
14:32:19  WARNING - minidump stackwalk path was given but couldn't be found. Tried looking in '/builds/slave/test/build/linux-minidump_stackwalk'
Comment on attachment 8700051 [details]
MozReview Request: Bug 1233716 - Set self.minidump_stackwalk_path if specified in the configs. r=jlund

https://reviewboard.mozilla.org/r/28539/#review27325

lgtm

since this has been through a few cycles and I'm paranoid, we should do a try run of affected jobs if not already done.

::: testing/mozharness/mozharness/mozilla/testing/testbase.py:668
(Diff revisions 6 - 7)
> -            minidump_filename = '%s-minidump_stackwalk' % platform_name
> +            minidump_filename = '%s-minidump_stackwalk' % TOOLTOOL_PLATFORM_DIR[platform_name]

maybe this name should be more generic or at least un-tooltool specific
Attachment #8700051 - Flags: review?(jlund) → review+
The try push was triggered via reviewboard.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6b35bd5f86a5&selectedJob=15300388

Last night when I looked it was fine but I will verify once again.
jlund: the results do not show any issues.

I believe TOOLTOOL_PLATFORM_DIR is an adequate name since it is a mapping specific for tooltool configs in tree:
https://dxr.mozilla.org/mozilla-central/source/testing/config/tooltool-manifests

Please let me know if you waive the issue so I can land it.
Flags: needinfo?(jlund)
https://hg.mozilla.org/mozilla-central/rev/e2dc6422362c
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
Flags: needinfo?(jlund)
You need to log in before you can comment on or make changes to this bug.