Closed Bug 1113460 Opened 5 years ago Closed 5 years ago

test jobs that use minidump stackwalk should get it from tooltool and not tools repo

Categories

(Release Engineering :: Applications: MozharnessCore, defect)

x86
macOS
defect
Not set

Tracking

(firefox34 fixed, firefox35 fixed, firefox36 fixed, firefox37 fixed, firefox38 fixed, firefox-esr31 fixed, b2g-v1.4 fixed, b2g-v2.0 fixed, b2g-v2.1 fixed)

RESOLVED FIXED
Tracking Status
firefox34 --- fixed
firefox35 --- fixed
firefox36 --- fixed
firefox37 --- fixed
firefox38 --- fixed
firefox-esr31 --- fixed
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed

People

(Reporter: jlund, Assigned: jlund)

References

Details

Attachments

(2 files, 1 obsolete file)

in an effort to reduce how many times we rm and clone our tools repo, it seems there are many jobs that barely even need the tools repo at all.

a common use for rm/cloning tools on every job is for the minidump stackwalk bits. turns out this stretches across ff, fennec, and b2g tests and there are a few different ways the tests go about doing that, depending on the platform and test suite.

this bug will:

1) get such test jobs to avail of tooltool to grab the required minidumps instead of using tools repo
2) standardize how that is done
it looks like all of the linux64/osx64 gecko tests and the android 2.3 (linux64 based) tests use the wrong minidump stackwalk binary. platform.architecture() actually returns a tuple: (bits, linkage) so lines like this[1] will always fail

will have to address that in this bug too

[1] http://mxr.mozilla.org/build/source/mozharness/mozharness/mozilla/testing/testbase.py?rev=4a83adea2532#561
IIUC, marionette windows jobs don't use MINIDUMP_STACKWALK at all.

log example[1]

I think that's because download_minidump_stackwalk[2] is not defined for the windows case[3].

jgriffin, before I add it, just a sanity check that it wasn't purposely being skipped for windows


[1] https://tbpl.mozilla.org/php/getParsedLog.php?id=54830091&full=1&branch=mozilla-central
[2] http://mxr.mozilla.org/build/source/mozharness/configs/marionette/prod_config.py?rev=3a32e88478ff#40
[3] http://mxr.mozilla.org/build/source/mozharness/configs/marionette/windows_config.py
Flags: needinfo?(jgriffin)
It wasn't intentional, just an oversight.
Flags: needinfo?(jgriffin)
adds tooltool manifests. the incoming mozharness patch until this lands across all trees
Attachment #8540451 - Flags: review?(jgriffin)
see previous attachment for associated m-c manifest patch.

there are _many_ scripts and platforms this touches. I think this try push covers them all: https://tbpl.mozilla.org/?tree=Try&rev=e2dddcfda2b9

jgriffin: there are not many in releng who would know these parts of the infra so I'm asking you for the review. please bounce it to someone else if you don't think you're appropriate for it :)
Attachment #8540455 - Flags: review?(jgriffin)
Comment on attachment 8540451 [details] [diff] [review]
141222_bug_1113460_use_tooltool_for_minidump-m-c.patch

Review of attachment 8540451 [details] [diff] [review]:
-----------------------------------------------------------------

rubber stamp!
Attachment #8540451 - Flags: review?(jgriffin) → review+
Comment on attachment 8540455 [details] [diff] [review]
141222_bug_1113460_use_tooltool_for_minidump-mozharness.patch

Review of attachment 8540455 [details] [diff] [review]:
-----------------------------------------------------------------

Nice, this looks much more sane.

::: mozharness/mozilla/testing/testbase.py
@@ +588,5 @@
> +                    output_dir=dirs['abs_work_dir'],
> +                    cache=c.get('tooltool_cache')
> +                )
> +            except KeyError:
> +                self.error('missing required: c["tooltool_servers"]')

Is this the only thing that can throw KeyError?  If not, better to make the error message more dynamic to avoid confusion later.  Also, should we return if we hit this?

@@ +595,5 @@
> +                                             minidump_stackwalk_path)
> +            if os.path.exists(abs_minidump_path):
> +                self.minidump_stackwalk_path = abs_minidump_path
> +            else:
> +                self.warning("minidump stackwalk path was given but couldn't be determined. "

maybe s/determined/found ?
Attachment #8540455 - Flags: review?(jgriffin) → review+
(In reply to Jonathan Griffin (:jgriffin) from comment #7)
> Comment on attachment 8540455 [details] [diff] [review]
> 141222_bug_1113460_use_tooltool_for_minidump-mozharness.patch
> 
> Review of attachment 8540455 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Nice, this looks much more sane.
> 
> ::: mozharness/mozilla/testing/testbase.py
> @@ +588,5 @@
> > +                    output_dir=dirs['abs_work_dir'],
> > +                    cache=c.get('tooltool_cache')
> > +                )
> > +            except KeyError:
> > +                self.error('missing required: c["tooltool_servers"]')
> 
> Is this the only thing that can throw KeyError?  If not, better to make the
> error message more dynamic to avoid confusion later.  Also, should we return
> if we hit this?

that except key list used to be much bigger but I *think* it is just tooltool servers that is a hard requirement. I'll beef up the error message suggesting 'a' key is missing and that to check if tooltool_servers exists in self.config. fixed

> 
> @@ +595,5 @@
> > +                                             minidump_stackwalk_path)
> > +            if os.path.exists(abs_minidump_path):
> > +                self.minidump_stackwalk_path = abs_minidump_path
> > +            else:
> > +                self.warning("minidump stackwalk path was given but couldn't be determined. "
> 
> maybe s/determined/found ?

thanks, makes sense. fixed
Comment on attachment 8540451 [details] [diff] [review]
141222_bug_1113460_use_tooltool_for_minidump-m-c.patch

thanks!

inbound: https://hg.mozilla.org/integration/mozilla-inbound/rev/927ca48b5f29
Attachment #8540451 - Flags: checked-in+
Keywords: leave-open
Comment on attachment 8540455 [details] [diff] [review]
141222_bug_1113460_use_tooltool_for_minidump-mozharness.patch

on default: https://hg.mozilla.org/build/mozharness/rev/88c4800a33cb

interdiff from review: http://people.mozilla.org/~jlund/141230_1113460_use_tooltool_for_minidump-mozharness-interdiff.patch

manifests should be uplifted everywhere now. There will likely be more than one twig branch that is behind m-c and will burn jobs until it merges.

here is one final try push that is against canonical mozharness but the default branch (since we haven't merged to prod yet). I'm doing a full run through as I'm paranoid about talos. it shouldn't be affected but since talos inherits TestMixin, it be good to test that too: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c55b58575854

finally, I've cc'd buildduty and sheriffs since this is quite an invasive patch
Attachment #8540455 - Flags: checked-in+
mozharness has been merged to production. patches are live :)
Depends on: 1117313
backed out for reasons stated here https://bugzilla.mozilla.org/show_bug.cgi?id=1117313#c1

will follow up when I am back from PTO
Assignee: nobody → jlund
we are having hg.m.o issues again. while the cause is unknown, tackling this item should help alleviate load.

This patch is virtually the same but it does add tooltool bits for the newly created lucid tests and it also 'chmod +x' the stackwalk binary so we don't hit bug 1117313 again.

here is an interdiff of I think the only changes from last time (aside from some bitrot merge conflicts): http://people.mozilla.org/~jlund/150318_bug_1113460_use_tooltool_for_minidump-mh-interdiff.patch
Attachment #8540455 - Attachment is obsolete: true
Attachment #8579505 - Flags: review?(jgriffin)
Comment on attachment 8579505 [details] [diff] [review]
150318_bug_1113460_use_tooltool_for_minidump-mh.patch

Review of attachment 8579505 [details] [diff] [review]:
-----------------------------------------------------------------

r+ based on past review and the interdiff
Attachment #8579505 - Flags: review?(jgriffin) → review+
Comment on attachment 8579505 [details] [diff] [review]
150318_bug_1113460_use_tooltool_for_minidump-mh.patch

on default: 
remote:   https://hg.mozilla.org/build/mozharness/rev/5615101b3a40

merged to prod:
remote:   https://hg.mozilla.org/build/mozharness/rev/75c435ef19ca

mozharness.json rev bump on m-c:
remote:   https://hg.mozilla.org/mozilla-central/rev/2e2222a40262
Attachment #8579505 - Flags: checked-in+
sweet. thanks! I think we are all done here for now
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
I'm guessing this is why we haven't been getting usable stacks for most Windows crashes in automation lately, i.e.
https://treeherder.mozilla.org/logviewer.html#?job_id=705366&repo=mozilla-aurora
Flags: needinfo?(jlund)
commented in 1146855
Flags: needinfo?(jlund)
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.