Closed Bug 693352 Opened 13 years ago Closed 13 years ago

add minidump_stackwalk and symbols to the android automation

Categories

(Release Engineering :: General, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jmaher, Assigned: bear)

References

Details

(Keywords: intermittent-failure, Whiteboard: [mobile_unittests][android_tier_1])

Attachments

(6 files, 4 obsolete files)

now that we have the ability to pull minidumps from the device, we need to add the rest of the infrastructure when we run the harness.

1) set minidump_stackwalk environment variable
2) download the symbols and point add --symbols=<path> to the commandline.
Summary: add stackwalk_minidump and symbols to the android automation → add minidump_stackwalk and symbols to the android automation
We have all the stackwalk_minidump binaries included in talos:
http://hg.mozilla.org/build/talos/file/79963fc07e5d/breakpad,

then we need to set the environment variable MINIDUMP_STACKWALK before running.

Finally, we need to download the symbols from the same directory as the tests and apk come from. Here is an example:
http://stage.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android/1318375139/fennec-10.0a1.en-US.android-arm.crashreporter-symbols.zip

marking this as tier_1 because we are seeing a lot of crashes during the automation.
Whiteboard: [orange][mobile_unittests][android_tier_1]
Assignee: nobody → bear
Blocks: 438871
bear: any chance we can see some motion on this bug this week? Devs are getting restless.
first pass, i'm sure I am missing something, but after looking thru the code twice in an hour it's probably faster to get feedback
Attachment #573554 - Flags: feedback?(aki)
Comment on attachment 573554 [details] [diff] [review]
first pass at adding stackwalk items to Remote* test items

Review of attachment 573554 [details] [diff] [review]:
-----------------------------------------------------------------

just some general feedback on my end-

::: process/factory.py
@@ +286,5 @@
>          'win32': WithProperties('%(toolsdir:-)s/breakpad/win32/minidump_stackwalk.exe'),
>          'win64': WithProperties('%(toolsdir:-)s/breakpad/win64/minidump_stackwalk.exe'),
>          'macosx': WithProperties('%(toolsdir:-)s/breakpad/osx/minidump_stackwalk'),
>          'macosx64': WithProperties('%(toolsdir:-)s/breakpad/osx64/minidump_stackwalk'),
> +        'android':  WithProperties('%(toolsdir:-)s/breakpad/linux/minidump_stackwalk'),

do we run all android tests on a linux foopy?  I thought they were osx for some reason.

@@ +6692,5 @@
> +        if stackwalk_cgi and kwargs.get('downloadSymbols'):
> +            self.env['MINIDUMP_STACKWALK_CGI'] = stackwalk_cgi
> +        else:
> +            self.env['MINIDUMP_STACKWALK'] = getPlatformMinidumpPath(platform)
> +        self.env['MINIDUMP_SAVE_PATH'] = WithProperties('%(basedir:-)s/minidumps')

we need to ensure this works properly with all harnesses.  Probably does because it looks unfamiliar to me and I don't know what I am talking about.
(In reply to Joel Maher (:jmaher) from comment #4)
> Comment on attachment 573554 [details] [diff] [review] [diff] [details] [review]
> first pass at adding stackwalk items to Remote* test items
> 
> Review of attachment 573554 [details] [diff] [review] [diff] [details] [review]:
> -----------------------------------------------------------------
> 
> just some general feedback on my end-
> 
> ::: process/factory.py
> @@ +286,5 @@
> >          'win32': WithProperties('%(toolsdir:-)s/breakpad/win32/minidump_stackwalk.exe'),
> >          'win64': WithProperties('%(toolsdir:-)s/breakpad/win64/minidump_stackwalk.exe'),
> >          'macosx': WithProperties('%(toolsdir:-)s/breakpad/osx/minidump_stackwalk'),
> >          'macosx64': WithProperties('%(toolsdir:-)s/breakpad/osx64/minidump_stackwalk'),
> > +        'android':  WithProperties('%(toolsdir:-)s/breakpad/linux/minidump_stackwalk'),
> 
> do we run all android tests on a linux foopy?  I thought they were osx for
> some reason.

the builds are all linux and the foopies are all osx, but the test runs on the tegras - but to be honest I don't know where stackwalk happens, on the foopy where runtestsremote is or on the tegra where the tests are run?


> 
> @@ +6692,5 @@
> > +        if stackwalk_cgi and kwargs.get('downloadSymbols'):
> > +            self.env['MINIDUMP_STACKWALK_CGI'] = stackwalk_cgi
> > +        else:
> > +            self.env['MINIDUMP_STACKWALK'] = getPlatformMinidumpPath(platform)
> > +        self.env['MINIDUMP_SAVE_PATH'] = WithProperties('%(basedir:-)s/minidumps')
> 
> we need to ensure this works properly with all harnesses.  Probably does
> because it looks unfamiliar to me and I don't know what I am talking about.

right now I only know of what the tegras use, but yea we will be needing to test this broadly
Comment on attachment 573554 [details] [diff] [review]
first pass at adding stackwalk items to Remote* test items

>-        'android': None,
>+        'android':  WithProperties('%(toolsdir:-)s/breakpad/linux/minidump_stackwalk'),

I'm going to guess we do the minidump_stackwalk on the foopy.
Otherwise we probably need an android-specific binary.
Either way I'm pretty sure linux is wrong, but I don't know a lot about this.

> class RemoteUnittestFactory(MozillaTestFactory):
>     def __init__(self, platform, suites, hostUtils, productName='fennec',
>                  downloadSymbols=False, downloadTests=True,
>                  posixBinarySuffix='', remoteExtras=None,
>-                 branchName=None, **kwargs):
>+                 branchName=None, stackwalk_cgi=None, **kwargs):
>         self.suites = suites
>         self.hostUtils = hostUtils
>+        self.stackwalk_cgi = stackwalk_cgi
>+
>+        if stackwalk_cgi and kwargs.get('downloadSymbols'):
>+            self.env['MINIDUMP_STACKWALK_CGI'] = stackwalk_cgi
>+        else:
>+            self.env['MINIDUMP_STACKWALK'] = getPlatformMinidumpPath(platform)
>+        self.env['MINIDUMP_SAVE_PATH'] = WithProperties('%(basedir:-)s/minidumps')
>+        self.env.update(env)

I'm not entirely sure about your logic here.
I'd be surprised if the env with WithProperties works, but maybe it's buildbot magic.
Is there a way for us to download the symbols successfully here?
And I'm not entirely sure how the cgi works, but if we're passing a symbols url instead of symbols, do we need the kwargs.get('downloadSymbols') to be set to set the cgi env var?

>     def addRunTestSteps(self):
>+        if self.stackwalk_cgi and self.downloadSymbols:
>+            symbols_path = '%(symbols_url)s'
>+        else:
>+            symbols_path = 'symbols'

Here as well.
I don't know if we automatically get the symbols and put them in 'symbols'?
Attachment #573554 - Flags: feedback?(aki)
(In reply to Aki Sasaki [:aki] from comment #6)
> I'm going to guess we do the minidump_stackwalk on the foopy.
> Otherwise we probably need an android-specific binary.
> Either way I'm pretty sure linux is wrong, but I don't know a lot about this.

We do run minidump_stackwalk on the desktop machine, not the mobile device. If foopy is where remote*.py get executed, then that's where this would happen.

> > Is there a way for us to download the symbols successfully here?
> And I'm not entirely sure how the cgi works, but if we're passing a symbols
> url instead of symbols, do we need the kwargs.get('downloadSymbols') to be
> set to set the cgi env var?
> 

> Here as well.
> I don't know if we automatically get the symbols and put them in 'symbols'?

Currently, on desktop systems, we download the symbols after downloading the build, and unpack them to "symbols". Once bug 679759 is finished, we'll be able to stop doing that, and pass the URL to the zip file directly to the script, which will download it as-needed.
(In reply to Joel Maher (:jmaher) from comment #0)
> now that we have the ability to pull minidumps from the device, we need to
> add the rest of the infrastructure when we run the harness.
> 
> 1) set minidump_stackwalk environment variable
> 2) download the symbols and point add --symbols=<path> to the commandline.

On Desktop, this is --symbols-path=<path>.

Is it --symbols for Remote testing?
Is that for remote unittests, or for talos, or both?
This appears to be --symbolsPath for remote talos.
I haven't quite figured out the correct option for remote unit tests, though mochitests seems to inherit the desktop options class, which would suggest the symbols option would stay the same.
Bear: this passes checkconfigs and I *think* it'll work.
I would love to see this run in staging for a while, so we can verify that

a) we download symbols on tegra talos and unit test runs
b) we set MINIDUMP_STACKWALK appropriately
c) we set --symbolsPath on remote talos and --symbols-path on remote unit tests.

If we see a crash and see a stack, uber bonus points, but the above 3 points should be enough to resolve this.

I'm going to be slammed with android native work for the next few days, so if you or someone else can set this up, that would be awesome.  Otherwise I'll circle back and try after the android native work settles down.
Attachment #574489 - Flags: feedback?(bear)
Attachment #574489 - Attachment description: [needs testing] Bearadd stackwalk/symbols to remote unittests + talos → [needs testing] add stackwalk/symbols to remote unittests + talos
Attachment #574486 - Flags: review?(bear) → review+
Attachment #574489 - Flags: feedback?(bear) → feedback+
I'll work on getting these into staging tonight so we can see it run overnight/tomorrow.  If any tweaking needs doing I can make the changes after pestering you with questions.
staging tegra master reconfigured with this patch
My bug 701864 patches will bitrot this a bit, but the testing will still be useful.
download_symbols is breaking in staging, but that's probably because it's pointing at a nightly/latest-* directory, where we don't upload symbols.

Both patches need to be updated to support 'android' and 'android-xul' in addition to 'linux-android'.
Attachment #574486 - Attachment is obsolete: true
Attachment #575598 - Flags: review?(bear)
Attachment #575598 - Flags: review?(bear) → review+
This downloads symbols successfully.
I saw a --symbolsPath in perfconfigurator in a talos run; I think I'll see a --symbols-path in a unit test run (but will keep testing).
Attachment #574489 - Attachment is obsolete: true
Attachment #575600 - Flags: review?(bear)
Attachment #575600 - Flags: review?(bear) → review+
Comment on attachment 575608 [details] [diff] [review]
(custom) fix unittest symbol download

Ok, this downloads symbols for unittests properly, and extracts them, and unittests are run with --symbols-path.

HOWEVER, it extracts symbols in build/symbols and the unittests are run in build/test, with --symbols-path=symbols.

We can hack the symbols path to be an abspath, or extract them into build/test/symbols, or run the tests out of build/, or add --symbols-path=../symbols .

I'm kind of leaning towards the last one, which would involve editing all the buildbotcustom/steps/unittest steps to call "--symbols-path=../%s" % symbols_path .
Same with talos: symbols are in test/../talos-data/symbols, we run talos in test/../talos-data/talos/ with --symbolsPath=symbols .
Attachment #575600 - Attachment is obsolete: true
To get bm07 to reconfig, I had to remove mozilla-1.9.1 from the release branch list in master_config.json.

I *think* this is only generated when creating a new master, and isn't auto-updated.
If it is autoupdated, it shouldn't add mozilla-1.9.1 anymore unless there's a duplicate file somewhere that isn't build-tools' production masters json...
(In reply to Aki Sasaki [:aki] from comment #22)
> To get bm07 to reconfig, I had to remove mozilla-1.9.1 from the release
> branch list in master_config.json.
> 
> I *think* this is only generated when creating a new master, and isn't
> auto-updated.
> If it is autoupdated, it shouldn't add mozilla-1.9.1 anymore unless there's
> a duplicate file somewhere that isn't build-tools' production masters json...

Sorry, wrong bug.
Attachment #577475 - Flags: review?(aki)
Attached patch (custom) adjust symbols path (obsolete) — Splinter Review
Attachment #577476 - Flags: review?(aki)
Comment on attachment 577475 [details] [diff] [review]
(configs) adjust symbols path

This looks to be the same as attachment 575598 [details] [diff] [review] :)
Attachment #577475 - Flags: review?(aki) → review+
Comment on attachment 577476 [details] [diff] [review]
(custom) adjust symbols path

>@@ -7368,17 +7393,17 @@ class TalosFactory(RequestSortingBuildFa
>         ))
>         self.addStep(ShellCommand(
>          name="mkdir_symbols",
>          command=['mkdir', 'symbols'],
>          workdir=self.workdirBase,
>         ))
>         self.addStep(UnpackFile(
>          filename=WithProperties("../%(symbolsFile)s"),
>-         workdir="%s/symbols" % self.workdirBase,
>+         workdir="%s/../symbols" % self.workdirBase,

Hm.  I don't think this will work.
You're making workdirBase/symbols in the previous step and then trying to unpack the symbols in to workdirBase/../symbols in this step.
Does this work?

I think you want to edit the remotePerfConfigurator step to use ../symbols if remote == True.

I think the rest of this looks good; do the tests look good?
(Verify the directory that the symbols are extracted into is the same path that the unit tests, or perfconfigurator for talos, reference).

Minusing due to the above workdir comment, plus pending test results, but this is close.
Attachment #577476 - Flags: review?(aki) → review-
(In reply to Aki Sasaki [:aki] from comment #27)
> Comment on attachment 577476 [details] [diff] [review] [diff] [details] [review]
> (custom) adjust symbols path
> 
> >@@ -7368,17 +7393,17 @@ class TalosFactory(RequestSortingBuildFa
> >         ))
> >         self.addStep(ShellCommand(
> >          name="mkdir_symbols",
> >          command=['mkdir', 'symbols'],
> >          workdir=self.workdirBase,
> >         ))
> >         self.addStep(UnpackFile(
> >          filename=WithProperties("../%(symbolsFile)s"),
> >-         workdir="%s/symbols" % self.workdirBase,
> >+         workdir="%s/../symbols" % self.workdirBase,
> 
> Hm.  I don't think this will work.
> You're making workdirBase/symbols in the previous step and then trying to
> unpack the symbols in to workdirBase/../symbols in this step.
> Does this work?

good catch - i've removed it

> 
> I think you want to edit the remotePerfConfigurator step to use ../symbols
> if remote == True.

that step already has ../symbols when useSymbols == True

> 
> I think the rest of this looks good; do the tests look good?
> (Verify the directory that the symbols are extracted into is the same path
> that the unit tests, or perfconfigurator for talos, reference).

the directory has symbols dir looks good and I've verified that the unitests --symbols path points to it

right now talos tests are failing for the build i've selected so I want to change to one that works plus I'm not seeing the talos steps referencing symbols at all so I think I need to tweak the config to set useSymbols

> 
> Minusing due to the above workdir comment, plus pending test results, but
> this is close.
this has run in staging and is waiting for a non-release day window to land (after I get a final review from aki for sanity)
final tweaks and what is currently being tested on my staging server
Attachment #577476 - Attachment is obsolete: true
Attachment #578011 - Flags: review?(aki)
Attachment #578011 - Flags: review?(aki) → review+
Comment on attachment 577475 [details] [diff] [review]
(configs) adjust symbols path

committed changeset 5129:f9fa20d7547d
Attachment #577475 - Flags: checked-in+
Comment on attachment 578011 [details] [diff] [review]
(custom) adjust symbols path

committed changeset 1924:a719acf41723
Attachment #578011 - Flags: checked-in+
landed in production
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Depends on: 732285
Whiteboard: [orange][mobile_unittests][android_tier_1] → [mobile_unittests][android_tier_1]
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: