Closed Bug 558947 Opened 10 years ago Closed 9 years ago

generate 64bit mac build symbols and use them for talos/tests

Categories

(Release Engineering :: General, defect, P2)

x86
macOS
defect

Tracking

(blocking2.0 final+)

RESOLVED FIXED
Tracking Status
blocking2.0 --- final+

People

(Reporter: anodelman, Assigned: nthomas)

References

Details

(Whiteboard: fixed-mozilla-2.0b4, other branches still to be tidied up)

Attachments

(5 files, 3 obsolete files)

No description provided.
Depends on: 550335
Will allow 64bit talos testing to occur, even if we don't have symbols to download.
Attachment #438618 - Flags: review?(lsblakk)
Attachment #438618 - Flags: review?(lsblakk) → review+
Comment on attachment 438618 [details] [diff] [review]
[checked in]temporarily disable downloading mac64 build symbols

changeset:   699:8ec0e4c337ab
Attachment #438618 - Attachment description: temporarily disable downloading mac64 build symbols → [checked in]temporarily disable downloading mac64 build symbols
Attachment #438618 - Flags: checked-in+
Priority: -- → P3
Should be checking for 'snowleopard' not 'macosx64'.  Drat.
Attachment #438891 - Flags: review?(lsblakk)
Attachment #438891 - Flags: review?(lsblakk) → review+
Comment on attachment 438891 [details] [diff] [review]
[checked in]temporarily disable downloading mac64 build symbols (take 2)

changeset:   703:eebe4b96186b
Attachment #438891 - Attachment description: temporarily disable downloading mac64 build symbols (take 2) → [checked in]temporarily disable downloading mac64 build symbols (take 2)
Attachment #438891 - Flags: checked-in+
Depends on: 517832
Waiting on google to review jimb's patches.  Then we'll need to update our stackwalker + symbol generation goo.  Then things will start working here.
I only have the vaguest idea of how all this works but we are missing:

- symbol packages generated for mac 64 bit builds
- stackwalker http://hg.mozilla.org/build/tools/file/605b16dc7e05/breakpad/osx - currently we have a stackwalker for osx and we'll need one for osx64
Assignee: nobody → jim
I'm reading comment 6 to mean that you have separate minidump_stackwalk executables for each architecture we support.

The same minidump_stackwalk executable should work for any minidump, regardless of what kind of machine the minidump was written on and the symbol files were generated from.  I just now used a minidump_stackwalk executable built on x86 Linux to successfully get stack traces from minidumps generated on ARM, x86, and x86_64.
The symbol dumpers, however, are platform-specific.
http://hg.mozilla.org/mozilla-central/rev/d39bed41e215 fixes the perma-orange but doesn't fix this bug.
I'm confused by this bug.  What kind of symbol files are (were) being downloaded by process/factor.py?  Were they XCode-generated .dSYM files (Mach-O binary files holding compiler-generated debugging information), or were they Breakpad symbol files (text files whose first lines start with the word "MODULE")?

If it's the former, then this bug has nothing to do with Breakpad or the Mac 64-bit support I've been doing.

If it's the latter, then this bug has nothing to do with bug 550335, listed as a blocker.  That bug is concerned with tools/rb/fix_macosx_stack.py, which processes the stack traces produced by nsTraceRefcnt, which are completely unrelated to Breakpad's stack traces and minidump_stackwalk.
(In reply to comment #10)
> I'm confused by this bug.  What kind of symbol files are (were) being
> downloaded by process/factor.py?  Were they XCode-generated .dSYM files (Mach-O
> binary files holding compiler-generated debugging information), or were they
> Breakpad symbol files (text files whose first lines start with the word
> "MODULE")?
> 
> If it's the former, then this bug has nothing to do with Breakpad or the Mac
> 64-bit support I've been doing.
> 
> If it's the latter, then this bug has nothing to do with bug 550335, listed as
> a blocker.  That bug is concerned with tools/rb/fix_macosx_stack.py, which
> processes the stack traces produced by nsTraceRefcnt, which are completely
> unrelated to Breakpad's stack traces and minidump_stackwalk.

The answer is, both.  I'm going to file a bug on this since it's causing us pain elsewhere.  All we really need for our purposes is the breakpad symbols.

Currently 'make buildsymbols' on OSX 10.5 creates a crashreporter-symbols.zip file that contains both the Mach-O binaries (two copies!  and both copies are fat x86/ppc!), and also contains the breakpad symbols.  e.g.

http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx/1271967419/firefox-3.7a5pre.en-US.mac.crashreporter-symbols.zip

Some of these symbols are uploaded via 'make uploadsymbols'.  The entire zip is uploaded to FTP, and then pulled down for talos / unittest runs so that we can get stack traces on a crash.
(In reply to comment #11)
> Some of these symbols are uploaded via 'make uploadsymbols'.  The entire zip is
> uploaded to FTP, and then pulled down for talos / unittest runs so that we can
> get stack traces on a crash.

Could you clarify what you mean here ? AFAICT 'make uploadsymbols' is calling 
 http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/tools/upload_symbols.sh
so uploads to same info to the symbol and ftp servers.
(In reply to comment #12)
> (In reply to comment #11)
> > Some of these symbols are uploaded via 'make uploadsymbols'.  The entire zip is
> > uploaded to FTP, and then pulled down for talos / unittest runs so that we can
> > get stack traces on a crash.
> 
> Could you clarify what you mean here ? AFAICT 'make uploadsymbols' is calling 
> 
> http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/tools/upload_symbols.sh
> so uploads to same info to the symbol and ftp servers.

I just meant that I wasn't sure if the symbol server actually consumed both the breakpad and native symbols.
The symbol server stores the native symbols but doesn't use them (they're for developers to download for local debugging). The Talos/unittest machines also only consume the Breakpad format .sym files, not the native files. (Ideally we would just not upload those and save the disk space and transfer time for the testing slaves.)

Anyhow, this bug has one more dependency that might not be filed, which is to make Breakpad actually able to generate minidumps on 64-bit OS X. Once jimb's symbol dumping changes land, and minidump generation gets implemented and lands, then we can turn this on.
I *think* this is Core:BuildConfig bug?

(In reply to comment #14)
> Anyhow, this bug has one more dependency that might not be filed, which is to
> make Breakpad actually able to generate minidumps on 64-bit OS X. 

Ted/Jimb: Can you file a bug for this missing depbug?
Component: Release Engineering → Build Config
Product: mozilla.org → Core
QA Contact: release → build-config
Version: other → unspecified
That was bug 548025 which is already fixed upstream.
Component: Build Config → Breakpad Integration
Depends on: 548025
No longer depends on: 562081
Product: Core → Toolkit
QA Contact: build-config → breakpad.integration
nom'd because we need this for shipping 10.6 64bit builds.
blocking2.0: --- → ?
blocking2.0: ? → beta1+
jim: any update on this blocker for beta1?
Assignee: jim → nobody
(In reply to comment #18)
> jim: any update on this blocker for beta1?

This shouldn't have been assigned to jim, per discussion following the dev meeting 8 days ago.
Taking, I'm going to clean up Jim's patches and get them landed.
Assignee: nobody → ted.mielczarek
No longer depends on: 550335
Ted, can you get that done by 6/22?
No, I'm off today. I have all the patches updated, I was just fighting the Google issue tracker on Friday so I didn't get final review on one last patch. I can get that sorted out and landed this week.
--> beta2+, we decided this doesn't need to block the first beta release.
blocking2.0: beta1+ → beta2+
This all landed upstream:
http://code.google.com/p/google-breakpad/source/list
r610 - r619

Syncing to upstream is tracked in bug 567424.
No longer blocks: 567424
Back over to RelEng. Breakpad is enabled on x86-64 OS X builds now, so you should be able to start building and uploading symbols.
Assignee: ted.mielczarek → nobody
Component: Breakpad Integration → Release Engineering
Product: Toolkit → mozilla.org
QA Contact: breakpad.integration → release
Version: unspecified → other
Moving to beta3 since this looks like it's FIXED.
blocking2.0: beta2+ → beta3+
blocking2.0: beta3+ → beta4+
Hi everyone,

This is a beta4 blocker, but no progress appears to have been made on it since it was moved from a beta3 blocker to a beta4 blocker.  Can we please can an update?
Turning on symbols in the build (bug 576053) is in the landing queue. I'll check what we need to do here.
We'll automatically build and upload symbols now that bug 576053 is landed again (sure hope that sticks this time). 

We set 
 MINIDUMP_STACKWALK=/path/to/tools/breakpad/osx64/minidump_stackwalk
in unit test runs. I've added a symlink added to the tools repo to make that work, http://hg.mozilla.org/build/tools/rev/980e354c1644.

If we start downloading symbols by adjusting,
  http://hg.mozilla.org/build/buildbotcustom/file/c568cf50e7f3/process/factory.py#l6450
Talos may just work. We have
 http://mxr.mozilla.org/mozilla/source/testing/performance/talos/ttest.py#146
and the osx minidump_stackwalk will run fine on 64bit hosts (jimb's comment earlier).
Also need to remove
  http://hg.mozilla.org/build/buildbot-configs/file/f3c1cc7275db/mozilla-tests/config.py#l195
to get unit test to download symbols.

I'll pick this up next week if no-one beats me to it.
Summary: generate 64bit mac build symbols → generate 64bit mac build symbols and use them for talos/tests
Attached patch [WIP] buildbot-configs (obsolete) — Splinter Review
Should enable symbol upload for nightlies, and download for unit tests. Untested.
Download symbols for talos. Updated revert of attachment 438891 [details] [diff] [review].
Attached patch buildbot-configs v2 (obsolete) — Splinter Review
There's two parts to this patch. 

In mozilla/config.py I'm enabling uploading symbols to the symbol server for 64bit mac nightlies globally. That's OK for branches which didn't merge m-c since Friday because 'make uploadsymbols' exits gracefully if there is nothing to do. It's also setting MOZ_SYMBOLS_EXTRA_BUILDID so that the manifests have unique names for branch and platform (which makes cleanup possible later). Also fixes linux64 for e10s, which is out of step.

In mozilla-tests/config.py I'm removing the default of not downloading symbols for mac64 unit tests, and moving it to branches that haven't merged from m-c yet. We will remove those lines later and I've done this to prevent red in the meantime. I figure shadow and 2.0 will get new enough m-c when they go live. This should land after try has all the slaves updated. I've verified symbol download and unpacking works OK in staging. 

Downloading symbols for talos runs is the buildbotcustom patch.
Attachment #465629 - Attachment is obsolete: true
Attachment #466261 - Flags: review?(catlee)
As before, enables downloading symbols for talos runs on snow leopard. I've taken the opportunity to use a config var instead of a hardwired OS list in buildbotcustom.  This patch has the surprising effect of setting useSymbols to False on MozillaUpdateConfig, for w7x64 on all branches and macosx64 on branches without symbols yet. This seems like the correct behavior to me.
Attachment #465631 - Attachment is obsolete: true
Attachment #466262 - Flags: review?(catlee)
So having downloaded symbols in a talos/unit run, how would I go about verifying that stacks are produced on a crash ? I tried killing a SIGSEGV and SIGABRT to the firefox-bin process but that didn't work, although talos did report it going away. Similarly coercing talos to install crashme-new.xpi and using it while tp4 was running was not successful.

I'd take this bug, but I'm going to be away for a few days and it'll need some help to shephard the changes in.
I don't have a way to trigger a crash from outside the process on OS X. Using "Crash Me Now!" would be the only way to test this, currently, short of adding an intentional crash to the code that you're building.
I do for the next few hours, then I'm looking for a RelEng volunteer.

Updated buildbot-configs patch coming for uploading release symbols.
Assignee: nobody → nrthomas
Priority: P3 → P2
In addition to v2 also sets symbol upload for 64bit release builds. I believe we are doing tests on the minis so the changes in mozilla-test apply to releases too.
Attachment #466261 - Attachment is obsolete: true
Attachment #466454 - Flags: review?(catlee)
Attachment #466261 - Flags: review?(catlee)
Attachment #466262 - Flags: review?(catlee) → review+
Attachment #466454 - Flags: review?(catlee) → review+
Comment on attachment 466454 [details] [diff] [review]
buildbot-configs v3

Landed the build side as 
 http://hg.mozilla.org/build/buildbot-configs/rev/68d4977f1859

pm01/03 reconfig'd.
Comment on attachment 466262 [details] [diff] [review]
buildbotcustom - re-enable talos symbol download

http://hg.mozilla.org/build/buildbotcustom/rev/15d76843671d
Attachment #466262 - Flags: checked-in+
Comment on attachment 466454 [details] [diff] [review]
buildbot-configs v3

Landed the tests part as 
  http://hg.mozilla.org/build/buildbot-configs/rev/2dbf53dca1ec
Attachment #466454 - Flags: checked-in+
Bustage fix for try tests
  http://hg.mozilla.org/build/buildbot-configs/rev/2ba2f161118e

Haven't enabled symbols there yet.
Do we think this is a "few more hours" or "few more days" left thing? Really hard to tell from the bug comments; looks like there's progress!
Sorry, I ran out of time to update this before starting a couple of days off. The current status is that symbols are enabled on mozilla-central and several other branches (bug 576053; not tracemonkey, jaegermonkey, places, or cedar until they merge from m-c). We are downloading and unpacking symbols for talos and unit tests on the same branches. On try we are creating symbols but not using them for talos/tests. But ....

I did a m-c nightly after all those changes, that build is at http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2010-08-16-17-mozilla-central/firefox-4.0b4pre.en-US.mac64.dmg
It uploaded symbols OK, the manifest in the socorro symbol store is firefox-4.0b4pre-Darwin-20100816172326-macosx64-symbols.txt. When I used crashme-new.xpi to induce a crash I got
 http://crash-stats.mozilla.com/report/index/d6dbae97-07ef-4fbf-9f5d-a64e62100816
which *does not* have resolved the symbols. For comparison this is linux
 http://crash-stats.mozilla.com/report/index/17f75018-9ebf-4716-bcca-f544b2100816

I don't know what the problem is, whether the symbols are bogus or if the issue is in Socorro. Ted would probably be able to figure that out the fastest.
Then again, neither of the crashes above resolve the libxul reference at Frame 1, and if I generate an ObjC exception with crashme I get
 http://crash-stats.mozilla.com/report/index/1ef020e9-da29-4399-80b4-3fa102100816
And that has lots of frames resolved in XUL. Success!

So this is fixed for Fx4.0b4, but with other branches not finished so I don't want to resolve this FIXED yet.
Whiteboard: fixed-mozilla-2.0b4, other branches still to be tidied up
No longer blocking beta 4, then.
blocking2.0: beta4+ → final+
tracemonkey, jaegermonkey, places, cedar, and try now all produce symbol zips for dep builds (and where used nightlies). So we can enable download of those symbols for talos and unit test jobs.
Attachment #469207 - Flags: review?(catlee)
Attachment #469207 - Flags: review?(catlee) → review+
Blocks: 590208
Comment on attachment 469207 [details] [diff] [review]
[buildbot-configs] enable symbol download for talos/tests on remaining branches

changeset:   2910:2461bc778d70
Attachment #469207 - Flags: checked-in+
Based on the last patch being landed here I think this is RESO FIXED now.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
I now see beautiful assertion stacks on Mac64 debug crashtest.  gj everyone!
Duplicate of this bug: 559623
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.