46 bytes, text/x-phabricator-request
|Details | Review|
Bug 1309172 - Updated the forked Breakpad client sources by manually applying working revisions up to version 7b3afa9258e58a57ffbeb395d445811f92616ae9
46 bytes, text/x-phabricator-request
|Details | Review|
46 bytes, text/x-phabricator-request
|Details | Review|
+++ This bug was initially created as a clone of Bug #1264367 +++ We've run into a bunch of issues when sync'ing breakpad with upstream in bug 1264367 mostly because we had fallen back quite a bit. This is a reminder to update breakpad soon-ish, possibly after having fixed the issues with merging module mappings we encountered, least we fall back again.
I've landed a fix upstream that makes minidump UUIDs truly random on Linux: https://chromium.googlesource.com/breakpad/breakpad/+/a9fca58305afcb16a7ae539d87d672c702aba3f1 This might be a good time to pull from upstream so that we get that change.
Since I've picked up bug 1440282 it's about time we do this.
Assignee: nobody → gsvelto
Status: NEW → ASSIGNED
Created attachment 8983791 [details] [diff] [review] [PATCH 1/4] Apply brekpad revisions which were previously dropped These are revisions 17ad0c18b179c135fc5a3d2bba199c3fa4276035 and 8915f7be39448d9257b6da3ad0233944d1d9a92a which had not been included during the last pull because they caused issues when generating minidumps on Fennec.
Attachment #8983791 - Attachment description: [PATCH] Apply brekpad revisions which were previously dropped → [PATCH 1/4] Apply brekpad revisions which were previously dropped
Created attachment 8983792 [details] [diff] [review] [PATCH 2/4] Updated breakpad to version 7b3afa9258e58a57ffbeb395d445811f92616ae9 This includes only the vanilla files we haven't forked
Created attachment 8983793 [details] [diff] [review] [PATCH 3/4] Updated the forked Breakpad client sources This was done by manually applying the relevant patches all the way up to version 7b3afa9258e58a57ffbeb395d445811f92616ae9. I had to manually fix only a handful of conflicts.
This updates breakpad to match the current upstream. I've used the update-breakpad.sh script to generate attachment 8983793 [details] [diff] [review] and I created attachment 8983793 [details] [diff] [review] by manually applying the changes that were done to the client sources. Ted, do you think this looks sane? It builds correctly and I could capture some minidumps on Linux that show no major differences with the old ones; all the relevant fields seem to be populated correctly. Still, I've got plenty of testing to do though, especially Fennec which regressed the last time we did this.
Here's a try run for all architectures: https://treeherder.mozilla.org/#/jobs?repo=try&revision=889d6879fd3d1b688bea59ca6e0c9a5916d6ca72 I hope it crashes so that I can have a look at the minidumps. I've included all the commits we had to previously back out in the hope that the issues with merging mappings have been sorted out in the meantime.
Created attachment 8984947 [details] [diff] [review] [PATCH 4/4 v2] Build system and tools adjustments The minidump-analyzer needed a Windows-specific fix to handle unloaded modules, a new feature that was introduced in the meantime.
Attachment #8983794 - Attachment is obsolete: true
I've tested the patch-set on automation and there seems to be an issue symbolicating stacks on Linux, the other architectures look OK. This is a try run with a fake crash injected in the xpcshell-test with the patches applied: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b380d5e610559b3412f67c997bdfbb12bdef81a8 And this is without the patches: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b0c7e89906e0b0a925c7befc96d9ab56b17f8cf9
I've just finished analyzing the minidumps for both runs and the issue with Linux appears to be the same we encountered the last time I tried updating breakpad: the libxul.so mappings aren't been merged correctly and two entries for libxul.so are added to the modules list instead of one. This throws of symbolication since one of the two entries has no code or debug identifiers. I'll probable have to revert the problematic commits in breakpad like I did the old time. I'll also file a bug to fix this for good.
Created attachment 8985135 [details] Bug 1309172 - Updated breakpad to version 7b3afa9258e58a57ffbeb395d445811f92616ae9 This includes only the vanilla files we haven't forked
Created attachment 8985136 [details] Bug 1309172 - Updated the forked Breakpad client sources by manually applying working revisions up to version 7b3afa9258e58a57ffbeb395d445811f92616ae9 The following changes were not included as they break merging the segments corresponding to libxul.so in the module list: 8915f7be39448d9257b6da3ad0233944d1d9a92a 17ad0c18b179c135fc5a3d2bba199c3fa4276035 94b6309aecaddfcf11672f6cfad9575d68ad3b40 With these changes applied two entries for libxul.so are generated, the second one is bogus and prevents symbolication from working correctly.
Attachment #8983791 - Attachment is obsolete: true
Attachment #8983792 - Attachment is obsolete: true
Attachment #8983793 - Attachment is obsolete: true
Attachment #8984947 - Attachment is obsolete: true
I've uploaded the patches to Phabricator, since it's the first time I upload a stack of them I hope I did it the right way. The patch-set is split into three parts: the first includes all the changes to the vanilla breakpad sources, the second the changes to our forked code minus the problematic changes on the breakpad side and the third the fixups required to accommodate for the changes. Here's an updated try run with the bogus crash to ensure that symbolication is working correctly: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a79e27eab6e5902feda15dc3169293ea7b1d2000 Ted, do we need to anything on Socorro's side to accommodate for this? I remember you had to update something the previous time we did this but I don't remember what exactly.
Did you include the changes from bug 1464537?
(In reply to Mike Hommey [:glandium] from comment #16) > Did you include the changes from bug 1464537? No, our update script is OK with cherry-picked commits but will remove stuff we haven't pushed upstream. I'll make sure this is included (and possibly other changes we might have missed).
I've updated the patch including the changes from bug 1464537. I'll also try to upstream this so that it doesn't get lost the next time around.
After reading a bit more on bug 1464537 I was wondering if I did the right thing. You mentioned on that bug that  is sufficient for fixing this issue but I'm not sure if it is. Looking at the code it seems that if we fail to demangle the name the resulting name is still cleared so part of your change would still be needed. Or did I get it wrong?  https://chromium.googlesource.com/breakpad/breakpad/+/7398ce15b79daf038cd07fbae4e37e183e99788d
What I said in that bug is that there is one part that was already fixed upstream, and that I filed the other.
FYI I r+ed the other patch in Gerrit, but I can't seem to submit it, perhaps because the patch is in a weird state? glandium said he can't log into the Google account he used to submit the change while he's away from home, so we might need to wait for him to do that.
Comment on attachment 8985135 [details] Bug 1309172 - Updated breakpad to version 7b3afa9258e58a57ffbeb395d445811f92616ae9 Ted Mielczarek [:ted.mielczarek] has approved the revision. https://phabricator.services.mozilla.com/D1640
Attachment #8985135 - Flags: review+
Alright, I'll wait for Mike's change to be upstreamed and then I'll update the patch here so that the revision in GIT-INFO includes it.
Comment on attachment 8985136 [details] Bug 1309172 - Updated the forked Breakpad client sources by manually applying working revisions up to version 7b3afa9258e58a57ffbeb395d445811f92616ae9 Ted Mielczarek [:ted.mielczarek] has approved the revision. https://phabricator.services.mozilla.com/D1641
Attachment #8985136 - Flags: review+
(In reply to Gabriele Svelto [:gsvelto] from comment #23) > Alright, I'll wait for Mike's change to be upstreamed and then I'll update > the patch here so that the revision in GIT-INFO includes it. glandium fixed the patch state and I submitted his patch: https://chromium.googlesource.com/breakpad/breakpad/+/69c2c51dd89965d234eec16e3a9353634831916b
Cool, I'm updating the patches right now.
BTW, I'll have to squash them before landing because they cannot be built separately and if something goes wrong I want to make the backout as painless as possible.
Patch set updated with revision 69c2c51dd89965d234eec16e3a9353634831916b, we're almost good to land.
Comment on attachment 8985137 [details] Bug 1309172 - Build system and tools adjustments Ted Mielczarek [:ted.mielczarek] has approved the revision. https://phabricator.services.mozilla.com/D1642
Attachment #8985137 - Flags: review+
(In reply to Gabriele Svelto [:gsvelto] from comment #15) > Ted, do we need to anything on Socorro's side to accommodate for this? I > remember you had to update something the previous time we did this but I > don't remember what exactly. We only need to coordinate a Socorro update if we update the client code in Firefox such that it changes the generated minidumps in a backwards-incompatible way, or if we update the symbol dumping code such that generated .sym files will change in a backwards-incompatible way. The rev of Breakpad that Socorro's stackwalker is built against is pinned here, FYI: https://github.com/mozilla-services/socorro/blob/3a5b7131ca961d1d37c0a540ed61c9309134f0f7/scripts/build-breakpad.sh#L18 It was last updated in bug 1421988 for exactly those reasons (the symbol format got changed) so we should be in good shape as long as nothing in the client has changed the dump format in an odd way, and it doesn't sound like it has.
I tested submitting dumps to Socorro and they seem to be processed just fine; crashes on try are also properly symbolicated. I'll squash the patches and land them; if I missed something in my testing backing it out should be easy enough.
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/mozilla-inbound/rev/07a62b4e7923 Updated breakpad to version 69c2c51dd89965d234eec16e3a9353634831916b; r=ted.mielczarek
Backout by email@example.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/e96c47860e6e Backed out 1 changesets for breakpad related failures. CLOSED TREE
Backed out 1 changesets (bug 1309172) for breakpad related failures. push that caused the backout: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=07a62b4e792399774c6c056140f1a6bf9102b164&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-classifiedState=unclassified failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=07a62b4e792399774c6c056140f1a6bf9102b164&filter-failure_classification_id=2 backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/e96c47860e6ec24ebfe0c285332deea1ffcb5d87
It seems that fix_stack_using_bpsyms.py is encountering an unexpected character in the symbol file for a PUBLIC entry. Specifically I found these lines: webrtc-gtest.pdb/150FC4DC2A3C4F4D9E084E1B4A5FBF631/webrtc-gtest.sym:PUBLIC m 8f7d93 0 _intrinsic_setjmp xul.pdb/6402337BFF9149C1819558441909A84E1/xul.sym:PUBLIC m 5c49783 0 _intrinsic_setjmp xul.pdb/DEF358122E104788AFDF3C3D4AC442391/xul.sym:PUBLIC m 63f93c3 0 _intrinsic_setjmp mozglue.pdb/87479AA1ED144313BA1FB3194F6CD9931/mozglue.sym:PUBLIC m 47082 0 _intrinsic_setjmp I wonder where that 'm' is coming from and what does it mean. I'll check it out and fix the issue in our scripts.
This is caused by a change in breakpad that was introduced some time ago, here's the relevant issue and patch: https://bugs.chromium.org/p/google-breakpad/issues/detail?id=751 https://chromium.googlesource.com/breakpad/breakpad/+/b1226959a25b6a5311801d6f204b088c706e7c25 I'm trying to figure out if we need changes on our side besides being able to parse the new format.
Oh shoot, right, that was the reason for bug 1421988. Ugh, I just realized that we'll have to fix symbols.mozilla.org as well because it provides the symbolication API that snappy used to.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #37) > Oh shoot, right, that was the reason for bug 1421988. Ugh, I just realized > that we'll have to fix symbols.mozilla.org as well because it provides the > symbolication API that snappy used to. Looking at the generated .sym files the only difference seem to be the _intrinsic_setjmp entries which have this 'multiple' thing added to their line, the rest of the file is identical. Do you know what should be done about it? If I make fix_stack_using_bpsyms.py ignore that character everything still seems to work correctly but I wonder if it would actually need special handling.
I think simply parsing and ignoring the m will be sufficient. It shouldn't change our existing behavior. I don't know exactly what the code does now in the face of multiple functions, but presumably the output is arbitrary based on the ordering in the file.
I've updated the last patch with the fixes to fix_stack_using_bpsyms.py. Win64 mochitests are now passing correctly.
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/mozilla-inbound/rev/5284d9a25ffb Updated breakpad to version 69c2c51dd89965d234eec16e3a9353634831916b; r=ted.mielczarek
Status: ASSIGNED → RESOLVED
Last Resolved: 8 months ago
status-firefox63: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
You need to log in before you can comment on or make changes to this bug.