Status

()

RESOLVED FIXED
2 years ago
8 days ago

People

(Reporter: gsvelto, Assigned: gsvelto)

Tracking

(Depends on: 1 bug)

unspecified
mozilla63
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox63 fixed)

Details

Attachments

(3 attachments, 5 obsolete attachments)

(Assignee)

Description

2 years ago
+++ This bug was initially created as a clone of Bug #1264367 +++

We've run into a bunch of issues when sync'ing breakpad with upstream in bug 1264367 mostly because we had fallen back quite a bit. This is a reminder to update breakpad soon-ish, possibly after having fixed the issues with merging module mappings we encountered, least we fall back again.
(Assignee)

Updated

2 years ago
Blocks: 1316159
(Assignee)

Comment 1

2 years ago
I've landed a fix upstream that makes minidump UUIDs truly random on Linux:

https://chromium.googlesource.com/breakpad/breakpad/+/a9fca58305afcb16a7ae539d87d672c702aba3f1

This might be a good time to pull from upstream so that we get that change.
(Assignee)

Updated

a year ago
Depends on: 1421988
(Assignee)

Comment 2

9 months ago
Since I've picked up bug 1440282 it's about time we do this.
(Assignee)

Updated

9 months ago
Assignee: nobody → gsvelto
Status: NEW → ASSIGNED
(Assignee)

Comment 3

9 months ago
Created attachment 8983791 [details] [diff] [review]
[PATCH 1/4] Apply brekpad revisions which were previously dropped

These are revisions 17ad0c18b179c135fc5a3d2bba199c3fa4276035 and
8915f7be39448d9257b6da3ad0233944d1d9a92a which had not been included during
the last pull because they caused issues when generating minidumps on Fennec.
(Assignee)

Updated

9 months ago
Attachment #8983791 - Attachment description: [PATCH] Apply brekpad revisions which were previously dropped → [PATCH 1/4] Apply brekpad revisions which were previously dropped
(Assignee)

Comment 4

9 months ago
Created attachment 8983792 [details] [diff] [review]
[PATCH 2/4] Updated breakpad to version 7b3afa9258e58a57ffbeb395d445811f92616ae9

This includes only the vanilla files we haven't forked
(Assignee)

Comment 5

9 months ago
Created attachment 8983793 [details] [diff] [review]
[PATCH 3/4] Updated the forked Breakpad client sources

This was done by manually applying the relevant patches all the way up to version 7b3afa9258e58a57ffbeb395d445811f92616ae9. I had to manually fix only a handful of conflicts.
(Assignee)

Comment 6

9 months ago
Created attachment 8983794 [details] [diff] [review]
[PATCH 4/4] Build system adjustments
(Assignee)

Comment 7

9 months ago
This updates breakpad to match the current upstream. I've used the update-breakpad.sh script to generate attachment 8983793 [details] [diff] [review] and I created attachment 8983793 [details] [diff] [review] by manually applying the changes that were done to the client sources. Ted, do you think this looks sane?

It builds correctly and I could capture some minidumps on Linux that show no major differences with the old ones; all the relevant fields seem to be populated correctly. Still, I've got plenty of testing to do though, especially Fennec which regressed the last time we did this.
Flags: needinfo?(ted)
(Assignee)

Comment 8

8 months ago
Here's a try run for all architectures: https://treeherder.mozilla.org/#/jobs?repo=try&revision=889d6879fd3d1b688bea59ca6e0c9a5916d6ca72

I hope it crashes so that I can have a look at the minidumps. I've included all the commits we had to previously back out in the hope that the issues with merging mappings have been sorted out in the meantime.
(Assignee)

Comment 9

8 months ago
Created attachment 8984947 [details] [diff] [review]
[PATCH 4/4 v2] Build system and tools adjustments

The minidump-analyzer needed a Windows-specific fix to handle unloaded modules, a new feature that was introduced in the meantime.
Attachment #8983794 - Attachment is obsolete: true
(Assignee)

Comment 10

8 months ago
I've tested the patch-set on automation and there seems to be an issue symbolicating stacks on Linux, the other architectures look OK. This is a try run with a fake crash injected in the xpcshell-test with the patches applied:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b380d5e610559b3412f67c997bdfbb12bdef81a8

And this is without the patches:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b0c7e89906e0b0a925c7befc96d9ab56b17f8cf9
(Assignee)

Comment 11

8 months ago
I've just finished analyzing the minidumps for both runs and the issue with Linux appears to be the same we encountered the last time I tried updating breakpad: the libxul.so mappings aren't been merged correctly and two entries for libxul.so are added to the modules list instead of one. This throws of symbolication since one of the two entries has no code or debug identifiers. I'll probable have to revert the problematic commits in breakpad like I did the old time. I'll also file a bug to fix this for good.
Created attachment 8985135 [details]
Bug 1309172 - Updated breakpad to version 7b3afa9258e58a57ffbeb395d445811f92616ae9

This includes only the vanilla files we haven't forked
Created attachment 8985136 [details]
Bug 1309172 - Updated the forked Breakpad client sources by manually applying working revisions up to version 7b3afa9258e58a57ffbeb395d445811f92616ae9

The following changes were not included as they break merging the segments
corresponding to libxul.so in the module list:

8915f7be39448d9257b6da3ad0233944d1d9a92a
17ad0c18b179c135fc5a3d2bba199c3fa4276035
94b6309aecaddfcf11672f6cfad9575d68ad3b40

With these changes applied two entries for libxul.so are generated, the second
one is bogus and prevents symbolication from working correctly.
Created attachment 8985137 [details]
Bug 1309172 - Build system and tools adjustments
(Assignee)

Updated

8 months ago
Attachment #8983791 - Attachment is obsolete: true
(Assignee)

Updated

8 months ago
Attachment #8983792 - Attachment is obsolete: true
(Assignee)

Updated

8 months ago
Attachment #8983793 - Attachment is obsolete: true
(Assignee)

Updated

8 months ago
Attachment #8984947 - Attachment is obsolete: true
(Assignee)

Comment 15

8 months ago
I've uploaded the patches to Phabricator, since it's the first time I upload a stack of them I hope I did it the right way. The patch-set is split into three parts: the first includes all the changes to the vanilla breakpad sources, the second the changes to our forked code minus the problematic changes on the breakpad side and the third the fixups required to accommodate for the changes.

Here's an updated try run with the bogus crash to ensure that symbolication is working correctly:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a79e27eab6e5902feda15dc3169293ea7b1d2000

Ted, do we need to anything on Socorro's side to accommodate for this? I remember you had to update something the previous time we did this but I don't remember what exactly.
Did you include the changes from bug 1464537?
(Assignee)

Comment 17

8 months ago
(In reply to Mike Hommey [:glandium] from comment #16)
> Did you include the changes from bug 1464537?

No, our update script is OK with cherry-picked commits but will remove stuff we haven't pushed upstream. I'll make sure this is included (and possibly other changes we might have missed).
(Assignee)

Comment 18

8 months ago
I've updated the patch including the changes from bug 1464537. I'll also try to upstream this so that it doesn't get lost the next time around.
(Assignee)

Comment 19

8 months ago
After reading a bit more on bug 1464537 I was wondering if I did the right thing. You mentioned on that bug that [1] is sufficient for fixing this issue but I'm not sure if it is. Looking at the code it seems that if we fail to demangle the name the resulting name is still cleared so part of your change would still be needed. Or did I get it wrong?

[1] https://chromium.googlesource.com/breakpad/breakpad/+/7398ce15b79daf038cd07fbae4e37e183e99788d
Flags: needinfo?(mh+mozilla)
What I said in that bug is that there is one part that was already fixed upstream, and that I filed the other.
Flags: needinfo?(mh+mozilla)
FYI I r+ed the other patch in Gerrit, but I can't seem to submit it, perhaps because the patch is in a weird state? glandium said he can't log into the Google account he used to submit the change while he's away from home, so we might need to wait for him to do that.
Flags: needinfo?(ted)
Comment on attachment 8985135 [details]
Bug 1309172 - Updated breakpad to version 7b3afa9258e58a57ffbeb395d445811f92616ae9

Ted Mielczarek [:ted.mielczarek] has approved the revision.

https://phabricator.services.mozilla.com/D1640
Attachment #8985135 - Flags: review+
(Assignee)

Comment 23

8 months ago
Alright, I'll wait for Mike's change to be upstreamed and then I'll update the patch here so that the revision in GIT-INFO includes it.
Comment on attachment 8985136 [details]
Bug 1309172 - Updated the forked Breakpad client sources by manually applying working revisions up to version 7b3afa9258e58a57ffbeb395d445811f92616ae9

Ted Mielczarek [:ted.mielczarek] has approved the revision.

https://phabricator.services.mozilla.com/D1641
Attachment #8985136 - Flags: review+
(In reply to Gabriele Svelto [:gsvelto] from comment #23)
> Alright, I'll wait for Mike's change to be upstreamed and then I'll update
> the patch here so that the revision in GIT-INFO includes it.

glandium fixed the patch state and I submitted his patch:
https://chromium.googlesource.com/breakpad/breakpad/+/69c2c51dd89965d234eec16e3a9353634831916b
(Assignee)

Comment 26

8 months ago
Cool, I'm updating the patches right now.
(Assignee)

Comment 27

8 months ago
BTW, I'll have to squash them before landing because they cannot be built separately and if something goes wrong I want to make the backout as painless as possible.
(Assignee)

Comment 28

8 months ago
Patch set updated with revision 69c2c51dd89965d234eec16e3a9353634831916b, we're almost good to land.
Comment on attachment 8985137 [details]
Bug 1309172 - Build system and tools adjustments

Ted Mielczarek [:ted.mielczarek] has approved the revision.

https://phabricator.services.mozilla.com/D1642
Attachment #8985137 - Flags: review+
(In reply to Gabriele Svelto [:gsvelto] from comment #15)
> Ted, do we need to anything on Socorro's side to accommodate for this? I
> remember you had to update something the previous time we did this but I
> don't remember what exactly.

We only need to coordinate a Socorro update if we update the client code in Firefox such that it changes the generated minidumps in a backwards-incompatible way, or if we update the symbol dumping code such that generated .sym files will change in a backwards-incompatible way.

The rev of Breakpad that Socorro's stackwalker is built against is pinned here, FYI:
https://github.com/mozilla-services/socorro/blob/3a5b7131ca961d1d37c0a540ed61c9309134f0f7/scripts/build-breakpad.sh#L18

It was last updated in bug 1421988 for exactly those reasons (the symbol format got changed) so we should be in good shape as long as nothing in the client has changed the dump format in an odd way, and it doesn't sound like it has.
(Assignee)

Comment 31

8 months ago
I tested submitting dumps to Socorro and they seem to be processed just fine; crashes on try are also properly symbolicated. I'll squash the patches and land them; if I missed something in my testing backing it out should be easy enough.

Comment 32

8 months ago
Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/07a62b4e7923
Updated breakpad to version 69c2c51dd89965d234eec16e3a9353634831916b; r=ted.mielczarek

Comment 33

8 months ago
Backout by aciure@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e96c47860e6e
Backed out 1 changesets for breakpad related failures. CLOSED TREE
(Assignee)

Comment 35

8 months ago
It seems that fix_stack_using_bpsyms.py is encountering an unexpected character in the symbol file for a PUBLIC entry. Specifically I found these lines:

webrtc-gtest.pdb/150FC4DC2A3C4F4D9E084E1B4A5FBF631/webrtc-gtest.sym:PUBLIC m 8f7d93 0 _intrinsic_setjmp
xul.pdb/6402337BFF9149C1819558441909A84E1/xul.sym:PUBLIC m 5c49783 0 _intrinsic_setjmp
xul.pdb/DEF358122E104788AFDF3C3D4AC442391/xul.sym:PUBLIC m 63f93c3 0 _intrinsic_setjmp
mozglue.pdb/87479AA1ED144313BA1FB3194F6CD9931/mozglue.sym:PUBLIC m 47082 0 _intrinsic_setjmp

I wonder where that 'm' is coming from and what does it mean. I'll check it out and fix the issue in our scripts.
Flags: needinfo?(gsvelto)
(Assignee)

Comment 36

8 months ago
This is caused by a change in breakpad that was introduced some time ago, here's the relevant issue and patch:

https://bugs.chromium.org/p/google-breakpad/issues/detail?id=751

https://chromium.googlesource.com/breakpad/breakpad/+/b1226959a25b6a5311801d6f204b088c706e7c25

I'm trying to figure out if we need changes on our side besides being able to parse the new format.
Oh shoot, right, that was the reason for bug 1421988. Ugh, I just realized that we'll have to fix symbols.mozilla.org as well because it provides the symbolication API that snappy used to.
(Assignee)

Comment 38

8 months ago
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #37)
> Oh shoot, right, that was the reason for bug 1421988. Ugh, I just realized
> that we'll have to fix symbols.mozilla.org as well because it provides the
> symbolication API that snappy used to.

Looking at the generated .sym files the only difference seem to be the _intrinsic_setjmp entries which have this 'multiple' thing added to their line, the rest of the file is identical. Do you know what should be done about it? If I make fix_stack_using_bpsyms.py ignore that character everything still seems to work correctly but I wonder if it would actually need special handling.
Flags: needinfo?(ted)
I think simply parsing and ignoring the m will be sufficient. It shouldn't change our existing behavior. I don't know exactly what the code does now in the face of multiple functions, but presumably the output is arbitrary based on the ordering in the file.
Flags: needinfo?(ted)
(Assignee)

Comment 40

8 months ago
I've updated the last patch with the fixes to fix_stack_using_bpsyms.py. Win64 mochitests are now passing correctly.

Comment 41

8 months ago
Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5284d9a25ffb
Updated breakpad to version 69c2c51dd89965d234eec16e3a9353634831916b; r=ted.mielczarek

Comment 42

8 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/5284d9a25ffb
Status: ASSIGNED → RESOLVED
Last Resolved: 8 months ago
status-firefox63: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Depends on: 1474871
Depends on: 1526897
You need to log in before you can comment on or make changes to this bug.