Closed
Bug 1485759
Opened 6 years ago
Closed 6 years ago
No symbolication in Fennec Android Nightly since Aug 14
Categories
(Toolkit :: Crash Reporting, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla63
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox-esr60 | --- | unaffected |
firefox61 | --- | unaffected |
firefox62 | --- | unaffected |
firefox63 | blocking | fixed |
People
(Reporter: mccr8, Assigned: glandium)
References
Details
(Keywords: regression)
Attachments
(1 file)
2.47 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
Crashes from the 8-22 Android Nightly have signatures like OOM | large | libxul.so@0xffeda4 | libxul.so@0xffcfdf | libxul.so@0xffa1b9 | libxul.so@0x1f1247f | libxul.so@0x1f13a7f | libxul.so@0x1f13a37 | libxul.so@0x1f13a37 | libxul.so@0x102f6fb | libxul.so@0x1f13a37 | libxul.so@0x102f50f | libxul.so@0x1f13e1d example: bp-283d1876-52e1-483d-9893-7632f0180823 This seems to affect every C++ frame in the signatures.
Reporter | ||
Comment 1•6 years ago
|
||
Linux looks okay for that Nightly.
Comment 2•6 years ago
|
||
Are these possibly some of the Geckoview crashes? cpeterson might know.
Flags: needinfo?(cpeterson)
Comment 3•6 years ago
|
||
Socorro has a processor rule that fixes the product for crashes incorrectly marked as Fennec that should be Focus. The example crash in the description isn't a content process crash. Unless I'm misunderstanding things, I'm pretty sure it's not Focus and probably not a GeckoView crash.
Comment 4•6 years ago
|
||
Ok, based on Comment 3 clearing the ni for Chris.
Flags: needinfo?(cpeterson)
Reporter | ||
Comment 5•6 years ago
|
||
This is continuing in the 8-23 build, it looks like: https://crash-stats.mozilla.com/search/?build_id=20180823100113&release_channel=nightly&product=FennecAndroid&platform=Android&date=%3E%3D2018-08-22T17%3A00%3A00.000Z&date=%3C2018-08-24T11%3A19%3A46.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature
Comment 6•6 years ago
|
||
These unsymbolicated crash reports (or at least bp-283d1876-52e1-483d-9893-7632f0180823 from comment 0) are not from GeckoView because: 1. Focus is testing GeckoView 62.0b Beta and the above crash report is Gecko version 63.0a1 Nightly. 2. The above crash report has the "gws-and-facebook-spoof%40mozilla.org:1.0.0,webcompat%40mozilla.org:2.0.1" extension installed, which is a Fennec 63.0a1 Nightly experiment. @ James: do you have any theories why we are getting unsymbolicated crash reports from Fennec 63.0a1 Nightly starting around August 22 this week?
status-firefox63:
--- → affected
Flags: needinfo?(snorp)
OS: Unspecified → Android
Summary: No symbolication for the 8-22 Android Nightly → No symbolication for the 8-22 Fennec Android Nightly
Reporter | ||
Comment 7•6 years ago
|
||
I went back to the old builds, and it looks like the first build that isn't symbolicated is the 20180814100103 build. 20180814100103 looks like the first bad build. For example: bp-26813110-78e1-4c0f-9333-f7c210180822 The prior build, 20180813100105, looks okay to me. For example: bp-54f60dc1-2547-4d77-b294-2166f0180817 The range for commits added to the 20180814100103 build is: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=bf79440c1376b1e1114ba653917e1577d7b1007b&tochange=914b3b370ad059a04ad751642b74e013f8e3ad08 I see "Bug 1480006 - Enable LTO on Android CI builds." in that range. glandium, can you take a look please?
Reporter | ||
Updated•6 years ago
|
tracking-firefox63:
--- → ?
Summary: No symbolication for the 8-22 Fennec Android Nightly → No symbolication in Fennec Android Nightly since LTO was enabled
Reporter | ||
Updated•6 years ago
|
Summary: No symbolication in Fennec Android Nightly since LTO was enabled → No symbolication in Fennec Android Nightly since Aug 14
Assignee | ||
Comment 9•6 years ago
|
||
If I take the crashreporter symbols from the nightly build corresponding to the crash in comment 0, that is: https://queue.taskcluster.net/v1/task/ODjSM4BUSmOVwnkRxjSE2g/runs/0/artifacts/public/build/target.crashreporter-symbols.zip That archive contains the file libxul.so/4296B0626A41F5C300000000000000000/libxul.so.sym, which looks file. OTOH, the Module tab of the crash report says "missing symbols" for libxul.so debug identifier 4296B0626A41F5C301000000900000000, which matches. Ted, any idea what's up there?
Flags: needinfo?(mh+mozilla) → needinfo?(ted)
Assignee | ||
Comment 10•6 years ago
|
||
Also, if I manually symbolicate, with that libxul.so.sym, the top frames are; NS_ABORT_OOM(unsigned int) xpcom/base/nsDebugImpl.cpp:624 nsTSubstring<char16_t>::SetCapacity(unsigned int) xpcom/string/nsTSubstring.h:818 nsTSubstring<char16_t>::SetLength(unsigned int) xpcom/string/nsTSubstring.cpp:809 mozilla::dom::XMLHttpRequestMainThread::AppendToResponseText(char const*, unsigned int, bool) dom/xhr/XMLHttpRequestString.cpp:244
Some combo of glandium/ted probably has this under control :)
Flags: needinfo?(snorp)
Comment 12•6 years ago
|
||
For the crash in comment 0, looking at the raw dump tab shows in the modules list: { "base_addr": "0xc7889000", "code_id": "62b09642416ac3f50100000090000000", "debug_file": "libxul.so", "debug_id": "4296B0626A41F5C301000000900000000", "end_addr": "0xcb2cf000", "filename": "libxul.so", "missing_symbols": true, "version": "" }, I downloaded the matching build: https://queue.taskcluster.net/v1/task/ODjSM4BUSmOVwnkRxjSE2g/runs/0/artifacts/public/build/en-US/target.apk And (after decompressing it with xz) the libxul.so in there shows a very short build id: Displaying notes found at file offset 0x009100a0 with length 0x00000018: Owner Data size Description GNU 0x00000008 NT_GNU_BUILD_ID (unique build ID bitstring) Build ID: 62b09642416ac3f5 dump_syms on that file produces: $ dump_syms -i ./libxul.so MODULE Linux arm 4296B0626A41F5C300000000000000000 libxul.so INFO CODE_ID 62B09642416AC3F5 For comparison, I downloaded a nightly from 2018-08-13 (right before enabling LTO): https://index.taskcluster.net/v1/task/gecko.v2.mozilla-central.nightly.2018.08.13.latest.mobile.android-api-16-opt/artifacts/public/build/en-US/target.apk and it has a Build ID that's the length I would expect: Displaying notes found at file offset 0x000001ec with length 0x00000024: Owner Data size Description GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring) Build ID: 2c23f3cb5ea39f0db6cade029c8217dc9696d647 dump_syms on that file produces: $ dump_syms -i ./libxul.so MODULE Linux arm CBF3232CA35E0D9FB6CADE029C8217DC0 libxul.so INFO CODE_ID 2C23F3CB5EA39F0DB6CADE029C8217DC9696D647 I think the Breakpad minidump writing code might have a bug with Build IDs that are this short. It looks like it's reading off the end of the array or something.
Flags: needinfo?(ted)
Assignee | ||
Comment 13•6 years ago
|
||
aha! with bfd ld, --build-id is equivalent to --build-id=sha1. with lld, it's equivalent to --build-id=fast. We "just" need to be more explicit.
Assignee | ||
Comment 14•6 years ago
|
||
Assignee: nobody → mh+mozilla
Attachment #9004538 -
Flags: review?(core-build-config-reviews)
Updated•6 years ago
|
Attachment #9004538 -
Flags: review?(core-build-config-reviews) → review+
Assignee | ||
Comment 15•6 years ago
|
||
(In reply to Ted Mielczarek [:ted] [:ted.mielczarek] from comment #12) > For the crash in comment 0, looking at the raw dump tab shows in the modules > list: > { "base_addr": "0xc7889000", "code_id": "62b09642416ac3f50100000090000000", > "debug_file": "libxul.so", "debug_id": "4296B0626A41F5C301000000900000000", > "end_addr": "0xcb2cf000", "filename": "libxul.so", "missing_symbols": true, > "version": "" }, Heh, I actually already pasted that number in comment 9, but failed to realize that there was a 9 in between the zeros. > I think the Breakpad minidump writing code might have a bug with Build IDs > that are this short. It looks like it's reading off the end of the array or > something. I think it's still desirable to have sha1s as build-ids (or at least something larger), but it seems like it would be a good thing to fix that bug indeed.
Comment 16•6 years ago
|
||
Pushed by mh@glandium.org: https://hg.mozilla.org/integration/mozilla-inbound/rev/2b045052d4aa Pass --build-id=sha1 to the linker instead of --build-id. r=froydnj
Comment 17•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/2b045052d4aa
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Updated•6 years ago
|
Severity: normal → blocker
status-firefox61:
--- → unaffected
status-firefox62:
--- → unaffected
status-firefox-esr52:
--- → unaffected
status-firefox-esr60:
--- → unaffected
Priority: -- → P1
Comment 18•6 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #13) > aha! with bfd ld, --build-id is equivalent to --build-id=sha1. with lld, > it's equivalent to --build-id=fast. We "just" need to be more explicit. It would be nice if lld had a page on llvm.org that listed its commandline options. :-/ https://github.com/llvm-mirror/lld/blob/b771e1958601a28fafce682708530b493d0c89a6/ELF/Options.td#L30 Spelunking through blame shows: https://github.com/llvm-mirror/lld/commit/3408d8720ddc65f867e6046f0ebd898feaae1075 "We made a deliberate choice to not use a secure hash function for the sake of performance. Computing a secure hash is slow -- for example, MD5 throughput is usually 400 MB/s or so. SHA1 is slower than that." (In reply to Mike Hommey [:glandium] from comment #15) > I think it's still desirable to have sha1s as build-ids (or at least > something larger), but it seems like it would be a good thing to fix that > bug indeed. I filed bug 1487197 on that.
Reporter | ||
Comment 19•6 years ago
|
||
Looks like Android symbolication is working again: bp-57c5ddd9-6275-4e4e-8854-cf2340180829 Thanks for the fix!
You need to log in
before you can comment on or make changes to this bug.
Description
•