Closed Bug 698585 Opened 8 years ago Closed 2 years ago

Fennec Nightly shows an empty signature when it should not.

Categories

(Socorro :: General, task, P1, major)

ARM
Android

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nhirata, Unassigned)

Details

1. see : 
 https://crash-stats.mozilla.com/report/list?range_value=7&range_unit=days&date=2011-10-31%2010%3A00%3A00&missing_sig=EMPTY_STRING&version=Fennec%3A10.0a1 

Expected: should have some crash signature within the stacks of the crashes
Actual: no crash stacks at all

Note:
1. Java crash example within that set, see the app note section : https://crash-stats.mozilla.com/report/index/b0fbf30f-93f7-46f1-8c86-deff32111031 
2. I would have expected to see something like Java_org_mozilla_gecko_GeckoAppShell_reportJavaCrash in the stack.
( see https://crash-stats.mozilla.com/report/index/4af9643a-8adb-4f56-ad7f-64f0c2111029 )
This is not a (Socorro) bug. If we run out of memory, we often just don't have a chance to gather the info to send to the crash server. There might be other reasons as well that we encounter such a situation, Ted knows more on that. In any case, that would be a problem in the Breakpad client, I'd guess, not in Socorro.

Ted, do we have bugs on the Breakpad side to improve our situation there?

bp-b0fbf30f-93f7-46f1-8c86-deff32111031 might be a special case, which we need to care about, though as it's from the birch branch and might be a crash in the native UI.
That report has a Java stack, though, which means we should have been intentionally aborting, which is not an OOM.
(In reply to Ted Mielczarek [:ted, :luser] from comment #2)
> That report has a Java stack, though, which means we should have been
> intentionally aborting, which is not an OOM.

Yes, that's this one, I guess that might actually be fixed with bug 686973 being pushed live later this week.

The other are probably just our usual OOM problem, right? In that case, I don't see a Socorro problem but this probably being a dupe of either the Breakpad bug(s) on better handling OOM or bug 686973.
That's not clear to me, no. The OOM handling is generally a problem on Windows, where we don't control the minidump writing code. On Linux, the Breakpad code goes to great lengths to be safe about memory handling while writing dumps.
KaiRo: In bug 686973, java signature generation is triggered solely by the presence of "Java_org_mozilla_gecko_GeckoAppShell_reportJavaCrash" as a signatureSentinel within the standard stack.  In the case of crash b0fbf30f-93f7-46f1-8c86-deff32111031, minidumpstack walk failed due to a corrupt dump file.  That means that the signatureSentinel would not have been found in the regular stack and Java signature generation would not have been triggered.

If you'd like to have Java signature generation to have a secondary trigger, we'll have to rework the changes made in Bug 686973.  File an additional bug if you'd like me to take action.
(In reply to Ted Mielczarek [:ted, :luser] from comment #4)
> That's not clear to me, no. The OOM handling is generally a problem on
> Windows, where we don't control the minidump writing code. On Linux, the
> Breakpad code goes to great lengths to be safe about memory handling while
> writing dumps.

OK, then it seems we have a different problem there, but given the messages in the processor notes, it surely looks like what we are getting from Breakpad is not useful, so I still suspect that this is a Breakpad bug in the end, do you agree? What could be happening there? Do we have any clue?

(In reply to K Lars Lohn [:lars] [:klohn] from comment #5)
> KaiRo: In bug 686973, java signature generation is triggered solely by the
> presence of "Java_org_mozilla_gecko_GeckoAppShell_reportJavaCrash" as a
> signatureSentinel within the standard stack.  In the case of crash
> b0fbf30f-93f7-46f1-8c86-deff32111031, minidumpstack walk failed due to a
> corrupt dump file.  That means that the signatureSentinel would not have
> been found in the regular stack and Java signature generation would not have
> been triggered.

Hrm, OK, so much for what I said above, then. A corrupt dump file is surely a bad thing in any case, though, and I guess that's the same for all the crashes mentioned in comment #0, so we should leave this bug to those, do you agree?

> If you'd like to have Java signature generation to have a secondary trigger,
> we'll have to rework the changes made in Bug 686973.  File an additional bug
> if you'd like me to take action.

Let's first see how this bug works out. If this is a one-off problem or something we can solve on the Breakpad side, I'd guess we come back with a good dump and the right signature, so let's first see if that's it.
Component: Socorro → General
Product: Webtools → Socorro
(In reply to Naoki Hirata :nhirata from comment #7)
> This has still occurred for Native Android (FennecAndroid) :
> https://crash-stats.mozilla.com/report/
> list?range_value=7&range_unit=days&date=2012-01-
> 08&missing_sig=EMPTY_STRING&version=FennecAndroid%3A12.0a1

Strangely, this URL doesn't work, but on https://crash-stats.mozilla.com/topcrasher/byversion/FennecAndroid/12.0a1/7 this is the top link in the list. I'm also not seeing those in my custom reports, so this feels fishy. We need the Socorro team to look behind the scenes there.
Severity: normal → major
Priority: -- → P1
these signatures are empty because the insert of the record into Postgres fails.  Bug 712737 addresses the issue and will be deploy tomorrow (January 13, 2012) with Socorro 2.4.

The actual issue is that the Java stack is in an unexpected form.  This induces the processor to generate a signature is not only useless, but too long for the column in Postgres.  Bug 712737 addresses the issue by detecting the degenerates and substituting a message for the faulty signature.
I think this has been fixed with 2.4 now and can be marked as such. We still get a "EMPTY: java stack not in expected format" message but that's a different problem.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.