Closed Bug 1070874 Opened 10 years ago Closed 9 years ago

crash in nsUrlClassifierPrefixSet::StoreToFd(mozilla::Scoped<mozilla::ScopedClosePRFDTraits>&) at 0x5a5a5a5a5a5a5a5a

Categories

(Toolkit :: Safe Browsing, defect)

All
macOS
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox35 --- affected

People

(Reporter: metasieben, Unassigned)

Details

(Keywords: crash)

Crash Data

This bug was filed from the Socorro interface and is 
report bp-6bbd18c8-7771-47f3-917b-9c5a12140922.
=============================================================
Please try to reproduce again and then provide some steps to reproduce the problem.
sadly reproducing the crash is almost impossible, since i don't know what happened before the crash. like i wrote in the crash-reporter comment; i was not doing anything, the screen was even locked, nightly was idling (afaik).

afaics there were at least 40 crashes(most osx 10.9) with the same sig in the last 14 days.
I was on Planet Mozilla, scrolling downwards using the "Page Down" key, when I hit this crash.

bp-66be40c6-6b12-4d34-9a67-91a152141008
Status: UNCONFIRMED → NEW
Ever confirmed: true
Component: Untriaged → Phishing Protection
Product: Core → Toolkit
Moving to Phishing Protection since Safe Browsing is on the stack.

Setting needinfo? from SafeBrowsing owners - Sid/Gian-Carlo, any idea on how to move this nightly crasher forward? (plus cc'ing more SafeBrowsing peers, from https://wiki.mozilla.org/Modules/All#Core )
Flags: needinfo?(sstamm)
Flags: needinfo?(gpascutto)
Version: 35 Branch → Trunk
(cc'ing :mmc, not sure if this is related to our new TP code)
The Mac backtraces are all unusable, they're supposedly crashing here:
http://hg.mozilla.org/releases/mozilla-aurora/annotate/baaa0c3ab8fd/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#l420
and the crash line in nsTArray is while trying to call the Length method on heap or stack objects.

There's a single Windows one, which is also unusable:
https://crash-stats.mozilla.com/report/index/ae9b82e4-f7ef-4e28-917d-909632141004
because it supposedly crashes here, trying to write out a stack object:
http://hg.mozilla.org/releases/mozilla-release/annotate/32dddf30405a/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#l366

This reeks of some sort of memory corruption to me, with this code being the unfortunate victim because it needs to run while the browser is idling.

I see nothing actionable here. Maybe if there's some more Windows stacks, which unlike the Mac ones tend to be actually usable...
Flags: needinfo?(gpascutto)
(In reply to Gian-Carlo Pascutto [:gcp] from comment #6)
> The Mac backtraces are all unusable, they're supposedly crashing here:

Adding Steven (Mac guru) to needinfo - hopefully he might be able to know why the backtraces are unusable?

> This reeks of some sort of memory corruption to me, with this code being the
> unfortunate victim because it needs to run while the browser is idling.

Well, if it's memory corruption, it doesn't seem good for the future. (plus Nightly users are crashing, so if nothing is done, maybe more users will hit it again as it rides the train?)

Boris, any ideas?
Flags: needinfo?(sstamm)
Flags: needinfo?(smichaud)
Flags: needinfo?(bzbarsky)
It's interesting that this signature is so heavily Mac OS X focused. Compare with other crashes in code that sits near this: bug 1050108 and bug 1080472. Those may be related, i.e. the memory usage measurement is also trying to get to the Length() of those buffers, as in comment 6.
No idea, sorry...
Flags: needinfo?(bzbarsky)
All I can tell from a quick look is that these crashes are all on a secondary thread, and are memory-poisoning crashes (at address 0x5a5a5a5a5a5a5a5a).  Will look more closely later.

The crash stacks *aren't* "unusable".  It's just that none of their OS-related symbols is symbolized -- a longstanding problem with Socorro.
Summary: crash in nsUrlClassifierPrefixSet::StoreToFd(mozilla::Scoped<mozilla::ScopedClosePRFDTraits>&) → crash in nsUrlClassifierPrefixSet::StoreToFd(mozilla::Scoped<mozilla::ScopedClosePRFDTraits>&) at 0x5a5a5a5a5a5a5a5a
The actual crashes take place here:

http://hg.mozilla.org/mozilla-central/annotate/98ea98c8191a/xpcom/glue/nsTArray.h#l328

Socorro can't resolve its link to nsTArray.h, because it's looking for it in the wrong location.  This, too, is a long standing Socorro bug.
I think I'll stop here.

On the face of it, it seems that mHdr is invalid (that mHdr == 0x5a5a5a5a5a5a5a5a).  I've *no* idea how that could happen.
Flags: needinfo?(smichaud)
(In reply to Steven Michaud from comment #11)
> The actual crashes take place here:
>
> http://hg.mozilla.org/mozilla-central/annotate/98ea98c8191a/xpcom/glue/
> nsTArray.h#l328

So the crashes trace back probably to:

http://hg.mozilla.org/mozilla-central/rev/909655c3ec14

or

http://hg.mozilla.org/mozilla-central/rev/f915a22def59

Anybody has any idea what's next?
Supersearch says this was first seen on Mac in nightly 20140805030300 -- but the crash rate is so low that it may have been introduced 1-3 days before.
(In reply to Gary Kwong [:gkw] [:nth10sd] GMT+8 after Oct 9 from comment #13)
> (In reply to Steven Michaud from comment #11)
> > The actual crashes take place here:
> >
> > http://hg.mozilla.org/mozilla-central/annotate/98ea98c8191a/xpcom/glue/
> > nsTArray.h#l328
> 
> So the crashes trace back probably to:
> 
> http://hg.mozilla.org/mozilla-central/rev/909655c3ec14

Pure coding style update, doesn't look like a functional change.

> http://hg.mozilla.org/mozilla-central/rev/f915a22def59

4 year old change with code that's used all over the tree.
(In reply to David Major [:dmajor] (UTC+13) from comment #14)
> Supersearch says this was first seen on Mac in nightly 20140805030300 -- but
> the crash rate is so low that it may have been introduced 1-3 days before.

Then likely bug 1046038 either created or exposed the problem.
(In reply to Steven Michaud from comment #10)

> The crash stacks *aren't* "unusable".  It's just that none of their
> OS-related symbols is symbolized -- a longstanding problem with Socorro.

My problem with the Mac OS X traces is that they don't have any stack *inside* StoreToFD, although it which is not an OS-related symbol. That means we can't see the line that's actually causing the crash.
I'm going to assume bug 1074196 already fixed this, based on the recent crash reports I see mostly being in Aurora and not past 2014-10-07.
This means that single Windows crash must be caused by something else, and that one is likely a dupe of bug 721196.
> I'm going to assume bug 1074196 already fixed this

I doubt it.

The patch for bug 1074196 landed on trunk on 2014-10-01:
https://hg.mozilla.org/mozilla-central/rev/3c8818fdb16c

But here are several crashes in builds that contain that patch:

bp-e7bdaf9a-f9f2-4a95-a315-dab7b2141008
bp-66be40c6-6b12-4d34-9a67-91a152141008
bp-6dca05b9-1685-4372-be92-d128d2141007
bp-9d0a0fa2-36bd-42e7-9718-b0ba22141005

> My problem with the Mac OS X traces is that they don't have any stack *inside* StoreToFD

I agree that's annoying.  But you can still learn a lot from these stacks.
Crash Signature: [@ nsUrlClassifierPrefixSet::StoreToFd(mozilla::Scoped<mozilla::ScopedClosePRFDTraits>&)] → [@ nsUrlClassifierPrefixSet::StoreToFd(mozilla::Scoped<mozilla::ScopedClosePRFDTraits>&)] [@ nsUrlClassifierPrefixSet::StoreToFd]
None of the recent reports with this signature are showing a crash at 0x5a5a5a5a5a5a5a5a. We already have the more-generic bug 721196 for crashes with this signature, so closing this specific instance as WFM.
Status: NEW → RESOLVED
Closed: 9 years ago
Keywords: testcase-wanted
Resolution: --- → WORKSFORME
> None of the recent reports with this signature are showing a crash at 0x5a5a5a5a5a5a5a5a.

The only two recent crashes on OS X are:

bp-c4c0902b-a835-4fc7-bfd6-6feb62151020
bp-5e69dd3a-0841-4dff-8078-889a12151020

Look at the raw dumps for both of those crashes.  And remember that, on all Intel platforms, some crash addresses can be misreported as '0' or '-1' (aka 0xffffffffffffffff).  See bug 1018360 comment #0.

Those two crashes aren't significant:  There are only two of them, and they're in old versions of Firefox.  So you don't need to change the status of this bug.  I just want to make sure my work figuring out bug 1018360 and friends isn't forgotten.  You really do need to watch out for misreported crash addresses.
You need to log in before you can comment on or make changes to this bug.