Unknown-cause topcrash in imglib [@ nsExpirationTracker<imgCacheEntry, 3>::RemoveObject(imgCacheEntry*) ]

VERIFIED FIXED in mozilla1.9.2a1

Status

()

Core
ImageLib
P1
critical
VERIFIED FIXED
9 years ago
7 years ago

People

(Reporter: Joe Drew (not getting mail), Assigned: Joe Drew (not getting mail))

Tracking

({crash, fixed1.9.1, topcrash})

Trunk
mozilla1.9.2a1
x86
Windows XP
crash, fixed1.9.1, topcrash
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9.1 +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [fixed by bug 481753], crash signature, URL)

Attachments

(2 attachments)

(Assignee)

Description

9 years ago
There is currently a topcrash in imglib that I haven't found a way to reproduce. It seems to happen only on Windows (at least, the crash reports have come 100% from Windows users), and I haven't found any STR anywhere.

For anybody who's trying to help me debug, setting the environment variables

NSPR_LOG_MODULES="imgRequest:5"
NSPR_LOG_FILE="imgRequest.log"

before running your trunk version of Firefox, and then sending me imgRequest.log when Firefox crashes, are essential to getting this fixed.

Neil Rashbrook has seen this crash in the past.
Flags: blocking1.9.1?
(Assignee)

Updated

9 years ago
Severity: normal → critical
Priority: -- → P1
Happened for me too while using the first tryserver build from Shawn on bug 455555. Do you need more steps?
Keywords: crash, topcrash
OS: Windows XP → All
Hardware: x86 → All
As Joe explained it to me, the fix for bug 466586 has resulted in this topcrash occuring. Since bug 466586 is a P1 blocker, that implies that this is a P1 blocker, although the fact that it's an "unknown cause" makes me worry.

Vlad: what do you suggest we do here? I see a couple of options:

A) Allow bug 466586 to land in B3 and use the additional crash stacks to determine the cause of this crash.

B) Defer bug 466586 to the next beta, and resolve this on trunk.

C) Hold Beta 3 for this bug's fix.
Flags: blocking1.9.1? → blocking1.9.1+
Keywords: crash, topcrash
OS: All → Windows XP
Hardware: All → x86
(Assignee)

Updated

9 years ago
Keywords: crash, topcrash
There's also a topcrash on Linux with the signature:
nsExpirationTracker<imgCacheEntry, 3u>::RemoveObject
Some of the stacks looks like the same bug to me.
bp-646481c8-5149-4cb1-8c33-2e98b2090226
bp-127414a0-6d87-4000-a858-981632090223
bp-714a311a-d3c3-450a-a256-f3da22090220

But others looks like a different crash:
bp-8e0dc0da-739f-4e3a-bc15-1e8152090226
bp-328f3e71-4f26-4daf-89aa-39b9f2090223
(Assignee)

Comment 4

9 years ago
They more than likely all have the same root cause. Thanks, Mats.

Comment 5

9 years ago
I had one of those crashes and all I can say is that I visited an image-heavy website but when images were still loading, I closed the window (possibly last browser window. I don't remember).

Comment 6

9 years ago
I wonder if Valgrind would help here.

Comment 7

9 years ago
Sorry, I meant to say "I wonder if bug 479502 fixed this".  The keys are like right next to each other.

Comment 8

9 years ago
My crash totally went away yesterday too, so I also recommend RESO DUPE.
(Assignee)

Comment 9

9 years ago
There are still crashes in crash-stats using 2009-02-26's build, so I don't think so.
Chris, could you give us some of the URLs from the crashes listed here?

http://tinyurl.com/d2wy7b

That would be really helpful to finding a reliable crash test. Thanks.

Comment 11

9 years ago
Created attachment 364838 [details]
valgrind log of invalid write during imgcache expiry

maybe this is related?

Comment 12

9 years ago
(In reply to comment #11)
> Created an attachment (id=364838) [details]
> valgrind log of invalid write during imgcache expiry
> 
> maybe this is related?

Or maybe that's what bug 479502 fixed, sorry.
(Assignee)

Comment 13

9 years ago
Yeah, almost certainly. :(
Moved to P2 as per earlier discussion.
Priority: P1 → P2
(Assignee)

Updated

9 years ago
Depends on: 481553
(Assignee)

Updated

9 years ago
Depends on: 481753

Comment 15

9 years ago
Created attachment 365818 [details]
log of crash using NSPR_LOG_MODULES=all:5

Log remained when using NSPR_LOG_MODULES="imgRequest:5" but when using all:5 generated data. Better than nothing? http://www.animatedengines.com/index.shtml generates crashes for me consistently.
On Mac, I get a crash with this top of the stack:
0   XUL                           	0x0140556a void std::__adjust_heap<__gnu_cxx::__normal_iterator<nsRefPtr<imgCacheEntry>*, std::vector<nsRefPtr<imgCacheEntry>, std::allocator<nsRefPtr<imgCacheEntry> > > >, int, nsRefPtr<imgCacheEntry>, bool (*)(nsRefPtr<imgCacheEntry> const&, nsRefPtr<imgCacheEntry> const&)>(__gnu_cxx::__normal_iterator<nsRefPtr<imgCacheEntry>*, std::vector<nsRefPtr<imgCacheEntry>, std::allocator<nsRefPtr<imgCacheEntry> > > >, int, int, nsRefPtr<imgCacheEntry>, bool (*)(nsRefPtr<imgCacheEntry> const&, nsRefPtr<imgCacheEntry> const&)) + 3507818

Same bug? Different one?
(Assignee)

Comment 17

9 years ago
different one. If you can reproduce, please file.
(Assignee)

Updated

9 years ago
Priority: P2 → P1
The crash from comment #16 is now bug 482690.
I crashed with this stack twice in the past two days:
http://crash-stats.mozilla.com/report/index/9c0be68b-4df5-4ca5-8202-6c9a82090319?p=1
http://crash-stats.mozilla.com/report/index/2fd40617-7e6a-4f6f-912f-3b4922090318

The xul.dll frames without symbols weird me out. AFAICT, those really are out in the weeds, but the addresses are consistent, so that makes no sense to me. I'll try NSPR logging, as well as trying to catch it in a debugger.
(Assignee)

Comment 20

9 years ago
If you can continue to reproduce with a build from 20090318 or later, I am *really* interested in that. So far I haven't seen any crashes on crash-stats since I checked in bug 481753.
Those crashes were with 20090317, so here's hoping!
(Assignee)

Comment 22

9 years ago
This was a _very_ common crash that hasn't shown up at all since the 20090317 build - none from 0318 or 0319 have shown up on crash-stats. I'm going to call this fixed, and breathe a sigh of relief.
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
Since a changeset that actually corrected this does not seem to be identified here, isn't the correct resolution for this bug WORKSFORME?
Resolution: FIXED → WORKSFORME
I think Joe is saying that the fix in bug 481753 fixed this.
Comments in that bug would indicate you are correct.  WORKSFORME -> FIXED
Resolution: WORKSFORME → FIXED
Whiteboard: [fixed by bug 481753]
Target Milestone: --- → mozilla1.9.2a1
Status: RESOLVED → VERIFIED
Duplicate of this bug: 484706
(Assignee)

Updated

9 years ago
Keywords: fixed1.9.1
Did this land on 1.9.2 yet? Flags indicate no. It's the #31 topcrash in preliminary 3.6b4 data.
Oh, err, didn't realize how old this bug was. Seems like there's some other bug with the same signature. Will file a separate bug on that.

Nothing to see here people, move along.
There's still crash reports coming in with this signature.
I filed bug 548102.
Crash Signature: [@ nsExpirationTracker<imgCacheEntry, 3>::RemoveObject(imgCacheEntry*) ]
You need to log in before you can comment on or make changes to this bug.