Bugzilla

Comment 2

•

13 years ago

The stack trace looks like:
Frame 	Module 	Signature [Expand] 	Source
0 	xul.dll 	nsDiskCacheMap::DeleteRecord 	netwerk/cache/nsDiskCacheMap.cpp:488
1 	xul.dll 	nsDiskCacheDevice::DeactivateEntry_Private 	
2 	xul.dll 	nsDiskCacheDeviceDeactivateEntryEvent::Run 	netwerk/cache/nsDiskCacheDevice.cpp:473
3 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:618
4 	xul.dll 	nsRunnable::Release 	obj-firefox/xpcom/build/nsThreadUtils.cpp:55
5 	nspr4.dll 	_PR_NativeRunThread 	nsprpub/pr/src/threads/combined/pruthr.c:426
6 	nspr4.dll 	pr_root 	nsprpub/pr/src/md/windows/w95thred.c:122
7 	mozcrt19.dll 	_callthreadstartex 	obj-firefox/memory/jemalloc/crtsrc/threadex.c:348
8 	mozcrt19.dll 	_threadstartex 	obj-firefox/memory/jemalloc/crtsrc/threadex.c:326
9 	kernel32.dll 	BaseThreadStart

Crash Signature: [@ nsDiskCacheMap::DeleteRecord(nsDiskCacheRecord*)] → [@ nsDiskCacheMap::DeleteRecord(nsDiskCacheRecord*)] [@ nsDiskCacheMap::DeleteRecord]

OS: Windows NT → All

Hardware: x86 → All

Sheila Mooney

Updated

•

13 years ago

Keywords: topcrash

Sheila Mooney

Comment 3

•

13 years ago

Sitting at the #18 spot for crashes on 8.0 in the past week and #38 on 8.0.1 in the past week.

Comment 4

•

13 years ago

Still occurs according to crash-stats. Grabbing this on request from Josh and by agreement with Michal.

Assignee: michal.novotny → bjarne

Comment 5

•

13 years ago

This is caused by the same problem causing bug #691108, which I can reproduce with an xpcshell-test in a debugger.

Bug #691108 only occurs in debug-builds since the crashing statement (NS_NOTREACHED) is active only for debug builds. In an opt build, nsDiskCacheMap::WriteDiskCacheEntry() returns failure (instead of crashing in the NOTREACHED) making DeactivateEntry_Private() call DeleteStorage() and DeleteRecord(). The latter crashes (this bug) because the record has already been deleted from nsDiskCacheMap.

It seems like an entry can be deleted from nsDiskCacheMap while there is a pending deactiavtion. This is basically because nsDiskCacheStreamIO keeps a binding to the record even after CloseOutputStream/Internal is called. I am not 100% sure under which circumstances this happens but I can provoke it by dispatching an "xpcom-shutdown" notification from the test and use the debugger to make nsDiskCacheMap::DeleteStorage() fail when in nsDiskCacheStreamIO::Flush(), which triggers the call to nsDiskCacheMap::DeleteRecord(). When the deactivate-event fires, the record has already been deleted and we get the behaviour described above.

Will have to think about how to fix this... The obvious way is to clear the binding to the record in nsDiskCacheStreamIO but there may be other ways. I also do not really understand the logic in nsDiskCacheStreamIO::Flush() if nsDiskCacheMap::DeleteStorage() fails - why would we just remove the record..? Wouldn't dooming the entry be more appropriate?

Comment 6

•

12 years ago

Since bug #691108 is about one particular test failing, I've been trying to find the cache access-pattern used in that test in order to come up with some test-code which fails consistently. No clear conclusion yet but the observation is that these crashes occur immediately when the test enables disk-cache, loads one entry, clears the cache and tries to load a second entry. The crash seems to be the deactivate-call for the entry loaded immediately before clearing the cache.

Comment 7

•

12 years ago

(In reply to Bjarne (:bjarne) from comment #5)
> In an opt build,
> nsDiskCacheMap::WriteDiskCacheEntry() returns failure (instead of crashing
> in the NOTREACHED) making DeactivateEntry_Private() call DeleteStorage() and
> DeleteRecord(). The latter crashes (this bug) because the record has already
> been deleted from nsDiskCacheMap.

This does not seem to be right... I have a test which can consistently trigger bug #691108 but I do not seem to be able to trigger this bug with an opt build. Will investigate further...

Robert Kaiser

Comment 8

•

12 years ago

This signature is now #5 on 10.* in yesterday's data, #8 in total 10.0b4 data.

tracking-firefox10: --- → ?

Comment 9

•

12 years ago

Any idea if this is because it happens more frequently or because other crashes are fixed?

Comment 10

•

12 years ago

It was #18 top browser crasher in 8.0.
It's #25 top browser crasher in 9.0.1.
It's stable across 10.0 Betas, between #8 and #11.
So there's probably a regression between 9 and 10.

Comment 11

•

12 years ago

..  or maybe some other crashes were fixed between 9 and 10, causing this one to climb on the list?  ;)

Anyway, here's a theory: Using the test mentioned in comment #7 it can be shown that a failure in |nsDiskCacheMap::DeleteStorage| can cause an entry to be deleted from the cache-map twice (essence in comment #5). The crash occurs when |nsDiskCacheMap::DeleteRecord| is called for the second time for the same entry

https://hg.mozilla.org/releases/mozilla-release/file/4e309e63c279/netwerk/cache/nsDiskCacheMap.cpp#l488

Observe that on the second call |mHeader.mBucketUsage[bucketIndex]| can be 0 (for example if it happens immediately after clearing the cache). This means that |last| will be assigned the value of -1 but since |last| is an |PRUint32|, some conversion will happen. In the for-loop below |last| is converted into an |int| (note: no number of bits and signed) and my guess is that on certain platforms and with certain compilers and -flags |i| ends up as a large positive number which makes |records[i].HashNumber()| crash. This is possibly also a 32/64 bit issue - I'm not too deep into the gory details in this area.

Michal: On my Linux-box this is handled somehow and I don't have the setup for Windows-builds. Could you try the patch I'll attach in a while on your setup and verify that the contained test crashes in an opt-build? Then, change |last| to be a signed int and see if it fixes the crash?

Comment 12

•

12 years ago

Attached patch Test to provoke crash (obsolete) — Details — Splinter Review

This is the code mentioned in comment #7.

In a debug-build it reproduces the issue from bug #691108. In an opt-build, the theory is that on some platforms (non-Linux) it will reproduce the issue seen in this bug.

Note that this is not suitable for checking in since it cripples |nsICacheService| to provide a mechanism for the test to make |nsDiskCacheMap::DeleteStorage| fail.

Robert Kaiser

Comment 13

•

12 years ago

(In reply to Bjarne (:bjarne) from comment #9)
> Any idea if this is because it happens more frequently or because other
> crashes are fixed?

Frequency of this crash has definitely risen in Firefox 10+ compared to 9. Was 110-130 crashes per million ADU for the ~2 weeks before, 356 in data from 2012-01-17 (but that's a single-day spike). Also going between 100 and 300 for Firefox 11. And it's at 5-8 crashes per million ADU in data for Firefox 9.

Also, as you mentioned 32bit vs. 64bit, https://crash-stats.mozilla.com/report/list?signature=nsDiskCacheMap%3A%3ADeleteRecord%28nsDiskCacheRecord*%29 seems to have mostly 32bit, but the odd 64bit crash is visible there as well. All the crashes are on Windows, though.

Comment 14

•

12 years ago

(In reply to Bjarne (:bjarne) from comment #11)
> Michal: On my Linux-box this is handled somehow and I don't have the setup
> for Windows-builds. Could you try the patch I'll attach in a while on your
> setup and verify that the contained test crashes in an opt-build? Then,
> change |last| to be a signed int and see if it fixes the crash?

I went ahead and tried this on tryserver, but the opt-build does not crash as expected...  :(

Comment 15

•

12 years ago

Still believe the conversion int -> unsigned 32bit int -> int is unsafe. Additionally, even if conversion happens correctly, an optimizing compiler might unroll the outer if-loop (line 490 in link from comment #11) because |last|, being an unsigned int, should always be >= 0. So, in this case the inner loop would be executed once even if |i| is -1, and a crash is likely.

Of-course, the proper fix for this is to avoid deleting the record twice. However, that fix is not as obvious as avoid using unsigned ints here in order to make this code more stable.

Comment 16

•

12 years ago

Attached patch Change to signed int — Details — Splinter Review

IMO it's worth to try this safe, simple change and see if it fixes this crash. If it doesn't then no harm should have been done.

Attachment #590359 - Flags: review?(michal.novotny)

Alex Keybl [:akeybl]

Updated

•

12 years ago

tracking-firefox10: ? → +

Comment 17

•

12 years ago

Attached patch More complete fix which avoids deleting record twice (obsolete) — Details — Splinter Review

This can be applied alone or together with attachment #590359 [details] [diff] [review]

The basic idea is to make sure error-values are propagated properly from internal methods and up to a level where they can be handled. I've chosen nsCacheEntry as the point to deal with errors, but I'm open to discussions. (Note that the test in attachment #589849 [details] [diff] [review] need a try-catch to deal with this - I'll attach a new version shortly.)

Attachment #590702 - Flags: review?(michal.novotny)

Comment 18

•

12 years ago

Attached patch Test to trigger crash - modified to handle failure-returns — Details — Splinter Review

Attachment #589849 - Attachment is obsolete: true

Comment 19

•

12 years ago

Patch sent to tryserver

https://tbpl.mozilla.org/?tree=Try&rev=1f6764f8dfca

As a side-note: Bug #691108 seems to have disappeared since I started looking at this. I'm not aware of any fixes which may have fixed that. Anyone who knows..?

Wayne Mery (:wsmwk)

Updated

•

12 years ago

Whiteboard: [tbird crash]

Assignee

Comment 20

•

12 years ago

Comment on attachment 590702 [details] [diff] [review]
More complete fix which avoids deleting record twice

(In reply to Bjarne (:bjarne) from comment #17)
> The basic idea is to make sure error-values are propagated properly from
> internal methods and up to a level where they can be handled. I've chosen
> nsCacheEntry as the point to deal with errors, but I'm open to discussions.

If we simply call nsCacheService::DoomEntry() in nsDiskCacheStreamIO::Flush() instead of propagating the error, then the only difference would be that we wouldn't stop early in nsCacheEntry::RemoveDescriptor(), right? Would it be a problem?

Attachment #590702 - Flags: review?(michal.novotny)

Assignee

Comment 21

•

12 years ago

(In reply to Bjarne (:bjarne) from comment #11)
> Michal: On my Linux-box this is handled somehow and I don't have the setup
> for Windows-builds. Could you try the patch I'll attach in a while on your
> setup and verify that the contained test crashes in an opt-build? Then,
> change |last| to be a signed int and see if it fixes the crash?

I've tried the patch on windows build but it didn't crash. Variable "last" is 0xffffffff and the for loop is not entered.

Assignee

Comment 22

•

12 years ago

Comment on attachment 590359 [details] [diff] [review]
Change to signed int

I'm not sure that we want this change. This would just suppress the crash without fixing the issue. Anyway, if we want to handle the bucket usage better, I'd prefer to make variable 'i' unsigned since mBucketUsage is also unsigned, i.e. something like this:

    PRUint32            last = mHeader.mBucketUsage[bucketIndex];
    for (PRUint32 i = last; i > 0; i--) {
        if (records[i-1].HashNumber() == hashNumber) {

Attachment #590359 - Flags: review?(michal.novotny) → review-

Comment 23

•

12 years ago

(In reply to Michal Novotny (:michal) from comment #20)
> Comment on attachment 590702 [details] [diff] [review]
> If we simply call nsCacheService::DoomEntry() in
> nsDiskCacheStreamIO::Flush() instead of propagating the error, then the only
> difference would be that we wouldn't stop early in
> nsCacheEntry::RemoveDescriptor(), right? Would it be a problem?

That would resolve this particular problem, yes, but I'd consider it more of a hack. I think it is a mistake in any case not to propagate errors. IMO, it is not the responsibility of the disk-stream to decide whether to doom a cache-entry. It should propagate errors and leave such decisions to a higher layer.

Updated

•

12 years ago

Attachment #590702 - Flags: review?(michal.novotny)

Comment 24

•

12 years ago

(In reply to Michal Novotny (:michal) from comment #22)
> I'm not sure that we want this change. This would just suppress the crash
> without fixing the issue. 

Agreed, but it would make the code robust. The way it is now is just plain wrong.

> Anyway, if we want to handle the bucket usage
> better, I'd prefer to make variable 'i' unsigned since mBucketUsage is also
> unsigned, i.e. something like this:
> 
>     PRUint32            last = mHeader.mBucketUsage[bucketIndex];
>     for (PRUint32 i = last; i > 0; i--) {
>         if (records[i-1].HashNumber() == hashNumber) {

The point of this change is to avoid unsafe conversions. If you prefer to do it this way, that's ok with me although I don't see the rationale nor the benefit compared to the suggested way.

Also observe that this loop is found in many methods in this class, but without using |last| and rather subtracting 1 in the initial assignment to |i|. The reason for using |last| in |DeleteRecord| is that |last| is needed three other places in the loop. You would need to subtract 1 also in these three places, which is the reason it was done like this I guess.

I've provided two theories for why this particular line crashes (comment #11 and comment #15) - do you have any other theory for what can cause a crash here?

Comment 25

•

12 years ago

(In reply to Bjarne (:bjarne) from comment #19)
> Patch sent to tryserver
> 
> https://tbpl.mozilla.org/?tree=Try&rev=1f6764f8dfca

Seems to be as green as a tryserver-run can be these days...

> As a side-note: Bug #691108 seems to have disappeared since I started
> looking at this.

.. and of-course bug #691108, comment #77 came in a few hours after this comment. But the frequency of these has at least dropped.

Assignee

Comment 26

•

12 years ago

Comment on attachment 590702 [details] [diff] [review]
More complete fix which avoids deleting record twice

> +    // Doom entry if something bad happens while closing. See bug #673543
> +    if (NS_FAILED(rv) && rv != NS_BASE_STREAM_WOULD_BLOCK) {
> +        nsCacheService::DoomEntry(this);
> +        return false; // it' doomed - deactivate it
> +    }

Dooming the entry doesn't remove any active descriptor (though there shouldn't be any other descriptor since we had opened output) or request. So you IMO can't return false here.


> +        // Doom entry if something bad happens while closing. See bug #673543
> +        nsresult rv = descriptor->CloseOutput();
> +        if (NS_FAILED(rv) && rv != NS_BASE_STREAM_WOULD_BLOCK)
> +            nsCacheService::DoomEntry(this);

Cache streams never return NS_BASE_STREAM_WOULD_BLOCK, see nsCacheEntryDescriptor::nsOutputStreamWrapper::IsNonBlocking() and nsCacheEntryDescriptor::nsInputStreamWrapper::IsNonBlocking(). But if we ever change it, we would need to doom the entry too, because the data wasn't flushed properly. Or am I wrong?


>      if (!mClosed) {
>          mClosed = true;
>          // tell parent streamIO we are closing
> -        mStreamIO->CloseOutputStream(this);
> +        return mStreamIO->CloseOutputStream(this);
>      }
>  
>      mozilla::Telemetry::ID id;
>      if (NS_IsMainThread())

This disables the telemetry. The same problem is in nsDiskCacheOutputStream::CloseInternal().


>      if (mBinding && mOutStream)
> -        Flush();
> +        // This method is only called from nsDiskCacheBinding-dtor
> +        return Flush();

Although this is true, I'd prefer to null out mBinding (at least in case of successful flush).

Attachment #590702 - Flags: review?(michal.novotny) → review-

Comment 27

•

12 years ago

(In reply to Michal Novotny (:michal) from comment #26)
> Dooming the entry doesn't remove any active descriptor (though there
> shouldn't be any other descriptor since we had opened output) or request. So
> you IMO can't return false here.

Agreed.

> Cache streams never return NS_BASE_STREAM_WOULD_BLOCK, see
> nsCacheEntryDescriptor::nsOutputStreamWrapper::IsNonBlocking() and
> nsCacheEntryDescriptor::nsInputStreamWrapper::IsNonBlocking(). But if we
> ever change it, we would need to doom the entry too, because the data wasn't
> flushed properly. Or am I wrong?

You are probably right...  the right action seems to actually be just return true in this case, something like this:

    nsresult rv = descriptor->CloseOutput();
    if (rv == NS_BASE_STREAM_WOULD_BLOCK)
        return true;
    descriptor->ClearCacheEntry();
    PR_REMOVE_AND_INIT_LINK(descriptor);
    // Doom entry if something bad happens while closing. See bug #673543
    if (NS_FAILED(rv))
        nsCacheService::DoomEntry(this);

> This disables the telemetry. The same problem is in
> nsDiskCacheOutputStream::CloseInternal().

Ahh - good catch. Will fix.

> Although this is true, I'd prefer to null out mBinding (at least in case of
> successful flush).

I could agree with this but I think it is a different issue and should not be done here. See my recent comments to bug #648232. (If I'm missing something and nulling is necessary for this patch, please point out why.)

Comment 28

•

12 years ago

(In reply to Bjarne (:bjarne) from comment #27)
> I could agree with this but I think it is a different issue and should not
> be done here. See my recent comments to bug #648232. (If I'm missing
> something and nulling is necessary for this patch, please point out why.)

Hmm...  I missed your point here. Please forget this comment.

Comment 29

•

12 years ago

Attached patch More complete fix which avoids deleting record twice, v1.1 — Details — Splinter Review

Addressed comments. Pushed to tryserver as https://tbpl.mozilla.org/?tree=Try&rev=6211c80cfb86

Attachment #590702 - Attachment is obsolete: true

Attachment #591087 - Flags: review?(michal.novotny)

Assignee

Comment 30

•

12 years ago

First, I'd like to note that I have no theory. Your theory about "signed int -> unsigned 32b int -> signed int" conversion would make sense if int were 64 bit long. But it seems that all windows versions are I32LP64, i.e. int is always 32 bit.

> The point of this change is to avoid unsafe conversions. If you prefer to do
> it this way, that's ok with me although I don't see the rationale nor the
> benefit compared to the suggested way.

The benefit is that there is no unsigned -> signed conversion in my change. See section 6.3.1.3 paragraph 3 of the C99 standard:

6.3.1.3 Signed and unsigned integers

1) When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
2) Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
3) Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

Assignee

Comment 31

•

12 years ago

(In reply to Bjarne (:bjarne) from comment #27)
> You are probably right...  the right action seems to actually be just return
> true in this case, something like this:
> 
>     nsresult rv = descriptor->CloseOutput();
>     if (rv == NS_BASE_STREAM_WOULD_BLOCK)
>         return true;

Returning true will lead to an assertion in nsCacheEntryDescriptor::Close(). But I'm probably OK with it, because the cache streams are blocking and if we change it, we need to change the closing of entries too...

Assignee

Updated

•

12 years ago

Attachment #591087 - Flags: review?(michal.novotny) → review+

Updated

•

12 years ago

Keywords: checkin-needed

Comment 32

•

12 years ago

(In reply to Michal Novotny (:michal) from comment #21)
> I've tried the patch on windows build but it didn't crash. Variable "last"
> is 0xffffffff and the for loop is not entered.

Yeah...  I guess it adds 2^N to -1 which should set all the bits when rhs is -1.

(In reply to Michal Novotny (:michal) from comment #30)
> First, I'd like to note that I have no theory. Your theory about "signed int
> -> unsigned 32b int -> signed int" conversion would make sense if int were
> 64 bit long. But it seems that all windows versions are I32LP64, i.e. int is
> always 32 bit.

I am really no expert modern compilers work so I'll accept this if you say so. (I also see that msdn states that MS-compilers preserve the bit-patterns in s -> u and u -> s conversions which confirms your comment #21 and should make this safe on win32.)

What about the unroll-idea? Could an optimizing modern compiler assume that a value assigned from an unsigned int will always be >= 0 and hence always execute the first iteration of the loop |for (int i = last; i >= 0; i--)|? (We did such things manually in the good old days - maybe compilers have learnt how to do it by now..?)

> The benefit is that there is no unsigned -> signed conversion in my change.

Well, it is a point to avoid the wrapping which seems to be allowed when subtracting unsigned ints. Another way to handle this can be to follow the secure-code guidelines from CERT

https://www.securecoding.cert.org/confluence/display/cplusplus/INT30-CPP.+Ensure+that+unsigned+integer+operations+do+not+wrap#INT30-CPP.Ensurethatunsignedintegeroperationsdonotwrap-Subtraction

The code we have is essentially the "noncompliant code example" and we could make it compliant like they suggest here.

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Comment 33

•

12 years ago

https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=f10313959cb0

Keywords: checkin-needed

Comment 34

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/f10313959cb0

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla12

Comment 35

•

12 years ago

There are still crashes in:
* 12.0a1:
https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A12.0a1&query_search=signature&query_type=contains&reason_type=contains&date=02%2F03%2F2012%2010%3A27%3A07&range_value=28&range_unit=days&hang_type=any&process_type=any&do_query=1&signature=nsDiskCacheMap%3A%3ADeleteRecord%28nsDiskCacheRecord*%29
* 13.0a1:
https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A13.0a1&query_search=signature&query_type=contains&reason_type=contains&date=02%2F03%2F2012%2010%3A27%3A07&range_value=28&range_unit=days&hang_type=any&process_type=any&do_query=1&signature=nsDiskCacheMap%3A%3ADeleteRecord%28nsDiskCacheRecord*%29

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Comment 36

•

12 years ago

-> default owner

Assignee: bjarne → nobody

Wayne Mery (:wsmwk)

Comment 37

•

12 years ago

not only are there still crashes, the patch appears to have had zero impact

Alex Keybl [:akeybl]

Comment 38

•

12 years ago

Top crasher #29 in FF11, and still apparent in FF12. Josh - can we get somebody to look at this?

Assignee: nobody → joshmoz

tracking-firefox11: --- → +

tracking-firefox12: --- → +

Josh Aas

Comment 39

•

12 years ago

Michal, can you look into this?

Assignee: joshmoz → michal.novotny

Sheila Mooney

Comment 40

•

12 years ago

It's tracking for FF12 and we don't have any comments in 10 days. It's a top crash we shipped in 11 and it would be great to get it fixed in 12. It's still very high volume so the fix 1/26 didn't work.

Assignee

Comment 41

•

12 years ago

I'm actively working on this. So far I have a test that reproduces an assertion from bug #691108. I believe that the cause of both bugs is the same. But the problem is that I need to run the test overnight to get the assertion and the logs are really huge, so it's not a matter of a short while to analyze them.

Assignee

Comment 42

•

12 years ago

Unfortunately this bug isn't duplicate of #691108. Did anybody succeed to reproduce this crash? A NSPR log (nsHttp:5,cache:5) from a crashing session would be very helpful.

Assignee

Comment 43

•

12 years ago

Attached patch fix — Details — Splinter Review

In the end I think that the problem is similar to bug #691108. It could happen that a new cache entry is activated while we are waiting for pending events on the cache IO thread in nsCacheService::Shutdown() at http://hg.mozilla.org/mozilla-central/annotate/6fe5b0271cd1/netwerk/cache/nsCacheService.cpp#l1245 and in nsCacheService::OnProfileShutdown() at http://hg.mozilla.org/mozilla-central/annotate/6fe5b0271cd1/netwerk/cache/nsCacheService.cpp#l2183. In both cases we clear mActiveEntries before calling SyncWithCacheIOThread() and we expect that there is no active entry when SyncWithCacheIOThread() finishes. I'm not able to reproduce the crash and also I don't see exactly what scenario causes the crash, but using debugger I'm able to force a cache entry for a wyciwyg channel to be created while the main thread is blocked by SyncWithCacheIOThread(). Unfortunately the cache entry is also closed and removed from mActiveEntries before SyncWithCacheIOThread() finishes. But if I was able to reference the wyciwyg channel for a bit longer, the cache entry wouldn't be closed immediately and IMO it would lead to the crash that we see in this bug. Attached patch should fix this problem.

Attachment #613553 - Flags: review?(jduell.mcbugs)

Jason Duell

Comment 44

•

12 years ago

Comment on attachment 613553 [details] [diff] [review]
fix

Review of attachment 613553 [details] [diff] [review]:
-----------------------------------------------------------------

I'm no expert in this code, but poking around I see that ActivateEntry can already fail commonly enough (ex: if hash collision) that I'm assuming ProcessRequest will behave well once we've added in the additonal failure mode (if we're clearing entries).

Attachment #613553 - Flags: review?(jduell.mcbugs) → review+

Assignee

Comment 45

•

12 years ago

tryserver is green except few intermittent failures: https://tbpl.mozilla.org/?tree=Try&rev=9f4bbf96ea71

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Assignee

Comment 46

•

12 years ago

http://hg.mozilla.org/integration/mozilla-inbound/rev/38fcd9047d39

Comment 47

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/38fcd9047d39

Status: REOPENED → RESOLVED

Closed: 12 years ago → 12 years ago

Resolution: --- → FIXED

Lukas Blakk [:lsblakk] use ?needinfo

Updated

•

12 years ago

Target Milestone: mozilla12 → mozilla14

Version: 6 Branch → Trunk

Alex Keybl [:akeybl]

Comment 48

•

12 years ago

It's too late in FF12's cycle to fix this top crasher (first regressing in FF10) - please nominate for Aurora approval if you feel the fix is low risk.

status-firefox12: --- → wontfix

tracking-firefox13: --- → +

Comment 49

•

12 years ago

This is now in Aurora, adjusting flags accordingly.

status-firefox13: --- → wontfix

status-firefox14: --- → fixed

tracking-firefox14: --- → +

Lukas Blakk [:lsblakk] use ?needinfo

Comment 50

•

12 years ago

It's #4 top crasher in 10.0.4 ESR.

status-firefox-esr10: --- → affected

tracking-firefox-esr10: --- → ?

Updated

•

12 years ago

tracking-firefox-esr10: ? → 14+

Lukas Blakk [:lsblakk] use ?needinfo

Assignee

Comment 51

•

12 years ago

Attached patch unbitrotted for esr10 — Details — Splinter Review

[Approval Request Comment]
User impact if declined: possible crash during FF shutdown
Fix Landed on Version: mozilla14
Risk to taking this patch (and alternatives if risky): probably none
String or UUID changes made by this patch: none

Attachment #636168 - Flags: approval-mozilla-esr10?

Comment 52

•

12 years ago

Comment on attachment 636168 [details] [diff] [review]
unbitrotted for esr10

looks good, thanks for the updated patch.

Attachment #636168 - Flags: approval-mozilla-esr10? → approval-mozilla-esr10+

Assignee

Comment 53

•

12 years ago

http://hg.mozilla.org/releases/mozilla-esr10/rev/974fb8245eb5