Closed Bug 1445479 Opened 2 years ago Closed 2 years ago

Crash in imgCacheValidator::RemoveProxy

Categories

(Core :: ImageLib, defect, P3, critical)

60 Branch
Unspecified
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla61
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- fixed
firefox59 --- unaffected
firefox60 --- wontfix
firefox61 --- fixed

People

(Reporter: philipp, Assigned: aosmond)

References

Details

(Keywords: crash, regression, Whiteboard: [gfx-noted])

Crash Data

Attachments

(1 file, 3 obsolete files)

This bug was filed from the Socorro interface and is
report bp-e2092def-6078-43c5-9b80-8682a0180313.
=============================================================

Top 10 frames of crashing thread:

0 XUL imgCacheValidator::RemoveProxy xpcom/ds/nsTArray.h:507
1 XUL imgRequestProxy::imgCancelRunnable::Run image/imgRequestProxy.cpp:386
2 XUL mozilla::SchedulerGroup::Runnable::Run xpcom/threads/SchedulerGroup.cpp:413
3 XUL nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1040
4 XUL NS_ProcessPendingEvents xpcom/threads/nsThreadUtils.cpp:459
5 XUL nsBaseAppShell::NativeEventCallback widget/nsBaseAppShell.cpp:98
6 XUL nsAppShell::ProcessGeckoEvents widget/cocoa/nsAppShell.mm:436
7 CoreFoundation CoreFoundation@0xaa7e0 
8 CoreFoundation CoreFoundation@0x89f0b 
9 CoreFoundation CoreFoundation@0x8942e 

=============================================================

this signature is starting to show up from macos installations in a low volume in firefox 60. the crash appears to be in a codepath last touched by bug 1383682.
Hey Andrew, anything here jump out to you?
Flags: needinfo?(aosmond)
Whiteboard: [gfx-noted]
Hm, there is one path that does not reset the validating flag in the requests. I wonder if we somehow ended up here:

https://searchfox.org/mozilla-central/rev/8fa0b32c84f924c6809c690117dbd59591f79607/image/imgLoader.cpp#2928
Assignee: nobody → aosmond
Flags: needinfo?(aosmond)
I think we should teardown the request/proxies in cases like changed in the patch, but I'm not sure if you are aware of any cases where we should do differently?
Attachment #8958913 - Flags: review?(tnikkel)
Comment on attachment 8958913 [details] [diff] [review]
0001-Bug-1445479-Ensure-we-teardown-requests-on-image-cac.patch, v1

Review of attachment 8958913 [details] [diff] [review]:
-----------------------------------------------------------------

::: image/imgLoader.cpp
@@ +2935,3 @@
>    }
> +
> +  mRequest->ContinueCancel(NS_BINDING_ABORTED);

mRequest is the existing (old) request. Why would we want to cancel it? The consumers of it (whoever might have loaded it the first time) could still be using it. Or am I missing something?
Attachment #8958913 - Flags: review?(tnikkel)
Crash Signature: [@ imgCacheValidator::RemoveProxy] → [@ imgCacheValidator::RemoveProxy] [@ imgRequestProxy::RemoveFromOwner ]
Crash Signature: [@ imgCacheValidator::RemoveProxy] [@ imgRequestProxy::RemoveFromOwner ] → [@ imgCacheValidator::RemoveProxy] [@ imgRequestProxy::RemoveFromOwner ] [@ nsTArray_Impl<T>::RemoveElement<T> | imgRequestProxy::RemoveFromOwner ]
Do we have a modified patch?
Flags: needinfo?(aosmond)
(In reply to Timothy Nikkel (:tnikkel) from comment #4)
> Comment on attachment 8958913 [details] [diff] [review]
> 0001-Bug-1445479-Ensure-we-teardown-requests-on-image-cac.patch, v1
> 
> Review of attachment 8958913 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: image/imgLoader.cpp
> @@ +2935,3 @@
> >    }
> > +
> > +  mRequest->ContinueCancel(NS_BINDING_ABORTED);
> 
> mRequest is the existing (old) request. Why would we want to cancel it? The
> consumers of it (whoever might have loaded it the first time) could still be
> using it. Or am I missing something?

At the time, my line of thinking was to put us in a consistent state in the rare cases where validation failed to complete, for some reason. But I'm inclined to agree with you now; just let them retry and hope that the problem was transient rather than persistent. Will update the patch.
Flags: needinfo?(aosmond)
Incorporate review feedback.
Attachment #8958913 - Attachment is obsolete: true
Attachment #8963250 - Flags: review?(tnikkel)
Comment on attachment 8963250 [details] [diff] [review]
0001-Bug-1445479-Ensure-we-teardown-requests-on-image-cac.patch, v2

Review of attachment 8963250 [details] [diff] [review]:
-----------------------------------------------------------------

"Now we cancel the request and notify the proxies/listeners of the error."

Where do we cancel a request in this patch?

As far as I can tell the UpdateProxies call in AbortRequest will not notify of an error, because the proxy's are pointing to the old request, which presumably has nothing wrong with it.

::: image/imgLoader.cpp
@@ +2926,5 @@
> +  AbortRequest();
> +}
> +
> +void
> +imgCacheValidator::AbortRequest()

What request gets aborted? Does the new request get aborted (if it exists) because we are the only one holding a ref to it?

@@ +2935,5 @@
> +
> +  mRequest->SetValidator(nullptr);
> +  mRequest = nullptr;
> +  mNewRequest = nullptr;
> +  UpdateProxies();

UpdateProxies has a comment saying "this is safe because we are called from OnStartRequest, an asynchronously called function". We are adding a caller from ~imgCacheValidator, which could be called from unsafe places potentially?
Patch rewrite. It should not use async notifications if called from the imgCacheValidator destructor -- since we won't be changing the owner there, I can't see any other event ordering concerns. Fixed misleading/outdated method names and comments.
Attachment #8963250 - Attachment is obsolete: true
Attachment #8963250 - Flags: review?(tnikkel)
Attachment #8965770 - Flags: review?(tnikkel)
Review ping
Flags: needinfo?(tnikkel)
Comment on attachment 8965770 [details] [diff] [review]
0001-Bug-1445479-Ensure-we-teardown-requests-on-image-cac.patch, v3

Okay the code looks good.

But I'm not convinced we want to point to the old request if we can't validate. If the network was down would we want to return the old request or return an error (like we would if we hadn't loaded the image before)?
Flags: needinfo?(tnikkel) → needinfo?(aosmond)
(In reply to Timothy Nikkel (:tnikkel) from comment #11)
> Comment on attachment 8965770 [details] [diff] [review]
> 0001-Bug-1445479-Ensure-we-teardown-requests-on-image-cac.patch, v3
> 
> Okay the code looks good.
> 
> But I'm not convinced we want to point to the old request if we can't
> validate. If the network was down would we want to return the old request or
> return an error (like we would if we hadn't loaded the image before)?

If I am understanding you correctly, then are you proposing we meet halfway between the original patch and this patch? It would change the owner for the validating imgRequestProxy objects to a new, cancelled request, and the existing, already finished loading imgRequestProxy objects are unaffected? I can get behind that.
Flags: needinfo?(aosmond)
(In reply to Andrew Osmond [:aosmond] from comment #12)
> (In reply to Timothy Nikkel (:tnikkel) from comment #11)
> > Comment on attachment 8965770 [details] [diff] [review]
> > 0001-Bug-1445479-Ensure-we-teardown-requests-on-image-cac.patch, v3
> > 
> > Okay the code looks good.
> > 
> > But I'm not convinced we want to point to the old request if we can't
> > validate. If the network was down would we want to return the old request or
> > return an error (like we would if we hadn't loaded the image before)?
> 
> If I am understanding you correctly, then are you proposing we meet halfway
> between the original patch and this patch? It would change the owner for the
> validating imgRequestProxy objects to a new, cancelled request, and the
> existing, already finished loading imgRequestProxy objects are unaffected? I
> can get behind that.

Yes, that was what I was suggesting, but I was more asking your opinion on the matter as there are arguments to be made either way.
Attachment #8967836 - Flags: review?(tnikkel) → review+
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9a1f93317997
Ensure we teardown requests on image cache validation failure paths. r=tnikkel
https://hg.mozilla.org/mozilla-central/rev/9a1f93317997
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
could you request an uplift, in case you deem the patch fit for beta? (the issue is newly showing up in 60)
Flags: needinfo?(aosmond)
Comment on attachment 8967836 [details] [diff] [review]
0001-Bug-1445479-Ensure-we-teardown-requests-on-image-cac.patch, v4

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1383682
[User impact if declined]: May experience crash in content process while browsing normally.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: Yes, no crashes observed since landing.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: Changes are limited in scope and well understood. Only behavioral change is for the case we were crashing.
[String changes made/needed]: None.
Flags: needinfo?(aosmond)
Attachment #8967836 - Flags: approval-mozilla-beta?
I think it's too late for 60 here, given the small volume.
Attachment #8967836 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Andrew would you mind requesting uplift to esr60 on this one?
Flags: needinfo?(aosmond)
Comment on attachment 8967836 [details] [diff] [review]
0001-Bug-1445479-Ensure-we-teardown-requests-on-image-cac.patch, v4

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: May experience rare crash in content process while browsing normally.
Fix Landed on Version: 61
Risk to taking this patch (and alternatives if risky): Low risk. Changes are limited in scope and well understood. Only behavioral change is for the case we were crashing.
String or UUID changes made by this patch: None.

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Flags: needinfo?(aosmond)
Attachment #8967836 - Flags: approval-mozilla-esr60?
Comment on attachment 8967836 [details] [diff] [review]
0001-Bug-1445479-Ensure-we-teardown-requests-on-image-cac.patch, v4

crash fix for 60.1esr
Attachment #8967836 - Flags: approval-mozilla-esr60? → approval-mozilla-esr60+
You need to log in before you can comment on or make changes to this bug.