Closed
Bug 1342255
Opened 8 years ago
Closed 8 years ago
Crash in NS_OutputStreamIsBuffered
Categories
(Core :: DOM: Service Workers, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla55
Tracking | Status | |
---|---|---|
firefox52 | --- | unaffected |
firefox-esr52 | --- | unaffected |
firefox53 | --- | fixed |
firefox54 | --- | fixed |
firefox55 | --- | fixed |
People
(Reporter: mccr8, Assigned: catalinb)
References
()
Details
(Keywords: crash)
Crash Data
Attachments
(1 file, 1 obsolete file)
This bug was filed from the Socorro interface and is
report bp-7add4b55-5b0e-470b-87ac-e91c22170223.
=============================================================
Null deref crash in NS_OutputStreamIsBuffered. This is not a common crash. I only see a couple of crashes, all from the same install time, all on the site https://una.im/
I was able to reproduce this crash by going to the site, then scrolling around, then closing the tab and later going to the page again: bp-c96a72da-f97b-4469-b90a-4f0a72170223
It is called like this, so I guess responseBody is null:
if (!NS_OutputStreamIsBuffered(responseBody)) {
That is set like this earlier in the method:
nsCOMPtr<nsIOutputStream> responseBody;
rv = mInterceptedChannel->GetResponseBody(getter_AddRefs(responseBody));
if (NS_WARN_IF(NS_FAILED(rv))) {
return;
}
However, the definition of GetResponseBody seems to be:
InterceptedChannelBase::GetResponseBody(nsIOutputStream** aStream)
{
NS_IF_ADDREF(*aStream = mResponseBody);
return NS_OK;
}
which means that we can return null on success.
Reporter | ||
Updated•8 years ago
|
URL: https://una.im/
Comment 1•8 years ago
|
||
Catalin, can you take a look?
Blocks: ServiceWorkers-stability
Flags: needinfo?(catalin.badea392)
Assignee | ||
Comment 2•8 years ago
|
||
I can see the crash in nightly as well.
Flags: needinfo?(catalin.badea392)
Updated•8 years ago
|
Assignee: nobody → catalin.badea392
Priority: -- → P2
Assignee | ||
Comment 3•8 years ago
|
||
Oh, I thought I assigned myself to the bug.
Assignee | ||
Comment 4•8 years ago
|
||
If the SW throws after calling respondWith(), the reset interception runnable
will race with the promise passed to respondWith(). Note, we shouldn't reset
interception on JS errors. ResetInterception will set
mInterceptedChannel->mResponseBody to null, which will cause a crash when
the promise is resolved.
This patches fixes both issues, but the WPT test is a bit of a stretch because
I'm using a setTimeout to force the crash conditions in firefox.
Attachment #8847279 -
Flags: review?(bugmail)
Comment 5•8 years ago
|
||
I think this bug ends up being a continuation of bug 1271692 in response to https://github.com/w3c/ServiceWorker/issues/896.
Blocks: 1271692
Comment 6•8 years ago
|
||
Comment on attachment 8847279 [details] [diff] [review]
Fix crash in respondWith resolved callback. Don't reset interception if the sw throws after respondWith().
Review of attachment 8847279 [details] [diff] [review]:
-----------------------------------------------------------------
Huh, so this race is extra bad because InterceptedChannel{Base,Chrome,Content} aren't quite thread-safe and so the method calls on InterceptedChannel* can interleave.
* ResumeRequest invokes ResetInterception() on the main thread (asserted by ResumeRequest, InterceptedChannel* doesn't care).
* RespondWithHandler::ResolvedCallback invokes InterceptedChannelBase::GetResponseBody() on the worker thread (asserted).
* InterceptedChannel{Base,Content,Chrome} don't use locks/monitors anywhere.
Invariant-wise, the other SW logic that might impact mResponseBody is FinishResponse::Run which (on the main thread, asserted) invokes FinishSynthesizedResponse. So for the SW logic to be safe, it's required that we ensure calls to ResetInterception() or FinishSynthesizedResponse() can't possibly overlap. The change to FetchEventRunnable::DispatchFetchEvent takes care of one of the ResumeRequest instantiation sites. The other is its Cancel() method which is mutually exclusive with WorkerRun() having run. So that's good. And FinishSynthesizedResponse happens stricly after RespondWithHandler::ResolvedCallback completes, so that's fine too.
Restating stuff:
- The new test covers :bkelly's second example from https://github.com/w3c/ServiceWorker/issues/896#issue-154045947 which bug 1271692 did not add a test for.
I'm giving you an r+, but I'm not a peer for this code, so you still need to needinfo :bkelly or some other DOM or Workers peer to defer or do their own review.
::: testing/web-platform/tests/service-workers/service-worker/resources/respond-then-throw-worker.js
@@ +1,3 @@
> +self.addEventListener('fetch', function(event) {
> + event.respondWith(new Promise(function(res) {
> + setTimeout(function() { res(new Response('intercepted')); }, 300);
As I understand the scheduling issue, without your fix, the throw("error") below would resume in a ResumeRequest being dispatched to the main thread at the conclusion of this event handler. We want to ensure the resolve(), which will be entirely serviced on the worker thread, will happen after that.
The setTimeout(300) can be replaced by a postMessage ping/pong and a setTimeout delaying the ping or the pong so that the pong happens strictly after the ResumeRequest runnable is dispatched. I'd probably put the setTimeout with the ping and the comment in here.
It's still an implementation dependent check, but I think a case can be made for it being a likely implementation error. And most importantly, it covers our crasher! The comment might go something like this:
Ensure that throwing an error from the fetch handler does not cancel the still-pending respondWith(). We delay resolving the promise by both a setTimeout and a postMessage ping-pong so that any cancellation logic has the opportunity to be propagated to the page and back to the worker.
Attachment #8847279 -
Flags: review?(bugmail) → review+
Comment 7•8 years ago
|
||
I defer to :asuth to review this bug. Thanks!
Assignee | ||
Comment 8•8 years ago
|
||
Using postMessage ping pong to delay the response.
Pushed by catalin.badea392@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9ce518ccb466
Fix crash in respondWith resolved callback. Don't reset interception if the sw throws after .respondWith(). r=asuth
Comment 10•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Comment 11•8 years ago
|
||
Please request Aurora/Beta approval on this when you get a chance.
status-firefox52:
--- → unaffected
status-firefox53:
--- → affected
status-firefox54:
--- → affected
status-firefox-esr52:
--- → unaffected
Flags: needinfo?(catalin.badea392)
Assignee | ||
Updated•8 years ago
|
Attachment #8847279 -
Attachment is obsolete: true
Assignee | ||
Comment 12•8 years ago
|
||
Comment on attachment 8849103 [details] [diff] [review]
Fix crash in respondWith resolved callback. Don't reset interception if the sw throws after .respondWith().
Approval Request Comment
[Feature/Bug causing the regression]: Service Worker null deref
[User impact if declined]: Crash in websites using SW
[Is this code covered by automated tests?]: Yes. WPT test
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: No
[Why is the change risky/not risky?]: SW specific
[String changes made/needed]: none
Note when merging, the changes to the manifest file will probably not apply, just discard them and run ./mach wpt-manifest-update
Flags: needinfo?(catalin.badea392)
Attachment #8849103 -
Flags: approval-mozilla-beta?
Attachment #8849103 -
Flags: approval-mozilla-aurora?
Comment 13•8 years ago
|
||
Comment on attachment 8849103 [details] [diff] [review]
Fix crash in respondWith resolved callback. Don't reset interception if the sw throws after .respondWith().
Fix a crash related to SW. Aurora54+ & Beta53+.
Attachment #8849103 -
Flags: approval-mozilla-beta?
Attachment #8849103 -
Flags: approval-mozilla-beta+
Attachment #8849103 -
Flags: approval-mozilla-aurora?
Attachment #8849103 -
Flags: approval-mozilla-aurora+
Comment 14•8 years ago
|
||
(In reply to Cătălin Badea (:catalinb) from comment #12)
> Comment on attachment 8849103 [details] [diff] [review]
> Fix crash in respondWith resolved callback. Don't reset interception if the
> sw throws after .respondWith().
>
>
> Approval Request Comment
> [Feature/Bug causing the regression]: Service Worker null deref
> [User impact if declined]: Crash in websites using SW
> [Is this code covered by automated tests?]: Yes. WPT test
> [Has the fix been verified in Nightly?]: Yes.
> [Needs manual test from QE? If yes, steps to reproduce]: No.
> [List of other uplifts needed for the feature/fix]: none
> [Is the change risky?]: No
> [Why is the change risky/not risky?]: SW specific
> [String changes made/needed]: none
>
> Note when merging, the changes to the manifest file will probably not apply,
> just discard them and run ./mach wpt-manifest-update
yes this happened when trying to uplift this patch. Can you provide a rebased/aurora-beta patch ? Thanks!
Flags: needinfo?(catalin.badea392)
Updated•8 years ago
|
Flags: needinfo?(catalin.badea392)
Comment 15•8 years ago
|
||
uplift |
Comment 16•8 years ago
|
||
bugherder uplift |
Comment 17•8 years ago
|
||
Setting qe-verify- based on Catalin's assessment on manual testing needs (see Comment 12) and the fact that this fix has automated coverage.
Flags: qe-verify-
You need to log in
before you can comment on or make changes to this bug.
Description
•