Closed Bug 1380255 Opened 7 years ago Closed 7 years ago

Make FetchDriver thread safe

Categories

(Core :: DOM: Core & HTML, defect, P2)

55 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox-esr52 --- unaffected
firefox54 --- unaffected
firefox55 --- unaffected
firefox56 --- unaffected
firefox57 --- unaffected
firefox58 --- fixed

People

(Reporter: dragana, Assigned: bkelly)

References

Details

Attachments

(3 files, 2 obsolete files)

FetchDrive implements nsIThreadRetargetableStreamListener and declares NS_DECL_ISUPPORTS.
It should be NS_DECL_THREADSAFE_ISUPPORTS if it wants to be retargetable.
Group: core-security → dom-core-security
I found that nsWebRequestListener and nsStreamListenerWrapper have the same problem. I have open bugs for them too. I do not know if the problem is going to be triggered at all for them (it will not be triggered if one of the listeners in a chain vetoes retargeting), but anyway it should be fixed.
Why is this only a problem in FF56?  Couldn't this have triggered in non-e10s in older releases?
(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #2)
> Why is this only a problem in FF56?  Couldn't this have triggered in
> non-e10s in older releases?

I think it is connected with nsUknownDecoder somehow, which is retargetable only since 56. It was not triggered before 56 (or at least we do not know about it). Maybe I am wrong. Please change flags.
Ok, that makes sense.
Ben, can you take this or do you know if there is someone who can take it?
Flags: needinfo?(bkelly)
Andrew, can you find someone to work this?  I am just starting to dig out of PTO email.
Flags: needinfo?(bkelly) → needinfo?(overholt)
Summary: Make FetchDrive thread safe → Make FetchDriver thread safe
Sorry for the delay here. I think Catalin is probably best positioned to take this when he's back from PTO. He's got some other stuff he's trying to land but can probably parallelize and focus on this 100% when that's finished.
Flags: needinfo?(overholt) → needinfo?(catalin.badea392)
In bug 1379631, I have disabled OMT delivery for FetchDriver. Please turn it on again in this bug. That is also a work around for this bug on 56.
I am marking this unaffected for 56 as a work around landed in bug 1379631.

We can still uplift to 56 fix for this bug and turn on OMT if we want.
Assignee: nobody → catalin.badea392
Priority: -- → P2
Given comment 8, can we remove the privacy bit here?  It doesn't seem like an active security problem.
Don't think there's anything else to be done, FetchDriver is already designed
to accept OnDataAvailable off the main thread.
Attachment #8912183 - Flags: review?(amarchesini)
Attachment #8912183 - Flags: review?(amarchesini) → review+
Flags: needinfo?(catalin.badea392)
https://hg.mozilla.org/integration/mozilla-inbound/rev/22a322b57f5000dcd4a28b52d836a6dbec97eaa6
Is the bot sleeping?

Landed this without sec-approval because there was a fix landed in firefox 56 (comment #9) and esr is marked as unaffected.
(In reply to Cătălin Badea (:catalinb) from comment #12)
> Is the bot sleeping?

Pulsebot doesn't have access to sec bugs.
This shouldn't be a sec bug per comment 8.
Group: dom-core-security
(In reply to Cătălin Badea (:catalinb) from comment #14)
> Backed out due to test failures:

Are there plans to re-land this?
Flags: needinfo?(catalin.badea392)
Since Catalin is busy with child array list I'm going to steal this.
Assignee: catalin.badea392 → bkelly
Status: NEW → ASSIGNED
Flags: needinfo?(catalin.badea392)
It seems that test_ext_webrequest_responseBody.html expects to attach the to the channel and receive OnDataAvailable() calls on the main thread.  The test just happens to use fetch() here, as far as I can tell.

Kris, is there some mechanism that webrequest uses to disable channel retargeting for other code?  Maybe we are missing something in fetch()?  Or is this possibly broken for all webrequest uses where channels are retargeted to other threads?
Flags: needinfo?(kmaglione+bmo)
(In reply to Ben Kelly [:bkelly] from comment #18)
> It seems that test_ext_webrequest_responseBody.html expects to attach the to
> the channel and receive OnDataAvailable() calls on the main thread.  The
> test just happens to use fetch() here, as far as I can tell.
> 
> Kris, is there some mechanism that webrequest uses to disable channel
> retargeting for other code?  Maybe we are missing something in fetch()?  Or
> is this possibly broken for all webrequest uses where channels are
> retargeted to other threads?

It doesn't expect to receive those calls on the main thread. It generally prefers to receive data on the STS thread, but should work on any thread. The data always gets filtered through the main thread of the extension process, but the result always gets re-dispatched on whichever thread the last onDataAvailable call happens on.
Flags: needinfo?(kmaglione+bmo)
Hmm, it seems webrequest tries to support retargeting:

https://searchfox.org/mozilla-central/source/toolkit/components/extensions/webrequest/StreamFilterParent.cpp#120
https://searchfox.org/mozilla-central/source/toolkit/components/extensions/webrequest/ChannelWrapper.cpp#946

And in e10s mode it seems the StreamFilter IPC should execute on the main thread there.

Maybe this is more of a race between the stream filter and the fetch() response.text() resolving.

I'll debug tomorrow.  Sorry for the flag spam.
Actually, it looks like fetch() lost some data in there:

fetch() produced:
```
<!DOCTYPE html>\n    <html lang=\"en\">\n    <head>\n      <meta charset=\"UTF-8\">\n      <title></title>\n    </head>\n    <body>\nconsectetur adipiscing elit, <br>\nsed do eiusmod tempor incididunt ut labore et dolore magna aliqua. <br>\nUt enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. <br>\nDuis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. <br>\nExcepteur sint occaecat cupidatat non proident, <br>\nsunt in culpa qui officia deserunt mollit anim id est laborum.<br>\n\n    </body>\n    </html>\n
```

webrequest saw:
```
<!DOCTYPE html>\n    <html lang=\"en\">\n    <head>\n      <meta charset=\"UTF-8\">\n      <title></title>\n    </head>\n    <body>\nLorem ipsum dolor sit amet, <br>\nconsectetur adipiscing elit, <br>\nsed do eiusmod tempor incididunt ut labore et dolore magna aliqua. <br>\nUt enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. <br>\nDuis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. <br>\nExcepteur sint occaecat cupidatat non proident, <br>\nsunt in culpa qui officia deserunt mollit anim id est laborum.<br>\n\n    </body>\n    </html>\n
```
(In reply to Ben Kelly [:bkelly] from comment #21)
> Actually, it looks like fetch() lost some data in there:

Yeah, there's a bug for that that's on my todo list for this week. When data delivery is targetted to a thread pool other than the one that owns the StreamFilter worker, the data chunks sometimes wind up being processed out of order. I need to figure out a safe locking arrangement that avoids that problem.
Hmm, looking at a debug build, though, shows that StreamFilterParent hit an assertion:

https://treeherder.mozilla.org/logviewer.html#?job_id=133326989&repo=mozilla-inbound&lineNumber=39676

Kris, sorry to bother you again, but does that make any sense to you?  Shouldn't fetch() be in the client side here?

If this is a known issue in StreamFilter, should we perhaps mark it not safe for ThreadRetargeting?
Or is there another solution so I can land this fetch() change if this is a known issue in webrequest?
(In reply to Ben Kelly [:bkelly] from comment #23)
> Shouldn't fetch() be in the client side here?

No, that arrangement here is kind of unusual. The parent side is always in the
process that the request originated in (i.e., the process that owns the
HttpChannelChild), and the child side is always in the extension process.

The parent side used to always be in the parent process, but for complicated
reasons related to content decoders, it's really difficult to correctly filter
response data from the parent process.

> If this is a known issue in StreamFilter, should we perhaps mark it not safe
> for ThreadRetargeting?

We could, but I'd rather just fix it correctly. Like I said, it's on my todo
list for this week.
(In reply to Kris Maglione [:kmag] (long backlog; ping on IRC if you're blocked) from comment #25)
> We could, but I'd rather just fix it correctly. Like I said, it's on my todo
> list for this week.

Ok, can you make that bug block this one?
Andrea, this patch makes us only trigger the FetchDriverObserver::OnDataAvailable() callback if its actually needed to update a FetchObserver.  In addition, we only call it on the first ODA.

The goal is to remove as many main thread runnables as possible while receiving data.  Unfortunately the OnDataAvailable() code was sending an additional MT runnable for every ODA which kind of defeated the point of retargeting OMT.
Attachment #8917842 - Flags: review?(amarchesini)
Comment on attachment 8917842 [details] [diff] [review]
P2 Don't fire FetchDriverObserver::OnDataAvailable() for every FetchDriver ODA callback. r=baku

Review of attachment 8917842 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/fetch/Fetch.cpp
@@ +508,5 @@
> +bool
> +MainThreadFetchResolver::NeedOnDataAvailable()
> +{
> +  NS_ASSERT_OWNINGTHREAD(MainThreadFetchResolver);
> +  return mFetchObserver;

return !!mFetchObserver;
just to make it clear that we are returning a boolean.

@@ +715,5 @@
> +WorkerFetchResolver::NeedOnDataAvailable()
> +{
> +  AssertIsOnMainThread();
> +  MutexAutoLock lock(mPromiseProxy->Lock());
> +  return mFetchObserver;

here as well.
Attachment #8917842 - Flags: review?(amarchesini) → review+
Hmm, the try run is suggesting that maybe the webrequest test is not failing any more with fetch() OMT retargetting.
Oh, re-triggers show its clear still a problem.
Comment on attachment 8917940 [details] [diff] [review]
P3 Disable OMT retargeting of fetch() requests while being traced until webrequest can be fixed. r=kmag

Kris, I understand you have this problem on your todo list, but I'd like to avoid letting this bug sit for any longer.  I'd like to land OMT delivery now while you continue to work on the webrequest issue.

To that end, what do you think about this?  It adds a way to detect if channel tracing is being used and avoids thread retargeting in that case.

This of course disables retargeting when devtools network monitor is active as well, but that should be ok for the near term.  In the long run we will remove this restriction.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=74154c8fce2bcdd537e3054af6612c2aabc48acb
Attachment #8917940 - Flags: review?(kmaglione+bmo)
Depends on: 1405286
Try is green with P3.  I did 10 retriggers and could not reproduce the previous failure.
Comment on attachment 8917940 [details] [diff] [review]
P3 Disable OMT retargeting of fetch() requests while being traced until webrequest can be fixed. r=kmag

Review of attachment 8917940 [details] [diff] [review]:
-----------------------------------------------------------------

If we're going to do this, let's just return failure in StreamFilterParent::CheckListenerChain()
Attachment #8917940 - Flags: review?(kmaglione+bmo) → review-
Ok.  That does seem best given other consumers try to retarget channels as well.  Sorry to push for this right now, but it should be easily reversed once the problem is solved.  Thanks.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ce8a3391a351bd64f5abe196aa822622314396f5
Attachment #8918057 - Flags: review?(kmaglione+bmo)
Attachment #8917940 - Attachment is obsolete: true
Attachment #8918057 - Flags: review?(kmaglione+bmo) → review+
Pushed by bkelly@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0db1445c1db9
FetchDriver should have threadsafe refcounting. r=baku
https://hg.mozilla.org/integration/mozilla-inbound/rev/839e4b1245ca
P2 Don't fire FetchDriverObserver::OnDataAvailable() for every FetchDriver ODA callback. r=baku
https://hg.mozilla.org/integration/mozilla-inbound/rev/213cba8335fd
P3 Disable channel thread retargeting when webrequest is tracing the channel. r=kmag
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.