1320744 - HttpChannelChild should implement nsIThreadRetargetableRequest

Comment on attachment 8823211 [details] Bug 1320744 - Part 1, make refcounter of HttpChannelChild thread-safe. https://reviewboard.mozilla.org/r/101794/#review105712 the patch name should be like "make refcounter of httpchannelchild thread safe" not sure if this or the followup causes a perma leak of HttpBaseChannel. ::: netwerk/protocol/http/HttpChannelChild.h:112 (Diff revision 1) > > mozilla::ipc::IPCResult RecvNotifyTrackingProtectionDisabled() override; > mozilla::ipc::IPCResult RecvNotifyTrackingResource() override; > void FlushedForDiversion(); > > + bool SendDeletingChannel(); override? ::: netwerk/protocol/http/HttpChannelChild.cpp:205 (Diff revision 1) > -// Override nsHashPropertyBag's AddRef: we don't need thread-safe refcnt > +NS_IMPL_ADDREF_INHERITED(HttpChannelChild, HttpBaseChannel) > -NS_IMPL_ADDREF(HttpChannelChild) > > NS_IMETHODIMP_(MozExternalRefCountType) HttpChannelChild::Release() > { > - NS_PRECONDITION(0 != mRefCnt, "dup release"); > + MOZ_ASSERT(int32_t(mRefCnt) > 0, "dup release"); do this assertion on the local |count| var (>=0). ::: netwerk/protocol/http/HttpChannelChild.cpp:2848 (Diff revision 1) > } > > +bool > +HttpChannelChild::SendDeletingChannel() > +{ > + if(NS_IsMainThread()) { if ( ::: netwerk/protocol/http/HttpChannelChild.cpp:2855 (Diff revision 1) > + } > + > + nsresult rv; > + rv = NS_DispatchToMainThread( > + NewRunnableMethod(this, &HttpChannelChild::SendDeletingChannel)); > + return NS_SUCCEEDED(rv); you should assert a success here. because otherwise you leak?

Attachment #8823211 - Flags: review?(honzab.moz) → review+

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 7

•

9 years ago

mozreview-review-reply

Comment on attachment 8823211 [details] Bug 1320744 - Part 1, make refcounter of HttpChannelChild thread-safe. https://reviewboard.mozilla.org/r/101794/#review105712 I've updated the commit log and I'll fix the perma leak before check-in. > override? No, PHttpChannelChild::SendDeletingChannel is not a virtual function. > do this assertion on the local |count| var (>=0). done. > if ( done. > you should assert a success here. because otherwise you leak? done.

Comment hidden (mozreview-request)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 12

•

9 years ago

The memory leak on try is false alarm. BloatView use `NS_LOG_ADDREF` / `NS_LOG_RELEASE` to track the refcount of each object. In my patch I've changed the `HttpChannelChild::AddRef` implementation from `NS_IMPL_ADDREF` to `NS_IMPL_ADDREF_INHERITED`. Therefore, we need to introduce the same semantic as in `NS_IMPL_RELEASE_INHERITED` into our own `HttpChannelChild::Release`.

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 13

•

9 years ago

Comment on attachment 8823211 [details] Bug 1320744 - Part 1, make refcounter of HttpChannelChild thread-safe. The memory leak is fixed in this revision. See comment #12 for more detail.

Attachment #8823211 - Flags: review+ → review?(honzab.moz)

Honza Bambas (:mayhemer)

Comment 14

•

9 years ago

mozreview-review

Comment on attachment 8823212 [details] Bug 1320744 - Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild. https://reviewboard.mozilla.org/r/101796/#review105714 (sorry, this was left in reviewboard on the previous version of the patch, had to publish.. not sure the comments are still current, but some probably are) before I can do a review here, please add good explanatory comments to the code and please add a description of the patch to bugzilla. it's hard to try to understand it w/o a word of clarification. thanks. ::: netwerk/protocol/http/HttpChannelChild.h:196 (Diff revision 1) > // Sets the event target for future IPC messages. Messages will either be > // directed to the TabGroup or DocGroup, depending on the LoadInfo associated > // with the channel. Should be called when a new channel is being set up, > // before the constructor message is sent to the parent. > void SetEventTarget(); > + already_AddRefed<nsIEventTarget> GetEventTargetForNetwork(); comment ::: netwerk/protocol/http/HttpChannelChild.h:287 (Diff revision 1) > // is synthesized. > bool mSuspendParentAfterSynthesizeResponse; > > + // Target thread for delivering ODA. > + nsCOMPtr<nsIEventTarget> mTargetThread; > + nsCOMPtr<nsIEventTarget> mNetworkEventTarget; and this one is? ::: netwerk/protocol/http/HttpChannelChild.cpp:726 (Diff revision 1) > > DoOnStatus(this, transportStatus); > DoOnProgress(this, progress, progressMax); > > + if (mTargetThread && NS_GetCurrentThread() != mTargetThread) { > + // Suspend channel event processing while doing OMT ODA very nice comment :)) please say why... ::: netwerk/protocol/http/HttpChannelChild.cpp:732 (Diff revision 1) > + mEventQ->Suspend(); > + > + RefPtr<HttpChannelChild> self = this; > + nsCOMPtr<nsIRunnable> resumeRunnable = NS_NewRunnableFunction([self]() { > + MOZ_ASSERT(NS_IsMainThread()); > + // Resume channel event processing on original thread. and a nice one again :D why on the original thread..?

Honza Bambas (:mayhemer)

Comment 15

•

9 years ago

mozreview-review

Comment on attachment 8823211 [details] Bug 1320744 - Part 1, make refcounter of HttpChannelChild thread-safe. https://reviewboard.mozilla.org/r/101794/#review105942 ::: netwerk/protocol/http/HttpChannelChild.cpp:208 (Diff revision 3) > NS_IMETHODIMP_(MozExternalRefCountType) HttpChannelChild::Release() > { > - NS_PRECONDITION(0 != mRefCnt, "dup release"); > - NS_ASSERT_OWNINGTHREAD(HttpChannelChild); > - --mRefCnt; > - NS_LOG_RELEASE(this, mRefCnt, "HttpChannelChild"); > + // Copied from NS_IMPL_RELEASE_INHERITED to align the refcount logging > + // with NS_IMPL_ADDREF_INHERITED. > + nsrefcnt count = HttpBaseChannel::Release(); > + NS_LOG_RELEASE(this, count, "HttpChannelChild"); 1. |this| can be no longer valid. 2. I believe the NS_LOG_RELEASE here should be removed, as HttpBaseChannel::Release() takes care of it. 3. sad that when HttpChannelChild will leak now, we won't find out believing it's HttpBaseChannel... ::: netwerk/protocol/http/HttpChannelChild.cpp:214 (Diff revision 3) > > // Normally we Send_delete in OnStopRequest, but when we need to retain the > // remote channel for security info IPDL itself holds 1 reference, so we > // Send_delete when refCnt==1. But if !mIPCOpen, then there's nobody to send > // to, so we fall through. > - if (mKeptAlive && mRefCnt == 1 && mIPCOpen) { > + if (mKeptAlive && count == 1 && mIPCOpen) { hmm... and what if: Thread 1: nsrefcnt count = HttpBaseChannel::Release(); count == 1 --CONTEXT SWITCH-- Thread 2: nsrefcnt count = HttpBaseChannel::Release(); count == 0 delete this; --CONTEXT SWITCH-- Thread 1: this->SendDeletingChannel(); BOOOOM!!!!!! the last ref is held only by IPC, but still, it may happen we get the last call on a different thread. I think the release method cannot land as is.

Attachment #8823211 - Flags: review?(honzab.moz) → review-

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 16

•

9 years ago

mozreview-review

Comment on attachment 8823211 [details] Bug 1320744 - Part 1, make refcounter of HttpChannelChild thread-safe. https://reviewboard.mozilla.org/r/101794/#review106148 ::: netwerk/protocol/http/HttpChannelChild.cpp:208 (Diff revision 3) > NS_IMETHODIMP_(MozExternalRefCountType) HttpChannelChild::Release() > { > - NS_PRECONDITION(0 != mRefCnt, "dup release"); > - NS_ASSERT_OWNINGTHREAD(HttpChannelChild); > - --mRefCnt; > - NS_LOG_RELEASE(this, mRefCnt, "HttpChannelChild"); > + // Copied from NS_IMPL_RELEASE_INHERITED to align the refcount logging > + // with NS_IMPL_ADDREF_INHERITED. > + nsrefcnt count = HttpBaseChannel::Release(); > + NS_LOG_RELEASE(this, count, "HttpChannelChild"); 1. Yes, `this` is not valid for dereference. However `NS_LOG_RELEASE` only use `this` as an id of that object. Code here is aligned with https://searchfox.org/mozilla-central/rev/c477aa8bd99278962998adba1c5e4b15a02c42c7/xpcom/glue/nsISupportsImpl.h#963 2. This `NS_LOG_RELEASE` cannot be removed if we use `NS_IMPL_ADDREF_INHERITED` to implement AddRef() because `NS_LOG_ADDREF` for 'HttpChannelChild' will be called. Therefore, `NS_LOG_RELEASE` here is necessary to keep BloatView happy. 3. In this revision, HttpChannelChild will be tracked by BloatView so we'll see both `HttpChannelChild` and `HttpBaseChannel` appeared in BloatView if leak happened. If we keep `NS_IMPL_ADDREF` instead of using the inherited implementation. We can keep the same approach to implement HttpChannelChild::Release. ::: netwerk/protocol/http/HttpChannelChild.cpp:214 (Diff revision 3) > > // Normally we Send_delete in OnStopRequest, but when we need to retain the > // remote channel for security info IPDL itself holds 1 reference, so we > // Send_delete when refCnt==1. But if !mIPCOpen, then there's nobody to send > // to, so we fall through. > - if (mKeptAlive && mRefCnt == 1 && mIPCOpen) { > + if (mKeptAlive && count == 1 && mIPCOpen) { This should never happen because in the case we need to call `SendDeletingChannel()`, the last refcount is held by the `AddIPDLReference()`. Only `ReleaseIPDLReference()` can make refcount to 0 and it means we already complete the `SendDeletingChannel -> SendDeleteSelf -> send__delete__` procedure.

Comment hidden (mozreview-request)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 19

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e4c69f0b5f97

Comment hidden (mozreview-request)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 22

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1f9e322545ea

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 23

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=919a069dba90

Honza Bambas (:mayhemer)

Comment 24

•

9 years ago

(didn't look at the new patch yet) (In reply to Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) from comment #16) > Comment on attachment 8823211 [details] > Bug 1320744 - Part 1, make refcounter of HttpChannelChild thread-safe. > > https://reviewboard.mozilla.org/r/101794/#review106148 > > ::: netwerk/protocol/http/HttpChannelChild.cpp:208 > (Diff revision 3) > > NS_IMETHODIMP_(MozExternalRefCountType) HttpChannelChild::Release() > > { > > - NS_PRECONDITION(0 != mRefCnt, "dup release"); > > - NS_ASSERT_OWNINGTHREAD(HttpChannelChild); > > - --mRefCnt; > > - NS_LOG_RELEASE(this, mRefCnt, "HttpChannelChild"); > > + // Copied from NS_IMPL_RELEASE_INHERITED to align the refcount logging > > + // with NS_IMPL_ADDREF_INHERITED. > > + nsrefcnt count = HttpBaseChannel::Release(); > > + NS_LOG_RELEASE(this, count, "HttpChannelChild"); > > 1. Yes, `this` is not valid for dereference. However `NS_LOG_RELEASE` only > use `this` as an id of that object. And that is an argument? No. Never pass an invalid ptr to any API! The reason is that when a new object is created in the meantime ending up with the same address as |this| on a different thread, you will screw the counter, obviously. > Code here is aligned with > https://searchfox.org/mozilla-central/rev/ > c477aa8bd99278962998adba1c5e4b15a02c42c7/xpcom/glue/nsISupportsImpl.h#963 Then that code is wrong too. > > 2. This `NS_LOG_RELEASE` cannot be removed if we use > `NS_IMPL_ADDREF_INHERITED` to implement AddRef() because `NS_LOG_ADDREF` for > 'HttpChannelChild' will be called. Therefore, `NS_LOG_RELEASE` here is > necessary to keep BloatView happy. understand. I think just doing NS_LOG_RELEASE before you call super::Release() would do. > > 3. In this revision, HttpChannelChild will be tracked by BloatView so we'll > see both `HttpChannelChild` and `HttpBaseChannel` appeared in BloatView if > leak happened. good. > > If we keep `NS_IMPL_ADDREF` instead of using the inherited implementation. > We can keep the same approach to implement HttpChannelChild::Release. I would personally prefer that, if possible. Could be confusing when the leak happens. > > ::: netwerk/protocol/http/HttpChannelChild.cpp:214 > (Diff revision 3) > > > > // Normally we Send_delete in OnStopRequest, but when we need to retain the > > // remote channel for security info IPDL itself holds 1 reference, so we > > // Send_delete when refCnt==1. But if !mIPCOpen, then there's nobody to send > > // to, so we fall through. > > - if (mKeptAlive && mRefCnt == 1 && mIPCOpen) { > > + if (mKeptAlive && count == 1 && mIPCOpen) { > > This should never happen because in the case we need to call > `SendDeletingChannel()`, the last refcount is held by the > `AddIPDLReference()`. Only `ReleaseIPDLReference()` can make refcount to 0 > and it means we already complete the `SendDeletingChannel -> SendDeleteSelf > -> send__delete__` procedure. OK, let's see if this is going to crash in release or not ;) honestly I don't much trust the code as is, but that unfortunately is also the legacy of our broken IPC referencing logic...

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Updated

•

9 years ago

Depends on: 1334068

Honza Bambas (:mayhemer)

Comment 25

•

9 years ago

mozreview-review

Comment on attachment 8823211 [details] Bug 1320744 - Part 1, make refcounter of HttpChannelChild thread-safe. https://reviewboard.mozilla.org/r/101794/#review109312 still smelling something wrong here... ::: netwerk/protocol/http/HttpChannelChild.cpp:2845 (Diff revision 5) > + return PHttpChannelChild::SendDeletingChannel(); > + } > + > + nsresult rv; > + rv = NS_DispatchToMainThread( > + NewRunnableMethod(this, &HttpChannelChild::SendDeletingChannel)); so... this runnable will addref |this|, making the refcnt == 2 and after it's dispatched, ran and released will drop the counter back to 1 leading to call of SendDeletingChannel from Release() again? I can't locally test this since this patch deps on number of other patches that are yet incomplete. ::: netwerk/protocol/http/HttpChannelChild.cpp:2848 (Diff revision 5) > + nsresult rv; > + rv = NS_DispatchToMainThread( > + NewRunnableMethod(this, &HttpChannelChild::SendDeletingChannel)); > + MOZ_ASSERT(NS_SUCCEEDED(rv)); > + > + return NS_SUCCEEDED(rv); why returning something when it's not checked anywhere?

Attachment #8823211 - Flags: review?(honzab.moz) → review-

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 26

•

9 years ago

mozreview-review

Comment on attachment 8823211 [details] Bug 1320744 - Part 1, make refcounter of HttpChannelChild thread-safe. https://reviewboard.mozilla.org/r/101794/#review111000

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 28

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=9d77be3f0485

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 29

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5ecdc388d0d1

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 30

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3a279e3fda98

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Updated

•

9 years ago

Blocks: PBg-HTTP

Shian-Yow Wu [:swu]

Updated

•

8 years ago

Whiteboard: [necko-next] → [necko-active]

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 31

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e54d9d139e84

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 32

•

8 years ago

mozreview-review-reply

Comment on attachment 8823211 [details] Bug 1320744 - Part 1, make refcounter of HttpChannelChild thread-safe. https://reviewboard.mozilla.org/r/101794/#review109312 > so... this runnable will addref |this|, making the refcnt == 2 and after it's dispatched, ran and released will drop the counter back to 1 leading to call of SendDeletingChannel from Release() again? > > I can't locally test this since this patch deps on number of other patches that are yet incomplete. In `Release()`, `mKeptAlive` is used to guarantee that `SendDeletingChannel()` is only called at most once. However I'm thinking about using `NewNonOwningRunnableMethod` to avoid holding additional refcount. > why returning something when it's not checked anywhere? Fixed. I should have remove the return value when I realize this function cannot be override.

Comment hidden (mozreview-request)

Honza Bambas (:mayhemer)

Comment 35

•

8 years ago

mozreview-review

Comment on attachment 8823211 [details] Bug 1320744 - Part 1, make refcounter of HttpChannelChild thread-safe. https://reviewboard.mozilla.org/r/101794/#review115454 ::: netwerk/protocol/http/HttpChannelChild.cpp:2860 (Diff revision 6) > + DebugOnly<bool> success = PHttpChannelChild::SendDeletingChannel(); > + MOZ_ASSERT(success); > + return; > + } > + > + nsresult rv; move to the top as the first declaration in the method

Attachment #8823211 - Flags: review?(honzab.moz) → review+

Honza Bambas (:mayhemer)

Comment 36

•

8 years ago

Attached patch [for splinter] Bug 1320744 - Part 2, implement nsIThreadRetargetableRequest in HttpChannelChild. (obsolete) — Details — Splinter Review

Honza Bambas (:mayhemer)

Comment 37

•

8 years ago

Comment on attachment 8839572 [details] [diff] [review] [for splinter] Bug 1320744 - Part 2, implement nsIThreadRetargetableRequest in HttpChannelChild. Review of attachment 8839572 [details] [diff] [review]: ----------------------------------------------------------------- few questions: ::: netwerk/protocol/http/HttpChannelChild.cpp @@ +730,5 @@ > const int64_t progress = offset + count; > DoOnProgress(this, progress, progressMax); > > + if (mTargetThread && NS_GetCurrentThread() != mTargetThread) { > + // Suspend channel event processing while doing OMT ODA usual question: why? :) @@ +736,5 @@ > + > + RefPtr<HttpChannelChild> self = this; > + nsCOMPtr<nsIRunnable> resumeRunnable = NS_NewRunnableFunction([self]() { > + MOZ_ASSERT(NS_IsMainThread()); > + // Resume channel event processing on original thread. ....why? @@ +755,5 @@ > + self->DoOnDataAvailable(self, self->mListenerContext, stringStream, offset, count); > + stringStream->Close(); > + > + nsCOMPtr<nsIEventTarget> originalTarget = self->GetChannelEventTarget(); > + originalTarget->Dispatch(resumeRunnable, NS_DISPATCH_NORMAL); I'm lost a bit (lack of knowledge here). can you tell me what's the difference between mEventQ->GetTargetThread() and mTargetThread? I may have a hunch, but I want to make sure I understand. @@ +1884,5 @@ > LOG(("HttpChannelChild::Cancel [this=%p]\n", this)); > + > + nsCOMPtr<nsIEventTarget> originalTarget = GetChannelEventTarget(); > + bool isOnThread; > + originalTarget->IsOnCurrentThread(&isOnThread); Could non-main-thread mEventQ->GetTargetThread() ever become null during the time the channel is pending? Since then we would access mCanceled etc members potentially concurrently. @@ +1888,5 @@ > + originalTarget->IsOnCurrentThread(&isOnThread); > + if (!isOnThread) { > + return originalTarget->Dispatch( > + NewRunnableMethod<nsresult>(this, &HttpChannelChild::Cancel, status), > + NS_DISPATCH_NORMAL); nit: indent or leave on the same line @@ +1889,5 @@ > + if (!isOnThread) { > + return originalTarget->Dispatch( > + NewRunnableMethod<nsresult>(this, &HttpChannelChild::Cancel, status), > + NS_DISPATCH_NORMAL); > + } add a good comment why you are doing this change. if we delay cancellation by posting it, the steam listener (onstart/oda) could still be called on the target after the channel has been canceled (or, the channel's consumer thought it was). I'm not sure how big this could be a problem, but something tells me it may bring issues, hard to reproduce, hard to identify, with hard consequences. @@ +2137,5 @@ > > +already_AddRefed<nsIEventTarget> > +HttpChannelChild::GetChannelEventTarget() > +{ > + nsCOMPtr<nsIEventTarget> target = mEventQ->GetTargetThread(); where from is mEventQ->GetTargetThread() set? what is it exactly and how does it change? @@ +2909,5 @@ > + nsCOMPtr<nsIThreadRetargetableStreamListener> retargetableListener = > + do_QueryInterface(mListener, &rv); > + if (!retargetableListener || NS_FAILED(rv)) { > + NS_WARNING("Listener is not retargetable"); > + return NS_OK; this is non fatal? @@ +2915,5 @@ > + > + rv = retargetableListener->CheckListenerChain(); > + if (NS_FAILED(rv)) { > + NS_WARNING("Subsequent listeners are not retargetable"); > + return NS_OK; as well this? maybe it's correct, just checking. maybe worth a comment if there is nothing in IDL for this method (didn't look)?

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 38

•

8 years ago

Comment on attachment 8839572 [details] [diff] [review] [for splinter] Bug 1320744 - Part 2, implement nsIThreadRetargetableRequest in HttpChannelChild. Review of attachment 8839572 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/protocol/http/HttpChannelChild.cpp @@ +730,5 @@ > const int64_t progress = offset + count; > DoOnProgress(this, progress, progressMax); > > + if (mTargetThread && NS_GetCurrentThread() != mTargetThread) { > + // Suspend channel event processing while doing OMT ODA This is to avoid OnStopRequest to be processed while ODA is still sit in the event queue of the target thread. @@ +736,5 @@ > + > + RefPtr<HttpChannelChild> self = this; > + nsCOMPtr<nsIRunnable> resumeRunnable = NS_NewRunnableFunction([self]() { > + MOZ_ASSERT(NS_IsMainThread()); > + // Resume channel event processing on original thread. ChannelEventQueue will processing the pending queue right on the thread invoking Resume(). However all the channel events are designed to be processed on main thread/network EventTarget, since they will trigger OnStartRequest/OnStopRequest/OnProgress/OnStatus. Therefore, we can only process mEventQ on main thread. @@ +755,5 @@ > + self->DoOnDataAvailable(self, self->mListenerContext, stringStream, offset, count); > + stringStream->Close(); > + > + nsCOMPtr<nsIEventTarget> originalTarget = self->GetChannelEventTarget(); > + originalTarget->Dispatch(resumeRunnable, NS_DISPATCH_NORMAL); mEventQ->GetTargetThread() is the same thread/EventTarget HttpChannelChild sits on, which is main thread/network EventTargat. mTargetThread is the thread for ODA delivery, e.g. image decoder thread. @@ +1889,5 @@ > + if (!isOnThread) { > + return originalTarget->Dispatch( > + NewRunnableMethod<nsresult>(this, &HttpChannelChild::Cancel, status), > + NS_DISPATCH_NORMAL); > + } This change is made because we call `Cancel()` when mListener->OnDataAvailable return failure. In nsHttpChannel we doesn't have this behavior and I think we shouldn't close the channel because of a callback function. In that case, we can keep `Cancel` main thread only. How do you think? @@ +2137,5 @@ > > +already_AddRefed<nsIEventTarget> > +HttpChannelChild::GetChannelEventTarget() > +{ > + nsCOMPtr<nsIEventTarget> target = mEventQ->GetTargetThread(); TargetThread is assigned in https://searchfox.org/mozilla-central/rev/39e4b25a076c59d2e7820297d62319f167871449/netwerk/protocol/http/HttpChannelChild.cpp#2072. This is introduced by Quantum DOM modification in bug 1318506. @@ +2909,5 @@ > + nsCOMPtr<nsIThreadRetargetableStreamListener> retargetableListener = > + do_QueryInterface(mListener, &rv); > + if (!retargetableListener || NS_FAILED(rv)) { > + NS_WARNING("Listener is not retargetable"); > + return NS_OK; This is copied from nsHttpChannel::RetargetDeliveryTo(). There are two types of error handling in caller: 1) NS_ENSURE_SUCCESS, 2) NS_WARNING if failed. Therefore, returning NS_OK is the only choice to make current code work but hiding the warning message as a side effect. @@ +2915,5 @@ > + > + rv = retargetableListener->CheckListenerChain(); > + if (NS_FAILED(rv)) { > + NS_WARNING("Subsequent listeners are not retargetable"); > + return NS_OK; The documentation in IDL is vague. https://searchfox.org/mozilla-central/rev/39e4b25a076c59d2e7820297d62319f167871449/netwerk/base/nsIThreadRetargetableRequest.idl#29 It only states that there is no guarantee retargeting will success, but not describing how to detect it earlier. I thinking adding "@throws NS_ERROR_NOT_AVAILABLE if failed to retarget" in IDL will provide more clear error handling to both caller and implementer. We can do that in a separate bug to change both IDL and code if you also think this is the way to do it.

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 39

•

8 years ago

Comment on attachment 8839572 [details] [diff] [review] [for splinter] Bug 1320744 - Part 2, implement nsIThreadRetargetableRequest in HttpChannelChild. Review of attachment 8839572 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/protocol/http/HttpChannelChild.cpp @@ +2909,5 @@ > + nsCOMPtr<nsIThreadRetargetableStreamListener> retargetableListener = > + do_QueryInterface(mListener, &rv); > + if (!retargetableListener || NS_FAILED(rv)) { > + NS_WARNING("Listener is not retargetable"); > + return NS_OK; I somehow compare to the wrong code while writing previous comment. Here I should just return NS_ERROR_NO_INTERFACE.

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 40

•

8 years ago

mozreview-review-reply

Comment on attachment 8823211 [details] Bug 1320744 - Part 1, make refcounter of HttpChannelChild thread-safe. https://reviewboard.mozilla.org/r/101794/#review115454 > move to the top as the first declaration in the method This one is actually `DebugOnly<nsresult> rv`, so I would prefer to leave the declaration as close as the assignment. In fact I plan to rewrite these lines to: ``` DebugOnly<nsresult> rv = NS_DispatchToMainThread( NewNonOwningRunnableMethod(this, &HttpChannelChild::SendDeletingChannel)); MOZ_ASSERT(NS_SUCCEEDED(rv)); ```

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 41

•

8 years ago

Attached patch main-thread-only-cancel.patch (obsolete) — Details — Splinter Review

The major issue in previous comment is about allowing Cancel() to be called on other threads. With this patch we can avoid calling Cancel() on other threads for OMT ODA.

Attachment #8839859 - Flags: feedback?(honzab.moz)

Honza Bambas (:mayhemer)

Comment 42

•

8 years ago

(In reply to Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) from comment #38) > Comment on attachment 8839572 [details] [diff] [review] > [for splinter] Bug 1320744 - Part 2, implement nsIThreadRetargetableRequest > in HttpChannelChild. > > Review of attachment 8839572 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: netwerk/protocol/http/HttpChannelChild.cpp > @@ +730,5 @@ > > const int64_t progress = offset + count; > > DoOnProgress(this, progress, progressMax); > > > > + if (mTargetThread && NS_GetCurrentThread() != mTargetThread) { > > + // Suspend channel event processing while doing OMT ODA > > This is to avoid OnStopRequest to be processed while ODA is still sit in the > event queue of the target thread. OK, I'd like to talk about this face to face (after/during the necko meeting) to make sure I fully understand. As well for other two replies. > @@ +1889,5 @@ > > + if (!isOnThread) { > > + return originalTarget->Dispatch( > > + NewRunnableMethod<nsresult>(this, &HttpChannelChild::Cancel, status), > > + NS_DISPATCH_NORMAL); > > + } > > This change is made because we call `Cancel()` when > mListener->OnDataAvailable return failure. Which is correct, I think. > In nsHttpChannel we doesn't have > this behavior Because the result of mListener->ODA is returned by nsHttpChannel::ODA and thus goes down to the input stream pump which then cancels itself and fires only OnStopRequest with status = that error. > and I think we shouldn't close the channel because of a > callback function. We must keep to the nsIStreamListener contract. If we cancel asynchronously, I'm scared that we can get ODA calls before we finally get OnStop(error) > In that case, we can keep `Cancel` main thread only. How > do you think? The nsIStreamListener contract is: call OnStartRequest once if it throws, go to [cancel] call 0-N times OnDataAvailable if any of them throws, prevent any other call to ODA, go to [cancel] [cancel]: set status = error from the callback call OnStopRequest(status) once release references but I'm afraid that the following can happen: OnStartRequest on main thread retarget to T2 T2: ODA T2: ODA -> failure -> post cancel to main thread followed possibly with something like T2: ODA consumer: what is this??? Cancel(error) T2: ODA consumer: what is this again??? OnStopRequest(error) T2: ODA consumer: what is this again??? Not sure how bad this can be, but all those ODAs after the second one that threw must be prevented. That's my point. > > @@ +2137,5 @@ > > > > +already_AddRefed<nsIEventTarget> > > +HttpChannelChild::GetChannelEventTarget() > > +{ > > + nsCOMPtr<nsIEventTarget> target = mEventQ->GetTargetThread(); > > TargetThread is assigned in > https://searchfox.org/mozilla-central/rev/ > 39e4b25a076c59d2e7820297d62319f167871449/netwerk/protocol/http/ > HttpChannelChild.cpp#2072. This is introduced by Quantum DOM modification in > bug 1318506. Aha!! So mEventQueue->GetEventTarget() is likely to be the main thread (with prioritization), right? And nsHttpChannel->mTargetThread is the worker thread we want to retarget ODA to, right? I was confused by those two. > > @@ +2909,5 @@ > > + nsCOMPtr<nsIThreadRetargetableStreamListener> retargetableListener = > > + do_QueryInterface(mListener, &rv); > > + if (!retargetableListener || NS_FAILED(rv)) { > > + NS_WARNING("Listener is not retargetable"); > > + return NS_OK; > > This is copied from nsHttpChannel::RetargetDeliveryTo(). There are two types > of error handling in caller: 1) NS_ENSURE_SUCCESS, 2) NS_WARNING if failed. > Therefore, returning NS_OK is the only choice to make current code work but > hiding the warning message as a side effect. OK > > @@ +2915,5 @@ > > + > > + rv = retargetableListener->CheckListenerChain(); > > + if (NS_FAILED(rv)) { > > + NS_WARNING("Subsequent listeners are not retargetable"); > > + return NS_OK; > > The documentation in IDL is vague. > https://searchfox.org/mozilla-central/rev/ > 39e4b25a076c59d2e7820297d62319f167871449/netwerk/base/ > nsIThreadRetargetableRequest.idl#29 > It only states that there is no guarantee retargeting will success, but not > describing how to detect it earlier. I thinking adding "@throws > NS_ERROR_NOT_AVAILABLE if failed to retarget" in IDL will provide more clear > error handling to both caller and implementer. We can do that in a separate > bug to change both IDL and code if you also think this is the way to do it. Leave it as it is. If there will ever be a consumer needing to know if retarget is successful we may change it. (In reply to Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) from comment #40) > Comment on attachment 8823211 [details] > Bug 1320744 - Part 1, make refcounter of HttpChannelChild thread-safe. > > https://reviewboard.mozilla.org/r/101794/#review115454 > > > move to the top as the first declaration in the method > > This one is actually `DebugOnly<nsresult> rv`, Agree!

Honza Bambas (:mayhemer)

Comment 43

•

8 years ago

Comment on attachment 8839859 [details] [diff] [review] main-thread-only-cancel.patch Review of attachment 8839859 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/protocol/http/HttpChannelChild.cpp @@ +739,5 @@ > // rest. > nsCOMPtr<nsIInputStream> stringStream; > nsresult rv = NS_NewByteInputStream(getter_AddRefs(stringStream), data.get(), > + count, (needOMT ? NS_ASSIGNMENT_COPY > + : NS_ASSIGNMENT_DEPEND)); I missed this! yes! @@ +746,5 @@ > return; > } > > + if (needOMT) { > + // Suspend channel event processing while doing OMT ODA so, just add here the explanatory comment you added to bugzilla (with whatever tweaks needed). same for other places. @@ +844,5 @@ > uint64_t offset, uint32_t count) > { > LOG(("HttpChannelChild::DoOnDataAvailable [this=%p]\n", this)); > if (mCanceled) > return; here it is! is mCanceled an atomic? I think rel-acq would do. @@ +847,5 @@ > if (mCanceled) > return; > > nsresult rv = mListener->OnDataAvailable(aRequest, aContext, aStream, offset, count); > + Unused << NS_WARN_IF(NS_FAILED(rv)); totally bad! :X this has to be stored to mStatus (if there is not already). the above if (mCanceled) return; condition has to be extended to if (mCanceled || NS_FAILED(mStatus)) return; and the failure mStatus value must be preserved (not rewritten by Cancel()) and must be passed to OnStop as the status arg. and we definitely must send cancel to the parent channel with the status, yes, from here and here only, and post to the main thread when needed. or something like that.. @@ +1899,5 @@ > NS_IMETHODIMP > HttpChannelChild::Cancel(nsresult status) > { > LOG(("HttpChannelChild::Cancel [this=%p]\n", this)); > + MOZ_ASSERT(NS_IsMainThread()); We do the same in nsHttpChannel, so it's probably alright. ODA should never cancel the channel then. Not happy, but I think it's good enough for keeping this simple.

Attachment #8839859 - Flags: feedback?(honzab.moz) → feedback-

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 44

•

8 years ago

Thanks @mayheymer for the detailed explanation of the nsIStreamListener contract. I think we can prepend a CancelEvent in mEventQ when running DoOnDataAvailable on the retargeted thread. It'll be guaranteed to be processed first before other ODA/OnStopRequest since mEventQ is suspended at that time.

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 45

•

8 years ago

(In reply to Honza Bambas (:mayhemer) from comment #43) > > @@ +844,5 @@ > > uint64_t offset, uint32_t count) > > { > > LOG(("HttpChannelChild::DoOnDataAvailable [this=%p]\n", this)); > > if (mCanceled) > > return; > > here it is! is mCanceled an atomic? I think rel-acq would do. > Solely making mCanceled atomic might not be enough. We should make entire Cancel() atomic since there are more actions happened after `mCanceled = true;` in both nsHttpChannel [1] and HttpChannelChild [2]. [1] https://searchfox.org/mozilla-central/rev/b1044cf7c2000c3e75e8181e893236a940c8b6d2/netwerk/protocol/http/nsHttpChannel.cpp#5765 [2] https://searchfox.org/mozilla-central/rev/b1044cf7c2000c3e75e8181e893236a940c8b6d2/netwerk/protocol/http/HttpChannelChild.cpp#1859

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 46

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4d619d5a40bc

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 47

•

8 years ago

Attached patch http-channel-child-retargetable.patch (obsolete) — Details — Splinter Review

update according to review comment. [summary of changes] 1. HttpChannelChild implements nsIThreadRetargetableRequest 2. Dispatch ODA to retarget thread 3. Allow channel cancel on other thread preempts following ODA/OnStopRequest 4. make mCanceled atomic with rel_acq ordering.

Attachment #8839859 - Attachment is obsolete: true

Attachment #8840360 - Flags: feedback?(honzab.moz)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 48

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=9a6dec524990

Comment hidden (mozreview-request)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 51

•

8 years ago

Comment on attachment 8840360 [details] [diff] [review] http-channel-child-retargetable.patch try looks good, so move to reviewboard for continuing the review process.

Attachment #8840360 - Attachment is obsolete: true

Attachment #8840360 - Flags: feedback?(honzab.moz)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Updated

•

8 years ago

Attachment #8839572 - Attachment is obsolete: true

Honza Bambas (:mayhemer)

Comment 52

•

8 years ago

:sc, please add the latest patch (part 2) as a normal patch to bugzilla. I'm totally lost in what I see in rb. Thanks.

Flags: needinfo?(schien)

Honza Bambas (:mayhemer)

Comment 53

•

8 years ago

Ah! the name of the diff when you download it from rb remains the same among revisions - god, it's so stupid. OK, no problem. I found myself again :)

Flags: needinfo?(schien)

Honza Bambas (:mayhemer)

Comment 54

•

8 years ago

Attached patch (for spliter) Part 2, implement nsIThreadRetargetableRequest in HttpChannelChild. (obsolete) — Details — Splinter Review

Honza Bambas (:mayhemer)

Comment 55

•

8 years ago

Comment on attachment 8841550 [details] [diff] [review] (for spliter) Part 2, implement nsIThreadRetargetableRequest in HttpChannelChild. Review of attachment 8841550 [details] [diff] [review]: ----------------------------------------------------------------- I think I finally understand how the stuff works here. ::: netwerk/protocol/http/HttpChannelChild.cpp @@ +768,5 @@ > + // target to continue processing the following ODA/OnStopRequest. > + nsCOMPtr<nsIEventTarget> originalTarget = self->GetChannelEventTarget(); > + originalTarget->Dispatch(NS_NewRunnableFunction([self]() { > + MOZ_ASSERT(NS_IsMainThread()); > + self->mEventQ->Resume(); this is not a good design from the performance point of view. first, OnTransportAndData is called on the main thread. you find out the channel's ODA has been retargetted. you post this event (the outer one) to that retarget thread, while the event queue (mEventQ) is suspended, so that no other calls to HttpChannelChild::OnTransportAndData can happen (are just put to the queue). then you process ODA on the retarget thread, and only after that you post back to the main thread to resume the queue and start this dance again. So, this will not get rid of the queue starvation delays, it will double them. I think we may need to modify retargetting capability of ChannelEventQueue and use it instead. Now retargeting implementation is pretty lame in that class. My idea is to enhance the RunOrEnqueue method by one argument - an event target. Each ChannelEvent helper class object will carry the event target with it. ChannelEventQueue::FlushQueue will process the events, but before Run is called, the target thread will be checked. If not equal to the current thread, FlushQueue will re-post itself to that thread. This will ensure sequencing of all the events. Also, when there is something in the queue but the target thread passed to RunOrEnqueue is the current thread, the even must be enqueued to preserve order, obviously. Makes sense? mEventQ->RetargetDeliveryTo is used only on two places and both I believe can be easily changed to use the suggested API change. @@ +3015,5 @@ > + // any ODA/OnStopRequest callbacks. > + nsTArray<UniquePtr<ChannelEvent>> cancelEvent; > + cancelEvent.AppendElement(MakeUnique<CancelEvent>(this, aRv)); > + > + mEventQ->PrependEvents(cancelEvent); yes, this is a good solution to put in front. but with the suggested changes above this method would look just like: { nsTArray<UniquePtr<ChannelEvent>> cancelEvent; cancelEvent.AppendElement(MakeUnique<CancelEvent>(NS_GetMainThread(), this, aRv)); mEventQ->PrependEvents(cancelEvent); } nit: maybe add a method "PrependEvent" for a single event addition?

Honza Bambas (:mayhemer)

Comment 56

•

8 years ago

mozreview-review

Comment on attachment 8823212 [details] Bug 1320744 - Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild. https://reviewboard.mozilla.org/r/101796/#review117136 see comment 55 (would give f+ but rb is limited even just in that simple thing...)

Attachment #8823212 - Flags: review?(honzab.moz) → review-

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 57

•

8 years ago

(In reply to Honza Bambas (:mayhemer) from comment #55) sgtm, this will allow ChannelEventQueue to continue event process on multiple threads.

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 58

•

8 years ago

The proposed change of ChannelEventQueue in comment #55 have one restriction: the producer of ChannelEvent is always on main thread. In HTTP ODA retargeting scenario, retarget thread is provided during `OnStartRequest` callback, which happens on main thread. If ODA event is produced on other thread, you'll not be guaranteed to have the retarget thread to call `RunOrEnqueue` unless the thread is blocked until `OnStartRequest` finished. This is going to cause trouble when I move content process Http to STS thread in bug 1338493. nsHttpChannel/nsInputStreamPump doesn't have this issue because nsInputStreamPump requires explicit `AsyncWait` to obtain ODA after OnStartRequest, so the retarget thread can be specified at that time. However, we don't do pulling for receiving IPC message. The solution I can think of is to store the IPC message in a separate queue and process it after OnStartRequest. In this case we can delay the `RunOrEnqueue` until the retarget thread is decided.

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Updated

•

8 years ago

Whiteboard: [necko-active] → [necko-active][PBg-HTTP-M1]

Honza Bambas (:mayhemer)

Comment 59

•

8 years ago

(In reply to Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) from comment #58) > The proposed change of ChannelEventQueue in comment #55 have one > restriction: the producer of ChannelEvent is always on main thread. > > In HTTP ODA retargeting scenario, retarget thread is provided during > `OnStartRequest` callback, which happens on main thread. If ODA event is > produced on other thread, you'll not be guaranteed to have the retarget > thread to call `RunOrEnqueue` unless the thread is blocked until > `OnStartRequest` finished. Yes! Thanks for catching this so soon. I can imagine the architecture will be as following: HttpChannelChild will have a member nsIEventTarget (say, mTargetThread) that will be set (changed?) from RetargetDeliveryTo. In RecvODA you will call mEventQ->RunOnEnqueue(mTargetThread, event). Right? So... what if the nsIEventTarget reference to RunOrEnqueue ODAs on would be your own implementation of that interface that will be assign a real thread later, in this case somewhere from OnStartRequest. like (very schematically): class MyLazyTarget : nsIEventTarget // held as HttpChannelChild { RefPtr<> mTargetThread } { mTarget; nsresult IsOnCurrentThread(bool *res) { mTarget ? mTarget->IsOnCurrentThread(res) : *res = false; return OK } FORWARD_ALL_OTHER_METHODS(mTarget->) public: SetTarget(aTarget) { mTarget = aTarget } // called from: RetargetDeliveryTo(t) { mTargetThread->SetTarget(t) } }; The ChannelEventQueue class is using (calling on) the target thread in a later stage (after you are out of OnStartRequest) anyway, so I think it could solve the problem. Also, (if not already implemented) you need to put stuff to the queue if any thread is inside of a ChannelEvent(derivate)::Run(). A re-entrant locked counter in ChannelEventQueue will be needed. When it's non-null, RunOrEnqueue will always enqueue. Makes sense? > > This is going to cause trouble when I move content process Http to STS > thread in bug 1338493. > > nsHttpChannel/nsInputStreamPump doesn't have this issue because > nsInputStreamPump requires explicit `AsyncWait` to obtain ODA after > OnStartRequest, so the retarget thread can be specified at that time. Actually, must be. > However, we don't do pulling for receiving IPC message. The solution I can > think of is to store the IPC message in a separate queue and process it > after OnStartRequest. In this case we can delay the `RunOrEnqueue` until the > retarget thread is decided. If the nsIEventTarget wrapper idea has some serious flaw, then this could do, but separate queues sounds like something rather over complicated.

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 60

•

8 years ago

(In reply to Honza Bambas (:mayhemer) from comment #59) > (In reply to Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) from comment > #58) > > The proposed change of ChannelEventQueue in comment #55 have one > > restriction: the producer of ChannelEvent is always on main thread. > > > > In HTTP ODA retargeting scenario, retarget thread is provided during > > `OnStartRequest` callback, which happens on main thread. If ODA event is > > produced on other thread, you'll not be guaranteed to have the retarget > > thread to call `RunOrEnqueue` unless the thread is blocked until > > `OnStartRequest` finished. > > Yes! Thanks for catching this so soon. > > I can imagine the architecture will be as following: > > HttpChannelChild will have a member nsIEventTarget (say, mTargetThread) that > will be set (changed?) from RetargetDeliveryTo. In RecvODA you will call > mEventQ->RunOnEnqueue(mTargetThread, event). Right? > > So... what if the nsIEventTarget reference to RunOrEnqueue ODAs on would be > your own implementation of that interface that will be assign a real thread > later, in this case somewhere from OnStartRequest. > > like (very schematically): > > class MyLazyTarget : nsIEventTarget // held as HttpChannelChild { RefPtr<> > mTargetThread } > { > mTarget; > nsresult IsOnCurrentThread(bool *res) { mTarget ? > mTarget->IsOnCurrentThread(res) : *res = false; return OK } > FORWARD_ALL_OTHER_METHODS(mTarget->) > public: > SetTarget(aTarget) { mTarget = aTarget } // called from: > RetargetDeliveryTo(t) { mTargetThread->SetTarget(t) } > }; > > The ChannelEventQueue class is using (calling on) the target thread in a > later stage (after you are out of OnStartRequest) anyway, so I think it > could solve the problem. The major flaw is that all the nsIEventTarget operation on MyLazyTarget is not guaranteed to happen on the same event target object, since `SetTarget` can be called on other thread simultaneously. This will make the event queue processing more unpredictable. We can do it the other way around: make each ChannelEvent to provide the nsIEventTarget they want. In this case, we can get the event target at the beginning of the event processing in ChannelEventQueue.h/.cpp and guarantee all the event target operations are on the same object in `RunOrEnqueue` and `FlushQueue`. > > > Also, (if not already implemented) you need to put stuff to the queue if any > thread is inside of a ChannelEvent(derivate)::Run(). A re-entrant locked > counter in ChannelEventQueue will be needed. When it's non-null, > RunOrEnqueue will always enqueue. Sure, will be included in my implementation.

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 61

•

8 years ago

Attached patch p2-channel-event-queue.patch (obsolete) — Details — Splinter Review

summary of the changes: 1. make ChannelEventQueue to be able to run, enqueue, and flush on multiple threads. 2. leverage queue suspension to restrict single event execution over multiple threads. 3. allow AutoEventEnqueuer on multiple threads.

Attachment #8841550 - Attachment is obsolete: true

Attachment #8843862 - Flags: review?(honzab.moz)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 62

•

8 years ago

Attached patch p3-retargetable-http-child.patch (obsolete) — Details — Splinter Review

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Updated

•

8 years ago

Attachment #8843863 - Flags: review?(honzab.moz)

Comment hidden (mozreview-request)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 66

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a37417fb7a00

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 67

•

8 years ago

Comment on attachment 8843862 [details] [diff] [review] p2-channel-event-queue.patch Review of attachment 8843862 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/ipc/ChannelEventQueue.cpp @@ +89,5 @@ > > if (!--mSuspendCount) { > RefPtr<Runnable> event = > + NewCancelableRunnableMethod(this, &ChannelEventQueue::CompleteResume); > + Unused << NS_WARN_IF(NS_FAILED(NS_DispatchToCurrentThread(event.forget()))); hit bug 1345319 here because Fetch API will retarget to nsStreamTransportService, which is a thread pool.

Honza Bambas (:mayhemer)

Comment 68

•

8 years ago

:sc, can you please give a proper description on the patches which part is which? like "part 1 of 4" or something. it's hard to figure out which patch to review first, second etc. Thanks.

Flags: needinfo?(schien)

Honza Bambas (:mayhemer)

Comment 69

•

8 years ago

Ah! I see it now, you made copies in spliter :D got it. Thanks.

Honza Bambas (:mayhemer)

Comment 70

•

8 years ago

I'll be doing splinter copies (if necessary) myself, you don't need to do it.

Flags: needinfo?(schien)

Honza Bambas (:mayhemer)

Comment 71

•

8 years ago

Comment on attachment 8843862 [details] [diff] [review] p2-channel-event-queue.patch Review of attachment 8843862 [details] [diff] [review]: ----------------------------------------------------------------- I'm a bit puzzled here. you were talking about problems when ondata can arrive sooner than we finish onstart so that you can't pass target as an arg to RunOrEnque. but this patch shows something else. but the patch works according my local testing. can you summarize what problems you are facing? can you file any bugs and make them blocking this one of there are any issues left to fix first? ::: netwerk/ipc/ChannelEventQueue.cpp @@ +89,5 @@ > > if (!--mSuspendCount) { > RefPtr<Runnable> event = > + NewCancelableRunnableMethod(this, &ChannelEventQueue::CompleteResume); > + Unused << NS_WARN_IF(NS_FAILED(NS_DispatchToCurrentThread(event.forget()))); Note that the dispatch instead of a direct call of CompleteResume has been introduced in bug 870564. ::: netwerk/ipc/ChannelEventQueue.h @@ +63,4 @@ > // @param aAssertionWhenNotQueued - this optional param will be used in an > // assertion when the event is executed directly. > inline void RunOrEnqueue(ChannelEvent* aCallback, > + nsIEventTarget* aTarget = nullptr, I though you wanted to let the target hang off of the event and not pass it as an arg? @@ +105,4 @@ > > uint32_t mSuspendCount; > + bool mSuspended; > + uint32_t mForcedCount; write a comment what this is @@ +168,5 @@ > ChannelEventQueue::StartForcedQueueing() > { > MutexAutoLock lock(mMutex); > mForced = true; > + ++mForcedCount; doesn't just one flag do? why do you need a bool and a counter? for mSuspend/mSuspendCount it makes sense, because those are used somewhat independently. But looking at the code, you would do with just if (!!mForcedCount) instead of if (mForced) on all places. ::: netwerk/protocol/http/HttpChannelChild.cpp @@ +305,5 @@ > { > LOG(("HttpChannelChild::RecvAssociateApplicationCache [this=%p]\n", this)); > mEventQ->RunOrEnqueue(new AssociateApplicationCacheEvent(this, groupID, > + clientID), > + mNeckoTarget); I think this should always end up on the main thread. @@ +844,5 @@ > MOZ_RELEASE_ASSERT(!mFlushedForDiversion, > "Should not be receiving any more callbacks from parent!"); > > mEventQ->RunOrEnqueue(new StopRequestEvent(this, channelStatus, timing), > + mNeckoTarget, shouldn't this always end up on the main thread too? @@ +1249,5 @@ > RefPtr<HttpChannelChild> self(this); > Send__delete__(this); > > + // XXX > + mNeckoTarget = nullptr; needs a comment. and how do you ensure this is thread-safe thing to do? @@ +1343,5 @@ > mEventQ->RunOrEnqueue(new Redirect1Event(this, registrarId, newUri, > redirectFlags, responseHead, > securityInfoSerialization, > + channelId), > + mNeckoTarget); this must happen on the main thread @@ +1498,5 @@ > mozilla::ipc::IPCResult > HttpChannelChild::RecvRedirect3Complete() > { > LOG(("HttpChannelChild::RecvRedirect3Complete [this=%p]\n", this)); > + mEventQ->RunOrEnqueue(new Redirect3Event(this), mNeckoTarget); this as well. @@ +2120,5 @@ > if (dispatcher) { > nsCOMPtr<nsIEventTarget> target = > dispatcher->EventTargetFor(TaskCategory::Network); > gNeckoChild->SetEventTargetForActor(this, target); > + mNeckoTarget = target; how are you ensuring atomicity or sequential access to this member?

Attachment #8843862 - Flags: review?(honzab.moz) → feedback+

Honza Bambas (:mayhemer)

Comment 72

•

8 years ago

Comment on attachment 8843863 [details] [diff] [review] p3-retargetable-http-child.patch Review of attachment 8843863 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/protocol/http/HttpChannelChild.cpp @@ +829,5 @@ > return; > > nsresult rv = mListener->OnDataAvailable(aRequest, aContext, aStream, offset, count); > if (NS_FAILED(rv)) { > + TryCancel(rv); could the method be named "CancelOnMainThread" ? @@ +2993,5 @@ > + nsCOMPtr<nsIThread> mainThread; > + NS_GetMainThread(getter_AddRefs(mainThread)); > + target = mainThread.forget(); > + } > + mEventQ->PrependEvents(cancelEvent, target); should use the PrependEvent method?

Attachment #8843863 - Flags: review?(honzab.moz) → review+

Honza Bambas (:mayhemer)

Comment 73

•

8 years ago

mozreview-review

Comment on attachment 8823212 [details] Bug 1320744 - Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild. https://reviewboard.mozilla.org/r/101796/#review120552

Attachment #8823212 - Flags: review?(honzab.moz) → review+

Honza Bambas (:mayhemer)

Comment 74

•

8 years ago

mozreview-review

Comment on attachment 8843871 [details] Bug 1320744 - Part 2, Allow ChannelEventQueue to perform flush on multiple threads. https://reviewboard.mozilla.org/r/117488/#review120554

Attachment #8843871 - Flags: review?(honzab.moz)

Honza Bambas (:mayhemer)

Updated

•

8 years ago

Attachment #8843862 - Attachment is obsolete: true

Honza Bambas (:mayhemer)

Updated

•

8 years ago

Attachment #8843863 - Attachment is obsolete: true

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 75

•

8 years ago

Comment on attachment 8843863 [details] [diff] [review] p3-retargetable-http-child.patch Review of attachment 8843863 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/protocol/http/HttpChannelChild.cpp @@ +829,5 @@ > return; > > nsresult rv = mListener->OnDataAvailable(aRequest, aContext, aStream, offset, count); > if (NS_FAILED(rv)) { > + TryCancel(rv); Sure. That makes the code more self-explanatory. @@ +2993,5 @@ > + nsCOMPtr<nsIThread> mainThread; > + NS_GetMainThread(getter_AddRefs(mainThread)); > + target = mainThread.forget(); > + } > + mEventQ->PrependEvents(cancelEvent, target); Good catch. Totally forgot to modify this part...

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 76

•

8 years ago

Comment on attachment 8843862 [details] [diff] [review] p2-channel-event-queue.patch Review of attachment 8843862 [details] [diff] [review]: ----------------------------------------------------------------- Here is the summary of issues I encountered: 1. the main-thread-producer-only limitation I mentioned in comment 58 is not going to take effect until I start working on bug 1338493 (move HttpChannel IPC from main thread to STS thread in content process). That's why my current patch works. My first step is to make sure multi-thread queue consuming works. Next step is to decide whether to 2. the DispatchToCurrentThread-in-thread-pool issue will be triggered by current patch. Bug 1345319 is filed but I was not able to solve it by myself. In despite I'm trying to workaround it by not calling DispatchToCurrentThread in Resume. 3. While I'm trying to write the patch for the workaround, I hit the ServiceWorker+Diversion issue because the I change the timing of queue flushing for Resume. It reveals a issue that synthesized ODA/OnStopRequest is not aware by ChannelEventQueue, so the guard event `HttpFlushedForDiversionEvent` cannot guarantee all the ODA/OnStopRequest are diverted to parent. I'll explain it in more detail in next comment. ::: netwerk/ipc/ChannelEventQueue.cpp @@ +89,5 @@ > > if (!--mSuspendCount) { > RefPtr<Runnable> event = > + NewCancelableRunnableMethod(this, &ChannelEventQueue::CompleteResume); > + Unused << NS_WARN_IF(NS_FAILED(NS_DispatchToCurrentThread(event.forget()))); Thanks, good to know the background of this line of code. In my local patch I tried to workaround bug 1345319 by making a direct call to CompleteResume. I'll make sure we don't hit the original multiple suspend issue. ::: netwerk/protocol/http/HttpChannelChild.cpp @@ +305,5 @@ > { > LOG(("HttpChannelChild::RecvAssociateApplicationCache [this=%p]\n", this)); > mEventQ->RunOrEnqueue(new AssociateApplicationCacheEvent(this, groupID, > + clientID), > + mNeckoTarget); I should have added a clear comment about mNeckoTarget. This is the event target for Network events on main thread (Quantum DOM stuff, see bug 1318506). @@ +1249,5 @@ > RefPtr<HttpChannelChild> self(this); > Send__delete__(this); > > + // XXX > + mNeckoTarget = nullptr; My stupid mistake...I should restore the original comment because it is for the same purpose. mNeckoTarget is used only on main thread so no thread-safty issue (This is not the retargeted thread for ODA to be clear).

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 77

•

8 years ago

Attached image ServiceWorker_Diversion.png — Details

This diagram illustrates the issue I see in ServiceWorker diversion scenario. In current implementation, mSynthesizedResponsePump->Resume() will dispatch nsInputStreamPump::OnInputStreamReady to main thread and ChannelEventQueue::Resume will also dispatch a task back to current thread to complete resume. So the synthesized ODA/OnStopRequest (which generated by OnInputStreamReady) happens to be executed before FlushedForDiversion (which called after even queue resumption). HttpChannelParent will recv ODA/OnStopRequest/DivertComplete in correct order. If I make ChannelEventQueue not dispatching a task but call CompleteResume directly. The FlushedForDiversion will be executed first which cause DivertComplete be received before ODA/OnStopRequest. One solution is that DivertComplete should be sent after both synthesized OnStopRequest is diverted and HttpFlushedForDiversionEvent is executed in ServiceWorker scenario. IMO this is legit because the ODA/OnStopRequest is generated from content process in this scenario, DivertComplete should only be sent after all events are populated. Need @jdm and @mayhemer to take a look at my proposal.

Flags: needinfo?(josh)

Flags: needinfo?(honzab.moz)

Honza Bambas (:mayhemer)

Comment 78

•

8 years ago

Thanks for the graph! It really helped. OK, so you are calling CompleteResume() directly from Resume() to w/a bug 1345319. But it is causing reorder of events. Should we then rather invest time to fix bug 1345319? It's a bug that should be fixed anyway. I think it's not a good idea to remove the dispatch of CompleteResume() from Resume(), because it could cause other regressions we may not yet know about. This is a large change already, so we should keep most of the behavior as it was before the patch, if possible. I can help fixing bug 1345319 if you want me to.

Flags: needinfo?(honzab.moz) → needinfo?(schien)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 79

•

8 years ago

Agreed that we should preserve the behavior as much as we can. @mayhemer, it will help me a lot if you can take a look at bug 1345319, since I have no clue on what that patch might look like.

Flags: needinfo?(schien)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 80

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ef5a0c53de46

Comment hidden (mozreview-request)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 84

•

8 years ago

(In reply to Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) from comment #79) > Agreed that we should preserve the behavior as much as we can. > > @mayhemer, it will help me a lot if you can take a look at bug 1345319, > since I have no clue on what that patch might look like. Ok, I found another way to workaround bug 1345319. With the fix I mentioned in comment #60, we can now always get the correct event target and call `nsIEventTarget::Dispatch`. In this way we can avoid NS_DispatchToCurrentThread in ChannelEventQueue.

Comment hidden (mozreview-request)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 88

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=299d8f31947c82ab0f2d766b68af98130e307f1b

Comment hidden (mozreview-request)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 92

•

8 years ago

@mayhemer, you may start review my patches again. To be noticed that there are still one docshell leak/HttpChannelChild leak in browser mochitest (https://treeherder.mozilla.org/#/jobs?repo=try&revision=5da83f9fcc7f1c1a22335d8faee5540ce950deed). It is related to enabling thread retargeting in nsHTML5StreamParser, try shows green if I manually turn off HTTP retargeting in nsHTML5StreamParser (https://treeherder.mozilla.org/#/jobs?repo=try&revision=914ef087543fd82878ca11c37923602ac050d933). I'm still trying to figure out the root cause.

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Updated

•

8 years ago

Flags: needinfo?(josh)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 93

•

8 years ago

The root cause of docshell leak/HttpChannelChild leak in browser chrome tests is that DoOnStatus/DoOnProgress cannot be run after DoOnStopRequest. OnStopRequest might trigger test case finish and will trigger test framework shutdown. The async DoOnStatus/DoOnProgress might cause the delay of CC on nsDocShell. This will result fails alarm in ShutdownLeak. A workaround is to postpone OnStopRequest processing by resubmitting it to the end of the main thread event target, if thread retargeting is enabled. In this case we can guarantee that all the DoOnStatus/DoOnProgress is done before DoOnStopRequest. This workaround can be removed after bug 1338493 is complete because we'll always submit OnStopRequest from STS thread to main thread.

Comment hidden (mozreview-request)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 97

•

8 years ago

Try looks good. https://treeherder.mozilla.org/#/jobs?repo=try&revision=d0ae32e8bee1b4da8cfe17b4c984152470eed4dc

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 98

•

8 years ago

Comment on attachment 8823212 [details] Bug 1320744 - Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild. part 3 should be reviewed again.

Attachment #8823212 - Flags: review+ → review?(honzab.moz)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 99

•

8 years ago

@mayhemer, any chance to provide review result today?

Flags: needinfo?(honzab.moz)

Honza Bambas (:mayhemer)

Comment 100

•

8 years ago

I'll see what I can do. Pretty busy currently. Sorry for delays.

Flags: needinfo?(honzab.moz)

Honza Bambas (:mayhemer)

Comment 101

•

8 years ago

Attached patch (for splinter review) Bug 1320744 - Part 2, Allow ChannelEventQueue to perform flush on multiple threads. — Details — Splinter Review

Honza Bambas (:mayhemer)

Comment 102

•

8 years ago

Attached patch (for splinter review) Bug 1320744 - Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild. — Details — Splinter Review

Honza Bambas (:mayhemer)

Comment 103

•

8 years ago

Comment on attachment 8849952 [details] [diff] [review] (for splinter review) Bug 1320744 - Part 2, Allow ChannelEventQueue to perform flush on multiple threads. Review of attachment 8849952 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/ipc/ChannelEventQueue.h @@ +122,5 @@ > bool aAssertionWhenNotQueued) > { > MOZ_ASSERT(aCallback); > > + // Events execution could be destruction of channel (and our own 'could be a destruction of the channel' @@ +148,5 @@ > + > + nsCOMPtr<nsIEventTarget> target = event->GetEventTarget(); > + MOZ_ASSERT(target); > + > + bool isCurrentThread; init to false @@ +155,5 @@ > + > + if (!isCurrentThread) { > + SuspendInternal(); > + mEventQueue.AppendElement(Move(event)); > + ResumeInternal(); add a comment explaining why you need to suspend/resume @@ +192,5 @@ > +inline nsresult > +ChannelEventQueue::PrependEvent(UniquePtr<ChannelEvent>& aEvent) > +{ > + MutexAutoLock lock(mMutex); > + MOZ_ASSERT(mSuspended || !!mForcedCount); explain well this assertion @@ +208,5 @@ > inline nsresult > ChannelEventQueue::PrependEvents(nsTArray<UniquePtr<ChannelEvent>>& aEvents) > { > MutexAutoLock lock(mMutex); > + MOZ_ASSERT(mSuspended || !!mForcedCount); as well here. ::: netwerk/ipc/ChannelEventQueue.cpp @@ +51,5 @@ > + MOZ_ASSERT(!mFlushing); > + mFlushing = true; > + } > + > + while (true) { few objections, definitely all for followups: - what if an event (on this thread) suspends the queue? I don't see it being checked. - Suspend from a different thread is probably OK as it won't see any events until Resume (because of repost). still a bit racy, but probably OK to leave as is (the code was racy before this patch already). @@ +60,5 @@ > + > + nsCOMPtr<nsIEventTarget> target = event->GetEventTarget(); > + MOZ_ASSERT(target); > + > + bool isCurrentThread; init to false @@ +62,5 @@ > + MOZ_ASSERT(target); > + > + bool isCurrentThread; > + DebugOnly<nsresult> rv = target->IsOnCurrentThread(&isCurrentThread); > + MOZ_ASSERT(NS_SUCCEEDED(rv)); hmm.. we will just loop when this fails. could we rather DIAGNOSTIC_ASSERT(false) and either throw the event away or exec on the current thread? @@ +87,3 @@ > } > > + // We can only resume after completely exits flushing state. don't type obvious comments, say why you do this. @@ +104,5 @@ > +{ > + mMutex.AssertCurrentThreadOwns(); > + > + mSuspended = true; > + mSuspendCount++; nit: a small assert that mSuspended == !!mSuspendCount before you do the changes? @@ +138,4 @@ > > + nsCOMPtr<nsIEventTarget> target; > + target = mEventQueue[0]->GetEventTarget(); > + MOZ_ASSERT(target); can you add a comment why you are picking the target of the next event? ::: netwerk/protocol/ftp/FTPChannelChild.cpp @@ +259,5 @@ > > + already_AddRefed<nsIEventTarget> GetEventTarget() > + { > + return do_GetMainThread(); > + } aren't you missing 'override' here? and all other places you are implementing GetEventTarget()? maybe you could have a base class implementing this method specific to each module. there seems to be a lot of duplicated code among the other files too. maybe helpful could be to have class ChannelMainThreadEvent : public ChannelEvent { already_AddRefed<nsIEventTarget> GetEventTarget() override { return do_GetMainThread(); } }; in ChannelEventQueue.h and derive here (and where applicable) from it instead of ChannelEvent

Honza Bambas (:mayhemer)

Comment 104

•

8 years ago

Comment on attachment 8850522 [details] [diff] [review] (for splinter review) Bug 1320744 - Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild. Review of attachment 8850522 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/protocol/http/HttpChannelChild.cpp @@ +761,3 @@ > const int64_t progressMax = mResponseHead->ContentLength(); > const int64_t progress = offset + count; > + if (NS_IsMainThread()) { blank line before this line please this whole block needs a good comment WHY it's been done. @@ +958,5 @@ > SendDivertOnStopRequest(channelStatus); > return; > } > > + // In thread retargeting is enabled, there might be DoOnStatus/DoOnProgress // If ? are references to DoOnStatus and DoOnProgress correct? I don't see events for them, just direct calls. Are you rather referring to events posted by mProgressSink? Order of OnStatus/OnProgress/OnStopRequest calls on HttpChannelChild are ensured by dispatching them via mEventQ, so they are ensured to be sequenced even among threads. Therefor, I'm not sure how we can even get call to HttpChannelChild::OnStopRequest when there are still pending some status and progress events. This should be made more clear. Unless I'm missing something (can be :)) there could be a bug in the event queue logic (part 2 patch). @@ +961,5 @@ > > + // In thread retargeting is enabled, there might be DoOnStatus/DoOnProgress > + // sit in the main thread event queue. We need to ensure OnStopRequest is > + // fired after that by resubmitting the StopRequestEvent at the end of main > + // thread event queueu. queueu @@ +3134,5 @@ > + MOZ_ASSERT(!NS_IsMainThread()); > + MOZ_ASSERT(aChild); > + } > + > + void Run() { mChild->Cancel(mRv); } assert thread, if applicable. @@ +3144,5 @@ > + return target.forget(); > + } > + > +private: > + HttpChannelChild* mChild; why isn't this strong? ::: parser/html/nsHtml5StreamParser.cpp @@ -983,5 @@ > > if (NS_FAILED(rv)) { > - // for now skip warning if we're on child process, since we don't support > - // off-main thread delivery there yet. This will change with bug 1015466 > - if (!XRE_IsContentProcess()) { :))

Honza Bambas (:mayhemer)

Comment 105

•

8 years ago

mozreview-review

Comment on attachment 8843871 [details] Bug 1320744 - Part 2, Allow ChannelEventQueue to perform flush on multiple threads. https://reviewboard.mozilla.org/r/117488/#review125564 https://bugzilla.mozilla.org/show_bug.cgi?id=1320744#c103

Attachment #8843871 - Flags: review?(honzab.moz) → review+

Honza Bambas (:mayhemer)

Comment 106

•

8 years ago

mozreview-review

Comment on attachment 8823212 [details] Bug 1320744 - Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild. https://reviewboard.mozilla.org/r/101796/#review125568 https://bugzilla.mozilla.org/show_bug.cgi?id=1320744#c104 I really sincerely hate review board!

Attachment #8823212 - Flags: review?(honzab.moz) → review+

Honza Bambas (:mayhemer)

Updated

•

8 years ago

Attachment #8850522 - Attachment description: (for splinter review) Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild. → (for splinter review) Bug 1320744 - Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild.

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 107

•

8 years ago

Comment on attachment 8850522 [details] [diff] [review] (for splinter review) Bug 1320744 - Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild. Review of attachment 8850522 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/protocol/http/HttpChannelChild.cpp @@ +761,3 @@ > const int64_t progressMax = mResponseHead->ContentLength(); > const int64_t progress = offset + count; > + if (NS_IsMainThread()) { Done. Following comment was added. // OnTransportAndData will be run on retargeted thread if applicable, however // OnStatus/OnProgress event can only be fired on main thread. We need to // dispatch the status/progress event handling back to main thread with the // appropriate event target for networking. @@ +958,5 @@ > SendDivertOnStopRequest(channelStatus); > return; > } > > + // In thread retargeting is enabled, there might be DoOnStatus/DoOnProgress Here I tried reference to the NS_NewRunnableFunction dispatched in OnTransportAndData (line 771), not the individual StatusEvent/ProgressEvent. Maybe I should use "event target" instead of "event queue" to avoid the confusion. Here is the scenario I meant: [on main thread] > RecvOnTransportAndData > - RunOrEnqueue(TransportAndDataEvent) [switch to retargeted thread] > - CompleteResume / FlushQueue > - OnTransportAndData > - dispatch runnable to main thread for DoOnStatus/DoOnProgress [switch back to main thread] > RecvOnStopRequest > - RunOrEnqueue(OnStopRequestEvent) > - OnStopRequest > - call mListener->OnStopRequest [process next runnable on main thread] > RunnableFunction::Run > - DoOnStatus / DoOnProgress > - call mListener->OnStatus / mListener->OnProgress My code here will pending the ChannelEventQueue processing back to the end of main thread event target. Thus all the RunnableFunction we generated in OnTransportAndData will be executed before mListener->OnStopRequest @@ +961,5 @@ > > + // In thread retargeting is enabled, there might be DoOnStatus/DoOnProgress > + // sit in the main thread event queue. We need to ensure OnStopRequest is > + // fired after that by resubmitting the StopRequestEvent at the end of main > + // thread event queueu. Done. @@ +3134,5 @@ > + MOZ_ASSERT(!NS_IsMainThread()); > + MOZ_ASSERT(aChild); > + } > + > + void Run() { mChild->Cancel(mRv); } HttpChannelChild::Cancel already have thread assertion and looks like a convention in HttpChannelChild to not add extra thread assertion (assertion in callee). Though I can always add it here. @@ +3144,5 @@ > + return target.forget(); > + } > + > +private: > + HttpChannelChild* mChild; This is a convention of the ChannelEvent to avoid reference cycle (HttpChannelChild -> ChannelEventQueue -> ChannelEvent). Otherwise HttpChannelChild dtor will never be called if there is a ChannelEvent in queue, additional ChannelEventQueue::Clear will be needed and called all over the place when we abort HttpChannelChild.

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 108

•

8 years ago

Comment on attachment 8849952 [details] [diff] [review] (for splinter review) Bug 1320744 - Part 2, Allow ChannelEventQueue to perform flush on multiple threads. Review of attachment 8849952 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/ipc/ChannelEventQueue.h @@ +122,5 @@ > bool aAssertionWhenNotQueued) > { > MOZ_ASSERT(aCallback); > > + // Events execution could be destruction of channel (and our own Done. @@ +148,5 @@ > + > + nsCOMPtr<nsIEventTarget> target = event->GetEventTarget(); > + MOZ_ASSERT(target); > + > + bool isCurrentThread; Done. @@ +155,5 @@ > + > + if (!isCurrentThread) { > + SuspendInternal(); > + mEventQueue.AppendElement(Move(event)); > + ResumeInternal(); Here I simply reuse the code in Resume to decide next thread to run event. Here is the comment I add: // This event cannot be run on current thread so put it in queue and // resume it on corresponding thread. @@ +192,5 @@ > +inline nsresult > +ChannelEventQueue::PrependEvent(UniquePtr<ChannelEvent>& aEvent) > +{ > + MutexAutoLock lock(mMutex); > + MOZ_ASSERT(mSuspended || !!mForcedCount); I add following comment: // Prepending event while no queue flush foreseen might cause the following // channel events not run. This assertion here guarantee there must be a // queue flush, either triggered by Resume or EndForcedQueueing, to execute // the added events. @@ +208,5 @@ > inline nsresult > ChannelEventQueue::PrependEvents(nsTArray<UniquePtr<ChannelEvent>>& aEvents) > { > MutexAutoLock lock(mMutex); > + MOZ_ASSERT(mSuspended || !!mForcedCount); same as above. ::: netwerk/ipc/ChannelEventQueue.cpp @@ +51,5 @@ > + MOZ_ASSERT(!mFlushing); > + mFlushing = true; > + } > + > + while (true) { Suspend during flushing is done by TakeEvent [1], nullptr will be return in this case. So, suspend from any thread will not stop the running channel event but flush will be aborted after that. [1] https://searchfox.org/mozilla-central/rev/c48398abd9f0f074c69f2223260939e30e8f99a8/netwerk/ipc/ChannelEventQueue.cpp#23 @@ +60,5 @@ > + > + nsCOMPtr<nsIEventTarget> target = event->GetEventTarget(); > + MOZ_ASSERT(target); > + > + bool isCurrentThread; Done. @@ +62,5 @@ > + MOZ_ASSERT(target); > + > + bool isCurrentThread; > + DebugOnly<nsresult> rv = target->IsOnCurrentThread(&isCurrentThread); > + MOZ_ASSERT(NS_SUCCEEDED(rv)); throwing away that event might be problematic. I'll add a DIAGNOSTIC_ASSERT and run it on current thread. @@ +87,3 @@ > } > > + // We can only resume after completely exits flushing state. Invoking Resume() directly in the while loop might make CompleteResume being executed while mFlushing is still true. In this case we'll not do FlushQueue because we think queue flushing is running. So, this obvious comment is the actual reason, you should not try ask for queue resumption during flush. I'm going to extend the comment as: // The flush procedure is aborted because next event cannot be run on current // thread. We need to resume the event processing right after flush procedure // is finished. // Note: we cannot call Resume() while "mFlushing == true" because // CompleteResume will not trigger FlushQueue while there is an ongoing flush. @@ +104,5 @@ > +{ > + mMutex.AssertCurrentThreadOwns(); > + > + mSuspended = true; > + mSuspendCount++; This assertion cannot hold in following case: >Suspend > - mSuspend = true, mSuspendCount = 1 >Resume > - ResumeInternal > - mSuspendCount = 0; > - dispatch CompleteResume to another thread >Suspend > - true != !!0 @@ +138,4 @@ > > + nsCOMPtr<nsIEventTarget> target; > + target = mEventQueue[0]->GetEventTarget(); > + MOZ_ASSERT(target); Add following comment: // Complete the resumption on the event target for the next event so that we // can flush the queue directly without additional event dispatching. ::: netwerk/protocol/ftp/FTPChannelChild.cpp @@ +259,5 @@ > > + already_AddRefed<nsIEventTarget> GetEventTarget() > + { > + return do_GetMainThread(); > + } We are doing necko event labeling right now (e.g. bug 1343750 for FTPChannel), so we'll see less and less ChannelEvent dispatching to main thread directly. Thus the value of adding ChannelMainThreadEvent will become lower by time. However I can still do that to minimize my patch.

Honza Bambas (:mayhemer)

Comment 109

•

8 years ago

Comment on attachment 8849952 [details] [diff] [review] (for splinter review) Bug 1320744 - Part 2, Allow ChannelEventQueue to perform flush on multiple threads. Review of attachment 8849952 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/ipc/ChannelEventQueue.h @@ +155,5 @@ > + > + if (!isCurrentThread) { > + SuspendInternal(); > + mEventQueue.AppendElement(Move(event)); > + ResumeInternal(); What I wanted was to explain why you have to bother with suspending and resuming. This just repeats what the code does (= a useless comment). @@ +192,5 @@ > +inline nsresult > +ChannelEventQueue::PrependEvent(UniquePtr<ChannelEvent>& aEvent) > +{ > + MutexAutoLock lock(mMutex); > + MOZ_ASSERT(mSuspended || !!mForcedCount); this is a good comment! thanks. ::: netwerk/ipc/ChannelEventQueue.cpp @@ +51,5 @@ > + MOZ_ASSERT(!mFlushing); > + mFlushing = true; > + } > + > + while (true) { aha! very well hidden ;) @@ +104,5 @@ > +{ > + mMutex.AssertCurrentThreadOwns(); > + > + mSuspended = true; > + mSuspendCount++; yeah, right, had to realize that. @@ +138,4 @@ > > + nsCOMPtr<nsIEventTarget> target; > + target = mEventQueue[0]->GetEventTarget(); > + MOZ_ASSERT(target); Isn't the reason just to save some loops/reposts? this looks to me simply like a shorthand. maybe just don't bother with a comment here, it's not anything critical to understand at the first look. ::: netwerk/protocol/ftp/FTPChannelChild.cpp @@ +259,5 @@ > > + already_AddRefed<nsIEventTarget> GetEventTarget() > + { > + return do_GetMainThread(); > + } OK, a good arg to not add ChannelMainThreadEvent. but still might be nice to have a base class for each module at least. up to you. work is already done.

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 110

•

8 years ago

Comment on attachment 8849952 [details] [diff] [review] (for splinter review) Bug 1320744 - Part 2, Allow ChannelEventQueue to perform flush on multiple threads. Review of attachment 8849952 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/ipc/ChannelEventQueue.h @@ +155,5 @@ > + > + if (!isCurrentThread) { > + SuspendInternal(); > + mEventQueue.AppendElement(Move(event)); > + ResumeInternal(); The reason I use suspend/resume here is to avoid create another path of triggering MayFlushQueue. Do you want me to put this down in comment? ::: netwerk/ipc/ChannelEventQueue.cpp @@ +138,4 @@ > > + nsCOMPtr<nsIEventTarget> target; > + target = mEventQueue[0]->GetEventTarget(); > + MOZ_ASSERT(target); The original code will also dispatch a runnable on the target thread, so basicly we are doing the same thing. ::: netwerk/protocol/ftp/FTPChannelChild.cpp @@ +259,5 @@ > > + already_AddRefed<nsIEventTarget> GetEventTarget() > + { > + return do_GetMainThread(); > + } Already add MainThreadChannelEvent in my latest local patch.

Comment hidden (mozreview-request)

Pulsebot

Comment 114

•

8 years ago

Pushed by schien@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/afe830d17d3e Part 1, make refcounter of HttpChannelChild thread-safe. r=mayhemer https://hg.mozilla.org/integration/autoland/rev/fe2d707ec202 Part 2, Allow ChannelEventQueue to perform flush on multiple threads. r=mayhemer https://hg.mozilla.org/integration/autoland/rev/fbe57838663a Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild. r=mayhemer

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 115

•

8 years ago

Backed out for frequently failing test_bug440572.html on Windows 8 x64 opt and pgo: https://hg.mozilla.org/integration/autoland/rev/1181e6fcf33f7525f6bc414e83ba0a1f3caf5231 https://hg.mozilla.org/integration/autoland/rev/2994cb1c71365634983039c6dd04bea5e2b83919 https://hg.mozilla.org/integration/autoland/rev/99ef7d4e960de742234cc8e3f9a0ba62bc3f0f9b Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=fbe57838663a9eaf66067c30eb72ff6d51584a81&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=86442021&repo=autoland 12:52:39 INFO - TEST-START | dom/tests/mochitest/bugs/test_bug440572.html 12:52:39 INFO - TEST-INFO | started process screenshot 12:52:39 INFO - TEST-INFO | screenshot: exit 0 12:52:39 INFO - Buffered messages logged at 12:52:39 12:52:39 INFO - TEST-PASS | dom/tests/mochitest/bugs/test_bug440572.html | wrong sender! 12:52:39 INFO - TEST-PASS | dom/tests/mochitest/bugs/test_bug440572.html | wrong sender! 12:52:39 INFO - TEST-PASS | dom/tests/mochitest/bugs/test_bug440572.html | wrong sender! 12:52:39 INFO - TEST-PASS | dom/tests/mochitest/bugs/test_bug440572.html | received the right number of messages. 12:52:39 INFO - Buffered messages finished 12:52:39 INFO - TEST-UNEXPECTED-FAIL | dom/tests/mochitest/bugs/test_bug440572.html | test in frame failed. - got "SecurityError: Permission denied to access property \"content\" on cross-origin object", expected "success" 12:52:39 INFO - SimpleTest.is@SimpleTest/SimpleTest.js:310:5 12:52:39 INFO - runtests@dom/tests/mochitest/bugs/test_bug440572.html:31:3 12:52:39 INFO - onload@dom/tests/mochitest/bugs/test_bug440572.html:1:1 12:52:39 INFO - TEST-PASS | dom/tests/mochitest/bugs/test_bug440572.html | parent["content"] should be the WebIDL property of Window. 12:52:39 INFO - Not taking screenshot here: see the one that was previously logged 12:52:39 INFO - TEST-UNEXPECTED-FAIL | dom/tests/mochitest/bugs/test_bug440572.html | parent["dump"] should be the WebIDL property of Window. - didn't expect "success", but got it 12:52:39 INFO - SimpleTest.isnot@SimpleTest/SimpleTest.js:322:5 12:52:39 INFO - runtests@dom/tests/mochitest/bugs/test_bug440572.html:33:3 12:52:39 INFO - onload@dom/tests/mochitest/bugs/test_bug440572.html:1:1

Flags: needinfo?(schien)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 116

•

8 years ago

The javascript execution of three iframes in test_bug440572.html cannot be guaranteed. I'll try fix this test case.

Flags: needinfo?(schien)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 117

•

8 years ago

Let's see how it works: https://treeherder.mozilla.org/#/jobs?repo=try&revision=94a36ba3e40512f546bee1dd24a589e7ea060626

Comment hidden (mozreview-request)

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Comment 119

•

8 years ago

Try looks good (100% pass for 20 run on m-e10s chunk 3 on win8 x64 opt), waiting for @mrbkap's review for the test case modification. https://treeherder.mozilla.org/#/jobs?repo=try&revision=c0633d8ae942f590f3ccfaf219701c7b4b2936c0

Blake Kaplan (:mrbkap) (inactive)

Comment 120

•

8 years ago

mozreview-review

Comment on attachment 8851367 [details] Bug 1320744 - Part 4, Allow iframes in test_bug440572.html to be loaded out of sequence. . https://reviewboard.mozilla.org/r/123678/#review126548

Attachment #8851367 - Flags: review?(mrbkap) → review+

Pulsebot

Comment 121

•

8 years ago

Pushed by schien@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/de9bccfe404e Part 1, make refcounter of HttpChannelChild thread-safe. r=mayhemer https://hg.mozilla.org/integration/autoland/rev/5edaddffcc08 Part 2, Allow ChannelEventQueue to perform flush on multiple threads. r=mayhemer https://hg.mozilla.org/integration/autoland/rev/3774a5d55ac2 Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild. r=mayhemer https://hg.mozilla.org/integration/autoland/rev/e2a697abd5d3 Part 4, Allow iframes in test_bug440572.html to be loaded out of sequence. r=mrbkap.

Carsten Book [:Tomcat]

Comment 122

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/de9bccfe404e https://hg.mozilla.org/mozilla-central/rev/5edaddffcc08 https://hg.mozilla.org/mozilla-central/rev/3774a5d55ac2 https://hg.mozilla.org/mozilla-central/rev/e2a697abd5d3

Status: NEW → RESOLVED

Closed: 8 years ago

status-firefox55: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla55

Calixte Denizet (:calixte)

Updated

•

8 years ago

Depends on: 1351558

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

8 years ago

Depends on: 1353829

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 123

•

8 years ago

omg you finally did it \o/

Ryan VanderMeulen [:RyanVM]

Updated

•

8 years ago

Depends on: 1371203

Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz)

Assignee

Updated

•

8 years ago

Blocks: 1345404

Daniel Veditz [:dveditz]

Updated

•

8 years ago

Depends on: 1401459

Bug 1320744 - Part 1, make refcounter of HttpChannelChild thread-safe. 9 years ago Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) 59 bytes, text/x-review-board-request	mayhemer : review+	Details
Bug 1320744 - Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild. 9 years ago Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) 59 bytes, text/x-review-board-request	mayhemer : review+	Details
[for splinter] Bug 1320744 - Part 2, implement nsIThreadRetargetableRequest in HttpChannelChild. 8 years ago Honza Bambas (:mayhemer) 13.48 KB, patch		Details \| Diff \| Splinter Review
main-thread-only-cancel.patch 8 years ago Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) 5.93 KB, patch	mayhemer : feedback-	Details \| Diff \| Splinter Review
http-channel-child-retargetable.patch 8 years ago Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) 19.85 KB, patch		Details \| Diff \| Splinter Review
(for spliter) Part 2, implement nsIThreadRetargetableRequest in HttpChannelChild. 8 years ago Honza Bambas (:mayhemer) 19.50 KB, patch		Details \| Diff \| Splinter Review
p2-channel-event-queue.patch 8 years ago Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) 29.61 KB, patch	mayhemer : feedback+	Details \| Diff \| Splinter Review
p3-retargetable-http-child.patch 8 years ago Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) 16.72 KB, patch	mayhemer : review+	Details \| Diff \| Splinter Review
Bug 1320744 - Part 2, Allow ChannelEventQueue to perform flush on multiple threads. 8 years ago Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) 59 bytes, text/x-review-board-request	mayhemer : review+	Details
ServiceWorker_Diversion.png 8 years ago Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) 105.77 KB, image/png		Details
(for splinter review) Bug 1320744 - Part 2, Allow ChannelEventQueue to perform flush on multiple threads. 8 years ago Honza Bambas (:mayhemer) 42.38 KB, patch		Details \| Diff \| Splinter Review
(for splinter review) Bug 1320744 - Part 3, implement nsIThreadRetargetableRequest in HttpChannelChild. 8 years ago Honza Bambas (:mayhemer) 22.97 KB, patch		Details \| Diff \| Splinter Review
Bug 1320744 - Part 4, Allow iframes in test_bug440572.html to be loaded out of sequence. . 8 years ago Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) 59 bytes, text/x-review-board-request	mrbkap : review+	Details