Closed Bug 915905 Opened 6 years ago Closed 6 years ago

OnDataAvailable off main thread: make nsInputStreamPump::OnInputStreamReady more atomic

Categories

(Core :: Networking, defect)

25 Branch
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla27
Tracking Status
firefox25 --- affected
firefox26 --- fixed
firefox27 --- fixed

People

(Reporter: sworkman, Assigned: sworkman)

References

Details

(Whiteboard: [qa-])

Attachments

(1 file)

While writing patches for Bug 874268 "Support OnDataAvailable and OnStopRequest off main thread for Vector/SVG image loading", I noticed that nsInputStreamPump::OnInputStreamReady can be called on multiple threads concurrently if nsInputStreamPump::mTargetThread points to a threadpool. This is a result of the changes introduced in Bug 497003 "Support sending OnDataAvailable() to other threads".

The problem is when Cancel is called in the same calling context as OnInputStreamReady, e.g. during a call to the listener's OnDataAvailable. In cases where mTargetThread is a threadpool (e.g. the imagelib decode pool), a call to mAsyncStream->AsyncWait (via EnsureWaiting) can effectively dispatch OnInputStreamReady to another thread for concurrent execution. This is not desirable as state variables in nsInputStreamPump can be overwritten, confusing callback order and execution. Instead, we want OnInputStreamReady to finish it's current callback (OnStartRequest, OnDataAvailable or OnStopRequest), and then call AsyncWait (again, via EnsureWaiting) to effectively dispatch itself to the same or a different thread.

This should be fixed before landing changes to support OnDataAvailable for imagelib code.
There should only be one iteration of nsInputStreamPump::OnInputStreamPump processing callbacks at a time, hence:
 -- a monitor (mMonitor) used to protect member variable accesses.
 -- a boolean (mProcessingCallbacks) used to prevent restarting a callback while it is still being processed (includes the callback and state updates).

Note: Both are required since the monitor must be exited during nsIStreamListener callbacks to avoid deadlock with calls to RetargetDeliveryTo for other nsInputStreamPumps; e.g. nsHttpChannel may require both mTransactionPump and mCachePump to be retargeted if the request is partially cached. In this case, when the monitor is exited, the boolen still provides protection for the callbacks and state updates.
Attachment #804587 - Flags: review?(jduell.mcbugs)
Try job from last night: Looks good:
https://tbpl.mozilla.org/?tree=Try&rev=fcd51c17afeb

Second try job, includes talos to check for performance regressions:
https://tbpl.mozilla.org/?tree=Try&rev=146133aafbac
Comment on attachment 804587 [details] [diff] [review]
v1.0 Synchronize callback processing in nsInputStreamPump::OnInputStreamReady

Review of attachment 804587 [details] [diff] [review]:
-----------------------------------------------------------------

Seems good--let's give it a spin :)

::: netwerk/base/src/nsInputStreamPump.cpp
@@ +106,5 @@
>  nsresult
>  nsInputStreamPump::PeekStream(PeekSegmentFun callback, void* closure)
>  {
> +  ReentrantMonitorAutoEnter mon(mMonitor);
> +    

every declaration of 'mon' in this patch has extra whitespace in the following blank line.

@@ +416,5 @@
> +        // RetargetDeliveryTo for other nsInputStreamPumps; e.g. nsHttpChannel
> +        // may require both mTransactionPump and mCachePump to be retargeted if
> +        // the request is partially cached. In this case, when the monitor is
> +        // exited, the boolen still provides protection for the callbacks and
> +        // state updates.

shorter comment here. How about this?

  // There should only be one iteration of this loop happening at a time. 
  // Cancel() on other thread can happen but combination of monitor and 
  // mProcessingCallbacks should prevent it from calling AsyncWait() and getting here.

@@ +469,5 @@
>              mRetargeting = true;
>          }
>  
> +        // OK to unset |mProcessingCallbacks| here because we're still in the
> +        // monitor until this iteration of the loop ends. Should be unset

// Unset mProcessingCallbacks here (while we have lock) so our own call to 
 // EnsureWaiting isn't blocked by it.
Attachment #804587 - Flags: review?(jduell.mcbugs) → review+
https://hg.mozilla.org/mozilla-central/rev/67a56c884bfe
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Comment on attachment 804587 [details] [diff] [review]
v1.0 Synchronize callback processing in nsInputStreamPump::OnInputStreamReady

[Approval Request Comment]
Bug caused by (feature/regressing bug #):
-- Speculative fix for bug 920975 (crash in nsHtml5StreamParser::WriteStreamBytes) - see bug 920975 comment 11.

User impact if declined:
-- Continued crashes, seems to be most likely on flipora.com according to crash reports.

Testing completed (on m-c, etc.):
-- Baked on m-c for several weeks.

Risk to taking this patch (and alternatives if risky):
-- Pushing directly to beta skips baking time on aurora - increased audience may turn up another issue. No alternatives at present.

String or IDL/UUID changes made by this patch:
-- None.
Attachment #804587 - Flags: approval-mozilla-beta?
Attachment #804587 - Flags: approval-mozilla-aurora?
Attachment #804587 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Approving for Aurora but we might be too late for speculative fixed on Beta, will leave that nom up for consideration on Monday.
Has this patch made an impact on the crash on Aurora KaiRo? If we don't know, or the answer is yes, we'll take the fix for today's beta. bug 920725 wants resolution!
Flags: needinfo?(kairo)
The crash still appears on on the 20131013004004 build.

The crash rate was already low enough that we're in the noise: https://crash-analysis.mozilla.com/bsmedberg/nsHtml5StreamParser-aurora.svg

The spike at the end is probably just not having full ADU results for yesterday yet; in any case I'd say it's unlikely that this patch helped.
Flags: needinfo?(kairo)
Comment on attachment 804587 [details] [diff] [review]
v1.0 Synchronize callback processing in nsInputStreamPump::OnInputStreamReady

No need to risk an uplift in that case, sadly.
Attachment #804587 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.