Closed Bug 1288204 Opened 3 years ago Closed 3 years ago

Make HTTP cache I/O back-end sync operations be cancalable during shutdown (Windows only).

Categories

(Core :: Networking: Cache, defect)

All
Windows Vista
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla51
Tracking Status
firefox50 --- affected
firefox51 --- fixed

People

(Reporter: mayhemer, Assigned: mayhemer)

References

(Depends on 1 open bug)

Details

(Whiteboard: [necko-active])

Attachments

(1 file, 2 obsolete files)

There seems to be a way to cancel synchronous operations that block a thread:

https://msdn.microsoft.com/en-us/library/windows/hardware/Dn613954(v=vs.85).aspx

In detail:
- CancelSynchronousIo taking a single argument - a thread handle, should work for CreateFile and hopefully CloseHandle
- MoveFileWithProgress (we must update nsLocalFile::CopySingleFile) that triggers a callback that we may be used to cancel the operation ; but there is no warranty the callback will be called at all during the long hangs

We may try to play with this to get rid of our IO shutdown hangs or even in-session hangs.

I've made some testing, but it's inconclusive...

I'd start slowly with this bug, just do CancelSynchronousIo and hope to get rid of some of CreateFile/CloseHandle hangs.

CancelSynchronousIo is only available since WinVista.
Whiteboard: [necko-active]
Status: NEW → ASSIGNED
Attached patch v1 (obsolete) — Splinter Review
IO cancellation seems to work only for file opening and maybe reading and closing.  Writing cannot be canceled.  Another bad thing is that for writes it's been observed that CancelSynchronousIo() actually blocks until the write operation on the target thread finished.  Which means we have two blocked threads instead of one...  Nice...

The patch still may be worth landing.  Description and r? will come later after some more testing, but the form as is now is mostly final.
Attached patch v1.1 (obsolete) — Splinter Review
I think this is becoming a production quality patch.

I'd also add some telemetry (another patch or even a bug) to see how this works.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0b90445993722195d592d24a163aedfd58f72f0b
Attachment #8778213 - Attachment is obsolete: true
Attachment #8778855 - Flags: review?(michal.novotny)
Comment on attachment 8778855 [details] [diff] [review]
v1.1

Review of attachment 8778855 [details] [diff] [review]:
-----------------------------------------------------------------

The windows specific code looks good to me, but maybe somebody more familiar with windows should have a look at it.

::: netwerk/cache2/CacheFileIOManager.cpp
@@ +580,5 @@
>      while (!mNotified) {
> +      mon.Wait(waitTime);
> +      if (!mNotified) {
> +        // If there is any IO blocking on the IO thread, this will
> +        // try to cancel it.  Returns no more than after two seconds.

no more -> no later ?

@@ +2760,5 @@
>    if (mShuttingDown) {
>      return NS_ERROR_NOT_INITIALIZED;
>    }
>  
> +  while (!mShuttingDown) {

This has no effect. mShuttingDown is checked just above this while loop. This code runs on Cache IO thread and mShuttingDown is set in CacheFileIOManager::ShutdownInternal() also on Cache IO thread and thus the value cannot change until we exit this method.

@@ +3322,5 @@
>    // here and set it again once we dispatch a continuation event. By doing so,
>    // we don't have to drop the flag on any possible early return.
>    mRemovingTrashDirs = false;
>  
> +  while (!mShuttingDown) {

The same as in CacheFileIOManager::OverLimitEvictionInternal() above.

::: netwerk/cache2/CacheIOThread.cpp
@@ +225,5 @@
>  , mCurrentlyExecutingLevel(0)
>  , mHasXPCOMEvents(false)
>  , mRerunCurrentEvent(false)
>  , mShutdown(false)
> +, mIOCancelableEvent(false)

it's declared as integer type

@@ +372,5 @@
>  
> +void CacheIOThread::CancelBlockingIO()
> +{
> +  // This is an attempt to cancel any blocking I/O operation taking
> +  // to long time.

too long

::: netwerk/cache2/CacheIOThread.h
@@ +119,5 @@
>    uint32_t mCurrentlyExecutingLevel;
>  
>    EventQueue mEventQueue[LAST_LEVEL];
> +  // Raised when nsIEventTarget.Dispatch() is called on this thread
> +  bool mHasXPCOMEvents;

This will IMO reintroduce bug 1157322. Or why do you think it's no longer needed?

@@ +128,3 @@
>    bool mShutdown;
> +  // If > 0 there is currently an I/O operation on the thread
> +  // that can be attempted to interrup during shutdown,

interrup -> interrupt

wouldn't be better "can be attempted to be interrupted during shutdown"?
Attachment #8778855 - Flags: review?(michal.novotny) → feedback+
(In reply to Michal Novotny (:michal) from comment #5)
> Comment on attachment 8778855 [details] [diff] [review]
> v1.1
> 
> Review of attachment 8778855 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> The windows specific code looks good to me, but maybe somebody more familiar
> with windows should have a look at it.

Not sure who would that be :D
(patches conflict)
Depends on: 1292623
Attached patch v1.2Splinter Review
some of the changes were mistakes (like removing the atomic), walked the patch more carefully this time.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=75413068ca49
Attachment #8778855 - Attachment is obsolete: true
Attachment #8779800 - Flags: review?(michal.novotny)
Attachment #8779800 - Flags: review?(michal.novotny) → review+
(land after bug 1292623)
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f56cc98fc84e
Use CancelSynchronousIo Win32 function to cancel blocking I/O on the HTTP cache thread. r=michal
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/f56cc98fc84e
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
See Also: → 1306278
See Also: → 1306282
Depends on: 1306284
You need to log in before you can comment on or make changes to this bug.