Closed Bug 409815 Opened 12 years ago Closed 12 years ago

Download manager virus scan API call should time out after a while

Categories

(Toolkit :: Downloads API, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9beta4

People

(Reporter: sdwilsh, Assigned: robarnold)

References

Details

Attachments

(2 files, 5 obsolete files)

We want the windows api call to just be dropped after a short while.  The idea is that after a minute we just assume everything is OK and set the download to a complete state.  After two minutes, we just drop the thread handling the request (but if something comes in during that time, we handle it accordingly).
Flags: blocking-firefox3?
Running this program before starting up the browser will hang the scanning thread and thus the download. Killing the program (Ctrl+C) will let the download continue with a warning in the console.
Attached patch Proposed solution v1 (obsolete) — Splinter Review
Spawns a watchdog thread to watch scans. Watchdog timeout period is 60 seconds. Scans that complete after the timeout are ignored.
Attachment #295865 - Flags: review?
Attachment #295865 - Flags: review? → review?(randjmathis)
Attachment #295865 - Flags: review?(randjmathis)
Attachment #295865 - Flags: review?(jmathies)
Comment on attachment 295539 [details]
Virus scanner that hangs on initialization

After rebooting my laptop, it seems that my av scanner no longer hangs on initialization, but just fails instantly.
Attachment #295539 - Attachment is obsolete: true
Just out of curiosity, how bad is the hanging problem? Is it specific to a particular vendor maybe?
It seems like older computers, and there was one vendor in particular.  See Bug 412204 for more details.
Hey Rob, just and FYI this patch was broken by the checkin from bug 393305. I'll try to peice it together locally to test.
Been running this for a day or so now, looks to be working well. We need an updated patch though to address the bustage before I can r+.
Attached patch unbitrotted patch (obsolete) — Splinter Review
The behavior of regular scans should not have changed since the last patch.
Attachment #295865 - Attachment is obsolete: true
Attachment #298807 - Flags: review?(jmathies)
Attachment #295865 - Flags: review?(jmathies)
Since the download manager looks for scanners at startup and not at scan time, you must run this program before you run firefox. While it is running, firefox (or even the test scanner in bug 103487) will hang when it tries to initialize the scanner. Killing the program (nicely as it asks) will allow any current and future calls to return w/an error. You will see firefox print out a warning on the console when that happens.
I'm going to resist dinging this on line lengths, as I think the 80 char limit is a bit outdated. But someone else might ding you on it, so you might want to shorten some of those up. :)

Ran this today, no problems. The code looks good. r+
Attachment #298807 - Flags: review?(jmathies) → review+
Do we need an SR to move this forward?  Or just approval?  Some folks who are really seeing issues (myself included, at times), especially with AVG Free, would like to see this in the nightlies, so we can further test.
Comment on attachment 298807 [details] [diff] [review]
unbitrotted patch

Edward or I need to review this before it can land.
Attachment #298807 - Flags: review?(sdwilsh)
Attachment #298807 - Flags: review?(edilee)
Someone should probably look at threading stuff as it relates to mozilla apps - I look at the win code but there's some functionality in here I'm not qualified to approve (yet :).
Comment on attachment 298807 [details] [diff] [review]
unbitrotted patch

>+// This destructor appeases the compiler; it would otherwise complain about an incomplete
>+// type for nsDownloadWatchdog in the instantiation of nsAutoPtr::~nsAutoPtr
>+//
>+// Plus, it's a handy location to call nsDownloadScannerWatchdog::Shutdown from
nit: end of sentence grammar please

>+  // Don't release upon timeout since the scanning thread is still live and references member variables
nit: comment line wrapping at 80 chars please.

>+PRBool
>+nsDownloadScanner::Scan::SetState(AVScanState newState, AVScanState expectedState) {
>+  PRBool gotExpectedState = PR_FALSE;
>+  EnterCriticalSection(&mStateSync);
>+  if(gotExpectedState = (mStatus == expectedState))
>+    mStatus = newState;
>+  LeaveCriticalSection(&mStateSync);
>+  return gotExpectedState;
>+}
JavaDoc style comment on this?  I'm not sure I understand why it's doing what it's doing...

>+nsDownloadScannerWatchdog::nsDownloadScannerWatchdog() 
>+  : mNewItemEvent(NULL), mQuitEvent(NULL) {
>+  InitializeCriticalSection(&mQueueSync);
>+}
Can InitializeCriticalSection fail?


>+    PRBool SetState(AVScanState newState, AVScanState expectedState);
Although previously mentioned javadoc comment should in the .h file here.

r=sdwilsh

Thanks a ton!
Attachment #298807 - Flags: review?(sdwilsh)
Attachment #298807 - Flags: review?(edilee)
Attachment #298807 - Flags: review+
URL: \
Flags: blocking-firefox3? → blocking-firefox3+
Priority: -- → P3
Whiteboard: [needs new patch][has reviews]
rob - it's my understanding that you want to land this after the IAttachmentExecute stuff, and this timeout will only be for IOfficeAttachment, or both? 
It will work for both (the entire thread's execution actually).
Attached patch Unbitrotted patch (obsolete) — Splinter Review
Changes since the last patch:
Scan::Run is invoked only once on behalf of the watchdog or the Scan itself
If the Scan cannot be initialized, it is not handed off to the watchdog

The biggest change of this patch is that Scan::mStatus is now written to by multiple threads; getting/setting this requires entering the mStateSync critical section or calling SetState which performs a compare & swap on it. This allows the scanning thread and watchdog threads to avoid trashing each others values, and to avoid extra work by discovering that the other has been executing.
Attachment #298807 - Attachment is obsolete: true
Attachment #302727 - Flags: review?(jmathies)
Attached patch Forgot to rediff (obsolete) — Splinter Review
I forgot to rediff after a minor change
Attachment #302727 - Attachment is obsolete: true
Attachment #303445 - Flags: review?(jmathies)
Attachment #302727 - Flags: review?(jmathies)
Attachment #303445 - Flags: review?(sdwilsh)
starting to look at this today...
+    mWatchdog = new nsDownloadScannerWatchdog();
+    rv = mWatchdog->Init();
...
+    // We timed out, so just release
+    ReleaseDispatcher* releaser = new ReleaseDispatcher(this);
+    NS_ADDREF(releaser);
+    NS_DispatchToMainThread(releaser);
...
+    ReleaseDispatcher* releaser = new ReleaseDispatcher(scan);
+    NS_ADDREF(releaser);

We should check the results here for failure.

+  while (0 != queueItemsLeft ||
+         (WAIT_OBJECT_0 + 1) != (waitStatus = WaitForMultipleObjects(2, waitHandles, FALSE, INFINITE)) &&
+         waitStatus != WAIT_FAILED) {

nit - line length

+    scan = reinterpret_cast<Scan*>(watchdog->mScanQueue.Pop());

Is there any chance this could return a null?

In general the code looks good. Some detailed testing with hanging scanners didn't bring to light any issues.
Comment on attachment 303445 [details] [diff] [review]
Forgot to rediff

>+ * Scan object will dispatch its run method to the main thread; this will
>+ * release the watchdog thread's addref on the Scan. Note that
Note that what?

I take it windows uses "events" instead of "signals"?  (I'm familiar with pthreads, not windows threads).

>+    if (FAILED(rv)) {
>+      mWatchdog = nsnull;
>+    }
nit: no bracing

s/SetState/SetAndCheckState/g ?

r=sdwilsh
Attachment #303445 - Flags: review?(sdwilsh) → review+
(In reply to comment #20)
> +    mWatchdog = new nsDownloadScannerWatchdog();
> +    rv = mWatchdog->Init();
> ...
> +    // We timed out, so just release
> +    ReleaseDispatcher* releaser = new ReleaseDispatcher(this);
> +    NS_ADDREF(releaser);
> +    NS_DispatchToMainThread(releaser);
> ...
> +    ReleaseDispatcher* releaser = new ReleaseDispatcher(scan);
> +    NS_ADDREF(releaser);
> 
> We should check the results here for failure.

Fixed. In the case of the releasers, we will leak memory but I wasn't sure if printing a warning would be appropriate.

> 
> +  while (0 != queueItemsLeft ||
> +         (WAIT_OBJECT_0 + 1) != (waitStatus = WaitForMultipleObjects(2,
> waitHandles, FALSE, INFINITE)) &&
> +         waitStatus != WAIT_FAILED) {
> 
> nit - line length

Fixed, but the line breaks look odd now.

> 
> +    scan = reinterpret_cast<Scan*>(watchdog->mScanQueue.Pop());
> 
> Is there any chance this could return a null?

No. The watchdog thread is the only consumer of the queue. So from the loop precondition we know that one of the following holds:
a) There were more items left on the queue last time we checked
b) waitStatus == WAIT_OBJECT_0 because it is not WAIT_OBJECT_1, WAIT_FAILED or WAIT_TIMEOUT (infinite wait time) so the mNewEvent Event was signaled which means that the a new Scan was pushed onto a previously empty queue (empty when the main thread checked meaning that the watchdog thread is somewhere between LeaveCriticalSection and the WaitForMultipleObjects in the loop and the watchdog knows that the queue is empty too). Thus the watchdog thread will Wait on mNewEvent which is signaled so it returns immediately and the watchdog thread proceeds to pop an element off a non-empty queue.

> 
> In general the code looks good. Some detailed testing with hanging scanners
> didn't bring to light any issues.
> 
Excellent. Some QA on this wouldn't hurt either.

(In reply to comment #21)
> (From update of attachment 303445 [details] [diff] [review])
> >+ * Scan object will dispatch its run method to the main thread; this will
> >+ * release the watchdog thread's addref on the Scan. Note that
> Note that what?

Hmm, either I didn't finish writing that comment or else it got lost in my Mercurial adventure. I wrote something that seemed appropriate.

> 
> I take it windows uses "events" instead of "signals"?  (I'm familiar with
> pthreads, not windows threads).

Yes, Events are signaled or non signaled. Waits on events block until the event is signaled. mNewEvent and mQuitEvent are both auto-reset events so after the Wait finishes, they are reset to the non-signaled state.

> 
> >+    if (FAILED(rv)) {
> >+      mWatchdog = nsnull;
> >+    }
> nit: no bracing

Fixed.

> 
> s/SetState/SetAndCheckState/g ?
> 

%s/SetState/CheckAndSetState/g
Attachment #303445 - Attachment is obsolete: true
Attachment #304878 - Flags: review?(sdwilsh)
Attachment #304878 - Flags: review?(jmathies)
Attachment #303445 - Flags: review?(jmathies)
Comment on attachment 304878 [details] [diff] [review]
Updated to address comments

this already has review
Attachment #304878 - Flags: review?(sdwilsh)
Attachment #304878 - Flags: review?(jmathies)
Keywords: checkin-needed
Whiteboard: [needs new patch][has reviews] → [has patch][has reviews][can land]
Checking in toolkit/components/downloads/src/nsDownloadScanner.cpp;.com@cvs.mozilla.org:/cvsroot ci -toolkit
/cvsroot/mozilla/toolkit/components/downloads/src/nsDownloadScanner.cpp,v  <--  nsDownloadScanner.cpplkit
new revision: 1.13; previous revision: 1.12
done
Checking in toolkit/components/downloads/src/nsDownloadScanner.h;
/cvsroot/mozilla/toolkit/components/downloads/src/nsDownloadScanner.h,v  <--  nsDownloadScanner.h
new revision: 1.8; previous revision: 1.7
done
Status: NEW → RESOLVED
Closed: 12 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: [has patch][has reviews][can land]
Target Milestone: Firefox 3 beta3 → Firefox 3 beta4
Product: Firefox → Toolkit
You need to log in before you can comment on or make changes to this bug.