Last Comment Bug 667535 - xpcom proxy use in netwerk/wifi can lead to scripted QI off main thread
: xpcom proxy use in netwerk/wifi can lead to scripted QI off main thread
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: XPCOM (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla8
Assigned To: Luke Wagner [:luke]
:
Mentors:
Depends on:
Blocks: 650411 674597
  Show dependency treegraph
 
Reported: 2011-06-27 11:23 PDT by Luke Wagner [:luke]
Modified: 2011-07-28 08:13 PDT (History)
11 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
rm xpcom/proxy use in netwerk/wifi (22.73 KB, patch)
2011-07-01 14:01 PDT, Luke Wagner [:luke]
dougt: review+
Details | Diff | Review
rm (some) xpcom/proxy use in security/manager/ssl (6.50 KB, patch)
2011-07-01 17:56 PDT, Luke Wagner [:luke]
no flags Details | Diff | Review
just fix xpcom/proxies (WIP) (8.16 KB, patch)
2011-07-08 18:27 PDT, Luke Wagner [:luke]
no flags Details | Diff | Review
just fix xpcom/proxies (9.60 KB, patch)
2011-07-11 14:22 PDT, Luke Wagner [:luke]
no flags Details | Diff | Review

Description Luke Wagner [:luke] 2011-06-27 11:23:04 PDT
nsProxyObjectManager::GetProxyForObject does a QI on the to-be-proxied nsISupports object on the caller's thread.  If the nsISupports is a scripted class (nsXPCWrappedJS), this will run JS on the caller's thread.  This QI is unsynchronized so in theory this can cause multiple threads to run in the same compartment at the same time which is a bad thing.

In theory, we could fix this by sending a synchronous runnable to execute the QI on the event target's thread.  But this would be fixing a mechanism which the Bens say we should instead be killing.  So an alternative would be to remove all places that create proxies for scripted objects.  So far assert+try have found nsNSSBadCertHandler and nsWifiMonitor::DoScan, but there may be others.
Comment 1 Luke Wagner [:luke] 2011-07-01 14:01:22 PDT
Created attachment 543506 [details] [diff] [review]
rm xpcom/proxy use in netwerk/wifi

This is a relatively straightforward patch that replaces proxy use by explicit runnables.  It actually removes 100 lines (171+, 274-) by factoring out some code.
Comment 2 Doug Turner (:dougt) 2011-07-01 14:41:46 PDT
Comment on attachment 543506 [details] [diff] [review]
rm xpcom/proxy use in netwerk/wifi

Review of attachment 543506 [details] [diff] [review]:
-----------------------------------------------------------------

This looks really nice.  Asking for some cleanup -- mostly nits -- because you are here.

::: netwerk/wifi/nsWifiMonitor.cpp
@@ +168,5 @@
> +
> +NS_IMETHODIMP nsPassErrorToWifiListeners::Run()
> +{
> +  LOG(("About to send error to the wifi listeners\n"));
> +  for (PRInt32 i = 0; i < mListeners->Count(); i++) {

nit:  mListeners is probably small, but the Count() is going to be called multiple times.  Can you use a temp, or can you rewrite this loop?  But, I suppose the existing code did the same....

@@ +170,5 @@
> +{
> +  LOG(("About to send error to the wifi listeners\n"));
> +  for (PRInt32 i = 0; i < mListeners->Count(); i++) {
> +    nsresult rv = (*mListeners)[i]->OnError(mResult);
> +    LOG( ("... sent %d\n", rv));

you can remove the nsresult rv and the LOG().  There is another bug about this generating a warning from this file.

@@ +185,5 @@
>    if (mKeepGoing && NS_FAILED(rv)) {
> +    nsAutoPtr<nsCOMArray<nsIWifiListener> > currentListeners(
> +                           new nsCOMArray<nsIWifiListener>(mListeners.Length()));
> +    if (!currentListeners)
> +      return NS_ERROR_OUT_OF_MEMORY;

how could this fail?  Not checking for the new nsCOMArray, but checking for currentListeners.  Lets just not check at all.

@@ +193,4 @@
>  
> +    nsCOMPtr<nsIThread> thread = do_GetMainThread();
> +    if (!thread)
> +      return NS_ERROR_UNEXPECTED;

does this ever return null?  Maybe we can just drop this check.

@@ +227,5 @@
> +
> +NS_IMETHODIMP nsCallWifiListeners::Run()
> +{
> +  LOG(("About to send data to the wifi listeners\n"));
> +  for (PRInt32 i = 0; i < mListeners->Count(); i++) {

same thing, lets not call Count() each time through this loop.

@@ +230,5 @@
> +  LOG(("About to send data to the wifi listeners\n"));
> +  for (PRInt32 i = 0; i < mListeners->Count(); i++) {
> +    nsresult rv = (*mListeners)[i]->OnChange(mAccessPoints->Elements(),
> +                                             mAccessPoints->Length());
> +    LOG( ("... sent %d\n", rv));

same thing, drop the rv and the LOG()

@@ +242,5 @@
> +{
> +    nsAutoPtr<nsCOMArray<nsIWifiListener> > currentListeners(
> +                           new nsCOMArray<nsIWifiListener>(mListeners.Length()));
> +    if (!currentListeners)
> +      return NS_ERROR_OUT_OF_MEMORY;

don't need the null check.

@@ +247,5 @@
> +
> +    {
> +      ReentrantMonitorAutoEnter mon(mReentrantMonitor);
> +
> +      for (PRUint32 i = 0; i < mListeners.Length(); i++) {

same thing.

@@ +261,5 @@
> +      PRUint32 resultCount = aAccessPoints.Count();
> +      nsAutoPtr<nsTArray<nsIWifiAccessPoint*> > accessPoints(
> +                               new nsTArray<nsIWifiAccessPoint *>(resultCount));
> +      if (!accessPoints)
> +        return NS_ERROR_OUT_OF_MEMORY;

kill the null check

@@ +268,5 @@
> +        accessPoints->AppendElement(aAccessPoints[i]);
> +
> +      nsCOMPtr<nsIThread> thread = do_GetMainThread();
> +      if (!thread)
> +        return NS_ERROR_UNEXPECTED;

is this check needed?

::: netwerk/wifi/nsWifiScannerUnix.cpp
@@ +122,5 @@
>  
>  nsresult
>  nsWifiMonitor::DoScan()
>  {
> +  static void* iwlib_handle = NULL;

lets move this outside of the function block (for style), and rename it to s_iwlib_handle (or something similar).

@@ +141,5 @@
>    else {
>      LOG(("Loaded libiw\n"));
>    }
>  
> +  static iw_open_t iw_open = NULL;

iw_open - > iw_open_func

@@ +145,5 @@
> +  static iw_open_t iw_open = NULL;
> +  if (!iw_open)
> +    iw_open = (iw_open_t) dlsym(iwlib_handle, "iw_sockets_open");
> +
> +  static iw_enum_t iw_enum = NULL;

iw_enum_func

@@ +149,5 @@
> +  static iw_enum_t iw_enum = NULL;
> +  if (!iw_enum)
> +    iw_enum = (iw_enum_t) dlsym(iwlib_handle, "iw_enum_devices");
> +
> +  static iw_stats_t iw_stats = NULL;

iw_stats_func

@@ +170,5 @@
> +    int skfd;
> +    SocketsGuard(int skfd) : skfd(skfd) {}
> +    ~SocketsGuard() { iw_sockets_close(skfd); }
> +  } guard(skfd);
> +

what is this about?
Comment 3 Luke Wagner [:luke] 2011-07-01 16:18:42 PDT
(In reply to comment #2)
> +NS_IMETHODIMP nsPassErrorToWifiListeners::Run()
> nit:  mListeners is probably small, but the Count() is going to be called
> multiple times.  Can you use a temp, or can you rewrite this loop?  But, I
> suppose the existing code did the same....

I usually do that, but as this was error-path code it seemed super-cold so I just went for less code.

> > +    nsAutoPtr<nsCOMArray<nsIWifiListener> > currentListeners(
> > +                           new nsCOMArray<nsIWifiListener>(mListeners.Length()));
> > +    if (!currentListeners)
> > +      return NS_ERROR_OUT_OF_MEMORY;
> 
> how could this fail?  Not checking for the new nsCOMArray, but checking for
> currentListeners.  Lets just not check at all.

Since currentListeners converts to the pointer it contains, this is checking for OOM by new nsCOMArray.  Would you like to remove the OOM-checking altogether (is mozilla already using infallible-malloc?)

> > +    nsCOMPtr<nsIThread> thread = do_GetMainThread();
> > +    if (!thread)
> > +      return NS_ERROR_UNEXPECTED;
> 
> does this ever return null?  Maybe we can just drop this check.

mxr showed most other sites checking likewise

> > +  LOG(("About to send data to the wifi listeners\n"));
> > +  for (PRInt32 i = 0; i < mListeners->Count(); i++) {
> 
> same thing, lets not call Count() each time through this loop.

Even though its not the error path, this still didn't seem like it could possibly be hot code; it was calling *proxies in a loop* !

> > +    if (!currentListeners)
> > +      return NS_ERROR_OUT_OF_MEMORY;
> 
> don't need the null check.

Again, to be clear, is it because we don't care about OOM?

> > +    SocketsGuard(int skfd) : skfd(skfd) {}
> > +    ~SocketsGuard() { iw_sockets_close(skfd); }
> > +  } guard(skfd);
> what is this about?

It makes sure that iw_sockets_close gets called reliably, even on early return.  There were a few places (before the patch) where it wasn't.  We've been using RAII guards like this all over in SpiderMonkey, but this is a different module so you be the judge.
Comment 4 Luke Wagner [:luke] 2011-07-01 17:56:23 PDT
Created attachment 543552 [details] [diff] [review]
rm (some) xpcom/proxy use in security/manager/ssl

Kai, Brian said you would be the one to review changes in this code.  I'm not familiar with this area, so please be suspicious.  Also, I only removed the uses of GetProxyForObject that where causing the situation described in comment 0 when running automated tests.  Do you know if I should remove any other uses of GetProxyForObject (i.e., are there any other cases where the proxied nsISupports can be implemented in JS?).
Comment 5 Luke Wagner [:luke] 2011-07-04 13:42:45 PDT
Green on try
Comment 6 Luke Wagner [:luke] 2011-07-07 11:18:38 PDT
Kai, any estimate on when you would be able to get to this?  I don't know about the severity, but this is currently a real race on trunk.
Comment 7 Luke Wagner [:luke] 2011-07-07 11:24:08 PDT
Well I guess I need to cc kai to ask him a question :)
Comment 8 Luke Wagner [:luke] 2011-07-08 11:33:15 PDT
So instead of just assert-and-see-what-breaks, I scanned all the GetProxyForObject uses and checked whether I could find any scripted QIs containing the IID.  To my dismay, 2/3 of the GetProxyForObject sites are in this category (lots of nsIInterfaceRequestor and nsIPrompt).

Since the trend seems to be do-more-in-JS, it seems like the only way to prevent a stream of sadness is to kill xpcom/proxies.  bsmedberg, what say you to that?
Comment 9 Luke Wagner [:luke] 2011-07-08 11:40:53 PDT
I guess another option is to keep xpcom proxies and just have them do the QI on the main thread...
Comment 10 Luke Wagner [:luke] 2011-07-08 18:27:33 PDT
Created attachment 544937 [details] [diff] [review]
just fix xpcom/proxies (WIP)

... yeah its like 10x less work to just to do the QI for scripted things on the main thread, so I'll just do that.
Comment 11 Benjamin Smedberg [:bsmedberg] 2011-07-08 20:54:27 PDT
Most proxy users know the class they are proxying, so I'm not sure that your survey gives us much meaningful data. But in any case, yes, removing XPCOM proxies entirely is on my long-term list.

Doing QIs on the main thread at least requires that the main thread not be blocked, which can be tricky in some cases where you have nested proxy calls.
Comment 12 Luke Wagner [:luke] 2011-07-08 21:24:41 PDT
(In reply to comment #11)
> Doing QIs on the main thread at least requires that the main thread not be
> blocked, which can be tricky in some cases where you have nested proxy calls.

Could you explain?  The WIP patch in comment 10 is pretty short, perhaps you can  point me at anything flagrantly wrong?
Comment 13 Luke Wagner [:luke] 2011-07-11 14:22:17 PDT
Created attachment 545257 [details] [diff] [review]
just fix xpcom/proxies

Green on try, so I'll just ask you formally.
Comment 14 Luke Wagner [:luke] 2011-07-25 14:19:42 PDT
bsmedberg: review ping.  Alternatively, feel free to suggest a less-loaded reviewer.
Comment 15 Luke Wagner [:luke] 2011-07-26 11:13:03 PDT
Comment on attachment 545257 [details] [diff] [review]
just fix xpcom/proxies

From IRC, bsmedberg would rather go the kill-proxies route and assures me that there can't be too many cases of proxied-JS.  I'll go back to my original two patches (one of which still needs review) and put in a hard abort-if-scripted check in proxy code.
Comment 16 Luke Wagner [:luke] 2011-07-27 11:01:33 PDT
Landed the netwerk/wifi patch dougt r+'d:

http://hg.mozilla.org/integration/mozilla-inbound/rev/169712c5c68b

Spun off another bug for security/manager/ssl (bug 674571) and for putting in release-build abort-if-false checks that there is no other scripted proxy use hiding out somewhere (bug 674597).
Comment 17 :Ehsan Akhgari (out sick) 2011-07-28 08:13:48 PDT
http://hg.mozilla.org/mozilla-central/rev/169712c5c68b

Note You need to log in before you can comment on or make changes to this bug.