Closed Bug 326273 (nsIThreadManager) Opened 18 years ago Closed 18 years ago

Implement nsIThreadManager

Categories

(Core :: XPCOM, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
mozilla1.9alpha1

People

(Reporter: darin.moz, Assigned: darin.moz)

References

(Depends on 3 open bugs, Blocks 1 open bug, )

Details

(Keywords: arch, Whiteboard: [trunk only])

Attachments

(13 files, 11 obsolete files)

191.77 KB, patch
benjamin
: review+
jaas
: review+
Details | Diff | Splinter Review
204.22 KB, patch
benjamin
: review+
samuel
: review+
Details | Diff | Splinter Review
46.61 KB, text/plain
Details
27.86 KB, text/plain
Details
10.74 KB, text/plain
Details
16.42 KB, text/plain
Details
244.00 KB, patch
Details | Diff | Splinter Review
26.74 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
226.81 KB, text/plain
Details
493.18 KB, patch
Details | Diff | Splinter Review
289.81 KB, patch
Bienvenu
: review+
Details | Diff | Splinter Review
13.60 KB, patch
Details | Diff | Splinter Review
392.41 KB, application/x-gzip
Details
Implement nsIThreadManager, spec'd here:
http://wiki.mozilla.org/XPCOM:nsIThreadManager
Status: NEW → ASSIGNED
Blocks: 289227
Blocks: 315442
This work is being done on the THREADS_20060213_BRANCH.
Priority: -- → P2
Blocks: 328252
Blocks: 218908
Blocks: 208184
Blocks: 190313
Blocks: 328804
Blocks: 329681
Blocks: 223191
Blocks: 330062
Blocks: 330771
> This work is being done on the THREADS_20060213_BRANCH.

I merged to the trunk on 3/7, and cut a new branch:  THREADS_20060307_BRANCH
I've been running the timer-based DHTML tests here:
http://wd-testnet.world-direct.at/mozilla/dhtml/funo/

It looks like the THREADS_20060307_BRANCH performs very well on these tests.  It's possible to get a very smooth and consistent 10ms timeout now.  The biggest problem with the old system was that we would only process PLEvent objects in one big batch (ProcessPendingEvents) from a WM_USER (or WM_TIMER) event on Windows.  This resulted in a lot of event starvation.
Blocks: 257454
Test build posted here:
http://friedfish.homeip.net/test/threads/
Blocks: 280922
I've tried the firefox-threads-20060307 build.
The testcase from bug 280922, https://bugzilla.mozilla.org/attachment.cgi?id=173259 is working nicely.
I can open multiple of these testcases and all testcases are running.

However, when I right-click to get a context menu, I don't get one!
When I close the tab containing the testcase, and then open a new random url in a new tab, suddenly the context menu, I was asking earlier before, appears.

Also, I managed to crash when opening 10 of those testcases or so and then closing one tab (after trying to open a context menu or something like that).
I'm on a Duron 800MHz, using WinXP.
That's very helpful feedback, thanks Martijn!
Martijn, I found the problem, and I believe I've fixed it.  I'll post a new build shortly.  I'm having trouble with my friedfish, so I'm trying to find a way to post builds on stage.mozilla.org.
Martijn,

I've uploaded a new windows build here:
http://stage.mozilla.org/pub/mozilla.org/firefox/nightly/experimental/threadmanager/

I plan on posting builds for Linux and OSX shortly.
Ok, with the new build that problem seems fixed.
When I have 10 tabs (or so) of https://bugzilla.mozilla.org/attachment.cgi?id=173259 open, I can now easily open the context menu. 
However, when hovering over the menu-items, it doesn't change background-color that easily anymore, it tends to 'skip' menu-items.

Also see this:
http://wargers.org/home.hccnet.nl/bad/326273_nsithreadmanager.htm
(it's almost the same as http://wargers.org/home.hccnet.nl/bad/snake.htm )
Press the spacebar to start.
It was already moving erratic in Mozilla, but with this new build it seems a bit worse and the moving in the beginning seems more rushed (too fast).
OK, thanks Martijn.  I'll investigate the new problems.
(In reply to comment #8)

> I've uploaded a new windows build here:
> http://stage.mozilla.org/pub/mozilla.org/firefox/nightly/experimental/threadmanager/

For some reason this build won't start properly on my XP machine. The process loads but then no window is ever displayed. It doesn't churn the cpu or anything... it just sits there. I have to force-kill the process.

Running with the -p cmdline works just fine (I get the profiles window), but then clicking OK or Exit both result in the dead-yet-alive process.

I'm on my linux box today but I'll try to figure out what's going on later this weekend... unless you have any ideas?

Martijn, those regressions seem to be new to the "-2" build that I posted to fix the other problem.  I think I know what needs to be done.  I'll work on a revised fix that hopefully addresses both issues.  Thanks again for your help!

Ben, that's interesting.  Are you sure you gave it enough time?  The 3/7 trunk was very slow to startup (initially) because of places.
(In reply to comment #12)
> Martijn, those regressions seem to be new to the "-2" build that I posted to
> fix the other problem.  I think I know what needs to be done.  I'll work on a
> revised fix that hopefully addresses both issues.  Thanks again for your help!
> 
> Ben, that's interesting.  Are you sure you gave it enough time?  The 3/7 trunk
> was very slow to startup (initially) because of places.

Ditto all of the above comments. This includes what appears to be hung startups on XP. After a few minutes I cancelled the process on XP, started with -p and also appeared hung, cancelled, started again a few minutes later and it eventually came up.  

But ... started just fine on my w2k system.  Caveat - I have run a previous places build one or both of these PCs a few weeks back - sorry don't recall which or if both.


IIRC places first startup was substantially improved by a patch in the past few weeks, but I don't recall the bug# - Is that patch in this branch?
I too get sometimes the problem Ben describes in comment 11.
However, when I kill the process, and start Darin's build again, it starts fine.

A way to get this problem constantly:
- Start a 1.8 build of Firefox and close it again
- Start Darin's build, you get to see the Deer Park Update window
- Press "Don't Check"
After that the programm won't appears and stays as a zombie process in the task manager.
I've compared this with a regular 2006-03-07 build, and with that one, I don't see this issue. So I guess this could indeed be some kind of issue with Darin's build.

Darin, thank you for tackling this bug!
I forgot to say when it appeared to hang on XP it stopped cold at :01 cpu sec.
I never got the Deer Park upgrade prompt.

The speedup I was thinking of is bug 327330 checked in 3/1 so if Darin pulled branch on 3/7, then this build has the patch?
> http://www.newportharbor.us/computerworks.htm performs extremely poorly

That's just testing animated GIF performance.  The performance problems with that GIF have to do with the way we lazily decode the individual frames.  I think Pav was saying that he planned on doing some work to improve testcases like this.  For now, I'm just focusing on DHTML-based animations (built using window.setTimeout or window.setInterval).
New windows build with fixes (not for startup issue) posted here:
http://stage.mozilla.org/pub/mozilla.org/firefox/nightly/experimental/threadmanager/
Actually, I now think the problem I mentioned in the second part of comment 9 is more a general performance problem, that not really has anything to do with this bug. (The move to cairo has made it far worse, than it was, see bug 328367 and dependancies)
Darin's builds don't seem to change that, for the better or the worse.
Blocks: 291386
Attached file v1 patch (gzipped) (obsolete) —
Here it is (bugzilla wouldn't let me post it in plaintext).  Let me know if it would help to have this patch broken down into more bitesize chunks.  Please also let me know when you think you might be able to review this, and please let me know if there are any questions I can answer to help expedite the review.

I'm hopeful that you guys will not suggest any drastic API changes since I've had the wiki pages up there for a couple months now! ;-)
Attachment #217584 - Flags: superreview?(roc)
Attachment #217584 - Flags: review?(benjamin)
Yeah, reviewing that straight is just crazy.  I'm going to post portions of the patch for review instead.
Attachment #217584 - Attachment is patch: false
Attachment #217584 - Attachment mime type: text/plain → application/x-gzip
Attachment #217584 - Flags: superreview?(roc)
Attachment #217584 - Flags: review?(benjamin)
Attached patch patch: xpcom v1 (obsolete) — Splinter Review
Attachment #217586 - Flags: superreview?(roc)
Attachment #217586 - Flags: review?(benjamin)
Attached patch patch: widget v1Splinter Review
Attachment #217587 - Flags: superreview?(roc)
Attachment #217587 - Flags: review?(benjamin)
Attached patch patch: necko v1 (obsolete) — Splinter Review
Attachment #217588 - Flags: superreview?(roc)
Attachment #217588 - Flags: review?(benjamin)
Attached patch patch: layout v1 (obsolete) — Splinter Review
Attachment #217589 - Flags: superreview?(roc)
Attachment #217589 - Flags: review?(benjamin)
Attached patch patch: mail v1 (obsolete) — Splinter Review
Attachment #217590 - Flags: superreview?(roc)
Attachment #217590 - Flags: review?(benjamin)
Attached patch patch: other v1Splinter Review
Attachment #217591 - Flags: superreview?(roc)
Attachment #217591 - Flags: review?(benjamin)
Where necessary, I plan on requesting module owner/peer approval, but first I want to make sure that benjamin and roc are happy with these changes before involving additional reviewers.
Darin, I was skimming the layout diff and I have two questions:

1)  In ReflowEvent, we should probably hold a strong ref to the presshell like we do now -- ProcessReflowCommands may well cause it to die, generally speaking.  At the very least, if we decide to change the ownership setup here we should do it in a separate patch.

2)  The CSS frame constructor and presshell had the weird event-queue-getting thing so that reflow and frame construction could still happen while a modal dialog is up.  I assume that with the new system this is a non-issue?  If so, hallelujah!
Blocks: 240874
> 1)  In ReflowEvent, we should probably hold a strong ref to the presshell like
> we do now -- ProcessReflowCommands may well cause it to die, generally
> speaking.  At the very least, if we decide to change the ownership setup here
> we should do it in a separate patch.

Very interesting.  I hadn't considered the ownership implications of that change carefully enough.  I agree that we should try not to affect the ownership.


> 2)  The CSS frame constructor and presshell had the weird event-queue-getting
> thing so that reflow and frame construction could still happen while a modal
> dialog is up.  I assume that with the new system this is a non-issue?  If so,
> hallelujah!

Yes, we no longer have the concept of multiple event targets per thread.  You just post events to the current thread -- that's it! :-)
Blocks: 333167
Attached patch patch: xpcom v1.1 (obsolete) — Splinter Review
revised xpcom patch: more docs for nsTWeakRef.h, removed empty nsTWeakRef.cpp, cleaned up some other comments.
Attachment #217584 - Attachment is obsolete: true
Attachment #217632 - Flags: superreview?(roc)
Attachment #217632 - Flags: review?(benjamin)
Attachment #217586 - Attachment is obsolete: true
Attachment #217586 - Flags: superreview?(roc)
Attachment #217586 - Flags: review?(benjamin)
Attached patch patch: layout v1.1 (obsolete) — Splinter Review
fix problem w/ PresShell lifetime reported by bz
Attachment #217589 - Attachment is obsolete: true
Attachment #217634 - Flags: superreview?(roc)
Attachment #217634 - Flags: review?(benjamin)
Attachment #217589 - Flags: superreview?(roc)
Attachment #217589 - Flags: review?(benjamin)
Blocks: 333198
Comment on attachment 217632 [details] [diff] [review]
patch: xpcom v1.1

i've sent comments on this to darin. the resulting thing was 1601 lines which exceeded what i was willing to dump here.
*sigh* i just knew someone would ask me to do this. ok, here's what i sent to darin so that we can maybe save on duplicate comments. unfortunately most people don't read attachment content (i certainly don't), so i doubt that'll work.
Comment on attachment 217947 [details]
response to 'Initial review notes for the xpcom v.1.1 patch'

>>>Index: xpcom/proxy/public/nsIProxyObjectManager.idl
>>
>>>+/**
>>>+ * Pass this value as the target to NS_GetProxyForObject to specify the main
>>>+ * thread as the target for the proxy object.
>>>+ */
>>>+#define NS_PROXY_TO_MAIN_THREAD  ((nsIEventTarget *) 1)
>>
>>Hrm... this will be honored by NS_GetProxyForObject but not
>>nsIProxyObjectManager?
>
>Actually, it works for both.  I'll fix the comment.  Originally, I only
>supported it for NS_GetProxyForObject, but that seemed silly.  I just forgot to
>fix the comment.

I think that we should not document this for nsIProxyObjectManager; somebody could end up passing pointer-cast integers across xptcall boundaries by accident, which would be bad.

>>>Index: js/jsd/jsd_xpc.cpp

>>Doesn't this need to push an event filter, to emulate the previous
>>behavior somehow?
>
>I don't know.  Does it?  Who might know?  I'll ask timeless.

Basically it runs a modal event loop so that it can show the venkman window without reentering the javascript that it is debugging.

>Ick.  GetHasPendingEvents is unsatisfying to me.

ok, leave it as-is.

>>Why [notxpcom] boolean instead of a regular void method (using error
>>nsresult to indicate whether to accept the event). It doesn't seem
>>very useful to have a scriptable interface with no scriptable methods.
>
>Because it would be very bad for XPCOM proxies or language bindings that did
>things with event queues to implement this method.  The interface on the other

It would? Language bindings (xpconnect) generally don't have much direct interaction with events or queues, AFAICT... other than terrible performance I don't see why it wouldn't make sense to implement this method in JS.

>interesting.  I'm concerned though that someone would then not have a way to
>say only create a thread if it does not already exist.  Then that would have
>the same sort of race condition, right?  Does this argue for a creation flags
>parameter?

Yes, I think it does. It would be an unusual circumstance where two independent threads would be racing to create a third named thread, but I suppose it could happen if you were intializing some advanced GC worker-thread or something bizarre like that.

>I don't understand.  I need to be able to be able to manipulate the fields
>contained in this class.  What exactly can be const?

Oh, I didn't realize it had member vars. Hrm, we're in uncharted territory a little bit here, since this is no longer a constructor/destructor-less class. It should probably work, but I worry a little bit.

>>>+nsThreadManager::GetMainThread(nsIThread **result)

>We can cache it, but maybe that would be premature optimization.  The reason it

I just imagine somebody calling this in a loop or a perf-sensitive area (e.g. posting a bunch of events to the main thread without caching it).

>>>+void
>>>+nsThreadPool::ShutdownThread(nsIThread *thread)

>The function is only called at the end of nsThreadPool::Run.  It is called when
>it comes time to "join" the thread that is exiting.  That must happen from some
>other thread, so I chose to always join from the main thread.  Doesn't that seem
>reasonable?

Yes, I think I was just confused about what code was triggering this call.

>>>+#undef  IMETHOD_VISIBILITY
>>>+#define IMETHOD_VISIBILITY NS_VISIBILITY_DEFAULT
>>
>>I don't think that this is necessary. NS_COM specifies default
>>visibility in the macro, and I have been sucessfully removing this hack
>>from the tree.
>
>This was required to build on Linux.  Otherwise, the nsRunnable defined AddRef,
>Release, and QueryInterface were undefined when used outside this DSO.  That
>makes perfect sense of course since the symbols are otherwise hidden.  That
>problem exists for any NS_IMETHOD that is exported from a DSO, so it would
>continue to exist even after moving this class into xpcom/glue unless you
>wanted to make other code get a copy of the symbols, which seems undesirable to
>me.

Oh, I see how this works. Ugh, perhaps this should be "#define IMETHOD_VISIBILITY" without NS_VISIBILITY_DEFAULT (since in libxul builds we don't want to export these functions).
(In reply to comment #38)
> I think that we should not document this for nsIProxyObjectManager; somebody
> could end up passing pointer-cast integers across xptcall boundaries by
> accident, which would be bad.

Good point.  Perhaps I should just drop support for it on nsIProxyObjectManager altogether.  I changed most uses of nsIProxyObjectManager in the tree with calls to NS_GetProxyForObject anyways, so if there are any stragglers, I can easily change them over as well.


> >>>Index: js/jsd/jsd_xpc.cpp
> 
> >>Doesn't this need to push an event filter, to emulate the previous
> >>behavior somehow?
> >
> >I don't know.  Does it?  Who might know?  I'll ask timeless.
> 
> Basically it runs a modal event loop so that it can show the venkman window
> without reentering the javascript that it is debugging.

I think we need a better solution for that problem.  We should have a way to suspend the JS context of the page being debugged.  We'll need something like that for synchronous XMLHttpRequest too.  I'd like to avoid pushing event queues to solve these sorts of problems, though in the debugging case, I can almost tolerate it.  So, perhaps I will throw in a pair of Push/PopEventQueue calls in this case.


> >>Why [notxpcom] boolean instead of a regular void method (using error
> >>nsresult to indicate whether to accept the event). It doesn't seem
> >>very useful to have a scriptable interface with no scriptable methods.
> >
> >Because it would be very bad for XPCOM proxies or language bindings that did
> >things with event queues to implement this method.  The interface on the other
> 
> It would? Language bindings (xpconnect) generally don't have much direct
> interaction with events or queues, AFAICT... other than terrible performance I
> don't see why it wouldn't make sense to implement this method in JS.

Well, a language binding might need to load a file containing code (e.g., a class file).  Imagine some crazy case where a language binding needs to fetch something using Necko.  More to the point however:  consider distributed XPCOM.  I just don't want to deal with this method being used outside a strict C++ setting.  Otherwise, I have to put a lot more work into the implementation to have it release the thread's lock during the callback to the filter.


> >interesting.  I'm concerned though that someone would then not have a way to
> >say only create a thread if it does not already exist.  Then that would have
> >the same sort of race condition, right?  Does this argue for a creation flags
> >parameter?
> 
> Yes, I think it does. It would be an unusual circumstance where two independent
> threads would be racing to create a third named thread, but I suppose it could
> happen if you were intializing some advanced GC worker-thread or something
> bizarre like that.

Hmm.. you could solve that with a service representing the GC worker-thread.  You ask that service to do your dirty work, and it takes care of ensuring creation of only one thread.  We don't need nsIThreadManager to provide that functionality in order to build that sort of thing.  I'm loathe to complicate this system further if it can be helped.


> >I don't understand.  I need to be able to be able to manipulate the fields
> >contained in this class.  What exactly can be const?
> 
> Oh, I didn't realize it had member vars. Hrm, we're in uncharted territory a
> little bit here, since this is no longer a constructor/destructor-less class.
> It should probably work, but I worry a little bit.

Yeah, it may prove to be problematic, but I'm willing to tempt fate for a little while.


> >>>+nsThreadManager::GetMainThread(nsIThread **result)
> 
> >We can cache it, but maybe that would be premature optimization.  The reason it
> 
> I just imagine somebody calling this in a loop or a perf-sensitive area (e.g.
> posting a bunch of events to the main thread without caching it).

Yeah, okay.  I've changed it.  I still need to keep a pointer to mMainPRThread in order to support calls to GetIsMainThread post-Shutdown.  That may point to another problem... not sure when to shutdown the main nsIThread implementation, but right now, I'm doing so from nsThreadManager::Shutdown, which occurs before we notify xpcom-shutdown-loaders.  That may not be the right time to destroy the main nsIThread implementation.


> Oh, I see how this works. Ugh, perhaps this should be "#define
> IMETHOD_VISIBILITY" without NS_VISIBILITY_DEFAULT (since in libxul builds we
> don't want to export these functions).

Perhaps.  I haven't actually testing xulrunner yet.
> Good point.  Perhaps I should just drop support for it on nsIProxyObjectManager

Nevermind.  There's way too many users of nsIProxyObjectManager and NS_PROXY_TO_MAIN_THREAD.  I wish to defer this issue to a follow-up bug.
That's fine, as long as we don't document it I don't mind implementation eccentricities.
so about venkman, as i just revisited it.

currently venkman crashes when it plays with threads.
i have code that would let venkman work w/ threads as long as it doesn't have to deal w/ thread unsafe objects. unfortunately, the world that i actually play with isn't like that.

the code i'm using takes a debugged thread and sync proxies the jsd methods to the debugger thread (for venkman, this will generally be the ui thread, but that's not something hard wired into jsd_xpc). i can't do this.

what i need to do is async proxy the call to the target thread and push an event loop on the proxying thread. then from the destination thread, *when* the debugger encounters an object which isn't thread safe, it needs to proxy all calls about that object back to the original thread - which will only work with an event loop running on that thread. when the debugger is done, it will return as it currently does allowing the async proxy to be resolved and the nested event loop on the original thread to clean up.

oh and to make things more fun, currently the idl uses integer like return values (i.e. last param to c++ method is an int*), which doesn't work for proxies. since i'm already wrapping, i think i'll have an internal interface which uses some threadsafe object to store that result instead.

anyway, that's what i need for venkman and that means that if i do try to finish my venkman work before you land, you will trigger conflicts. if what you're providing lets me do what i need, that's great, although some pointers would be appreciated.
So, I can make a call to nsIThreadInternal::PushEventQueue(null) in enterNestedEventLoop, but that doesn't have the property of stopping incoming network events or timer events or other asynchronous stuff.  It only has the effect of blocking existing events from firing... which I'm not sure will be that helpful.  If venkman needs a way to block out all other asynchronous stuff except for UI events (and associated reflows), then it'll need to solve that some other way.  That solution might have a lot in common with the solution that we invent for synchronous XMLHttpRequest.
Attached patch patch: xpcom v1.2 (obsolete) — Splinter Review
The other patches were revised slightly as well to account for the XPCOM changes, but I'd like to get this one reviewed fully before posting those revisions.
Attachment #217632 - Attachment is obsolete: true
Attachment #218120 - Flags: superreview?(roc)
Attachment #218120 - Flags: review?(benjamin)
Attachment #217632 - Flags: superreview?(roc)
Attachment #217632 - Flags: review?(benjamin)
Comment on attachment 218120 [details] [diff] [review]
patch: xpcom v1.2

> PRBool
>-nsMemoryImpl::sIsFlushing = PR_FALSE;
>+nsMemoryImpl::sIsFlushing = 0;

you forgot to change the type declaration here.
> > PRBool
> >-nsMemoryImpl::sIsFlushing = PR_FALSE;
> >+nsMemoryImpl::sIsFlushing = 0;
> 
> you forgot to change the type declaration here.

Thanks!
Comment on attachment 217634 [details] [diff] [review]
patch: layout v1.1

This changes <xul:image> error events to bubble; I doubt that's desirable....
> This changes <xul:image> error events to bubble; I doubt that's desirable....

Darn, that was not intentional.  I must have merged incorrectly with the recent DOM event changes in that file :-(
Actually, it's correct in my tree.  The code circa Feb 2006 did not set the NS_EVENT_FLAG_CANT_BUBBLE flag for error events.  When merging, I must have preserved that old code inadvertantly.  More recently, I had another merge conflict in that file, and I actually ended up correcting it to specify NS_EVENT_FLAG_CANT_BUBBLE.  I'll post a revised patch.
Attached patch patch: layout v1.2 (obsolete) — Splinter Review
Attachment #217634 - Attachment is obsolete: true
Attachment #218186 - Flags: superreview?(roc)
Attachment #218186 - Flags: review?(benjamin)
Attachment #217634 - Flags: superreview?(roc)
Attachment #217634 - Flags: review?(benjamin)
Comment on attachment 217588 [details] [diff] [review]
patch: necko v1

>Index: netwerk/base/src/nsPACMan.cpp

>+/*
> void *PR_CALLBACK
> nsPACMan::LoadEvent_Handle(PLEvent *ev)
> {
>   NS_REINTERPRET_CAST(nsPACMan *, PL_GetEventOwner(ev))->StartLoading();
>@@ -240,8 +241,17 @@ nsPACMan::LoadEvent_Destroy(PLEvent *ev)
>   self->mLoadEvent = nsnull;
>   self->Release();
>   delete ev;
> }
>+*/

use #if 0 instead of comments, or remove completely

>Index: netwerk/base/src/nsSocketTransportService2.cpp


> nsSocketTransportService::~nsSocketTransportService()

>+    PR_DestroyLock(mLock);

This could be null in OOM conditions, do we care? I didn't look carefully, but did we ever check mLock after the constructor to make sure we didn't OOM early?

> NS_IMETHODIMP
> nsSocketTransportService::IsOnCurrentThread(PRBool *result)
> {
>-    *result = (PR_GetCurrentThread() == gSocketThread);
>-    return NS_OK;
>+    NS_ENSURE_TRUE(mThread, NS_ERROR_NOT_INITIALIZED);
>+    return mThread->IsOnCurrentThread(result);
> }

Under what conditions would this reasonably be called while mThread was not initialized? Seems like an overly-cautious null-check to me, unless I'm missing something (this pattern happens in several places).

> NS_IMETHODIMP
> nsSocketTransportService::Init()
> {
>-    NS_ASSERTION(nsIThread::IsMainThread(), "wrong thread");
>+    NS_ENSURE_STATE(NS_IsMainThread());

I think that this is still assertion-worthy, even if you wish to add the runtime check.

>Index: netwerk/base/src/nsSocketTransportService2.h

>-    // pref to control autodial code
>     PRBool      mAutodialEnabled;
>+                            // pref to control autodial code

nit, weird indentation

> 
>     //-------------------------------------------------------------------------
>-    // misc (socket thread only)
>+    // initialization and shutdown (any thread)
>     //-------------------------------------------------------------------------
>-    // indicates whether we are currently in the process of shutting down
>-    PRBool      mShuttingDown;
> 
>-    //-------------------------------------------------------------------------
>-    // event queue (any thread)
>-    //-------------------------------------------------------------------------
>-
>-    PRCList  mEventQ; // list of PLEvent objects
>-    PRLock  *mEventQLock;
>+    PRLock     *mLock;
>+    PRBool      mInitialized;
>+    PRBool      mShuttingDown;

PRPackedBool?

>+                            // indicates whether we are currently in the
>+                            // process of shutting down

more odd indentation

>Index: netwerk/cache/src/nsCacheEntry.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/netwerk/cache/src/nsCacheEntry.cpp,v
>retrieving revision 1.57
>diff -p -u -4 -r1.57 nsCacheEntry.cpp
>--- netwerk/cache/src/nsCacheEntry.cpp	1 Sep 2005 00:20:14 -0000	1.57
>+++ netwerk/cache/src/nsCacheEntry.cpp	7 Apr 2006 19:13:25 -0000
>@@ -43,8 +43,10 @@
> #include "nsCacheEntry.h"
> #include "nsCacheEntryDescriptor.h"
> #include "nsCacheMetaData.h"
> #include "nsCacheRequest.h"
>+#include "nsThreadUtils.h"
>+#include "nsProxyRelease.h"
> #include "nsError.h"
> #include "nsICacheService.h"
> #include "nsCache.h"
> #include "nsCacheService.h"
>@@ -84,13 +86,13 @@ nsCacheEntry::~nsCacheEntry()
> 
>     // proxy release of of memory cache nsISupports objects
>     if (!mData)  return;
>     
>-    nsISupports * data = mData;
>-    NS_ADDREF(data);    // this reference will be owned by the proxy
>-    mData = nsnull;     // release our reference before switching threads
>+    nsISupports *data = nsnull;
>+    mData.swap(data);  // this reference will be owned by the proxy
>+                       // release our reference before switching threads
> 
>-    nsCacheService::ProxyObjectRelease(data, mThread);
>+    NS_ProxyRelease(mThread, data);
> }

How is this more efficient or readable than ((nsISupports*) mData)->AddRef(); NS_ProxyRelease(mThread, mData); ?

>Index: netwerk/protocol/http/src/nsHttpTransaction.h

>+    /*XXX
>     static void *PR_CALLBACK TransportStatus_Handler(PLEvent *);
>     static void  PR_CALLBACK TransportStatus_Cleanup(PLEvent *);
>-    static void *PR_CALLBACK DeleteThis_Handler(PLEvent *);
>-    static void  PR_CALLBACK DeleteThis_Cleanup(PLEvent *);
>+    */

#if 0

>Index: netwerk/test/TestDNS.cpp

> static PRBool IsAscii(const char *s)
> {
>-  while (*s)
>+  for (; *s; ++s) {
>     if (*s & 0x80)
>       return PR_FALSE;
>+  }

heh, apparently nobody ran that test.
Attachment #217588 - Flags: review?(benjamin) → review+
> use #if 0 instead of comments, or remove completely

Thanks, I meant to remove that.


> This could be null in OOM conditions, do we care? I didn't look carefully, but
> did we ever check mLock after the constructor to make sure we didn't OOM early?

I added a check to nsSocketTransportService::Init

> > NS_IMETHODIMP
> > nsSocketTransportService::IsOnCurrentThread(PRBool *result)
> > {
> >+    NS_ENSURE_TRUE(mThread, NS_ERROR_NOT_INITIALIZED);
> >+    return mThread->IsOnCurrentThread(result);
> > }
> 
> Under what conditions would this reasonably be called while mThread was not
> initialized? Seems like an overly-cautious null-check to me, unless I'm 
> missing something (this pattern happens in several places).

This is important.  The socket transport service is shutdown while we are offline.  There is no socket thread when it is shutdown.  However, the singleton object is still alive and well, and it can easily be accessed and this method can be called.  It could be argued that this method should return PR_FALSE when the STS is shutdown, but I'm happy to also say that it should also not be used.  There isn't much of a performance concern here.  It's a very cheap test, especially since we have to dereference |this| again on the next line.


> > NS_IMETHODIMP
> > nsSocketTransportService::Init()
> > {
> >-    NS_ASSERTION(nsIThread::IsMainThread(), "wrong thread");
> >+    NS_ENSURE_STATE(NS_IsMainThread());
> 
> I think that this is still assertion-worthy, even if you wish to add the
> runtime check.

Sigh... maybe we should have a macro for that.  I'll add a NS_ERROR.


> >Index: netwerk/base/src/nsSocketTransportService2.h
> 
> >-    // pref to control autodial code
> >     PRBool      mAutodialEnabled;
> >+                            // pref to control autodial code
> 
> nit, weird indentation

It's actually consistent with the rest of the file.


> >+    PRBool      mInitialized;
> >+    PRBool      mShuttingDown;
> 
> PRPackedBool?

OK.  I verified that we don't make any assumptions about reading (or writing) those values atomically.


> >Index: netwerk/cache/src/nsCacheEntry.cpp

> >-    mData = nsnull;     // release our reference before switching threads
> >+    nsISupports *data = nsnull;
> >+    mData.swap(data);  // this reference will be owned by the proxy
> >+                       // release our reference before switching threads
> > 
> >-    nsCacheService::ProxyObjectRelease(data, mThread);
> >+    NS_ProxyRelease(mThread, data);
> > }
> 
> How is this more efficient or readable than ((nsISupports*) mData)->AddRef();
> NS_ProxyRelease(mThread, mData); ?

Well, that isn't correct.  That introduces a race between this thread and that thread.  The point is to clear mData (to release it) on mThread instead of on this thread.  nsCOMPtr's swap method ends up being a convenient way to achieve that.
Attachment #218208 - Flags: superreview?(roc)
Attachment #218186 - Flags: review?(benjamin) → review?(bzbarsky)
Attachment #217588 - Attachment is obsolete: true
Attachment #217588 - Flags: superreview?(roc)
Comment on attachment 218120 [details] [diff] [review]
patch: xpcom v1.2

>Index: xpcom/build/nsXPComInit.cpp

> NS_ShutdownXPCOM(nsIServiceManager* servMgr)

>+        nsCOMPtr<nsIThread> thread = do_GetCurrentThread();
>+        NS_ENSURE_STATE(thread);

Are you sure? We are very fault-tolerant in XPCOM shutdown, in
general. Are there conditions where NS_InitXPCOM could succeed, but
this can fail (OOM)?

>Index: xpcom/threads/nsEventQueue.h

>+  enum { EVENTS_PER_PAGE = 15 };

Is there anything special about the number 15? (16-word struct?) At a glance it seems to me that a much larger block wouldn't be
bad, like 512. I guess it depends on the performance characteristics of the malloc impl. If this is trying to fit in a 16-word alloc structure, it should actually be 14 words long (EVENTS_PER_PAGE = 13), because typical malloc heaps have two-word malloc overhead. I'm not sure whether allocating these Pages is a critical codepath or not, but if it is we might want to consider a more specialized recycling allocator.

>+
>+  // Page objects are linked together to form a simple deque.
>+
>+  struct Page {
>+    struct Page *mNext;
>+    nsIRunnable *mEvents[EVENTS_PER_PAGE];
> 
>-  void NotifyObservers(const char *aTopic);
>-  void CheckForDeactivation();
>+    Page() : mNext(nsnull) {}
>+  };
>+
>+  static Page *NewPage() {
>+    return NS_STATIC_CAST(Page *, calloc(1, sizeof(Page)));
>+  }

1) why isn't this an overloaded operator new() on the struct? It seems like
   this static method will skip the Page() constructor always.
2) what is 1? is this just an uninitialized-memory guard? if so, why
   not use 0xAB?

>Index: xpcom/threads/nsEventQueue.cpp

>+nsEventQueue::nsEventQueue()
>+  : mMonitor(nsAutoMonitor::NewMonitor("xpcom.eventqueue"))
>+  , mHead(nsnull)
>+  , mTail(nsnull)
>+  , mOffsetHead(0)
>+  , mOffsetTail(0)

>+PRBool
>+nsEventQueue::GetEvent(PRBool mayWait, nsIRunnable **result)
> {
>+  nsAutoMonitor mon(mMonitor);
> 
>+  while (IsEmpty()) {
>+    if (!mayWait) {
>+      if (result)
>+        *result = nsnull;
>+      return PR_FALSE;
>+    }
>+    LOG(("EVENTQ(%p): wait begin\n", this)); 
>+    mon.Wait();
>+    LOG(("EVENTQ(%p): wait end\n", this)); 
>   }
> 
>+  if (result)
>+    *result = mHead->mEvents[mOffsetHead++];

In debug builds at least do you want to reset mEvents[mOffsetHead] to
garbage?

>+  // Check if mHead points to empty Page
>+  if (mOffsetHead == EVENTS_PER_PAGE) {
>+    Page *dead = mHead;
>+    mHead = mHead->mNext;
>+    FreePage(dead);
>+    mOffsetHead = 0;
>   }
> }

If the caller didn't pass 'result', there's no need to check mHead, is
there? Seems you could/should put the entire last section in the
if (result) block.
Attachment #218120 - Flags: review?(benjamin) → review+
> Are you sure? We are very fault-tolerant in XPCOM shutdown, in
> general. Are there conditions where NS_InitXPCOM could succeed, but
> this can fail (OOM)?

NS_InitXPCOM3 fails if it cannot initialize the main thread, and there are no memory allocations required to get the main thread once it has been constructed.


> >+  enum { EVENTS_PER_PAGE = 15 };
> 
> Is there anything special about the number 15? (16-word struct?)

I chose 15 rather randomly.  I could definitely bump it up a bit.


> >+  static Page *NewPage() {
> >+    return NS_STATIC_CAST(Page *, calloc(1, sizeof(Page)));
> >+  }
> 
> 1) why isn't this an overloaded operator new() on the struct? It seems like
>    this static method will skip the Page() constructor always.

Actually, I should just remove the default constructor.  calloc initializes the structure to all zeros, which is what the default constructor is trying to do.


> 2) what is 1? is this just an uninitialized-memory guard? if so, why
>    not use 0xAB?

The first parameter to calloc specifies the number of members of size given by the second parameter.  The contents of the memory block are initialized to zero no matter what.


> >+  if (result)
> >+    *result = mHead->mEvents[mOffsetHead++];
> 
> In debug builds at least do you want to reset mEvents[mOffsetHead] to
> garbage?

Good idea.


> If the caller didn't pass 'result', there's no need to check mHead, is
> there? Seems you could/should put the entire last section in the
> if (result) block.

Yup, thanks!
Blocks: 300057
Blocks: 301791
Comment on attachment 218186 [details] [diff] [review]
patch: layout v1.2

>Index: content/base/src/nsDocument.cpp

> nsDocument::PostUnblockOnloadEvent()
>+  nsCOMPtr<nsIRunnable> evt = new nsUnblockOnloadEvent(this);
>+  nsresult rv = NS_DispatchToCurrentThread(evt);

Not null-checking is ok because NS_DispatchToCurrentThread is null-safe and throws if the arg is null?  If so, that should be documented in nsThreadUtils.h.

>Index: content/base/src/nsDocument.h

> private:
>+  friend class nsUnblockOnloadEvent;

As I recall, this will give errors on some of our compilers because that class is undeclared (and they don't treat friend decls as being declarations of the thing that's being a friend).  You need a forward-decl of nsUnblockOnloadEvent earlier in this file (and hence probably want to make it a private inner class of nsDocument).

>Index: content/base/src/nsImageLoadingContent.cpp
>   ImageEvent(nsPresContext* aPresContext, nsIContent* aContent,
>              const nsAString& aMessage, nsIDocument* aDocument)
>@@ -731,10 +729,13 @@ public:
>     MOZ_COUNT_CTOR(ImageEvent);

No need for that; we'll be refcounted via the nsRunnable.

That said, perhaps some of these subclasses should override addref/release (only in trace-refcount builds?) to get better data out?

>     MOZ_COUNT_DTOR(ImageEvent);

Remove that too.

>@@ -799,22 +786,14 @@ nsImageLoadingContent::FireEvent(const n

>+  return NS_DispatchToCurrentThread(evt);

Is it me, or is NS_DispatchToMainThread a shade faster?  Since we're asserting that we're on the main thread above, could we use that here?  This code is, sadly, hit a lot during pageload.  :(  Or is the perf difference here completely negligible?

>Index: content/base/src/nsObjectLoadingContent.cpp

>   // (the content is stored implicitly as the owner)
>+  nsObjectLoadingContent *mContent;

The comment is no longer true.

Also, why not just use nsRefPtr<nsObjectLoadingContent> and not have to worry about manual refcounting?  Is the Release() method on that ambiguous or something?
>+nsAsyncInstantiateEvent::Run()
>+  // Check if we've been "revoked"
>+  if (mContent->mPendingInstantiateEvent != this)

Is this correct if multiple instantiate events are in flight at once (which is allowed)?  Maybe check with biesi?

> nsObjectLoadingContent::HasNewFrame(nsIObjectFrame* aFrame)

>-  if (!mInstantiating && aFrame && mType == eType_Plugin) {
>+  if (!mInstantiating && aFrame && mType == eType_Plugin &&
>+      !mPendingInstantiateEvent) {

This changes the logic, no?  Why?  The change looks wrong to me, since instantiate events capture the aFrame, so the already-posted event will NOT fire on this changed aFrame.

For what it's worth more than 4 lines of context would really help with reviewing the changes to this function.  :(

>+      mPendingInstantiateEvent = nsnull;  // break cycle

Why do we have a cycle?  Why is mPendingInstantiateEvent even a strong ref?  It should be a weak ref (nulled out as needed in ~nsAsyncInstantiateEvent), imo.  Not quite sure what this method should do with mPendingInstantiateEvent in failure conditions....

>Index: content/base/src/nsObjectLoadingContent.h
>+    nsCOMPtr<nsIRunnable>       mPendingInstantiateEvent;

Like I said, this is better as a weak pointer, imo.

>Index: content/events/public/nsPLDOMEvent.h
>+ * TODO: This should be renamed nsAsyncDOMEvent or something that does
>+ *       not include the substring "PL" that refers to the old PLEvent
>+ *       structure used with the old eventing system.

File a followup bug; cite the bug# in the comment.

>Index: docshell/base/nsDocShell.cpp

>+class RestorePresentationEvent : public nsRunnable
>-    nsRefPtr<nsDocShell> mDocShell;
>+    nsDocShellWeakRef mDocShell;

This changes the ownership model such that the docshell can now get destroyed while an event is pending.  We should probably call mWeakSelf.forget() in the destructor of nsDocShell.

>Index: docshell/base/nsWebShell.cpp

>-struct OnLinkClickEvent : public PLEvent {
>+struct OnLinkClickEvent : public nsRunnable {

s/struct/class/ right?

>-  if (eventService)
>-    eventService->GetThreadEventQueue(mThread, aQueue);

Can we assert before calling NS_DispatchToCurrentThread that the current thread is the same thread that Create() was called on or something?  Just in case someone calls into the nsILinkHandler api from off the main thread?  I see nothing that would prevent that from happening....

Better yet, would be to keep the "original thread" thing we have now and proxy over the clicks to the original thread as we do now -- that would allow OnLinkClick calls from all threads.

>Index: dom/src/base/nsGlobalWindow.cpp
>@@ -4350,17 +4320,10 @@ nsGlobalWindow::Close()
>+    nsRefPtr<nsCloseEvent> ev = new nsCloseEvent(this);

Why not just nsCOMPtr<nsIRunnable> here?

>Index: dom/src/base/nsJSEnvironment.cpp
>@@ -2297,20 +2294,11 @@ nsJSEnvironment::Init()
> #ifdef DEBUG
>   // Let's make sure that our main thread is the same as the xpcom main thread.
>+  NS_ASSERTION(NS_IsMainThread(), "bad");
> #endif

No need for the #ifdef here.  ;)

>Index: layout/base/nsCSSFrameConstructor.h

>+  struct RestyleEvent : public nsRunnable {

s/struct/class/, right?

>Index: layout/base/nsPresShell.cpp

>+struct ReflowEvent : public nsRunnable {

s/struct/class/

>+      // NOTE: the ReflowEvent struct is a friend of the PresShell class

And here.

>         // Now, explicitly release the pres shell before the view manager
>-        presShell = nsnull;
>         viewManager = nsnull;

Please leave the |ps = nsnull| thing there so the code matches that comment.

>Index: layout/forms/nsComboboxControlFrame.cpp
>@@ -1591,33 +1567,28 @@ nsComboboxControlFrame::RedisplayText(PR
>+    nsCOMPtr<nsIRunnable> event = new RedisplayTextEvent(mWeakSelf);
>+    NS_ENSURE_TRUE(event, NS_ERROR_OUT_OF_MEMORY);
> 
>+    // Revoke outstanding events to avoid out-of-order events which could mean
>+    // displaying the wrong text.
>+    if (mRedisplayTextEventPosted) {
>+      mWeakSelf.forget();
>+      mRedisplayTextEventPosted = PR_FALSE;
>+    }

That |if (mRedisplayTextEventPosted)| block (together with the ensuing mWeakSelf initialization) needs to come before we create the RedisplayTextEvent.  Otherwise you're revoking the event you just created (even before you've posted it).

>Index: layout/generic/nsGfxScrollFrame.h

>+  PRPackedBool mScrollEventPending:1;

Don't we need to init that in our constructor?

>Index: layout/generic/nsSelection.cpp
> class nsTypedSelection : public nsISelection,
>@@ -283,11 +285,11 @@ private:

>+  PRPackedBool mScrollEventPosted;

We need to init that in the constructor too, right?

>+struct nsScrollSelectionIntoViewEvent : public nsRunnable {

s/struct/class/

> nsTypedSelection::PostScrollSelectionIntoViewEvent(SelectionRegion aRegion)
>+  nsCOMPtr<nsIRunnable> ev =
>+      new nsScrollSelectionIntoViewEvent(mWeakSelf, aRegion);
>+  if (NS_FAILED(NS_DispatchToCurrentThread(ev)))
>+    return NS_ERROR_UNEXPECTED;

Is it worth propagating a better error here?

>Index: layout/printing/nsPrintEngine.cpp

>+  NS_IMETHOD Run() {
>+    if (mDocViewerPrint)

How could it be null?  We asserted that it's not in our constructor...

>Index: layout/style/nsCSSLoader.cpp

>+#if 0

Just remove the code.  It's not so pretty that we'd want to keep it around.  ;)

>@@ -1956,15 +1959,9 @@ CSSLoaderImpl::PostLoadEvent(nsIURI* aUR

>   if (!mPostedEvents.AppendElement(evt)) {
>-    PL_DestroyEvent(evt);
>+    NS_RELEASE(evt);

This needs to be |DestroyLoadEvent(evt)| instead of NS_RELEASE.  Otherwise we won't unblock the document onload.

Alternately you could move the BlockOnload() call to after this block and use NS_RELEASE here as you do; now that we're not tied to destroying via PL_DestroyEvent and have a choice of whether DestroyLoadEvent() is called when we destroy the event, that might be a better control flow.

>Index: parser/htmlparser/src/nsParser.h
>+    nsCOMPtr<nsIRunnable>        mContinueEvent;

I'm not really happy with the cycle here, even thought it does look like it's safe...  Could we make this a weak ref (and create new events into an on-stack nsCOMPtr)?  Or use the other revokation setup if that would make more sense?

>Index: view/src/nsViewManager.cpp
>+    // mViewManager is a weak reference that the view manager will break in
>+    // it's destructor.

s/it's/its/

> nsViewManager::PostInvalidateEvent()
>+      NS_WARNING("failed to dispatch nsSynthMouseMoveEvent");

s/nsSynthMouseMoveEvent/nsInvalidateEvent/

r=bzbarsky with those fixed.
Attachment #218186 - Flags: review?(bzbarsky) → review+
Blocks: 334573
> Not null-checking is ok because NS_DispatchToCurrentThread is null-safe and
> throws if the arg is null?  If so, that should be documented in
> nsThreadUtils.h.

null arg intentional.  API documented.


> > private:
> >+  friend class nsUnblockOnloadEvent;
> 
> As I recall, this will give errors on some of our compilers because that class
> is undeclared (and they don't treat friend decls as being declarations of the
> thing that's being a friend).  You need a forward-decl of nsUnblockOnloadEvent
> earlier in this file (and hence probably want to make it a private inner class
> of nsDocument).

I decided to simply forward declare the class to avoid having to #include nsThreadUtils.h in nsDocument.h as that would increase compile time.


> That said, perhaps some of these subclasses should override addref/release
> (only in trace-refcount builds?) to get better data out?

Interesting idea.  Maybe.  Since there are so many of them, I'd prefer to leave that until we see nsRunnable objects leaking, and I really want the implementation of new events to be as simple as possible.


> >@@ -799,22 +786,14 @@ nsImageLoadingContent::FireEvent(const n
> 
> >+  return NS_DispatchToCurrentThread(evt);
> 
> Is it me, or is NS_DispatchToMainThread a shade faster?  Since we're asserting
> that we're on the main thread above, could we use that here?  This code is,
> sadly, hit a lot during pageload.  :(  Or is the perf difference here
> completely negligible?

I'd be surprised if there was any measurable difference.  I prefer to minimize places where we assert that the current thread is the main thread.  In fact, that assertion seems superflous to me.  I don't see why we have it.


> >Index: content/base/src/nsObjectLoadingContent.cpp
> 
> Also, why not just use nsRefPtr<nsObjectLoadingContent> and not have to worry
> about manual refcounting?  Is the Release() method on that ambiguous or
> something?

Because AddRef and Release are ambiguous :-(


> >+nsAsyncInstantiateEvent::Run()
> >+  // Check if we've been "revoked"
> >+  if (mContent->mPendingInstantiateEvent != this)
> 
> Is this correct if multiple instantiate events are in flight at once (which is
> allowed)?  Maybe check with biesi?

OK, I sent mail to biesi asking about this.


> > nsObjectLoadingContent::HasNewFrame(nsIObjectFrame* aFrame)
> 
> >-  if (!mInstantiating && aFrame && mType == eType_Plugin) {
> >+  if (!mInstantiating && aFrame && mType == eType_Plugin &&
> >+      !mPendingInstantiateEvent) {
> 
> This changes the logic, no?  Why?  The change looks wrong to me, since
> instantiate events capture the aFrame, so the already-posted event will NOT
> fire on this changed aFrame.

I think this is tied to my assumption that there should only be one pending instantiate event per nsObjectLoadingContent.  I'll revise this code accordingly.


> For what it's worth more than 4 lines of context would really help with
> reviewing the changes to this function.  :(

Sorry, I normally diff with more context.  I think I reduced my context setting in .cvsrc for some reason and then never set it back :(


> >+      mPendingInstantiateEvent = nsnull;  // break cycle
> 
> Why do we have a cycle?  Why is mPendingInstantiateEvent even a strong ref?  It
> should be a weak ref (nulled out as needed in ~nsAsyncInstantiateEvent), imo. 
> Not quite sure what this method should do with mPendingInstantiateEvent in
> failure conditions....

Again, going to revisit this when I figure out how best to revise for multiple instantiate events.


> >Index: content/base/src/nsObjectLoadingContent.h
> >+    nsCOMPtr<nsIRunnable>       mPendingInstantiateEvent;
> 
> Like I said, this is better as a weak pointer, imo.



> >Index: content/events/public/nsPLDOMEvent.h
> >+ * TODO: This should be renamed nsAsyncDOMEvent or something that does
> >+ *       not include the substring "PL" that refers to the old PLEvent
> >+ *       structure used with the old eventing system.
> 
> File a followup bug; cite the bug# in the comment.

OK, see bug 334573.  Comment added to the code.


> >Index: docshell/base/nsDocShell.cpp
> 
> >+class RestorePresentationEvent : public nsRunnable
> >-    nsRefPtr<nsDocShell> mDocShell;
> >+    nsDocShellWeakRef mDocShell;
> 
> This changes the ownership model such that the docshell can now get destroyed
> while an event is pending.  We should probably call mWeakSelf.forget() in the
> destructor of nsDocShell.

OK


> >Index: docshell/base/nsWebShell.cpp
> 
> >-struct OnLinkClickEvent : public PLEvent {
> >+struct OnLinkClickEvent : public nsRunnable {
> 
> s/struct/class/ right?

OK


> >-  if (eventService)
> >-    eventService->GetThreadEventQueue(mThread, aQueue);
> 
> Can we assert before calling NS_DispatchToCurrentThread that the current thread
> is the same thread that Create() was called on or something?  Just in case
> someone calls into the nsILinkHandler api from off the main thread?  I see
> nothing that would prevent that from happening....
> 
> Better yet, would be to keep the "original thread" thing we have now and proxy
> over the clicks to the original thread as we do now -- that would allow
> OnLinkClick calls from all threads.

I'm not sure how a background thread would get a reference to a docshell, let alone a nsIContent!  It is an interesting idea allowing link click events from other threads, but I'd rather keep things limited for now.  I'll assert that the API is only used on the main thread.


> >Index: dom/src/base/nsGlobalWindow.cpp
> >@@ -4350,17 +4320,10 @@ nsGlobalWindow::Close()
> >+    nsRefPtr<nsCloseEvent> ev = new nsCloseEvent(this);
> 
> Why not just nsCOMPtr<nsIRunnable> here?

No good reason.  Changed.


> >Index: dom/src/base/nsJSEnvironment.cpp
> >@@ -2297,20 +2294,11 @@ nsJSEnvironment::Init()
> > #ifdef DEBUG
> >   // Let's make sure that our main thread is the same as the xpcom main thread.
> >+  NS_ASSERTION(NS_IsMainThread(), "bad");
> > #endif
> 
> No need for the #ifdef here.  ;)

Yup, changed.


> >Index: layout/base/nsCSSFrameConstructor.h
> 
> >+  struct RestyleEvent : public nsRunnable {
> 
> s/struct/class/, right?

OK


> >Index: layout/base/nsPresShell.cpp
> 
> >+struct ReflowEvent : public nsRunnable {
> 
> s/struct/class/

OK


> >         // Now, explicitly release the pres shell before the view manager
> >-        presShell = nsnull;
> >         viewManager = nsnull;
> 
> Please leave the |ps = nsnull| thing there so the code matches that comment.

OK


> >Index: layout/forms/nsComboboxControlFrame.cpp
> >@@ -1591,33 +1567,28 @@ nsComboboxControlFrame::RedisplayText(PR
> >+    nsCOMPtr<nsIRunnable> event = new RedisplayTextEvent(mWeakSelf);
> >+    NS_ENSURE_TRUE(event, NS_ERROR_OUT_OF_MEMORY);
> > 
> >+    // Revoke outstanding events to avoid out-of-order events which could mean
> >+    // displaying the wrong text.
> >+    if (mRedisplayTextEventPosted) {
> >+      mWeakSelf.forget();
> >+      mRedisplayTextEventPosted = PR_FALSE;
> >+    }
> 
> That |if (mRedisplayTextEventPosted)| block (together with the ensuing
> mWeakSelf initialization) needs to come before we create the
> RedisplayTextEvent.  Otherwise you're revoking the event you just created (even
> before you've posted it).

Good catch!  Fixed.


> >Index: layout/generic/nsGfxScrollFrame.h
> 
> >+  PRPackedBool mScrollEventPending:1;
> 
> Don't we need to init that in our constructor?

Yes, thanks!


> >Index: layout/generic/nsSelection.cpp
> > class nsTypedSelection : public nsISelection,
> >@@ -283,11 +285,11 @@ private:
> 
> >+  PRPackedBool mScrollEventPosted;
> 
> We need to init that in the constructor too, right?

Actually, that one is initialized in the constructor.


> >+struct nsScrollSelectionIntoViewEvent : public nsRunnable {
> 
> s/struct/class/

OK


> > nsTypedSelection::PostScrollSelectionIntoViewEvent(SelectionRegion aRegion)
> >+  nsCOMPtr<nsIRunnable> ev =
> >+      new nsScrollSelectionIntoViewEvent(mWeakSelf, aRegion);
> >+  if (NS_FAILED(NS_DispatchToCurrentThread(ev)))
> >+    return NS_ERROR_UNEXPECTED;
> 
> Is it worth propagating a better error here?

Seems like a good thing to do.


> >Index: layout/printing/nsPrintEngine.cpp
> 
> >+  NS_IMETHOD Run() {
> >+    if (mDocViewerPrint)
> 
> How could it be null?  We asserted that it's not in our constructor...

I think I was just preserving the code path.  There's an assertion in the constructor for the nsPrintCompletionEvent class that claims that it should never be null.  The old code has that assertion, and it null checked before calling OnDonePrinting.  Hmm...  if you think it can never be null, then I'll eliminate the null check.


> >Index: layout/style/nsCSSLoader.cpp
> 
> >+#if 0
> 
> Just remove the code.  It's not so pretty that we'd want to keep it around.  ;)

whoops.  thx!


> >@@ -1956,15 +1959,9 @@ CSSLoaderImpl::PostLoadEvent(nsIURI* aUR
> 
> >   if (!mPostedEvents.AppendElement(evt)) {
> >-    PL_DestroyEvent(evt);
> >+    NS_RELEASE(evt);
> 
> This needs to be |DestroyLoadEvent(evt)| instead of NS_RELEASE.  Otherwise we
> won't unblock the document onload.
>
> Alternately you could move the BlockOnload() call to after this block and use
> NS_RELEASE here as you do; now that we're not tied to destroying via
> PL_DestroyEvent and have a choice of whether DestroyLoadEvent() is called when
> we destroy the event, that might be a better control flow.

Yes, I like that better.  In fact, I can defer the AddRef call until after AppendElement succeeds to avoid the NS_RELEASE entirely.


> >Index: parser/htmlparser/src/nsParser.h
> >+    nsCOMPtr<nsIRunnable>        mContinueEvent;
> 
> I'm not really happy with the cycle here, even thought it does look like it's
> safe...  Could we make this a weak ref (and create new events into an on-stack
> nsCOMPtr)?  Or use the other revokation setup if that would make more sense?

We could, but weak refs introduce an additional malloc.  I tried to avoid weak refs whenever possible for that reason.  I think I prefer to leave the code as-is.  There are plenty of other cases where reference cycles are created for asynchronous operations (e.g., between nsIChannel and nsIStreamListener).  So long as you have well-defined and reliable points where the reference cycle will be broken (e.g., OnStopRequest), the reference cycle should not be a concern.


> >Index: view/src/nsViewManager.cpp
> >+    // mViewManager is a weak reference that the view manager will break in
> >+    // it's destructor.
> 
> s/it's/its/

Fixed


> > nsViewManager::PostInvalidateEvent()
> >+      NS_WARNING("failed to dispatch nsSynthMouseMoveEvent");
> 
> s/nsSynthMouseMoveEvent/nsInvalidateEvent/

Fixed


> r=bzbarsky with those fixed.

Thanks!
No longer blocks: 334573
Blocks: 334573
> In fact, that assertion seems superflous to me.  I don't see why we have it.

I'd be fine with removing the assert and just dispatching to the current thread event queue.  I agree that the less we explicitly work with the "main thread" the better.

> I'll assert that the API is only used on the main thread.

Sounds good.

> if you think it can never be null, then I'll eliminate the null check.

I think that's safe, yeah.

> In fact, I can defer the AddRef call until after AppendElement succeeds to
> avoid the NS_RELEASE entirely.

And just use raw delete or something?  I'd really prefer to use addref/release to manage lifetimes of XPCOM objects even if we think we know all the refs to them...
> And just use raw delete or something?  I'd really prefer to use addref/release
> to manage lifetimes of XPCOM objects even if we think we know all the refs to
> them...

No, I meant that if we are unable to allocate space in the nsVoidArray for the additional element, then we will just not bother AddRef'ing the given object.  We return and let the caller deal with the error.
Won't that leak the event we just created with |operator new|?
> Won't that leak the event we just created with |operator new|?

Yes, doh.  That was lame of me :-/
>>Index: content/base/src/nsObjectLoadingContent.h
>>+    nsCOMPtr<nsIRunnable>       mPendingInstantiateEvent;
>
>Like I said, this is better as a weak pointer, imo.

So, I decided to keep this as a strong ref to avoid malloc overhead associated with weak refs.
Comment on attachment 217587 [details] [diff] [review]
patch: widget v1

>Index: widget/public/nsIAppShell.idl

>+   * The starvationDelay arg is only used when favorPerfOverStarvation is
>+   * PR_FALSE. It is the amount of time in milliseconds to wait before the
>+   * PR_FALSE actually takes effect.
>+   */
>+  void favorPerformanceHint(in boolean favorPerfOverStarvation,
>+                            in unsigned long starvationDelay);

Huh? Could you explain why this millisecond arg is there? It sounds like 
"when favorPerfOverStarvation is PR_FALSE, continue starvation mode for another X millisecs", which I don't understand.

>Index: widget/src/cocoa/Makefile.in

>-		nsAppShellCocoa.mm \
>+		nsAppShell.mm \

Gratuitous renaming alert?

>Index: widget/src/cocoa/nsAppShell.mm

I'm certainly not all that competent to review this file or the set of changes from nsAppShellCocoa.mm; I assume that josh/mento will be reviewing (or wrote?) these bits?

>+// Arrange for Gecko events to be processed.  They will either be processed
>+// after the main run loop returns (if we own the run loop) or on
>+// NativeEventCallback (if an embedder owns the loop).

At some point I'd like to understand how this works in embedded situations, but perhaps not now.

>Index: widget/src/gtk/nsAppShell.cpp

>  * Contributor(s):
>- *   Markus G. Kuhn <mkuhn@acm.org>
>- *   Richard Verhoeven <river@win.tue.nl>
>- *   Frank Tang <ftang@netscape.com> adopt into mozilla

hrm?

>Index: widget/src/mac/nsAppShell.h

Ditto reviews for widget/src/mac, I am incompetent to review this.

>Index: widget/src/windows/nsWindow.cpp

>+  static int recursionBlocker = 0;
>+  if (recursionBlocker++ == 0) {
>+    NS_ProcessPendingEvents(nsnull, PR_MillisecondsToInterval(100));
>+    --recursionBlocker;

Just to verify, this code is main-thread only, correct?

>Index: widget/src/xpwidgets/nsBaseAppShell.cpp

>+#define THREAD_EVENT_STARVATION_LIMIT PR_MillisecondsToInterval(20)

I don't suppose it's worth making this configurable?

>+nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal *thr, PRBool mayWait,

>+      mDummyEvent = new nsRunnable();

Due to XPCOM reviews, nsRunnable is an abstract base now, isn't it? I presume this got fixed up with a dummy subclass.

>Index: xpfe/appshell/src/nsXULWindow.cpp

Did you have to reindent this? I'd like a -w version of this file since it's really hard to see what changed here.

>Index: xpfe/components/startup/src/nsAppStartup.cpp

>+class nsAppStartupExitEvent : public nsRunnable
>+{
>+public:
>+  nsAppStartupExitEvent(nsAppStartup *service)
>+    : mService(service) {}
>+
>+  NS_IMETHOD Run() {
>+    // Tell the appshell to exit
>+    mService->mAppShell->Exit();

Hrm, doesn't the "new" nsIAppShell.quit unwind nicely? Why couldn't you just call the method directly, instead of using an event?

>+    // We're done "shutting down".
>+    mService->mShuttingDown = PR_FALSE;

This will fail to compile on VC6, you need to cast mService to nsAppShell*. 

I also don't understand why we would set this to PR_FALSE once we've started shutdown, but the original code had it... I must be missing something of the semantics of mShuttingDown. You'd think after I designed this I would remember.

>Index: toolkit/xre/nsNativeAppSupportWin.cpp

>-    if (getenv("MOZ_NO_REMOTE"))
>+    if (1) //getenv("MOZ_NO_REMOTE"))

Debugging cruft?

r=me except for nsXULWindow and the mac-specific changes
Attachment #217587 - Flags: review?(benjamin) → review+
> So, I decided to keep this as a strong ref to avoid malloc overhead associated
> with weak refs.

I was thinking something like nsIRunnable* as the weak ref...  Since we just do pointer compares on that pointer, that should be fine, no?
> >+   * The starvationDelay arg is only used when favorPerfOverStarvation is
> >+   * PR_FALSE. It is the amount of time in milliseconds to wait before the
> >+   * PR_FALSE actually takes effect.
> >+   */
> >+  void favorPerformanceHint(in boolean favorPerfOverStarvation,
> >+                            in unsigned long starvationDelay);
> 
> Huh? Could you explain why this millisecond arg is there? It sounds like 
> "when favorPerfOverStarvation is PR_FALSE, continue starvation mode for
> another X millisecs", which I don't understand.

This is PL_FavorPerformanceHint migrated from plevent.c to here.  I didn't change a thing about how it works, and yes... you did understand it correctly.  It is pretty lame -- I agree -- but, I fear making too many more changes at once in this patch!


> >Index: widget/src/cocoa/Makefile.in
> 
> >-		nsAppShellCocoa.mm \
> >+		nsAppShell.mm \
> 
> Gratuitous renaming alert?

That was Mark's choice.  In fact, he wrote that file.  I'm fine with the name change since there's very little carried over from the old code.


> I'm certainly not all that competent to review this file or the set of 
> changes from nsAppShellCocoa.mm; I assume that josh/mento will be reviewing 
> (or wrote?) these bits?

I reviewed Mark's code and it checks out AFAIK.  I'll ask Josh to review it as well.


> >+// Arrange for Gecko events to be processed.  They will either be processed
> >+// after the main run loop returns (if we own the run loop) or on
> >+// NativeEventCallback (if an embedder owns the loop).
> 
> At some point I'd like to understand how this works in embedded situations, 
> but perhaps not now.

It does work :-)  I tested TestGtkEmbed and winEmbed.


> >Index: widget/src/gtk/nsAppShell.cpp
> 
> >  * Contributor(s):
> >- *   Markus G. Kuhn <mkuhn@acm.org>
> >- *   Richard Verhoeven <river@win.tue.nl>
> >- *   Frank Tang <ftang@netscape.com> adopt into mozilla
> 
> hrm?

The file is completely new.  It just happens to share the same name.


> >Index: widget/src/mac/nsAppShell.h
> 
> Ditto reviews for widget/src/mac, I am incompetent to review this.

Mark wrote this file too.  I'll ask Josh to review it as well.

 
> >Index: widget/src/windows/nsWindow.cpp
> 
> >+  static int recursionBlocker = 0;
> >+  if (recursionBlocker++ == 0) {
> >+    NS_ProcessPendingEvents(nsnull, PR_MillisecondsToInterval(100));
> >+    --recursionBlocker;
> 
> Just to verify, this code is main-thread only, correct?

Yes.

 
> >Index: widget/src/xpwidgets/nsBaseAppShell.cpp
> 
> >+#define THREAD_EVENT_STARVATION_LIMIT PR_MillisecondsToInterval(20)
> 
> I don't suppose it's worth making this configurable?

It may be nice for testing, but I think we probably don't need to pref this.

 
> >+nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal *thr, PRBool mayWait,
> 
> >+      mDummyEvent = new nsRunnable();
> 
> Due to XPCOM reviews, nsRunnable is an abstract base now, isn't it? I presume
> this got fixed up with a dummy subclass.

nsRunnable is a concrete class.


> >Index: xpfe/appshell/src/nsXULWindow.cpp
> 
> Did you have to reindent this? I'd like a -w version of this file since it's
> really hard to see what changed here.

Sorry, I meant to post a -w version of this file.  I'll post one shortly.


> >Index: xpfe/components/startup/src/nsAppStartup.cpp
> 
> >+class nsAppStartupExitEvent : public nsRunnable
> >+{
> >+public:
> >+  nsAppStartupExitEvent(nsAppStartup *service)
> >+    : mService(service) {}
> >+
> >+  NS_IMETHOD Run() {
> >+    // Tell the appshell to exit
> >+    mService->mAppShell->Exit();
> 
> Hrm, doesn't the "new" nsIAppShell.quit unwind nicely? Why couldn't you just
> call the method directly, instead of using an event?

The old code posted an event, so I was just preserving that.  Again, I want to minimize the number of changes if possible.

 
> >+    // We're done "shutting down".
> >+    mService->mShuttingDown = PR_FALSE;
> 
> This will fail to compile on VC6, you need to cast mService to nsAppShell*. 

The trunk does not compile on VC6.  So, is this really an issue?

 
> I also don't understand why we would set this to PR_FALSE once we've started
> shutdown, but the original code had it... I must be missing something of the
> semantics of mShuttingDown. You'd think after I designed this I would 
> remember.

Yeah, I'm not sure.  I'd like to preserve the existing code as much as possible.


> >Index: toolkit/xre/nsNativeAppSupportWin.cpp
> 
> >-    if (getenv("MOZ_NO_REMOTE"))
> >+    if (1) //getenv("MOZ_NO_REMOTE"))
> 
> Debugging cruft?

Yeah, thanks.  I'll be sure to fix that.
> I was thinking something like nsIRunnable* as the weak ref...  Since we just do
> pointer compares on that pointer, that should be fine, no?

Oh, good point.  I'll investigate doing that.
> The trunk does not compile on VC6.  So, is this really an issue?

creature and pacifica are still building with VC6, so yes.
> creature and pacifica are still building with VC6, so yes.

Oh, hmm... so is pacifica not building with thebes enabled?  Anyways, I have VC6 on my system, so I'll make sure the entire patch builds properly with it before committing any changes.
I take it that this will not be landing on 1.8.1, correct?

~B
> I take it that this will not be landing on 1.8.1, correct?

That's right.  It will not be landing on 1.8.1.
Whiteboard: [trunk only]
(In reply to comment #66)
> > I was thinking something like nsIRunnable* as the weak ref...  Since we just do
> > pointer compares on that pointer, that should be fine, no?
> 
> Oh, good point.  I'll investigate doing that.

I made a similar change to nsParser::mContinueEvent.

Blocks: 332579
(In reply to comment #8)
> Martijn,
> 
> I've uploaded a new windows build here:
>http://stage.mozilla.org/pub/mozilla.org/firefox/nightly/experimental/threadmanager/

will there be newer builds posted here for testing?
(AddRef and Release are ambiguous in nsObjectLoadingContent due to the multiple inheritance from nsISupports and lack of implementation of any nsISupports methods in it)
Comment on attachment 218186 [details] [diff] [review]
patch: layout v1.2

I agree with bz that this looks wrong:
+  if (!mInstantiating && aFrame && mType == eType_Plugin &&
+      !mPendingInstantiateEvent) {

just leaving the if as-is should be ok, I think.

Only the latest event should have an effect. Maybe HasNewFrame should have revoked currently pending events but that's probably moot now :-)

note that this doesn't handle OOM correctly:
+  nsCOMPtr<nsIRunnable> ev = new nsPluginNotFoundEvent(thisContent);
+  nsresult rv = NS_DispatchToCurrentThread(ev);
No longer blocks: 332579
> I agree with bz that this looks wrong:
> +  if (!mInstantiating && aFrame && mType == eType_Plugin &&
> +      !mPendingInstantiateEvent) {
> 
> just leaving the if as-is should be ok, I think.

I removed the !mPendingInstantiateEvent bit locally.  I'll be posting a new patch shortly.


> Only the latest event should have an effect. Maybe HasNewFrame should have
> revoked currently pending events but that's probably moot now :-)
> 
> note that this doesn't handle OOM correctly:
> +  nsCOMPtr<nsIRunnable> ev = new nsPluginNotFoundEvent(thisContent);
> +  nsresult rv = NS_DispatchToCurrentThread(ev);

NS_DispatchToCurrentThread returns NS_ERROR_INVALID_ARG if given null.  That seems fine to me.  It doesn't have to return NS_ERROR_OUT_OF_MEMORY.
(In reply to comment #75)
> NS_DispatchToCurrentThread returns NS_ERROR_INVALID_ARG if given null.  That
> seems fine to me.  It doesn't have to return NS_ERROR_OUT_OF_MEMORY.

Ah, sorry, I forgot that it accepts null.

Comment on attachment 218120 [details] [diff] [review]
patch: xpcom v1.2

@@ -236,54 +221,26 @@ nsMemoryImpl::FlushMemory(const PRUnicha

It's bad that FlushMemory with aImmediate PR_TRUE called during a flush, or off the main thread, returns immediately, possibly before any flushing has happened. However I guess this is not your problem.

+    sIsFlushing = 0;

Doesn't this need to be an atomic operation, to ensure the right memory barriers are in place?

 PRBool
-nsMemoryImpl::sIsFlushing = PR_FALSE;
+nsMemoryImpl::sIsFlushing = 0;

Make it PRInt32.

+/**
+ * Create a new thread, and optionally provide an initial event for the thread.
+ *
+ * @param result
+ *   The resulting nsIThread object.
+ * @param event
+ *   The initial event to run on this thread.  This parameter may be null.
+ * @param name
+ *   The name of the thread, which must be unique, or the empty string to
+ *   create an anonymous thread.
+ */

Say what happens if it's not unique.

+/**
+ * Get a reference to the thread with the given name.
+ *
+ * @param result
+ *   The resulting nsIThread object or null if no such thread exists.
+ */

You might want to mention that because of potential races, by the time you get the result the thread might have been created by another thread. So all we can say is that if you get a thread object back, then the thread exists. If you get null back you don't know anything, unless you have some extra knowledge that no-one else can create a thread with that name at this time.

+/**
+ * Shortcut for nsIThread::HasPendingEvents.
+ */
+inline PRBool
+NS_HasPendingEvents(nsIThread *thread)

This also requires a documentation warning. This is hard to use correctly without additional contextual knowledge. If it returns PR_FALSE, then we don't know anything because someone else might just have posted an event to the thread. If it returns PR_TRUE and 'thread' is not the current thread, then it might have just processed the events so there are no events pending.

+     * @param target
+     *   If target is null, then the current thread is used as the target.  If
+     *   target is NS_PROXY_TO_MAIN_THREAD (C++ only), then the main thread is
+     *   used as the target.  Otherwise, target identifies the nsIEventTarget
+     *   from which proxy method calls should be executed.

I thought you weren't going to document this?

+    if (mCallersTarget) {
+        nsCOMPtr<nsIRunnable> event =
+                new nsProxyCallCompletedEvent(this);
+        if (event) {
+            mCallersTarget->Dispatch(event, NS_DISPATCH_NORMAL);
+            return;
+        }
     }

If OOM the caller won't get a reply event, which is likely to deadlock everything. Perhaps it would be better to check mCallersTarget and preallocate the reply event before we even try to make the call, so we can return immediately if there's a problem.

+        nsCOMPtr<nsIRunnable> event = new nsProxyDestructorEvent(this);
+        if (event == nsnull)
+        {
+            NS_NOTREACHED("Leaking nsProxyObject!");
+            return;  // if this happens we are going to leak.
+        }

Similarly we could preallocate the destructor event for the proxy object.

Not really your problem, but nsProxyObjectCallInfo::GetCompleted should be an atomic operation to ensure the right memory barriers are used.

I'm up to xpcom/threads/TimerThread.cpp...
> It's bad that FlushMemory with aImmediate PR_TRUE called during a flush, or 
> off the main thread, returns immediately, possibly before any flushing has
> happened. However I guess this is not your problem.

Yeah, I didn't change that behavior.

 
> +    sIsFlushing = 0;
> 
> Doesn't this need to be an atomic operation, to ensure the right memory
> barriers are in place?

No, I believe we can assume that 32-bit values are written atomically on all platforms.  That assumption is made in other places too.


>  PRBool
> +nsMemoryImpl::sIsFlushing = 0;
> 
> Make it PRInt32.

Fixed.


> Say what happens if it's not unique.

OK


> +/**
> + * Get a reference to the thread with the given name.
> + *
> + * @param result
> + *   The resulting nsIThread object or null if no such thread exists.
> + */
> 
> You might want to mention that because of potential races, by the time you 
> get the result the thread might have been created by another thread. So all 
> we can say is that if you get a thread object back, then the thread exists. 
> If you get null back you don't know anything, unless you have some extra 
> knowledge that no-one else can create a thread with that name at this time.

Good point.


> +/**
> + * Shortcut for nsIThread::HasPendingEvents.
> + */
> +inline PRBool
> +NS_HasPendingEvents(nsIThread *thread)
> 
> This also requires a documentation warning. This is hard to use correctly
> without additional contextual knowledge. If it returns PR_FALSE, then we 
> don't know anything because someone else might just have posted an event to 
> the thread. If it returns PR_TRUE and 'thread' is not the current thread, 
> then it might have just processed the events so there are no events pending.

Well... this function is designed to be used on the thread itself.  It is not designed to be used from other threads.  In fact, it will return NS_ERROR_UNEXPECTED if called from another thread.  I should document that properly in nsThreadUtils.h.  It's a little unfortunate to keep two copies of the documentation: once in nsIThread/nsIThreadManager and again in nsThreadUtils.h, but I guess that's better than people being confused about the APIs.


> 
> +     * @param target
> +     *   If target is null, then the current thread is used as the target.  If
> +     *   target is NS_PROXY_TO_MAIN_THREAD (C++ only), then the main thread is
> +     *   used as the target.  Otherwise, target identifies the nsIEventTarget
> +     *   from which proxy method calls should be executed.
> 
> I thought you weren't going to document this?

I'm not.  I need to post revised XPCOM patches.  The current patch does not take into account all of bsmedberg's comments ;-)


> +    if (mCallersTarget) {
> +        nsCOMPtr<nsIRunnable> event =
> +                new nsProxyCallCompletedEvent(this);
> +        if (event) {
> +            mCallersTarget->Dispatch(event, NS_DISPATCH_NORMAL);
> +            return;
> +        }
>      }
> 
> If OOM the caller won't get a reply event, which is likely to deadlock
> everything. Perhaps it would be better to check mCallersTarget and 
> preallocate the reply event before we even try to make the call, so we can 
> return immediately if there's a problem.

Agreed.  That's a bug in the old implementation too.  I'd prefer to file a bug on cleaning up xpcom/proxy instead since that's not the only problem with that code.


> +        nsCOMPtr<nsIRunnable> event = new nsProxyDestructorEvent(this);
> +        if (event == nsnull)
> +        {
> +            NS_NOTREACHED("Leaking nsProxyObject!");
> +            return;  // if this happens we are going to leak.
> +        }
> 
> Similarly we could preallocate the destructor event for the proxy object.

Again, I'd like to punt on this.  Too many changes in this patch already! ;-)


> Not really your problem, but nsProxyObjectCallInfo::GetCompleted should be an
> atomic operation to ensure the right memory barriers are used.

Are you sure that's an issue?


I will be posting revised patches tonight.
Here's a version of nsXULWindow.cpp with whitespace removed.
Attachment #219264 - Flags: superreview?
Attachment #219264 - Flags: review?(benjamin)
Comment on attachment 217587 [details] [diff] [review]
patch: widget v1

Josh: please review the widget/mac and widget/cocoa changes.  Thanks!  They were done by Mark.
Attachment #217587 - Flags: review?(joshmoz)
Attached file patch: layout v1.3
Roc: here's the latest version of the "layout" portion of this patch for review.
Attachment #218186 - Attachment is obsolete: true
Attachment #219265 - Flags: superreview?
Attachment #218186 - Flags: superreview?(roc)
Attachment #219265 - Flags: superreview? → superreview?(roc)
Attachment #219264 - Flags: superreview? → superreview?(roc)
Latest xpcom patch bits.
Attachment #218120 - Attachment is obsolete: true
Attachment #219267 - Flags: superreview?(roc)
Attachment #218120 - Flags: superreview?(roc)
Attached patch patch: mail v1.1Splinter Review
revised mailnews patch (fixed imap bug).  over to bienvenu for review.
Attachment #217590 - Attachment is obsolete: true
Attachment #219268 - Flags: superreview?(roc)
Attachment #219268 - Flags: review?(bienvenu)
Attachment #217590 - Flags: superreview?(roc)
Attachment #217590 - Flags: review?(benjamin)
(In reply to comment #78)
> > +    sIsFlushing = 0;
> > 
> > Doesn't this need to be an atomic operation, to ensure the right memory
> > barriers are in place?
> No, I believe we can assume that 32-bit values are written atomically on all
> platforms.  That assumption is made in other places too.

That's a bad assumption. The issue is not whether the write is atomic, but whether other stores can be moved around this one (as seen by other threads). We should discuss this further elsewhere. In the meantime, since you're adding this issue in a new place, please leave an XXX comment so we can find it if we revisit.

> > +/**
> > + * Shortcut for nsIThread::HasPendingEvents.
> > + */
> > +inline PRBool
> > +NS_HasPendingEvents(nsIThread *thread)
> > 
> > This also requires a documentation warning. This is hard to use correctly
> > without additional contextual knowledge. If it returns PR_FALSE, then we 
> > don't know anything because someone else might just have posted an event to 
> > the thread. If it returns PR_TRUE and 'thread' is not the current thread, 
> > then it might have just processed the events so there are no events pending.
> Well... this function is designed to be used on the thread itself.  It is not
> designed to be used from other threads.  In fact, it will return
> NS_ERROR_UNEXPECTED if called from another thread.  I should document that
> properly in nsThreadUtils.h.

Why take a thread parameter then?

> > If OOM the caller won't get a reply event, which is likely to deadlock
> > everything. Perhaps it would be better to check mCallersTarget and 
> > preallocate the reply event before we even try to make the call, so we can 
> > return immediately if there's a problem.
> Agreed.  That's a bug in the old implementation too.  I'd prefer to file a bug
> on cleaning up xpcom/proxy instead since that's not the only problem with that
> code.

Fair enough.
(In reply to comment #78)
> Well... this function is designed to be used on the thread itself.  It is not
> designed to be used from other threads.  In fact, it will return
> NS_ERROR_UNEXPECTED if called from another thread.  I should document that
> properly in nsThreadUtils.h.  It's a little unfortunate to keep two copies of
> the documentation: once in nsIThread/nsIThreadManager and again in
> nsThreadUtils.h, but I guess that's better than people being confused about the
> APIs.

Even if you only use it on the current thread, you should still document that when it return PR_FALSE, the caller should be aware that someone may in fact have just dispatched an event to the thread (unless extra knowledge is available).
Might be worth documenting somewhere how nsEventQueue adds a reference to the runnable when you do PutEvent, then GetEvent moves the runnable from the queue to the result, so doesn't need to change the refcount. That's a bit clever.

Is this page structure for the queue really important? What does it buy you over a singly-linked list, possibly with a recycling freelist for the cells?

In nsIEventTarget:
+  /**
+   * Check to see if this event target is associated with the current thread.
+   *
+   * @returns
+   *   A boolean value that if "true" indicates that events dispatched to this
+   *   event target will run on the current thread (i.e., the thread calling
+   *   this method).
+   */

What does this return if you call it on a thread pool from a thread that's a member of the pool?

In nsIThread:

When this function returns,
+   * the thread will be shutdown, and it will no longer be possible to dispatch
+   * events to the thread.

What happens if two threads call shutdown() on the same third thread? Better document this.

+  /**
+   * This method may be called to determine if there are any events ready to be
+   * processed.  It may only be called when this thread is the current thread.
+   *
+   * @returns
+   *   A boolean value that if "true" indicates that this thread has one or
+   *   more pending events.

This needs a warning that if it returns "false", there could still be pending events due to races.

+  // Wait for thread to call ThreadManager::SetCurrentThread, which completes

I think you mean SetupCurrentThread.

+  // XXX we could still end up with other events being added after the shutdown task

This seems like a big hole. Isn't this fairly easy to fix with a new flag on the thread?

For synchronous dispatch, do we need some kind of notification to be posted back to the dispatcher thread? Right now we have this:
+    while (wrapper->IsPending())
+      NS_ProcessNextEvent(thread);
but what if no-one ever happens to post events to the current thread? I think perhaps nsThreadSyncDispatch should post a do-nothing event back to the dispatcher thread.

Up to xpcom/threads/nsThreadManager.cpp...
> That's a bad assumption. The issue is not whether the write is atomic, but
> whether other stores can be moved around this one (as seen by other threads).
> We should discuss this further elsewhere. In the meantime, since you're 
> adding this issue in a new place, please leave an XXX comment so we can find 
> it if we revisit.

Interesting.  Yes, we should chat about this.


> > > +NS_HasPendingEvents(nsIThread *thread)
...
> Why take a thread parameter then?

This is just a shortcut for nsIThread::HasPendingEvents.  I suppose I could make the thread parameter optional.  Then, it'd just be a slight optimization to specify it.
> Even if you only use it on the current thread, you should still document that
> when it return PR_FALSE, the caller should be aware that someone may in fact
> have just dispatched an event to the thread (unless extra knowledge is
> available).

OK
Comment on attachment 217587 [details] [diff] [review]
patch: widget v1

r+ with the following stuff fixed up

+  if (mDelegate)
+    [mDelegate release];

Null check is unnecessary (making such a call has a built-in nil check) - if it is nil it won't matter, if it is a bad pointer you're screwed anyway.

+  [NSBundle loadNibFile:
+                     [NSString stringWithUTF8String:(const char*)nibPath.get()]
+      externalNameTable:
+           [NSDictionary dictionaryWithObject:[NSApplication sharedApplication]
+                                       forKey:@"NSOwner"]
+               withZone:NSDefaultMallocZone()];

That is really strange indentation - if you need to, to make it more readable, you can break some of the internal values into variables right before the call.

+ * Runs the main native Cocoa run loop, interrupting it as needed to process

Not correct as that comment is in widget/src/mac/nsAppShell.h

Otherwise looks good.
Attachment #217587 - Flags: review?(joshmoz) → review+
> > > > +NS_HasPendingEvents(nsIThread *thread)
> ...
> > Why take a thread parameter then?
> 
> This is just a shortcut for nsIThread::HasPendingEvents.  I suppose I could
> make the thread parameter optional.  Then, it'd just be a slight optimization
> to specify it.

OK, I made the |thread| parameter optional for both NS_HasPendingEvents and NS_ProcessNextEvent.  I also moved the implementation of those functions into nsThreadUtils.cpp since they became non-trivial.
Attachment #219264 - Flags: review?(benjamin) → review+
Comment on attachment 219268 [details] [diff] [review]
patch: mail v1.1

I had to pass in "" to NS_NewThread in nsImapProtocol.cpp for the thread name arg.

It looks to me that to fix all the assertions, the channel listener will need to be released on the UI thread, which is tricky.
Attached patch fix thread-safety assertions (obsolete) — Splinter Review
Darin, this fixes the thread-safety assertions every time you read an imap message. 

It would be nicer if the tee stream listener or whatever it is used thread-safe addref/releases, but I don't know how feasible that is. This patch introduces two more proxied roundtrips between the imap thread and the ui thread, which I'm not crazy about.
> That is really strange indentation - if you need to, to make it more readable,
> you can break some of the internal values into variables right before the 
> call.

Since Mark wrote the file (and its an entirely new file), I'm going to defer to him for coding style preference.
> Is this page structure for the queue really important? What does it buy you
> over a singly-linked list, possibly with a recycling freelist for the cells?

I chose to implement a deque structure to minimize allocation overhead.  I suppose I could have used a recycling freelist, but I think this implementation will perform better under heavy load (i.e., fewer mallocs).

The old PLEvent system avoided the need to allocate a container for the PLEvent by storing a pointer to the next PLEvent in the PLEvent itself.  This new event queue system has the property of allowing a nsIRunnable to be dispatched multiple times at once.  I take advantage of that in nsBaseAppShell.


> In nsIEventTarget:
> +  /**
> +   * Check to see if this event target is associated with the current thread.
> +   *
> +   * @returns
> +   *   A boolean value that if "true" indicates that events dispatched to this
> +   *   event target will run on the current thread (i.e., the thread calling
> +   *   this method).
> +   */
> 
> What does this return if you call it on a thread pool from a thread that's a
> member of the pool?

Yes, that is a good question.  The thread pool returns true if called on any one of the threads in the pool.


> In nsIThread:
> 
> When this function returns,
> +   * the thread will be shutdown, and it will no longer be possible to
> dispatch
> +   * events to the thread.
> 
> What happens if two threads call shutdown() on the same third thread? Better
> document this.

Yeah, badness... I'll document it.  PR_DestroyLock, PR_JoinThread, and other functions have similar behavior, so I think it is okay.  It would be costly to protect against the badness, and so I'd like to avoid doing so.


> +  /**
> +   * This method may be called to determine if there are any events ready to
> be
> +   * processed.  It may only be called when this thread is the current thread.
> +   *
> +   * @returns
> +   *   A boolean value that if "true" indicates that this thread has one or
> +   *   more pending events.
> 
> This needs a warning that if it returns "false", there could still be pending
> events due to races.

This seems pretty obvious to me, but okay ;-)


> +  // XXX we could still end up with other events being added after the
> shutdown task
> 
> This seems like a big hole. Isn't this fairly easy to fix with a new flag on
> the thread?

I think I would need to synchronize across setting that flag and calling PutEvent.  Otherwise, it is difficult to ensure that the shutdown event is the last event added to the queue.  I suppose I could process all pending events after the shutdown event runs.  Yeah, that combined with a flag should do the trick.


> For synchronous dispatch, do we need some kind of notification to be posted
> back to the dispatcher thread? Right now we have this:
> +    while (wrapper->IsPending())
> +      NS_ProcessNextEvent(thread);
> but what if no-one ever happens to post events to the current thread? I think
> perhaps nsThreadSyncDispatch should post a do-nothing event back to the
> dispatcher thread.

Good catch!  In fact, I originally intended for nsThreadSyncDispatch to dispatch a do-nothing event back to the dispatcher thread ;-)  I've changed it to dispatch itself back to the originating thread (with appropriate flagging to make the event do nothing).
sometimes the folder sink will be null - better to assert than crash in that situation.
Attachment #219382 - Attachment is obsolete: true
the other problem I'm having with this patch is that TB doesn't shut down, I suspect because I've set some prefs to do some cleanup at shutdown time, which pumps events to get the cleanup work done. I'll need to investigate a bit.
(In reply to comment #94)
> > What does this return if you call it on a thread pool from a thread that's a
> > member of the pool?
> Yes, that is a good question.  The thread pool returns true if called on any
> one of the threads in the pool.

Please make that clear by tightening up the language in the comment.
Attached patch hack to fix shutdown problems. (obsolete) — Splinter Review
If I add this little bit of sleep, it allows the imap thread(s) to shutdown cleanly and thus TB shuts down, even when doing cleanup on exit.  This is a pretty sucky hack, but maybe it will help in coming up with a better fix? The UI thread basically signals a monitor that an imap thread is waiting on, and when the imap thread wakes up, it exits. w/o this sleep call, I think the UI thread gets too far in shutting down the app such that we can't kill the imap threads.

The imap thread is here waiting on the m_urlReadyToRunMonitor

http://lxr.mozilla.org/mozilla/source/mailnews/imap/src/nsImapProtocol.cpp#1118

which TellThreadToDie has notified here:

http://lxr.mozilla.org/mozilla/source/mailnews/imap/src/nsImapProtocol.cpp#1051
It's quite possible that we could sleep for a lot less time and still have this work. On the other hand, we could have a lot of imap threads waiting to shutdown, up to five per imap server in the normal configuration.
Comment on attachment 219629 [details] [diff] [review]
hack to fix shutdown problems.

never mind, sleeping doesn't help - cleanup on exit was turned off
Attachment #219629 - Attachment is obsolete: true
Comment on attachment 217591 [details] [diff] [review]
patch: other v1

>Index: chrome/src/nsChromeProtocolHandler.cpp

>-NS_IMPL_THREADSAFE_ISUPPORTS2(nsChromeProtocolHandler, nsIProtocolHandler, nsISupportsWeakReference)
>+NS_IMPL_THREADSAFE_ISUPPORTS2(nsChromeProtocolHandler, nsIProtocolHandler,
>+                              nsISupportsWeakReference)

If you're going to do this can you put each iface on its own line?

>Index: directory/xpcom/base/src/nsLDAPConnection.cpp

>+            // XXX(darin): We need to shutdown this thread at some point.
>+            //             Otherwise, it will stick around until shutdown.

File a followup?

>Index: extensions/irc/js/lib/dcc.js

Need to get a MOA here... checking for the existence of interfaces instead of contracts makes me slightly nervous, but I guess it's ok since the existing code does it.

>Index: extensions/xmlextras/base/src/nsDOMParser.h

extensions/xmlextras is in the process of being copied to content/, coordinate this with peterv.

>Index: modules/plugin/base/src/nsPluginNativeWindowWin.cpp

I'd love an extra set of eyes on this plugin code, it makes me shiver.

>Index: storage/src/mozStorageService.cpp

>-static const char* gQuitApplicationMessage = "quit-application";
>+static const char* gShutdownMessage = "xpcom-shutdown-threads";

static const char kShutdownMessage[];
Attachment #217591 - Flags: review?(benjamin) → review+
Comment on attachment 217591 [details] [diff] [review]
patch: other v1

ssieb: please review extensions/irc changes.  thanks!
Attachment #217591 - Flags: review?(jessiebarnes72)
Attachment #217591 - Flags: review?(jessiebarnes72) → review?(samuel)
Comment on attachment 219267 [details] [diff] [review]
patch: xpcom v1.3

+    if (!name.IsEmpty()) {
+      // make sure thread name is unique
+      nsCOMPtr<nsIThread> temp;
+      GetThread(name, getter_AddRefs(temp));
+
+      if (temp && temp != previous) {
+        NS_ERROR("thread name is not unique");
+        return NS_ERROR_INVALID_ARG;
+      }
+      mThreadsByName.Put(name, thread);

There's a race here where setting two threads with the same name at the same time could both succeed, but only one ends up in the table. There's a similar, in fact bigger issue with the interval between NewThread checking the name for uniqueness and SetupNewThread filling in the name table. This could probably be fixed with an atomic "check name for uniqueness and preallocate entry in name table" operation, or maybe you just want to document that the thread manager will not necessarily give an error if two threads with the same name are created at the same time.

SetupCurrentThread as written requires either current or previous to be null. Either this should be documented or it should handle changing the nsIThread object from one non-null value to another. Actually I think the two usages should be split into two different methods ... SetupCurrentThread and TeardownCurrentThread, say.

+nsThreadPool::ShutdownThread(nsIThread *thread)
+  // This method is responsible for calling Shutdown on the thread.  This must
+  // be done from the main thread of the application.
+
+  NS_ASSERTION(!NS_IsMainThread(), "wrong thread");

The comment is wrong, I guess. "This must be called by 'thread'" (which can't be the main thread?)

It seems like it would be convenient for a thread to be able to call shutdown on itself. Of course that would require the behaviour to be special-cased to not block on the thread exit. So perhaps another method, shutdownCurrent would be better. Then nsThreadPool::ShutdownThread, which is really nsThreadPool::ShutdownCurrent, wouldn't require proxying, and same goes for any worker-thread type setup where a thread goes through a bunch of work until it's done
In nsThreadPool::PutEvent don't you need to do a mon.Notify() to wake up a waiting idle thread if you didn't spawn one?

+          PRIntervalTime delta = timeout - (now - idleSince);
Could this be zero, and if so, what happens? I think your test "(now - idleSince) > timeout" should be >=.

sr+ with all those comments addressed.
Attachment #219267 - Flags: superreview?(roc) → superreview+
Comment on attachment 217591 [details] [diff] [review]
patch: other v1

r=samuel@sieb.net on the chatzilla changes
Attachment #217591 - Flags: review?(samuel) → review+
> There's a race here where setting two threads with the same name at the same
> time could both succeed, but only one ends up in the table. There's a similar,
> in fact bigger issue with the interval between NewThread checking the name for
> uniqueness and SetupNewThread filling in the name table.

Good catch.  I definitely want to fix that race by pre-populating mThreadsByName with some kind of dummy entry.


> +nsThreadPool::ShutdownThread(nsIThread *thread)
> +  // This method is responsible for calling Shutdown on the thread.  This must
> +  // be done from the main thread of the application.
> +
> +  NS_ASSERTION(!NS_IsMainThread(), "wrong thread");
> 
> The comment is wrong, I guess. "This must be called by 'thread'" (which can't
> be the main thread?)

The comment is not very clear.  It is hard to mix phrases that speak of calling a method on an object named "thread" and calling a method on a particular thread of execution.  In this case, I'm trying to say that we cannot call Shutdown directly on the given object named "thread".  Instead, we must proxy the Shutdown call to some other thread, and so we just go with the main application thread.


> It seems like it would be convenient for a thread to be able to call shutdown
> on itself. Of course that would require the behaviour to be special-cased to
> not block on the thread exit. So perhaps another method, shutdownCurrent would
> be better. Then nsThreadPool::ShutdownThread, which is really
> nsThreadPool::ShutdownCurrent, wouldn't require proxying, and same goes for 
> any worker-thread type setup where a thread goes through a bunch of work until 
> it's done

bsmedberg also wanted this.  If I'm going to add it, then I'd probably just make Shutdown support it.  What is gained by introducing another method (besides simiplifying the documentation task)?


> In nsThreadPool::PutEvent don't you need to do a mon.Notify() to wake up a
> waiting idle thread if you didn't spawn one?

That happens inside the nsEventQueue::PutEvent call :-)


> +          PRIntervalTime delta = timeout - (now - idleSince);
> Could this be zero, and if so, what happens? I think your test "(now -
> idleSince) > timeout" should be >=.

I think that Wait(0) is okay (in that it doesn't wait), but OK... I like the idea of exiting the thread if (now - idleSince) >= timeout.

(In reply to comment #105)
> > It seems like it would be convenient for a thread to be able to call shutdown
> > on itself. Of course that would require the behaviour to be special-cased to
> > not block on the thread exit. So perhaps another method, shutdownCurrent would
> > be better. Then nsThreadPool::ShutdownThread, which is really
> > nsThreadPool::ShutdownCurrent, wouldn't require proxying, and same goes for 
> > any worker-thread type setup where a thread goes through a bunch of work until 
> > it's done
> 
> bsmedberg also wanted this.  If I'm going to add it, then I'd probably just
> make Shutdown support it.  What is gained by introducing another method
> (besides simiplifying the documentation task)?

If someone calls Shutdown(thread) and then carries on expecting 'thread' to have terminated, they'll get a surprise if 'thread' is the current thread. It would be better for them to get an error.

> > In nsThreadPool::PutEvent don't you need to do a mon.Notify() to wake up a
> > waiting idle thread if you didn't spawn one?
> That happens inside the nsEventQueue::PutEvent call :-)

Interesting. Can we make it do Notify(), not NotifyAll()? You're waking up all threads even though you only need one.
> If someone calls Shutdown(thread) and then carries on expecting 'thread' to
> have terminated, they'll get a surprise if 'thread' is the current thread. It
> would be better for them to get an error.

Yeah, that does seem awkward.  It isn't meant to work like pthread_exit, but people might expect that.  I'll give this API some thought.  I'm not sure what I want to do.


> > > In nsThreadPool::PutEvent don't you need to do a mon.Notify() to wake up a
> > > waiting idle thread if you didn't spawn one?
> > That happens inside the nsEventQueue::PutEvent call :-)
> 
> Interesting. Can we make it do Notify(), not NotifyAll()? You're waking up all
> threads even though you only need one.

Hmm... maybe.
Comment on attachment 217587 [details] [diff] [review]
patch: widget v1

+  /**
+   * Exit the handle event loop
+   */
+  void exit();

Comment that this should be called from the same thread that calls "run".

+  /**
+   * Called by subclasses from a native event.  See CallFromNativeEvent.
+   */
+  void NativeEventCallback();
+
+  /**
+   * Implemented by subclasses.  Invoke NativeEventCallback from a native
+   * event.  This method may be called on any thread.
+   */
+  virtual void CallFromNativeEvent() = 0;
+

These comments aren't very clear about who's responsible for what or what the methods do.
Comment on attachment 217587 [details] [diff] [review]
patch: widget v1

I can't give good review on a lot of this platform-specific stuff.

In nsAppShell::Init (GTK2)
+failed:
+    return NS_ERROR_FAILURE;
You need to close mPipeFDs[0] and [1] here.

+    ioc = g_io_channel_unix_new(mPipeFDs[0]);
Can this fail? The GTK docs aren't clear.

+    mTag = g_io_add_watch_full(ioc, G_PRIORITY_DEFAULT, G_IO_IN,
+                               EventProcessorCallback, this, nsnull);
Not sure about this one either.

What's the deal with OS/2? is someone else going to take care of it?

I don't like the name "NativeEventCallback". How about "ProcessGeckoEventsFromNativeQueue"? I don't much like that either...

I don't like "CallFromNativeEvent" either. How about "NotifyGeckoEventAvailable"?

+nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal *thr, PRBool mayWait,
+                                   PRUint32 recursionDepth)
+{
+  PRIntervalTime start = PR_IntervalNow();
+  PRIntervalTime limit = THREAD_EVENT_STARVATION_LIMIT;
+
+  if (mFavorPerf <= 0 && start > mSwitchTime + mStarvationDelay) {

Suppose I call FavorPerformanceHint(PR_FALSE, 0) and then FavorPerformanceHint(PR_FALSE, 1000). I would expect to be favouring interaction immediately, but the second call has the effect of favouring performance for one second. Is this what you intend? It seems odd. I guess I don't know how the delay is supposed to be used yet.

The changes to reindent most of nsXULWindow were unfortunate...

-    if (getenv("MOZ_NO_REMOTE"))
+    if (1) //getenv("MOZ_NO_REMOTE"))

Why is this change here?
> I can't give good review on a lot of this platform-specific stuff.

That's okay.  I'm hopeful that the platform-specific widget stuff has been appropriately reviewed: mark+josh on mac/cocoa, bsmedberg on windows, and you on gtk2/gtk.


> In nsAppShell::Init (GTK2)
> +failed:
> +    return NS_ERROR_FAILURE;
> You need to close mPipeFDs[0] and [1] here.

Thanks.


> +    ioc = g_io_channel_unix_new(mPipeFDs[0]);
> Can this fail? The GTK docs aren't clear.

I don't know.  The old nsAppShell.cpp seemed to think it could not fail.  I bet it can fail in low-memory situations though.


> +    mTag = g_io_add_watch_full(ioc, G_PRIORITY_DEFAULT, G_IO_IN,
> +                               EventProcessorCallback, this, nsnull);
> Not sure about this one either.

Right.  Again, I'm just doing what the old code did.


> What's the deal with OS/2? is someone else going to take care of it?

I'm not sure.  I'll remind the OS/2 folks again about this.


> I don't like the name "NativeEventCallback". How about
> "ProcessGeckoEventsFromNativeQueue"? I don't much like that either...

That name is inaccurate though since the callback only processes gecko events in the embedding case (when nsAppShell::Run was not called).


> I don't like "CallFromNativeEvent" either. How about
> "NotifyGeckoEventAvailable"?

Hrm... maybe.  The point of the function is to schedule a callback from the native event system.  What nsBaseAppShell does from that callback is up to nsBaseAppShell.

 
> +nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal *thr, PRBool mayWait,
> +                                   PRUint32 recursionDepth)
> +{
> +  PRIntervalTime start = PR_IntervalNow();
> +  PRIntervalTime limit = THREAD_EVENT_STARVATION_LIMIT;
> +
> +  if (mFavorPerf <= 0 && start > mSwitchTime + mStarvationDelay) {
> 
> Suppose I call FavorPerformanceHint(PR_FALSE, 0) and then
> FavorPerformanceHint(PR_FALSE, 1000). I would expect to be favouring
> interaction immediately, but the second call has the effect of favouring
> performance for one second. Is this what you intend? It seems odd. I guess I
> don't know how the delay is supposed to be used yet.

I agree that FavorPerformanceHint is really wacky.  Bsmedberg asked about this too.  I'm just preserving the old logic of PL_FavorPerformanceHint.  It did exactly this, and so I'm keeping the logic the same.  I don't want to deal with so many moving parts at once.  There's already enough changes to the system being made by this patch.  Someone determined that that wacky behavior of FavorPerformanceHint is good, so I'm going to go with it for now.  It makes sense to me to file a follow-up bug to investigate doing something else.


> The changes to reindent most of nsXULWindow were unfortunate...

Yes, sorry.  That's why I posted a -w version of nsXULWindow changes separately.  I probably shouldn't have included the full version in this patch.  The whitespace changes are good in my opinion because the old code was difficult to read and therefore difficult to maintain.


> -    if (getenv("MOZ_NO_REMOTE"))
> +    if (1) //getenv("MOZ_NO_REMOTE"))
> 
> Why is this change here?

Please ignore that.  It was there for debugging, and I forgot to remove it.
How about if I rename CallFromNativeEvent to ScheduleNativeEventCallback?  And then keep the NativeEventCallback method named as it is.
Comment on attachment 217587 [details] [diff] [review]
patch: widget v1

sr+ with those changes
Attachment #217587 - Flags: superreview?(roc) → superreview+
+class nsCachedChromeChannel : public nsIChannel, public nsIRunnable

I'm not a big fan of tacking nsIRunnable onto an existing object so you can use it as an event (you do this in a few other places too). It's a little bit wasteful of memory and it conflates two things that are really different.

How about a helper nsIRunnable implementation, something like this:
class nsRunnableMethod : public nsIRunnable {
public:
  typedef void (nsISupports::*Method)();
  nsRunnableMethod(nsISupports* aTarget, Method aMethod);
  void Run() { (aTarget->*aMethod)(); }
private:
  nsCOMPtr<nsISupports> mTarget;
  Method                mMethod;
};

Then nsCachedChromeChannel can just do
  nsIRunnable* runnable = new nsRunnableMethod(this, &nsCachedChromeChannel::HandleLoadEvent);
Well, that doesn't typecheck, so you might need a template, but you know what I mean.
> I'm not a big fan of tacking nsIRunnable onto an existing object so you can 
> use it as an event (you do this in a few other places too). It's a little bit
> wasteful of memory and it conflates two things that are really different.

Isn't it wasteful of memory to allocate the extra object?  That said, I think what you are saying is that the lifetime of the two objects is not the same, so they should be allocated separately.  OK.  I like your suggestion.  I'll try to come up with a templatized way of doing that and add it to nsThreadUtils.
My guess is that the event objects are have a much shorter lifetime than the objects they're being grafted into.

You could just typedef Method to a static function that takes a single void* parameter, and manually add a static helper function to the class to bounce to the right method. That's probably the best way to go without calling on too much C++-fu.

This could be used for NotifyObserverEvent too.
I did the C++ foo for HTTP, and it works nicely.  I'll do the same for this.
http://lxr.mozilla.org/mozilla/source/netwerk/protocol/http/src/nsHttpChannel.h#143
+  NS_DispatchToMainThread(ev, NS_DISPATCH_SYNC);
 }

If this fails, you leak ev.

+    rv = NS_DispatchToMainThread(ev);
   }
   return rv;
 }

Ditto

+        NS_DispatchToMainThread(ev);
         return;
       }

Ditto.

Perhaps nsThread::Dispatch should, on failure, check whether the runnable's refcount is zero, and if so delete the runnable? You'd need to document that, of course, but it would make the common case safe and convenient.

This keeps on happening, actually, so I'm not even going to list them all :-).
I guess the other option is to make 'ev' etc an nsRefPtr.
ooops, you already did. Disregard the last two comments!
Comment on attachment 217591 [details] [diff] [review]
patch: other v1

+nsresult nsNSSEventPostToUIThread(nsIRunnable *event);

Might as well make this inline.

sr+ assuming the nsCachedChromeChannel separation looks OK.
Attachment #217591 - Flags: superreview?(roc) → superreview+
Attachment #219264 - Flags: superreview?(roc) → superreview+
Comment on attachment 218208 [details] [diff] [review]
patch: necko v1.1

@@ -96,14 +96,14 @@ PendingPACQuery::Start()
   rv = mURI->GetAsciiHost(host);
   if (NS_FAILED(rv))
     return rv;
 
-  nsCOMPtr<nsIEventQueue> eventQ;
-  rv = NS_GetCurrentEventQ(getter_AddRefs(eventQ));
+  nsCOMPtr<nsIThread> thread;
+  rv = NS_GetCurrentThread(getter_AddRefs(thread));

Why didn't you use do_GetCurrentThread here?

 class nsPACMan : public nsIStreamLoaderObserver
                , public nsIInterfaceRequestor
                , public nsIChannelEventSink
+               , public nsIRunnable

Separation...

+  /*XXX
   PR_STATIC_CALLBACK(void *) LoadEvent_Handle(PLEvent *);
   PR_STATIC_CALLBACK(void) LoadEvent_Destroy(PLEvent *);
+  */

Why not just remove it?
Attachment #218208 - Flags: superreview?(roc) → superreview+
> > > What does this return if you call it on a thread pool from a thread that's a
> > > member of the pool?
> > Yes, that is a good question.  The thread pool returns true if called on any
> > one of the threads in the pool.
> 
> Please make that clear by tightening up the language in the comment.

Actually, I was wrong.  I forgot that I borrowed the same implementation from nsIOThreadPool's IsOnCurrentThread, which is to always return false.  In practice, that results in potentially extra dispatches when something could have happened synchronously.  That's not so bad.  Moreover, I have a NS_NOTREACHED assertion in that function that has to the best of my knowledge never been hit.  We just don't call that on the thread pools.  I'm probably going to preserve that NS_NOTREACHED for now and deal with a providing a better implementation later if the need arises.
> +nsresult nsNSSEventPostToUIThread(nsIRunnable *event);
> 
> Might as well make this inline.

Actually, I meant to CVS remove both the header and source since that function is no longer used :)
> -  nsCOMPtr<nsIEventQueue> eventQ;
> -  rv = NS_GetCurrentEventQ(getter_AddRefs(eventQ));
> +  nsCOMPtr<nsIThread> thread;
> +  rv = NS_GetCurrentThread(getter_AddRefs(thread));
> 
> Why didn't you use do_GetCurrentThread here?

In this case, I wanted to be very sure that the function succeeds since it would be bad to call AsyncResolve with a null event target.  That causes the DNS resolver to call you back on its thread.


>  class nsPACMan : public nsIStreamLoaderObserver
>                 , public nsIInterfaceRequestor
>                 , public nsIChannelEventSink
> +               , public nsIRunnable
> 
> Separation...

Done, and I'm really liking nsRunnableMethod... actually NS_NewRunnableMethod.
 

> Why not just remove it?

removed
+    // Holds a weak reference to ourselves that RestorePresentationEvent objects
+    // will use to reference us.  We can sever this weak ref at any time to
+    // effectively "revoke" any pending RestorePresentationEvent objects.
+    nsDocShellWeakRef          mWeakSelf;

Maybe you should call this something other than a weak ref. If you just wanted a weak ref there's nsSupportsWeakReference which would be more generally useful. What you actually want here is something you can cancel specifically to revoke those RestorePresentationEvents.

How about just storing a pointer to the RestorePresentationEvent here, and an nsDocShell* in RestorePresentationEvent, with a method on RestorePresentationEvent to null out its docshell pointer? To revoke the event from nsDocShell, call that method and null out the docshell's pointer to the event. Also null out the docshell's pointer to the event when the event fires. This would also be slightly more efficient because you don't need another object.

+  nsCSSFrameConstructorWeakRef mWeakSelf;

Same deal. And here you wouldn't need the pending flag, you'd know that a non-null event reference meant an event was pending.

+  PresShellWeakRef              mWeakSelf;
+  nsComboboxControlFrameWeakRef mWeakSelf;
+  nsGfxScrollFrameInnerWeakRef mWeakSelf;
+  nsTypedSelectionWeakRef mWeakSelf;
+  nsViewManagerWeakRef mWeakRef;

Same here, you wouldn't need additional pending flags.

+    // This creates a reference cycle between this and the event that is
+    // broken when the event fires.

You haven't really made a reference cycle because the pointer to the event is not a strong reference.

+void nsParser::HandleParserContinueEvent(nsParserContinueEvent *ev) {
+  // Ignore any revoked continue events...

IMHO it would be better to poke the event to drop its reference to the parser when it gets revoked. Then HandleParserContinueEvent would never get called, wouldn't need the extra parameter, and we wouldn't have the potential issue where a revoked event keeps its parser alive longer than necessary.
Maybe something like this could help:

template <class T> class nsRevocableEventPtr<T> {
public:
  nsRevocableEventPtr() : mEvent(nsnull) {}
  ~nsRevocableEventPtr() { Revoke(); }
  const nsRevocableEventPtr& operator=(T* aEvent) { Revoke(); mEvent = aEvent; }
  void Revoke() { if (mEvent) { mEvent->Revoke(); mEvent = nsnull; } }
  void Forget() { mEvent = nsnull; }
  PRBool IsPending() { return mEvent != nsnull; }
private:
  T* mEvent;
};

nsTWeakRef seems useful if you have a possibly unbounded number of events referring back to the same object, but it seems that we don't.
Comment on attachment 219265 [details]
patch: layout v1.3

Other than that issue, this looks good.
Attachment #219265 - Flags: superreview?(roc) → superreview+
> How about just storing a pointer to the RestorePresentationEvent here, and an
> nsDocShell* in RestorePresentationEvent

If we do that, we should ensure that we can't have two presentation restore events for the same docshell in flight at once.  Right now I don't think we guarantee that.... (but we probably should).
That's easy enough. Either suppress the second one, or revoke the first one by clearing its docshell pointer.
Attachment #219268 - Flags: superreview?(roc) → superreview+
Comment on attachment 219268 [details] [diff] [review]
patch: mail v1.1

I noticed you only updated the uuid of nsIImapService but not nsIImapProtocol or nsIImapIncomingServer - I doubt it matters since all you've changed is the type signature of some methods, and those interfaces will have their uuid changed anyway between 2.0 and 3.0, I'm sure.

nsImapProxyEvent.cpp/h shouldn't be used anymore. My bad for leaving the include in nsImapProtocol.h - I'll fix that, and cvs remove the files.

Are there tabs here? If so, can you remove them?

+	NS_GetProxyForObject(NS_PROXY_TO_MAIN_THREAD, NS_GET_IID(nsIStringBundle),
+											 m_pBundle, NS_PROXY_SYNC | NS_PROXY_ALWAYS,
+											 (void **) &strProxy);
 
(I know these files are a tab paradise :-( )

r=bienvenu, modulo the three issues we've talked about:

1. The streamListener thread safety assertions
2. The imap thread name needs to be null or empty string (whichever one works, I've forgotten)
3. The hang on shutdown if we cleanup on exit.
Attachment #219268 - Flags: review?(bienvenu) → review+
> I noticed you only updated the uuid of nsIImapService but not nsIImapProtocol
> or nsIImapIncomingServer - I doubt it matters since all you've changed is the
> type signature of some methods, and those interfaces will have their uuid
> changed anyway between 2.0 and 3.0, I'm sure.

It's probably best to change them.


> nsImapProxyEvent.cpp/h shouldn't be used anymore. My bad for leaving the
> include in nsImapProtocol.h - I'll fix that, and cvs remove the files.

I'll cvs remove them :)


> Are there tabs here? If so, can you remove them?
> 
> +       NS_GetProxyForObject(NS_PROXY_TO_MAIN_THREAD,
> NS_GET_IID(nsIStringBundle),
> +                                                                              
>          m_pBundle, NS_PROXY_SYNC | NS_PROXY_ALWAYS,
> +                                                                              
>          (void **) &strProxy);
> 
> (I know these files are a tab paradise :-( )

OK... will do.


> r=bienvenu, modulo the three issues we've talked about:
> 
> 1. The streamListener thread safety assertions

Fixed with NS_ProxyRelease.


> 2. The imap thread name needs to be null or empty string (whichever one works,
> I've forgotten)

Fixed.  Threads no longer have names actually.  I removed that feature because it was making the implementation of nsThreadManager overly complex, and there were no consumers depending on thread naming.


> 3. The hang on shutdown if we cleanup on exit.

I'm working on this now.
> Maybe something like this could help:
> 
> template <class T> class nsRevocableEventPtr<T> {

Yeah, that seems nice.


> nsTWeakRef seems useful if you have a possibly unbounded number of events
> referring back to the same object, but it seems that we don't.

Agreed.
Blocks: 301560
Attached file patch for check-in
This is the latest candidate patch for check-in.  Sorry, but I had to gzip it.
Attachment #221530 - Attachment is patch: false
Attachment #221530 - Attachment mime type: text/plain → application/x-gzip
I really missed thsi bug at the time. So it is really chance that BeOS will be broken.
Trying to do what i can this weekend
(In reply to comment #136)
> I really missed thsi bug at the time. So it is really chance that BeOS will be
> broken.
> Trying to do what i can this weekend
> 

Confirming, landing patch has broken BeOS builds.  Bug 337489 opened for regression.
No longer blocks: 337489
Depends on: 337489
OK, it's in.  Modulo some possible perf regressions on Windows, things seem well.

Marking FIXED :)

Thanks to everyone who helped make this possible!  I especially appreciate the code reviews and the help from those who downloaded the early test builds.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
It seems to me that the call to NS_ProcessPendingEvents(mMainThread) in nsThreadManager::Shutdown should be moved down below where all the other threads are shut down. And the comment should be fixed, of course.
I guess it doesn't matter much. I think it's actually only needed if there are no non-main threads and there are some events for the main thread on entry to nsThreadManager::Shutdown. (If there's at least one non-main thread, we would process all previously pending and newly-posted main thread events while shutting down the non-main threads.) But I'd move it down to avoid giving the impression that the main thread processes no more events after where the ProcessingPendingEvents call is now.
Depends on: 337516
Blocks: 336985
Blocks: 281269
Blocks: 297294
Blocks: 230610
This apparently causes Camino trunk builds to be unable to load pages after having been idle for a brief but fluctuating time period (bug 337550).
Depends on: 337711
In SeaMonkey and Firefox of trunk, Japanese cannot be input by IME.
(Camino is not confirmed for the reasons for bug337550. )

I file bug337740.

OK:
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.9a1) Gecko/20060510
Minefield/3.0a1
NG:
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.9a1) Gecko/20060511
Minefield/3.0a1
Depends on: 337740
Depends on: 337879
Blocks: 323262
Depends on: 337993
Blocks: 197341
Depends on: 338491
https://bugzilla.mozilla.org/process_bug.cgi#c2 suggests this might be causing the inability to shutdown after closing the last window on linux.  any thoughts?
Depends on: 338343
Depends on: 339683
Depends on: 50104
Blocks: 55403
Blocks: 62604
Blocks: 76075
Blocks: 134723
Blocks: 340283
Depends on: 341430
FYI, this check-in seems to have initiated the problem with the tab DnD indicator not working the first time you drag (bug 338302).
Blocks: 74331
Depends on: 338302
What branches/versions will this fix be available for?
trunk only
Blocks: 123202
Depends on: 339210
Depends on: 344861
Blocks: 319857
Depends on: 347266
Depends on: 347750
Depends on: 348318
Depends on: 346973
Depends on: 354087
Depends on: 338549
Since there are multiple dependent bugs still open - should this bug be re-opened?
No, this is fixed...  The open dependent bugs are basically either regressions or followups or bugs that we thought might get fixed by this but didn't....
Depends on: 360219
Depends on: 342810
Depends on: 363585
Depends on: 337761
Depends on: 371374
Depends on: 374957
No longer depends on: 374957
Depends on: 374957
Depends on: 376306
Depends on: 368325
Depends on: 378358
Alias: nsIThreadManager
Depends on: 382186
Depends on: 386614
Depends on: 389615
Depends on: 389994
Depends on: 389931
Depends on: 381699
Depends on: 338225
No longer depends on: 389615
Depends on: 358202
Depends on: 359515
No longer depends on: 354087
Depends on: 389675
Depends on: 404694
Depends on: 417750
Depends on: 430800
Depends on: 430567
Depends on: 444322
Blocks: 469667
Depends on: 678573
Depends on: 620806
Depends on: 749145
Depends on: 900711
You need to log in before you can comment on or make changes to this bug.