Closed Bug 215085 Opened 22 years ago Closed 13 years ago

Cleanup EventQueue(Service)

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: timeless, Assigned: timeless)

References

Details

(Keywords: crash)

testcase (for xpcshell): eqs=Components.classes["@mozilla.org/event-queue-service;1"].getService(Components.interfaces.nsIEventQueueService) try { for (a=0; a<10000; print(++a)) { eqs.popThreadEventQueue(eqs.pushThreadEventQueue()); gc(); } } finally { quit(); } I'd like to start w/ danm's review of the patch in attachment 129042 [details] [diff] [review]. Note that this test case should *not* accumulate event queues since each iteration pushes, pops, and ideally garbage collects a single event queue. This flavor is a crasher in various ways (out of file handles, stack depth exceeded).
There we go. Time to get this review off my hard disk. I think the patch is mostly good guards and checks that are currently missing. But I do feel the need to pick at it piecemeal: Index: nsEventQueue.cpp =================================================================== RCS file: /cvsroot/mozilla/xpcom/threads/nsEventQueue.cpp,v retrieving revision 3.38 diff -up6 -r3.38 nsEventQueue.cpp --- nsEventQueue.cpp +++ nsEventQueue.cpp @@ -60,12 +60,16 @@ static PRThread *gEventQueueLogThread = @@ -110,17 +114,12 @@ nsEventQueueImpl::nsEventQueueImpl() @@ -141,12 +140,16 @@ nsEventQueueImpl::Init(PRBool aNative) @@ -169,12 +172,16 @@ nsEventQueueImpl::InitFromPRThread(PRThr all fine-'n-good @@ -493,33 +500,31 @@ nsEventQueueImpl::Create(nsISupports *aO ... - nsCOMPtr<nsPIEventQueueChain> queueChain(do_QueryInterface(aQueue)); + nsCOMPtr<nsPIEventQueueChain> queueChain(do_QueryInterface(aQueue, &rv)); + if (NS_FAILED(rv)) + return rv; ... - nsCOMPtr<nsPIEventQueueChain> endChain(do_QueryInterface(end)); + nsCOMPtr<nsPIEventQueueChain> endChain(do_QueryInterface(end, &rv)); But I do have to grouse about this one. What, am I the only guy who doesn't trust return codes? Really, there are accessors in the code which return an NS_OK error code but a null interface. Couldn't name one right now of course but I've been bitten by that one before. In my apparently minority opinion, what I really care about is whether I got the interface I wanted, not what the stinking result code was. The pattern nsCOMPtr<nsIThing> thing(do_QueryInterface(protoThing,&rv)) if (NS_SUCCEEDED(rv)) thing->Method() makes me totally nervous. In this case it's just a QI so |returned value| and |returned interface| are equivalent but still I prefer this section the way it was. But since it seems like I'm in the minority I'm prepared to cave. Brendan? Index: nsEventQueue.h =================================================================== RCS file: /cvsroot/mozilla/xpcom/threads/nsEventQueue.h,v retrieving revision 3.19 diff -up6 -r3.19 nsEventQueue.h --- nsEventQueue.h +++ nsEventQueue.h @@ -77,12 +77,13 @@ private: ... + Unlink(); ... This Unlink() is premature. I hope this isn't the crux of your patch because it can't be that way. There are clients of this code (necko) which continue to post events to cached event queues even after they've been deactivated. This is legal; we deal with it by punting the tardy event up to an active elder queue. But this Unlink cuts the queue out of the chain so it loses track of the elder queue(s) before its last owner (and potential poster) has released it. If you run with this change and exercise networking a bit I promise in time you'll hit the "event dropped because event chain is dead" (nsEventQueue.cpp;3.38#251) assertion. It's not a bad idea though, pulling the queue out of the chain at this point. I've been toying with it myself. It's just a little trickier than that. Index: nsEventQueueService.cpp =================================================================== RCS file: /cvsroot/mozilla/xpcom/threads/nsEventQueueService.cpp,v retrieving revision 1.39 diff -up6 -r1.39 nsEventQueueService.cpp --- nsEventQueueService.cpp +++ nsEventQueueService.cpp @@ -182,13 +182,14 @@ nsEventQueueServiceImpl::CreateEventQueu @@ -259,13 +260,12 @@ NS_IMETHODIMP @@ -277,64 +277,66 @@ nsEventQueueServiceImpl::PushThreadEvent again fine-'n-good, except for that curious reliance on return values :). Note in this case the value-of-interest returned from MakeNewQueue (aQueue) can actually be non-null though the value-of-no-interest returned (rv) indicates failure, which breaks my own minority world-view. However that's another error: the queue is broken and shouldn't be returned and in fact will leak because of queue's unusual ownership model but that's another issue... And in the case of the third change (@@ -277...) if the incompletely initialized queue mistakenly returned by MakeNewQueue didn't leak then with this change from the patch: nsCOMPtr<nsIEventQueue> newQueue; + rv = MakeNewQueue((PRThread*)key.GetValue()... + *aNewQueue = newQueue; + if (NS_SUCCEEDED(rv)) { ... + NS_ADDREF(*aNewQueue); aNewQueue would return a dangling pointer. It's important to return null pointers! The int return code; not so much. I know. I stand alone there. So mostly good. I'd like some changes and I have in mind two additions: MakeNewQueue I think should also be fixed to not return an incompletely initialized object (and to fix the leak this will require a hack to artificially decrement the refcount because Queues, and I admit this is an error of mine, ref themselves in their constructor (that could also be fixed but it wants care)). Also, a DEBUG-only check to warn us when the event queue stack gets too deep. That'll be more important now. I've just checked that in, though.
Shouldn't the logging in the constructor be moved to one of the Init functions? Isn't it just logging the uninitialized mEventQueue, well now initialized to null? And while your in there, the Init functions share some code that could potentially be put in an inline helper function and shared. Something like below. Have Init and InitFromPRThread call this helper, and then put the logging in InitFromPLQueue. That should get the initial logging in a central place. Timeless, up to you if you want to fix the logging or not. inline nsresult nsEventQueueImpl::InitImpl(PRThread* thread, PRBool aNative) { PLEventQueue* queue; if (aNative) queue = PL_CreateNativeEventQueue("Thread event queue...", thread); else queue = PL_CreateMonitoredEventQueue("Thread event queue...", thread); if (queue) return InitFromPLQueue(queue); else return NS_ERROR_OUT_OF_MEMORY; }
It's just logging. It was exactly where I wanted it when I put it in there, (I imagine!). Seemed worthwhile checking in at the time but I doubt anyone has used it since. Seems senseless to move it unless you want to use it and find it's in an inconvenient place for the problem at hand.
Danm may have a point: if a getter can return NS_OK with a null out param, you have to test both. If you care more about null than the rv, test the out param pointer after the call (make sure it's null before -- use an nsCOMPtr) and bail with rv if it's null. I don't know about the particular interface methods called in the patch; someone could look. But in many cases a one-size-fits-all rule is best: null check and fail with the rv, or don't even bother capturing the callee's rv, make your own up (don't hide errors such as NS_ERROR_OUT_OF_MEMORY, tho). /be
Yay! Just checked in; an example of the subject of comments 4 and (mostly) comment 1, where the returned interface was null but the error code was NS_OK. Part of the patch checked in (ignore the patch posted in the bug which used a more clumsy NS_ENSURE_TRUE(sectionNode) form) went: nsCOMPtr<nsIDOMNode> sectionNode; nsresult rv = GetParentNode(getter_AddRefs(sectionNode)); - NS_ENSURE_SUCCESS(rv, rv); + if (!sectionNode) { + return rv; + } So there's an illustration of my point. The patched version (nsHTMLTableRowElement.cpp 1.78) was a return to how the code originally read. In rev 1.69 it was changed to the "trust rv" version. The result was crash bug 211357. Now of course QueryInterface isn't the same thing as an accessor method. If a QI returned a null interface you'd also expect an error return. So in the case of the patch for this bug, both versions are pretty much equivalent. Still, I find the "if (!sectionNode)" code more ad rem and therefore more readable, more elegant and more trustworthy. So I'd hate to see perfectly good interface checking code lose any more ground to that misbegotten "trust rv" stuff. There. All done ranting. For a few minutes.
QA Contact: scc → xpcom
This bug has no information on current versions and no clear lead as to what the crashes that it tracks are. This should either be updated with current info and reopened or new bugs be filed on concrete actions/crashes on current code.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.