Closed Bug 287861 Opened 19 years ago Closed 19 years ago

event queue code ignores creation error, crashes on null mEventQueue

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: brendan, Assigned: brendan)

References

Details

(Keywords: crash, Whiteboard: [greasemonkey])

Attachments

(2 files, 2 obsolete files)

I ran into this by installing greasemonkey and surfing around.  Patch with
narrative analysis:

#0  0x0058b7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x007f1734 in raise () from /lib/tls/libpthread.so.0
#2  0x0805556a in nsProfileLock::FatalSignalHandler ()
#3  <signal handler called>

Crashing on null |queue| here:

#4  0xf6f4f9be in PL_IsQueueNative ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so

Called from here with null |mEventQueue|:

#5  0xf6f50c00 in nsEventQueueImpl::NotifyObservers ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so

Called from here:

#6  0xf6f50d79 in nsEventQueueImpl::InitFromPRThread ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so

with aNative true, so PL_CreateNativeEventQueue must have failed.  Looking
at PL_CreateNativeEventQueue, the only way its _pl_CreateEventQueue can fail
other than OOM is if _pl_SetupNativeNotifier fails.  Looking there, I see a
pipe(2) call, which can fail on file table overflow.

Are we (in combination with greasemonkey, but it can't be solely to blame)
leaking fds?

In any event, nsEventQueueImpl::InitFromPRThread is clearly broken -- it fails
to check for PL_Create*EventQueue failure and therefore pass let a null
mEventQueue in |this| flow into NotifyObservers, etc.  It should propagate the
error to its caller with an early return.

#7  0xf6f51c77 in nsEventQueueServiceImpl::MakeNewQueue ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so
#8  0xf6f521a4 in nsEventQueueServiceImpl::PushThreadEventQueue ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so
#9  0xf6657f78 in nsXMLDocument::Load ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libgklayout.so
#10 0xf6f685c9 in XPTC_InvokeByIndex ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so
#11 0xf6c09e2e in XPCWrappedNative::CallMethod ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libxpconnect.so
#12 0xf6c0fb26 in XPC_WN_CallMethod ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libxpconnect.so
#13 0xf6fc97bc in js_Invoke ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libmozjs.so

GreaseMonkey (chrome://greasemonkey/content/greasemonkey.js), trying to load a
default-config.xml user script at the *'d line:

        this.load = function() {
          var doc = document.implementation.createDocument("", "", null);
          doc.async = false;
          try {
            doc.load(getScriptChrome("config.xml"));
          } catch (exc) {
*           doc.load(getScriptChrome("default-config.xml"));
          }
          . . .
        }

#14 0xf6fc0c1a in js_Interpret ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libmozjs.so
#15 0xf6fc9881 in js_Invoke ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libmozjs.so
#16 0xf6c0630a in nsXPCWrappedJSClass::CallMethod ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libxpconnect.so
#17 0xf6c024e6 in nsXPCWrappedJS::CallMethod ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libxpconnect.so
#18 0xf6f690e1 in PrepareAndDispatch ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so
#19 0xf65784f7 in nsEventListenerManager::HandleEventSubType ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libgklayout.so
#20 0xf6578f69 in nsEventListenerManager::HandleEvent ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libgklayout.so
#21 0xf66ff6fc in nsXULElement::HandleDOMEvent ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libgklayout.so
#22 0xf66ff7dd in nsXULElement::HandleDOMEvent ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libgklayout.so

Patch next.

/be
Attached patch proposed fix (obsolete) — Splinter Review
Note that if we are leaking fds (possibly due to greasemonkey attempting to
load lots of XML docs), I should have reached a state where I couldn't load
other types of docs.  I didn't see that (but of course, I crashed right after
pipe presumably failed, possibly due to too many open files per process being
reached, so there *could* be a leak).

So I've added a printf to plevent.c right after pipe fails.  I'll report here
if I see that printf.

/be
Attachment #178694 - Flags: superreview?(darin)
Attachment #178694 - Flags: review?(danm.moz)
Comment on attachment 178694 [details] [diff] [review]
proposed fix

This is a 1.0.1 branch patch, but it applies to the trunk too.

/be
Whiteboard: [greasemonkey]
this is a subset of the changes from bug 215085. which i was planning on
resurrecting because i recently revisited eventqueue's (there's some live bug
about onfocus=alert which results in us exhausting a system resource and the
eventqueue code crashes because it makes a really bogus assertion about windows
never failing to create windows...).
Comment on attachment 178694 [details] [diff] [review]
proposed fix

looks good to me.  do callers of these functions handle these errors properly?
Attachment #178694 - Flags: superreview?(darin) → superreview+
Comment on attachment 178694 [details] [diff] [review]
proposed fix

I forgot to patch this hunk, which comes just before the one hunk shown for
nsEventQueue.cpp:

@@ -145,16 +145,18 @@ nsEventQueueImpl::~nsEventQueueImpl()
 NS_IMETHODIMP
 nsEventQueueImpl::Init(PRBool aNative)
 {
   PRThread *thread = PR_GetCurrentThread();
   if (aNative)
     mEventQueue = PL_CreateNativeEventQueue("Thread event queue...", thread);
   else
     mEventQueue = PL_CreateMonitoredEventQueue("Thread event queue...",
thread);
+  if (!mEventQueue)
+    return NS_ERROR_FAILURE;
   NotifyObservers(gActivatedNotification);
   return NS_OK;
 }

 NS_IMETHODIMP
 nsEventQueueImpl::InitFromPRThread(PRThread* thread, PRBool aNative)
 {
   if (thread == NS_CURRENT_THREAD)

darin: the callers can be chased from
http://lxr.mozilla.org/mozilla/search?string=InitFromPRThread -- I looked until
I got sleepy, all was well (if you include the stylized checking done after the
fact at
http://lxr.mozilla.org/mozilla/source/xpfe/appshell/src/nsXULWindow.cpp#1736).

Asking dougt for r=, I don't think danm.moz is doing Mozilla stuff right now
(or ever again -- wahhh).

/be
Attachment #178694 - Flags: review?(danm.moz) → review?(dougt)
Assignee: dougt → brendan
the failure is due to a lack of resources, we tend to use oom for that instead
of generic failure.
Comment on attachment 178694 [details] [diff] [review]
proposed fix

Be sure to indent mEventQTable.Put in nsEventQueueService.cpp.	I can't tell if
you missed it.

There is a bit of inconsistency in error checking after MakeNewQueue.  Is it
enough to just check to see if queue is non-null before putting it into the
mEventQTable or do we also have to ensure that MakeNewQueue doesn't return
failure?  Either way, lets make both calls to MakeNewQueue do the same error
check.
timeless: pipe(2) failure is not "out of memory" except in a strained sense.  If
we don't have an EMFILE (see errno(3)) analogue, I'd rather just "fail".

dougt: that patch is a diff -w (I cleaned up whitespace elsewhere in the file),
see the diff options at the top -- hence the appearance of unindented code. 
Everything's lined up right in vi (and no tabs).

In the patch, all cases of MakeNewQueue callers propagate failing rv from that
method.  The differences are due to my attempt to minimize change, and to avoid
duplicating PR_ExitMonitor call where possible (one of the two cases).  If you
think it's important to have the same early-return-or-not control flow in both
cases, I'll have to use an else in the second ("or-not" -- can't early return
there because of a later nsCOMPtr in the enclosing block scope).

/be
Attached patch alterna-patch (obsolete) — Splinter Review
and an else in the second case will overindent a whole bunch of existing code,
but hey, it's a diff -w.  Here you go.

/be
Attachment #178893 - Flags: superreview?(darin)
Attachment #178893 - Flags: review?(dougt)
i think consistency is more important than the early return.  Either check the
out, return, or both.
dougt: then how about the second patch?

/be
Keywords: crash
Comment on attachment 178893 [details] [diff] [review]
alterna-patch

Oh, dougt wanted rv tested in both call sites.	New patch in a trice (doug, I
didn't deCOMtaminate -- I wanted to avoid owning too much code here, and I
wanted MakeNewQueue to be in charge of its own failure code).

/be
Attachment #178893 - Attachment is obsolete: true
Attachment #178893 - Flags: superreview?(darin)
Attachment #178893 - Flags: review?(dougt)
Remember, this is a diff -w.

/be
Attachment #178969 - Flags: superreview?(darin)
Attachment #178969 - Flags: review?(dougt)
Sorry, changed patch filename conventions mid-stream.

/be
Attachment #178694 - Attachment is obsolete: true
Attachment #178969 - Flags: review?(dougt) → review+
Attachment #178969 - Flags: superreview?(darin) → superreview+
Fixed.

/be
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Comment on attachment 178694 [details] [diff] [review]
proposed fix

(Clearing old review flag)
Attachment #178694 - Flags: review?(dougt)
*** Bug 241146 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: