event queue code ignores creation error, crashes on null mEventQueue

RESOLVED FIXED

Status

()

RESOLVED FIXED
14 years ago
7 years ago

People

(Reporter: brendan, Assigned: brendan)

Tracking

({crash})

Trunk
crash
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [greasemonkey])

Attachments

(2 attachments, 2 obsolete attachments)

(Assignee)

Description

14 years ago
I ran into this by installing greasemonkey and surfing around.  Patch with
narrative analysis:

#0  0x0058b7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x007f1734 in raise () from /lib/tls/libpthread.so.0
#2  0x0805556a in nsProfileLock::FatalSignalHandler ()
#3  <signal handler called>

Crashing on null |queue| here:

#4  0xf6f4f9be in PL_IsQueueNative ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so

Called from here with null |mEventQueue|:

#5  0xf6f50c00 in nsEventQueueImpl::NotifyObservers ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so

Called from here:

#6  0xf6f50d79 in nsEventQueueImpl::InitFromPRThread ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so

with aNative true, so PL_CreateNativeEventQueue must have failed.  Looking
at PL_CreateNativeEventQueue, the only way its _pl_CreateEventQueue can fail
other than OOM is if _pl_SetupNativeNotifier fails.  Looking there, I see a
pipe(2) call, which can fail on file table overflow.

Are we (in combination with greasemonkey, but it can't be solely to blame)
leaking fds?

In any event, nsEventQueueImpl::InitFromPRThread is clearly broken -- it fails
to check for PL_Create*EventQueue failure and therefore pass let a null
mEventQueue in |this| flow into NotifyObservers, etc.  It should propagate the
error to its caller with an early return.

#7  0xf6f51c77 in nsEventQueueServiceImpl::MakeNewQueue ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so
#8  0xf6f521a4 in nsEventQueueServiceImpl::PushThreadEventQueue ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so
#9  0xf6657f78 in nsXMLDocument::Load ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libgklayout.so
#10 0xf6f685c9 in XPTC_InvokeByIndex ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so
#11 0xf6c09e2e in XPCWrappedNative::CallMethod ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libxpconnect.so
#12 0xf6c0fb26 in XPC_WN_CallMethod ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libxpconnect.so
#13 0xf6fc97bc in js_Invoke ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libmozjs.so

GreaseMonkey (chrome://greasemonkey/content/greasemonkey.js), trying to load a
default-config.xml user script at the *'d line:

        this.load = function() {
          var doc = document.implementation.createDocument("", "", null);
          doc.async = false;
          try {
            doc.load(getScriptChrome("config.xml"));
          } catch (exc) {
*           doc.load(getScriptChrome("default-config.xml"));
          }
          . . .
        }

#14 0xf6fc0c1a in js_Interpret ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libmozjs.so
#15 0xf6fc9881 in js_Invoke ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libmozjs.so
#16 0xf6c0630a in nsXPCWrappedJSClass::CallMethod ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libxpconnect.so
#17 0xf6c024e6 in nsXPCWrappedJS::CallMethod ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libxpconnect.so
#18 0xf6f690e1 in PrepareAndDispatch ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/libxpcom.so
#19 0xf65784f7 in nsEventListenerManager::HandleEventSubType ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libgklayout.so
#20 0xf6578f69 in nsEventListenerManager::HandleEvent ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libgklayout.so
#21 0xf66ff6fc in nsXULElement::HandleDOMEvent ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libgklayout.so
#22 0xf66ff7dd in nsXULElement::HandleDOMEvent ()
   from /home/brendan/src/firefox1.0.1/mozilla/dist/bin/components/libgklayout.so

Patch next.

/be
(Assignee)

Comment 1

14 years ago
Created attachment 178694 [details] [diff] [review]
proposed fix

Note that if we are leaking fds (possibly due to greasemonkey attempting to
load lots of XML docs), I should have reached a state where I couldn't load
other types of docs.  I didn't see that (but of course, I crashed right after
pipe presumably failed, possibly due to too many open files per process being
reached, so there *could* be a leak).

So I've added a printf to plevent.c right after pipe fails.  I'll report here
if I see that printf.

/be
(Assignee)

Updated

14 years ago
Attachment #178694 - Flags: superreview?(darin)
Attachment #178694 - Flags: review?(danm.moz)
(Assignee)

Comment 2

14 years ago
Comment on attachment 178694 [details] [diff] [review]
proposed fix

This is a 1.0.1 branch patch, but it applies to the trunk too.

/be
(Assignee)

Updated

14 years ago
Whiteboard: [greasemonkey]

Comment 3

14 years ago
this is a subset of the changes from bug 215085. which i was planning on
resurrecting because i recently revisited eventqueue's (there's some live bug
about onfocus=alert which results in us exhausting a system resource and the
eventqueue code crashes because it makes a really bogus assertion about windows
never failing to create windows...).

Comment 4

14 years ago
Comment on attachment 178694 [details] [diff] [review]
proposed fix

looks good to me.  do callers of these functions handle these errors properly?
Attachment #178694 - Flags: superreview?(darin) → superreview+
(Assignee)

Comment 5

14 years ago
Comment on attachment 178694 [details] [diff] [review]
proposed fix

I forgot to patch this hunk, which comes just before the one hunk shown for
nsEventQueue.cpp:

@@ -145,16 +145,18 @@ nsEventQueueImpl::~nsEventQueueImpl()
 NS_IMETHODIMP
 nsEventQueueImpl::Init(PRBool aNative)
 {
   PRThread *thread = PR_GetCurrentThread();
   if (aNative)
     mEventQueue = PL_CreateNativeEventQueue("Thread event queue...", thread);
   else
     mEventQueue = PL_CreateMonitoredEventQueue("Thread event queue...",
thread);
+  if (!mEventQueue)
+    return NS_ERROR_FAILURE;
   NotifyObservers(gActivatedNotification);
   return NS_OK;
 }

 NS_IMETHODIMP
 nsEventQueueImpl::InitFromPRThread(PRThread* thread, PRBool aNative)
 {
   if (thread == NS_CURRENT_THREAD)

darin: the callers can be chased from
http://lxr.mozilla.org/mozilla/search?string=InitFromPRThread -- I looked until
I got sleepy, all was well (if you include the stylized checking done after the
fact at
http://lxr.mozilla.org/mozilla/source/xpfe/appshell/src/nsXULWindow.cpp#1736).

Asking dougt for r=, I don't think danm.moz is doing Mozilla stuff right now
(or ever again -- wahhh).

/be
Attachment #178694 - Flags: review?(danm.moz) → review?(dougt)
(Assignee)

Updated

14 years ago
Assignee: dougt → brendan

Comment 6

14 years ago
the failure is due to a lack of resources, we tend to use oom for that instead
of generic failure.

Comment 7

14 years ago
Comment on attachment 178694 [details] [diff] [review]
proposed fix

Be sure to indent mEventQTable.Put in nsEventQueueService.cpp.	I can't tell if
you missed it.

There is a bit of inconsistency in error checking after MakeNewQueue.  Is it
enough to just check to see if queue is non-null before putting it into the
mEventQTable or do we also have to ensure that MakeNewQueue doesn't return
failure?  Either way, lets make both calls to MakeNewQueue do the same error
check.
(Assignee)

Comment 8

14 years ago
timeless: pipe(2) failure is not "out of memory" except in a strained sense.  If
we don't have an EMFILE (see errno(3)) analogue, I'd rather just "fail".

dougt: that patch is a diff -w (I cleaned up whitespace elsewhere in the file),
see the diff options at the top -- hence the appearance of unindented code. 
Everything's lined up right in vi (and no tabs).

In the patch, all cases of MakeNewQueue callers propagate failing rv from that
method.  The differences are due to my attempt to minimize change, and to avoid
duplicating PR_ExitMonitor call where possible (one of the two cases).  If you
think it's important to have the same early-return-or-not control flow in both
cases, I'll have to use an else in the second ("or-not" -- can't early return
there because of a later nsCOMPtr in the enclosing block scope).

/be
(Assignee)

Comment 9

14 years ago
Created attachment 178893 [details] [diff] [review]
alterna-patch

and an else in the second case will overindent a whole bunch of existing code,
but hey, it's a diff -w.  Here you go.

/be
Attachment #178893 - Flags: superreview?(darin)
Attachment #178893 - Flags: review?(dougt)

Comment 10

14 years ago
i think consistency is more important than the early return.  Either check the
out, return, or both.
(Assignee)

Comment 11

14 years ago
dougt: then how about the second patch?

/be

Updated

14 years ago
Keywords: crash
(Assignee)

Comment 12

14 years ago
Comment on attachment 178893 [details] [diff] [review]
alterna-patch

Oh, dougt wanted rv tested in both call sites.	New patch in a trice (doug, I
didn't deCOMtaminate -- I wanted to avoid owning too much code here, and I
wanted MakeNewQueue to be in charge of its own failure code).

/be
Attachment #178893 - Attachment is obsolete: true
Attachment #178893 - Flags: superreview?(darin)
Attachment #178893 - Flags: review?(dougt)
(Assignee)

Comment 13

14 years ago
Created attachment 178969 [details] [diff] [review]
true alterna-patch

Remember, this is a diff -w.

/be
Attachment #178969 - Flags: superreview?(darin)
Attachment #178969 - Flags: review?(dougt)
(Assignee)

Comment 14

14 years ago
Created attachment 178971 [details] [diff] [review]
real -w patch here

Sorry, changed patch filename conventions mid-stream.

/be

Updated

14 years ago
Attachment #178694 - Attachment is obsolete: true

Updated

14 years ago
Attachment #178969 - Flags: review?(dougt) → review+

Updated

14 years ago
Attachment #178969 - Flags: superreview?(darin) → superreview+
(Assignee)

Comment 15

14 years ago
Fixed.

/be
Status: NEW → RESOLVED
Last Resolved: 14 years ago
Resolution: --- → FIXED

Comment 16

13 years ago
Comment on attachment 178694 [details] [diff] [review]
proposed fix

(Clearing old review flag)
Attachment #178694 - Flags: review?(dougt)
*** Bug 241146 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.