Closed Bug 389931 Opened 17 years ago Closed 16 years ago

Can't use menus in other windows while native filepicker is open

Categories

(Core :: Widget, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla1.9beta4

People

(Reporter: MatsPalmgren_bugz, Assigned: MatsPalmgren_bugz)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Keywords: regression)

Attachments

(3 files, 5 obsolete files)

Can't interact with other windows while modal dialog is open.

STEPS TO REPRODUCE
1. start Firefox
2. File->New Window
3. File->Save Page As  (Gnome File chooser dialog appears)
4. click on the other Firefox window (the window is focused)
5. click on its menubar File/Edit/View etc (nothing happens)
6. click on "Getting Started" button in the toolbar (throbber starts,
   but the page is not rendered)
7. type something in the location bar (works)
8. dismiss the dialog

ACTUAL RESULTS
Mouse clicks (5) and page load (6) seems to be "queued" - after closing
the dialog, the menus opens all at once (see screenshot) and the page
is rendered and the throbber stops.
This is the first time I've seen this bug, it's reproducible.

PLATFORMS AND BUILDS TESTED
Bug occurs in Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9a7pre) Gecko/2007072704 Minefield/3.0a7pre
Bug does not occur in Firefox 2.0.0.5
Flags: blocking1.9?
Attached image Screenshot #1
Can you narrow down a regression range?  Is this Firefox-specific?  Or specific to the GTK filepicker?  This worksforme with a SeaMonkey from 2 days ago (though using the XUL filepicker).
The bug also occurs in SeaMonkey:
Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9a7pre) Gecko/200707240202 SeaMonkey/2.0a1pre

The bug does not occur with the XUL filepicker, both in Firefox and SeaMonkey.

I noticed a difference with regards to modality between GTK/XUL filepickers:
while the GTK dialog is open, clicking on the window from which it was opened
does not transfer keyboard focus to that window.  Using the XUL one I can
click on the window from which it was opened and the window manager focus
it and I can type for example CTRL+R to reload the page.
I can also use keyboard commands to load new pages and when I type
CTRL+S a second filepicker is opened.
This seem like a bug, even the XUL filepicker should be window modal, right?
I can verify this. Filepicker dialog must be opened in the second window, then the menubar in the first one doesn't work.
I backed out bug 386802 locally, but that didn't fix this bug.
> This seem like a bug, even the XUL filepicker should be window modal, right?

See bug 99728.

Note that the gtk filepicker uses gtk_dialog_run, which blocks GTK events on thew parent if I read http://developer.gimp.org/api/2.0/gtk/GtkDialog.html#gtk-dialog-run right.  That really should make it app-modal; I'm a little surprised it wasn't before.

Out of curiousity, is opening the filepicker from the menu required to reproduce?  Or is using Ctrl-S enough?  If the menu is required, that points to bug 279703...
Summary: Can't interact with other windows while modal dialog is open → Can't interact with other windows while GTK filepicker is open
CTRL+S is enough.  FWIW, I also see that menus flickers more starting
in build 2007070504.  Clicking on File and then moving the mouse to
the right over the menubar, for example.
It's a regression from bug 279703.  I did a clean MOZ_CO_DATE=2007-07-03
build (which works) and then applied the patches in bug 279703 and
that makes the bug occur.  Let me know if you want me to do any
experiments with this tree before I remove it...
Blocks: 279703
Another observation: clicking on the X in the upper right corner to close
the first window works (the window is closed).  So, if I follow steps 1-7
above and then click the X to close the window, then dismiss the dialog,
that triggers a bunch of these assertions:
###!!! ASSERTION: failed to get the nsIScriptGlobalObject. bug 95465?: 'boundGlobal', file nsXBLPrototypeHandler.cpp, line 462
(presumably one per event that was queued up)
> FWIW, I also see that menus flickers more

Please file a separate bug on that.

Just to make sure we're on the same page... you describe two issues in comment 0: the menu not opening and the load not happening.  Are both regressions?  I would think that the former is and the latter was a problem all along...
(In reply to comment #10)
Filed bug 389996 on the menu flickering.

You're right, menus not opening is a regression of bug 279703, but
load not happening is an earlier regression: it works in Firefox 2
but not in 2007070404, I've filed bug 389994 on it.
Summary: Can't interact with other windows while GTK filepicker is open → Can't use menus in other windows while GTK filepicker is open
Attached patch fix? (obsolete) — Splinter Review
The problem is that when we get a NativeEventCallback() while
DoProcessNextNativeEvent() is on the call stack, we ignore it:
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/widget/src/xpwidgets/nsBaseAppShell.cpp&rev=1.7&root=/cvsroot&mark=96,105,136,138#82

This patch fixes the bug for GTK2 and it also fixes bug 389994!

The odd thing is that this method looks exactly the same before the menu
changes, so I don't understand how it worked before that.
It's the same issue.  Menus now open off an event, so if event processing is blocked (which is what it sounds like gtk_dialog_run does), you don't get menus opening.  And since networking uses events, no networking happens while the dialog is up.
Depends on: 389994
Er, I guess there's a patch here.
Blocks: 389994
Component: General → Widget: Gtk
No longer depends on: 389994
QA Contact: general → gtk
This bug also occurs on Windows.
FWIW, enabling the "#if 0"-ed block here:
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/widget/src/xpwidgets/nsBaseAppShell.cpp&rev=1.7&root=/cvsroot&mark=97-104#82
fixes this bug.  (It also fixes bug 389994 for the most part but not
completely, see bug 389994 comment 1.)
Component: Widget: Gtk → Widget
QA Contact: gtk → general
Summary: Can't use menus in other windows while GTK filepicker is open → Can't use menus in other windows while native filepicker is open
Darin isn't really active anymore... you probably want to ask the relevant reviewers from bug 326273 to look at this.
(In reply to comment #15)
> This bug also occurs on Windows.
> FWIW, enabling the "#if 0"-ed block here:
> http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/widget/src/xpwidgets/nsBaseAppShell.cpp&rev=1.7&root=/cvsroot&mark=97-104#82
> fixes this bug.  (It also fixes bug 389994 for the most part but not
> completely, see bug 389994 comment 1.)
> 

Yeah, removing that #if 0 fixes a pile of other bugs (such as 337761) too, but it causes a large impact on startup time.
Can we figure out why it's impacting startup?  That is, under what conditions we hit this code?  Perhaps we should only do that block in some of those cases but not others?
Flags: blocking1.9? → blocking1.9-
Whiteboard: [wanted-1.9]
I think we need to fix that #if 0, since it will probably fix bug 338401 which is a blocker.
Assignee: nobody → mats.palmgren
No longer blocks: 338401
Flags: blocking1.9- → blocking1.9?
Blocks: 338401
+'ing w/ P3.
Flags: blocking1.9? → blocking1.9+
Priority: -- → P3
Is this bug being used as the one to track for the events not happening in certain cases (regression from thread manager)?
Blocks: 337761
Blocks: 405198
Damon, this bug was brought up by QA in the Firefox/Gecko meeting today, as it is blocking major functionality (bug 405198). Can you bump priority?
OS: Linux → All
Hardware: PC → All
> reed: dietrich: as per the firefox/gecko meeting, you can just change the priority and add a comment

bumping to P1, given the number of dependent bugs and their severity.
Priority: P3 → P1
Flags: wanted1.9+
Whiteboard: [wanted-1.9]
Mats, is this going to make it before the beta 3 freeze on the 29th?
Would this be a duplicate of bug 257121 ?
Moving TM to Beta4
Target Milestone: --- → mozilla1.9beta4
Mats you still on this?
Attached patch Patch, rev. 2 (obsolete) — Splinter Review
This was a lot harder than I expected.  Changing this code to make it work
on one platform frequently regressed something on another platform... :-(

This patch seems to work on Windows XP, Linux and MacOSX.  I've tested it
on several of the bugs blocking bug 338225, on all three platforms, and it
fixed the ones I tested.  Help with further testing would be appreciated...
Attachment #274321 - Attachment is obsolete: true
Blocks: 338225
I can confirm that the new patch fixes bugs 203573, 337761, 338401 and 343729 on Windows and Mac.

Blocks: 203573
Also fixed are bugs 359515, 359566, 402219 and presumably the pseudo-duplicates of these bugs as well.
Mats, your patch works like a charm. Thanks!
Only thing I noticed is that the drag cursor seems to flicker between two states while dragging something on a page with marquees (like https://bugzilla.mozilla.org/attachment.cgi?id=165640 for example)
This is on windowsXP. This is a minor issue, though.
Hrm, with this patch included, my debug build also seems to chew most of the cpu power. I didn't see that without this patch, I think.
(In reply to comment #31)
> Only thing I noticed is that the drag cursor seems to flicker between two
> states while dragging something on a page with marquees

Ok, I see that too on Windows, but not Linux or MacOSX.  It's probably
a separate issue.  I also saw another DnD bug that only occurs on Windows:
when dragging a bookmark and hovering it over a closed folder it opens
them correctly but if I cancel the operation it didn't auto-close them.

(In reply to comment #32)
> Hrm, with this patch included, my debug build also seems to chew most
> of the cpu power.

Good catch, I didn't really notice I pegged one of my cores. ;-)
Attached patch Patch rev. 3 (obsolete) — Splinter Review
This fixes the performance problem for me.
Attachment #303246 - Attachment is obsolete: true
Thanks Mats for this important fix, i'm starting testing that on Vista
Ok, with the latest patch, the cpu peg seems to be gone.
I noticed that right-clicking on a flash plugin, the flash animation now continues, instead that it freezes. That's probably an improvement, I guess.
Except, it makes it sometimes difficult to click on one of the context menu-items while the animation is playing.
For example: http://www.mooves.nl/ , choose "animatie" and choose one of the clips, then right-click on one of the clips and zoom in/zoom out. Sometimes, it doesn't respond to that. But I guess it's probably not related to the patch (or only sideways).
Blocks: 404694
Blocks: 417750
With this patch I see a regression in drag&drop within the bookmarks menu under windows.  However, I am not sure that is because of anything this patch is doing wrong.  Without this patch, under windows I am able to click on Bookmarks in the main browser window and drag and drop a single item to a new location at which point the menu closes.  Upon reopening the menu the item is at the new location.  Under Linux without this patch this really does not work.  While I am dragging the item the menu dismisses itself.

With this patch the Linux behavior is unchanged.  Under windows, with this patch, the windows behavior regresses to how the behavior has always been under Linux.

SO,k I really don't know if this patch broke something that should work, or if there was some kind of hack in pace to make this work better that really only ever worked for windows and no longer works now that this fix is in place.
It sounds to me like being able to drag+drop within the bookmarks menu under Windows was just a bug this patch is fixing. A real fix to that issue would fix both Windows and Linux. 
(In reply to comment #38)
> It sounds to me like being able to drag+drop within the bookmarks menu under
> Windows was just a bug this patch is fixing. A real fix to that issue would fix
> both Windows and Linux. 
> 

I would think so too.  Especially since even the pre-patch windows behavior does not seem correct.  I would expect the browser menu to remain open after a drag and drop of an item within the menu from one place to another, and be dismissed only when you either open an item by clicking on it or click outside the menu.  This makes sense to me and is also the Windows behavior under Firefox 2.

Oddly, there did not seem to be any drag and drop functionality within this menu under Linux Firefox 2.
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.13) Gecko/20060414] (release) (W2Ksp4)
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.1.12) Gecko/20080201 SeaMonkey/1.1.8] (release) (W2Ksp4)
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9a5pre) Gecko/20070515 SeaMonkey/1.5a] (nightly) (W2Ksp4)
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9b4pre) Gecko/2008021702 SeaMonkey/2.0a1pre] (nightly) (W2Ksp4)

(In reply to comment #38)
> It sounds to me like being able to drag+drop within the bookmarks menu under
> Windows was just a bug this patch is fixing.

It may be. Yet I always considered it a feature.
While it's unusual to modify a "menu" in this way, it doesn't bother me as I'm used to this in the Personal Toolbar.

At least, if it's a bug, it's a longstanding one, not a ThreadManager regression.

(In reply to comment #39)
> I would think so too.  Especially since even the pre-patch windows behavior
> does not seem correct.  I would expect the browser menu to remain open after a

(You already said so in comment 37.)

> drag and drop of an item within the menu from one place to another, and be
> dismissed only when you either open an item by clicking on it or click outside
> the menu.  This makes sense to me and is also the Windows behavior under
> Firefox 2.

Yes, on Windows, MozillaAS v1.7.13 to SeaMonkey v2.0a1pre are working this "correct/legacy" way...

[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9b4pre) Gecko/2008021404 Minefield/3.0b4pre] (nightly) (W2Ksp4)

I can't test this: I get an exception and the bookmark is not moved.

Based on what you say, I would consider this a Firefox "regression", but I'm no (FF) expert.
(In reply to comment #36)
> I noticed that right-clicking on a flash plugin, the flash animation now
> continues, instead that it freezes. That's probably an improvement, I guess.

I think it really is the plugin's job to stop the animation while the
context menu is posted if that is the desired behaviour.
It's a minor issue, but I would prefer if it were not random.

> right-click on one of the clips and zoom in/zoom out. Sometimes, it
> doesn't respond to that. 

Yes, clicking in the Flash context menu sometimes doesn't perform the
command, it's a regression from the patch.
Highlighting the items in the menu always seems to work though.

(In reply to comment #37)
> Without this patch, under windows I am able to click on Bookmarks
> in the main browser window and drag and drop a single item to a new
> location at which point the menu closes.

I think it's a bug that the menu closes (Firefox 2 doesn't).

> Under Linux without this patch this really does not work.  While I
> am dragging the item the menu dismisses itself.

Well, it doesn't work the first time.  It works fine the second time and
the menu even stays up.  The patch makes Windows/Linux behave the same.
I think it's an improvement and I'll file a separate bug to make it work
the first time and other DnD problems I've discovered along the way.

Thanks for all the feedback!
Attached patch Patch rev. 4 (obsolete) — Splinter Review
This patch should improve the click problem in the Flash context menu
slightly (note that this is only a problem on Windows).
There are builds available for testing here:
https://build.mozilla.org/tryserver-builds/2008-02-18_14:18-mpalmgren@mozilla.com-1203373028/

I've tested with the latest Flash plugin (r115) on Tier-1 platforms
and also playing some Java games without seeing any problems except the
one noted above.
If someone can test with an older Flash plugin and other plugins
that would be appreciated.  https://pfs.mozilla.org/plugins/
There are also Patch rev.3 builds available if you want to compare:
https://build.mozilla.org/tryserver-builds/2008-02-17_07:28-mpalmgren@mozilla.com-1203262053/
(In reply to comment #37)
> With this patch I see a regression in drag&drop within the bookmarks menu under
> windows.  However, I am not sure that is because of anything this patch is
> doing wrong.  Without this patch, under windows I am able to click on Bookmarks
> in the main browser window and drag and drop a single item to a new location at
> which point the menu closes.  Upon reopening the menu the item is at the new
> location.  Under Linux without this patch this really does not work.  While I
> am dragging the item the menu dismisses itself.
> 
> With this patch the Linux behavior is unchanged.  Under windows, with this
> patch, the windows behavior regresses to how the behavior has always been under
> Linux.
> 
> SO,k I really don't know if this patch broke something that should work, or if
> there was some kind of hack in pace to make this work better that really only
> ever worked for windows and no longer works now that this fix is in place.
> 

If you're trying to describe a bug where the menu closes after a drag and drop, that is bug 395176. Otherwise, I can't read from this comment what you think is a bug.
(In reply to comment #44)
> If you're trying to describe a bug where the menu closes after a drag and drop,
> that is bug 395176. Otherwise, I can't read from this comment what you think is
> a bug.
> 
I see that issue without this patch installed.  With this patch insatlled, the menu closes about 2 seconds after you begin a drag operation within the menu.
bookmarks menu onDragExit code (in browser_places.js) is bogus, i don't think this patch is involved at all, removing the timer code from onDragExit mostly fixes the problem, so we should first fix this, then going to check that code (i've taken a look there yesterday while fixing the dropmarker code)
(In reply to comment #41)
> > Under Linux without this patch this really does not work.  While I
> > am dragging the item the menu dismisses itself.
> 
> Well, it doesn't work the first time.  It works fine the second time and
> the menu even stays up.  The patch makes Windows/Linux behave the same.
Actually, the first time Linux issue is worse than that.  Please see bug 418156.
Mats, we need to land this ASAP if we're going to take it at all (and I think we should!). Is it OK for me and bsmedberg to review now?
Sure, I don't plan any further changes unless testing reveals new problems.
I think we can take the minor regression with the Flash context menu, given
the benefits.  I agree it's a risky change that should land soon.
I think rev.4 is the better of the two, but I can live with rev.3 too,
if for example rev.4 has perf regressions and rev.3 doesn't.
(I worry about the comment Darin made about Ts/Txul).
+  if (NS_UNLIKELY(!mInitialized)) // GetCurrentThread during/after Shutdown
+    return nsnull;

Why? We still process events after mInitialized is set to false. Returning null from GetCurrentThread during all that processing seems wrong.

+      mayWait = PR_FALSE;

So we don't actually want to process the message that just arrived? Why not?

Might want to use a local variable to hold PR_IntervalNow so we don't have to call it twice. It's quite expensive in some systems (especially VMs).

+      PR_IntervalNow() - mLastGeckoEventTime < 2 * THREAD_EVENT_STARVATION_LIMIT) {

Why 2* ?
(In reply to comment #46)
> bookmarks menu onDragExit code (in browser_places.js) is bogus, i don't think
> this patch is involved at all, removing the timer code from onDragExit mostly
> fixes the problem, so we should first fix this, then going to check that code
> (i've taken a look there yesterday while fixing the dropmarker code)
> 
I have a big patch I am testing now that i plan to attach to bug 418156 that removes a lot of bogus code from browser-places.js.  So, unless you have something already working.  I can do that part leaving you to concentrate on the dropmarker issue.
comments about bookmarks menu closing should go in bug 418156. Please, don't spam about bookmarks menu in this report since menu closing problems are not related and Mats has to do his work here.
(In reply to comment #42)
> Created an attachment (id=304209) [details]
This patch no longer applies cleanly. It conflicts with the code checked in for bug 380454.
(In reply to comment #53)
> (In reply to comment #42)
> > Created an attachment (id=304209) [details] [details]
> This patch no longer applies cleanly. It conflicts with the code checked in for
> bug 380454.
> 

Sorry was wrong there.  It is the backout of bug 380454 that caused the issue.
(In reply to comment #54)
> (In reply to comment #53)
> > (In reply to comment #42)
> > > Created an attachment (id=304209) [details] [details] [details]
> > This patch no longer applies cleanly. It conflicts with the code checked in for
> > bug 380454.
> > 
> 
> Sorry was wrong there.  It is the backout of bug 380454 that caused the issue.
> 
Evidently I was right the first time.  The patch for bug 380454 is back in the tree and that is what causes the conflict.
(In reply to comment #50)
> +  if (NS_UNLIKELY(!mInitialized)) // GetCurrentThread during/after Shutdown
> +    return nsnull;
> 
> Why? We still process events after mInitialized is set to false.

Calling NS_GetCurrentThread() leads to:
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/xpcom/threads/nsThreadManager.cpp&rev=3.5&root=/cvsroot&mark=205#195
creating a new nsThread object which tries to register itself:
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/xpcom/threads/nsThread.cpp&rev=1.71&root=/cvsroot&mark=349#342
but 'mLock' is null since nsThreadManager is shutdown:
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/xpcom/threads/nsThreadManager.cpp&rev=3.5&root=/cvsroot&mark=174#169
(I have a stack if you want the details. Olli fixed this in bug 380454.)
Attached patch Patch rev. 5 (obsolete) — Splinter Review
(In reply to comment #50)
> +      mayWait = PR_FALSE;
> 
> So we don't actually want to process the message that just arrived?
> Why not?

It seemed to fix a "hang" I saw on Windows in the following case:
Run "firefox -profilemanager" then hit Enter real fast as soon
as the PM window opens.  This caused a long delay before the main
window opened.  Now that I tested it again it seems this is an
existing bug, but I think the patch made it worse (at least some
versions I tested, so I convinced myself it was my regression).
To be honest, I wasn't sure exactly why the above line fixed it,
so I had to dig a bit...

What happens is that the (modal) PM window exits (ExitModal) in a nested
event loop, but the loop blocking with 'mayWait' == 1 at the end of
OnProcessNextEvent() needs to be fed native events to unblock...
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/widget/src/xpwidgets/nsBaseAppShell.cpp&rev=1.7&root=/cvsroot&mark=240-249#213

Leaving the event (more importantly returning false) unblocked the loop,
we inserted a dummy thread event, then the event loop in ShowModal() saw
'mContinueModalLoop' was false...

I tried a few solutions to that problem, the first one was to simply
unblock the loop if more than 200ms had passed since we started the loop.
This worked well, we rarely hit it and only if the browser is idle so
adding a dummy event wont matter in this case.

Another solution is to make the inner nested event loops set 'mayWait'
to false.  I think I like this approach better.
Builds for Patch rev. 5 are available for testing:
https://build.mozilla.org/tryserver-builds/2008-02-23_07:28-mpalmgren@mozilla.com-1203780477/
Attachment #303446 - Attachment is obsolete: true
Attachment #304209 - Attachment is obsolete: true
Attachment #305231 - Flags: review?(roc)
Attachment #304209 - Flags: review?(roc)
Attachment #304209 - Flags: review?(benjamin)
Attachment #305231 - Flags: review?(benjamin)
BTW, there is still a remaining problem om MacOSX which I'm not addressing.
I filed bug 419199 on it.
(In reply to comment #57)
> It seemed to fix a "hang" I saw on Windows in the following case:
> Run "firefox -profilemanager" then hit Enter real fast as soon
> as the PM window opens.  This caused a long delay before the main
> window opened.  Now that I tested it again it seems this is an
> existing bug, but I think the patch made it worse (at least some
> versions I tested, so I convinced myself it was my regression).

Bug 378358 !!?
+// The mEventloopNestingState values
+#define NS_EVENTLOOP_RUN    0
+#define NS_EVENTLOOP_NATIVE 1
+#define NS_EVENTLOOP_NESTED 2

You should document the meaning of these carefully... "NESTED" doesn't seem like a good name, since "NATIVE" can be nested too, inside NESTED even.

+  if (mEventloopNestingState == NS_EVENTLOOP_NATIVE) {
+    mEventloopNestingState = NS_EVENTLOOP_NESTED;
     return;
   }

Actually I don't understand how this works. We set the state to NESTED and then just return, so does this mean we need another XPCOM event to be posted, to trigger another native event so that NativeEventCallback will get called again?

It seems what we're really doing is tracking what kind of native event loop we're in: either our own ProcessNextNativeEvent, or some internal OS one that we entered somehow. So NS_EVENTLOOP_RUN means we're not in one (top level thread execution), NS_EVENTLOOP_NATIVE means the innermost native event loop belongs to ProcessNextNativeEvent, and NS_EVENTLOOP_NESTED means the innermost native event loop belongs to someone else. Right?

In that case I'd name them something like NS_NATIVE_EVENT_LOOP_NONE, NS_NATIVE_EVENT_LOOP_XPCOM, NS_NATIVE_EVENT_LOOP_OTHER.

Then it seems to me that the best way to set NS_NATIVE_EVENT_LOOP_OTHER would be to set it every time we dispatch an XPCOM event (and clear it afterwards). If a native event loop is triggered inside that XPCOM event, it will have the right status. If it's not triggered, we won't look at the status. How does that sound?

mBlockWait needs documentation too although the approach there seems fine.
(In reply to comment #56)
> (In reply to comment #50)
> > +  if (NS_UNLIKELY(!mInitialized)) // GetCurrentThread during/after Shutdown
> > +    return nsnull;
> > 
> > Why? We still process events after mInitialized is set to false.
> 
> (I have a stack if you want the details. Olli fixed this in bug 380454.)

Yeah, I misread the code, sorry.
Blocks: 331088
Comment on attachment 305231 [details] [diff] [review]
Patch rev. 5

I don't think I'm a good reviewer for this, because I don't have any familiarity with our native-eventloop code. roc should give r+sr or somebody else (Neil?) should give the sr.

As a drive-by nit, however, why is mEvenloopNestingState not an enum, instead of a set of defines?
Attachment #305231 - Flags: review?(benjamin)
(In reply to comment #60)

I'll change the names and document them as requested.
(I'll use your state names below)

> ... so does this mean we need another XPCOM event to be posted, to
> trigger another native event so that NativeEventCallback will get
> called again?

Yes.  FWIW, I've not seen this happen during testing, possibly because
launching a dialog always triggers some XPCOM events which will schedule
the native events we need.

If we want to guarantee there is always a native event we would need
to add a timer for that purpose I think...  I'll try that, arming it
before calling ProcessNextNativeEvent and Cancel it after.
In practice it would rarely fire I think.
(I'm testing it right now on Linux and haven't been able to fire it yet)

> It seems what we're really doing is tracking what kind of native event loop
> we're in: either our own ProcessNextNativeEvent, or some internal OS one...

Correct.

> Then it seems to me that the best way to set NS_NATIVE_EVENT_LOOP_OTHER would
> be to set it every time we dispatch an XPCOM event (and clear it afterwards).

Not sure where you mean... doesn't that mean NativeEventCallback will always
see OTHER as the state?


(In reply to comment #62)
> why is mEvenloopNestingState not an enum

Yeah, I realized that after posting the patch... old habit of trying
to save a few bytes... except this is a singleton, doh!
How about setting the state to NS_NATIVE_EVENT_LOOP_OTHER just before we return from nsBaseAppShell::OnProcessNextEvent, and then reset it to NS_NATIVE_EVENT_LOOP_XPCOM in nsBaseAppShell::AfterProcessNextEvent? So basically it's only set to OTHER during the call to event->Run().
I still wonder about the following situation:
-- Multiple XPCOM events queued
-- First XPCOM event spins up an inner native event loop
-- No native events received and no XPCOM events posted ... so the queued events are not processed

Maybe that's so unlikely to happen that we don't care right now.
Attached file Gtk print dialog stack
(In reply to comment #65)
> How about setting the state to NS_NATIVE_EVENT_LOOP_OTHER just before we
> return from nsBaseAppShell::OnProcessNextEvent

Hmm, that sounds too late to me.  Wouldn't NativeEventCallback() from a
nested event loop always see XPCOM state in that case?  Here's the stack
for the Gtk print dialog on Linux -- it's the DoProcessNextNativeEvent()
call we need to watch out for I think...

(In reply to comment #66)
With XPCOM events queued that is unlikely, but as you pointed out there
is a tiny risk the scheduled NativeEventCallback has been consumed before
entering the inner loop (in theory).
Setting up a timer around ProcessNextNativeEvent will fix that.
Oh yeah, I forgot about Gecko code that's invoked via native events. Silly me.

Ho hum, this is ugly. How about we go ahead with your patch, without the timer? It'll improve things a lot, should handle 99% of cases with less complexity and lower risk of perf regressions.
Attached patch Patch rev. 6Splinter Review
Sure, we can add the timer after beta4.
Changes since rev.5: enum + name changes + documentation.
Attachment #305231 - Attachment is obsolete: true
Attachment #305677 - Flags: superreview?(roc)
Attachment #305677 - Flags: review?(roc)
Attachment #305231 - Flags: review?(roc)
Comment on attachment 305677 [details] [diff] [review]
Patch rev. 6

Make it so!

+    eEventloopNone,  // top level thread execution
+    eEventloopXPCOM, // innermost event loop is ProcessNextNativeEvent
+    eEventloopOther  // innermost event loop is a native library/plugin etc

Make it clear we're talking about *native* event loop here.
Attachment #305677 - Flags: superreview?(roc)
Attachment #305677 - Flags: superreview+
Attachment #305677 - Flags: review?(roc)
Attachment #305677 - Flags: review+
Given the deepness of this patch, the sooner you can land, the better :-). E.g. now would be good :-)
Thanks for wrapping this up, Mats.  :)
mozilla/widget/src/xpwidgets/nsBaseAppShell.cpp 	1.8
mozilla/widget/src/xpwidgets/nsBaseAppShell.h 	1.8 

It passed Tinderbox regression tests, now watching perf...

I filed bug 419630 on adding the timer (for separate tracking).

-> FIXED
Status: NEW → RESOLVED
Closed: 16 years ago
Depends on: 419630
Flags: in-litmus?
Resolution: --- → FIXED
Looks like there was no perf impact. Congratulations!!!
Depends on: 419740
Blocks: 389105
Depends on: 420148
Definitely verified in the latest trunk build!
The issue I mentioned in comment 36, I filed as bug 420148.
Status: RESOLVED → VERIFIED
Especially on OS X I can open the menu but none of the entries are working as long as the native dialog is open. They are working again after the dialog is closed. Actions done in between aren't queued anymore. Should this be filed as a new bug? I cannot find an existing one.
Please file regressions separately and CC me. Thanks.
Depends on: 420315
I filed bug 420315 for the issue mentioned in comment 31.
Blocks: 240709
No longer blocks: 389615
Blocks: 389615
Depends on: 420341
Depends on: 420886
Depends on: 421083
Depends on: 421214
Depends on: 421486
No longer depends on: 421486
No longer blocks: 381255
Depends on: 426324
Depends on: 429304
Blocks: 386099
Is this bug required by bug 331088? nominating for branch while we check.
Flags: blocking1.9.0.6?
Flags: blocking1.9.0.6?
Any chance this might be a potential cause of the delayed plugin tear down crash in bug 470487? Plugin tear down relies on the results of this event nesting tracking.
Depends on: 575515
You need to log in before you can comment on or make changes to this bug.