Closed Bug 53989 Opened 24 years ago Closed 24 years ago

XIM: over-the-spot hangs with VJE3.0 and ATOK X

Categories

(Core :: Internationalization, defect, P3)

x86
Linux
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: masaki.katakai, Assigned: masaki.katakai)

References

Details

(Keywords: hang, inputmethod, intl, Whiteboard: [rtm-])

Attachments

(8 files)

Japanese user reported this problem as

 http://bugzilla.mozilla.gr.jp/show_bug.cgi?id=285

When over-the-spot style, Mozill hangs when

 - click hyper link

or

 - start Composer

It seems that focused window is destroyed and
it causes this problem, some japanese IMEs,
VJE3.0 and ATOK X for Linux do not handle this case
properly and finally Mozilla hangs.

I made a patch already and the japanese user had verified.
I'll ask tajima@eng.sun.com to review the patch,
then post the patch.
Status: NEW → ASSIGNED
The patch was sent to a module owner for code review. Teruko-san,
please update the status whiteboard accordingly for both this
bug and 53990. Thanks.
Though I could not reproduce the problem on HPUX, neither could I see any problem after
I apply the patch. I have no objection to the patch. 
Thanks Shanjian for testing on HP-UX. Can I send
a request for check-in approval with r=shanjian, then?

r=somebody@netscape.com has been wanted.
the code looks ok to me, although I don't have any good way of testing it.
sr=blizzard pending pav looking at it
chris, pavlov already looked at it and he
seemed to give r= with his 10/02 comment.
Keywords: review
No wonder but it has been pending for code review and approval
more than two weeks.

Keywords: approval
pdt, pleaes approve this as RTM++
Keywords: rtm
Whiteboard: [rtm+]
Keywords: crash
Need a real review, a real super review, a description of why this matters for 
the Netscape 6 release, and some information about testing done on the patch.   
 
Whiteboard: [rtm+] → [rtm need info]
shanjian and erik - can you review the code again.
tajima- can you explain why this is important ? what will happen if we don't fix 
it ? what will happen if something wrong in the patch ?
How many people have test it ? Any people from Japan have test the patch ?
In the patch, there is a line like this:

  if(oldw != mBounds.width && oldh != mBounds.height) {

Should it be as follows:

  if ((oldw != mBounds.width) || (oldh != mBounds.height)) {

Also, oldy used to be set to spot.y, which had mCursorPosition.height added to
it, but now it doesn't have the height added. Is this needed?

Has this fix been tested with popular input methods on popular OS's in Japan?
What about Korean and Chinese?
Many users using VJE for Japanese input would hate mozilla without fixing the
problem, at lease 5 people(shanjian, Katakai, 2 japanese users, and myself) has
tested the patch, specifically I tested English/French/Germany locales, too.
HP-UX, Solaris, Redhat6.1/6.2 and FreeBSD are covered.

Erik, the if condition should be corrected as you pointed,
thanks.

As to oldy, now it has previous mCursorPosition.y, and is compared
to new mCursorPosition.y. This logic is correct. But, before the patch,
it has previous mCursorPosition.y + mCursorPosition.height and 
is compared to new mCustorPosition.y. That was wrong, and is corrected.

mark it as rtm+ again. pdt, does tajima's answer ok for you. Call him at 
650-786-9182 if you still have question
Whiteboard: [rtm need info] → [rtm+]
Is erik giving his a= ? Can anyone give a good r= ? Frankly, neither blizzard's
comments nor pav's make me feel confident in this patch to take it on the N6
branch. Marking [rtm need info]
Whiteboard: [rtm+] → [rtm need info]
Let's see a new patch (diff -u format please) with the fix to that logic bug 
Erik found.

/be
pav and blizzard should be able to review this code to make sure it does
not break the non-IME code paths.

erik should be able to review for this including the IME impact.

Masaki (katakai), Will you please review this too?
+    // In over-the-spot style, pre-edit can not be drawn properly when
+    // IMESetFocusWidget() is called at height=1 and width=1

Who sets height and width to 1? Where is that code?

+    // After resizing, we need to call SetPreeditArea() again
+    if(oldw != mBounds.width && oldh != mBounds.height) {
+      GdkWindow *gdkWindow = (GdkWindow*)this->GetNativeData(NS_NATIVE_WINDOW);
+      if (mXIC && gdkWindow) {
+        mXIC->SetPreeditArea(0, 0,
+           (int)((GdkWindowPrivate*)gdkWindow)->width,
+           (int)((GdkWindowPrivate*)gdkWindow)->height);
+      }
+    }
+    oldw = mBounds.width;
+    oldh = mBounds.height;

The 2 statements above should be inside the "if" statement to avoid setting them
redundantly.

+    oldx = compEvent.theReply.mCursorPosition.x;
+    oldy = compEvent.theReply.mCursorPosition.y;

Here again, these 2 should be inside the "if".

!     gPreeditFontset = gdk_fontset_load("-*-*-*-r-*-*-16-*-*-*-*-*-*-*");

Is this change needed to solve the hanging problem? Please limit the patch to
the bare minimum.

  nsWidget::IMEDestroyIC()
  {
    if (!mXIC) return;
+
    if (mIsToplevel == PR_TRUE) {
      delete mXIC;
+   } else {
+     nsWidget *widget = mXIC->GetFocusWidget();
+     if (widget && widget == this && mIMEShellWidget) {
+       mXIC->SetFocusWidget(mIMEShellWidget);
+       mXIC->UnsetFocusWidget();
+     }
    }
    mXIC = 0;
  }

What is this patch doing? Why is this necessary? Is there a bug in VJE? Is this
a workaround for a bug in VJE? Do we need the resizing part as well as this
part? Or is this part enough to solve the hanging problem? What is the
mIMEShellWidget?
The width and height are set to 1 when the widget is set to size 0x0.  That's
illegal in X so we set it to 1x1.
Erik and Chris, thank you very much for reviewing and comments.
Yes, the patch contains changes for 53990. I've attached patch
only for this problem. Only IMEDestroyIC() part is necessary.

I'd like to explain how this problem happens,
when a hyper link is clicked,

(1) nsWidget::IMESetFocusWidget() is invoked for setting focus
Mozilla sends focus_window (the widget) to IME server
IME server creates pre-edit window as a child of the focus_window

(2) the widget (contains the hyper link) will be destroyed
but, IME server keeps the focus_window of (1)

(3) new widget will be created for new page
nsWidget::IMESetFocusWidget() is invoked for setting focus
In ATOKX case, IME server does XReparentWindow() of pre-edit
window for previous focus_window to new focus_window,
however, the previous focus_window is invalid, already
destroyed in (2) and it causes X error.

I think problems exist in IME server, VLE and ATOKX. IME server
may be able to check the focus_window is valid or not before
calling XReparentWindow(). I understand kinput2 does the check.
However, we should put a workaround into Mozilla because VJE
and ATOKX are already shipped and used, and it would be difficult
to ask IME vendor to fix this problem.

mIMEShellWidget is a shell widget of Mozilla window. IC is shared
among widgets on the shell widget.

nsWidget::IMEDestroyIC()
{
  if (!mXIC) return;
  if (mIsToplevel == PR_TRUE) {
    delete mXIC;
  } else {
    nsWidget *widget = mXIC->GetFocusWidget();
    // when focused widget == this widget
    // and shell widget exists
    if (widget && widget == this && mIMEShellWidget) {
      // set focus window is the shell widget
      // to clear the previous focus_window
      mXIC->SetFocusWidget(mIMEShellWidget);
      // call gdk_im_end()
      mXIC->UnsetFocusWidget();
    }
  }
  mXIC = 0;
}

Masaki, thanks for the explanation. I still have questions.

> when a hyper link is clicked,
>
> (1) nsWidget::IMESetFocusWidget() is invoked for setting focus
> Mozilla sends focus_window (the widget) to IME server
> IME server creates pre-edit window as a child of the focus_window

Which hyperlink are you clicking? Any hyperlink?

Which widget is being focussed? A text field? Which one? Or a shell?

In the fix, why are you setting the focus to the shell widget and then unsetting
the focus? Would it work if you only unset the focus (without setting focus to
the shell)? Or do you have to do both of those (set focus to shell and unset
focus)?

How popular do you believe VJE and ATOK are on Unix? What percentage of users
are using them? (Please give us a rough estimate.)
Hi Erik,

> Which hyperlink are you clicking? Any hyperlink?

Yes, Any hyperlink.

> Which widget is being focussed? A text field? Which one? Or a shell?

Widget which has a hyperlink. Not only text field.
Currently widget that may take an input focus
will have an IC. The IC is shared in shell widget.

> Would it work if you only unset the focus (without setting focus to
the shell)? Or do you have to do both of those (set focus to shell and
unsetfocus)?

good question. I missed to mention about this. I was also thinking
this problem could be solved by only unsetfocus. However, it couldn't.
It seems that IM server still keep the previous focus_window even
when we call unsetfocus.

> How popular do you believe VJE and ATOK are on Unix? What percentage of users
are using them? (Please give us a rough estimate.)

I guess about 5%. kinput2 based IME engines are free so bundled
in Linux distribution. VJE and ATOK X are not free. I think
it is important to fully support such commercial IMEs. In addition,
this problem was posted in the major mailing-list of Linux in Japan.
So many people may think such serious problem will be fixed by RTM.

Is it better to ask submitter of the original problem to verify the
revised patch again? The original patch was verified one month ago.
I've verified the patch with ATOK X. However, I don't have VJE.
This may also affect non-Linux Unices because some commercial Unices
have licensde and bundled ATOK and/or VJE.

cc'ing jdunn and jaworski.
> > Which widget is being focussed? A text field? Which one? Or a shell?
>
> Widget which has a hyperlink.

Let's suppose we have an HTML document with a hyperlink. Which widget is then
being focussed? The widget that surrounds the entire HTML document?

> Is it better to ask submitter of the original problem to verify the
> revised patch again? The original patch was verified one month ago.
> I've verified the patch with ATOK X. However, I don't have VJE.

Yes, please ask the original bug reporter to verify the latest fix with VJE.

When a document presentation is being destroyed, are the widgets destroyed in a
particular order? For example, small internal widgets (subwindows) first, and
then the shell last? I'm just concerned about setting the focus on the shell
widget (mIMEShellWidget) when IMEDestroyIC is being called. Are you absolutely
sure that mIMEShellWidget is valid at that point and not in the process of being
destroyed itself?
> Let's suppose we have an HTML document with a hyperlink. Which widget is then
> being focussed? The widget that surrounds the entire HTML document?

not exact HTML widget. It will be called from

http://lxr.mozilla.org/mozilla/source/layout/html/base/src/nsPresShell.cpp#2167

I don't have any idea what this widget is. What is domWindow?

> Yes, please ask the original bug reporter to verify the latest fix with VJE.

I've asked. But I couldn't get any reply from him. I'll ask again.

> When a document presentation is being destroyed, are the widgets destroyed in a
> particular order? For example, small internal widgets (subwindows) first, and
> then the shell last? I'm just concerned about setting the focus on the shell

I verified the shell widget is always destroyed at last.

> widget (mIMEShellWidget) when IMEDestroyIC is being called. Are you absolutely
> sure that mIMEShellWidget is valid at that point and not in the process of being
> destroyed itself?

If mIMEShellWidget (mIsToplevel=PR_TRUE) is destroyed, IMEDestroyIC() just
call delete mXIC;

 if (mIsToplevel == PR_TRUE) {
    delete mXIC;
  } else {
  }
  mXIC = 0;

Even if mIMEShellWidget is destroyed before (by above codes),
mXIC is 0 so that the following code will return. The codes
about setting focus will not be executed.

  if(!mXIC) return;

r=erik

OK, I am satisfied that the 2nd patch (10/29/00 21:47) is safe and necessary.
I am also satisfied with Masaki's testing on ATOK. I believe we should get this
into the branch. If desired, we can get this into the trunk first for further
testing. Brendan, do we have permission to land the 2nd patch on the trunk? PDT,
do we have permission to land it on the branch? Marking [rtm+] to get it on
PDT's radar.
Whiteboard: [rtm need info] → [rtm+] [r=erik]
Umm, you need a SR= as well to get on rtm+.

Get it quickly before the PDT bitchslaps you back to rtm need info.
Whiteboard: [rtm+] [r=erik] → [rtm+] [r=erik], NEED SR=
Erik was not explicit but meant to ask Brendan for sr= in Erik's last comment.
The second patch has been carefully reviewed by Erik:
  http://bugzilla.mozilla.org/showattachment.cgi?attach_id=18269
Changing back to [rtm need info] since we don't have an SR= yet. Brendan? Chris?
Whiteboard: [rtm+] [r=erik], NEED SR= → [rtm need info] [r=erik], NEED SR=
I can sr=brendan@mozilla.org the second patch.  Is the first patch unnecessary
even on the trunk?

/be
Whiteboard: [rtm need info] [r=erik], NEED SR= → [rtm+] [r=erik], NEED SR=
Whiteboard: [rtm+] [r=erik], NEED SR= → [rtm+] [r=erik], [rtm+]
The first patch had pieces that were necessary for bug 53990. I've added comments
to that bug.
Whiteboard: [rtm+] [r=erik], [rtm+] → [rtm+] [r=erik], sr=brendan
Erik, Please check into the trunk.
Xianglan, this is the bug I was talking before.  I do not have ATOK installed in
my linux machine.  After erik checks in trunk build, would you verify this in
the trunk build?  Thanks.
Fix checked into trunk. Not marking FIXED yet, since it's not in the branch yet.
This second minimal patch is very safe:
 (1) It's Unix only
 (2) Will only be executed if an IME input context is being used (i.e. only
     for Japanese, Chinese or Korean input)
 (3) Just sets and unsets focus on an IME widget to make sure 3rd party IMEs
     know the input focus window is invalid

ATOK and VJE are 2 of the most popular Windows Japanese IMEs that have been
ported to Linux and other Unix systems.  Without this fix, these Mozilla is
unusable with these IMEs.

In case PDT is reviewing without the diff attachment, here is the diff:
diff -c -r1.242 nsWidget.cpp
*** nsWidget.cpp        2000/10/28 22:16:45     1.242
--- nsWidget.cpp        2000/10/30 05:42:31
***************
*** 3146,3151 ****
--- 3146,3157 ----
    if (!mXIC) return;
    if (mIsToplevel == PR_TRUE) {
      delete mXIC;
+   } else {
+     nsWidget *widget = mXIC->GetFocusWidget();
+     if (widget && widget == this && mIMEShellWidget) {
+       mXIC->SetFocusWidget(mIMEShellWidget);
+       mXIC->UnsetFocusWidget();
+     }
    }
    mXIC = 0;
  }
Thank you for integration, Erik.

Xianglan, to set over-the-spot, please specify in your
prefs.js as below,

user_pref("xim.input_style", "over-the-spot");

Clicking a hyper link causes Mozilla hangs without this fix
because ATOKX XIM server exits by X error.
Masaki,
Please add comments on any other testing you or others have done.  Thanks.
Compared 2000110308-Mtrunk and 2000103009-mn6 build.
On my RH6.2J, where atok12 is installed, Netscape 6 doesn't hang even without
the fix.
And after the fix, Netscape 6 doesn't have this problem either. 
Shanjian,
How about HPUX? Could you check today's trunk build on HPUX?
Thanks.
Shanjian, When you tried this on HP earlier were you using over-the-spot:
  user_pref("xim.input_style", "over-the-spot");
Actually, the pref for over-the-spot style doesn't make any difference for linux build on my environment.
IME still behaves in the same way as on-the-spot after I include the pref in prefs.js file.
As I mentioned before, this problem does not exist on HPUX. I did set
over-the-spot for
HPUX, that is the only IME option working for HP. I am trying to reproduce the
problem on
Linux now.
I just tried atok12 which I found from redhat6.2J commercial CD, 
I could not reproduce the problem. We might have to ask Katakai
to verify it.
Katakai-san and Tajima-san,
would you please verify this fix in today's trunk build? Thanks.
I've verified this problem on 2000110221 with ATOK X.

I can reproduce on 2000102621 but not on 2000110221.

ji and Shanjian, ATOK12 in Redhat62J CD is "ATOK12SE",
which uses kinput2 interface and it's not "ATOK X".
I'll try Solaris build on Monday. Also I'll get VJE IME.
PDT marking [rtm++]. Please check in to the branch ASAP.
Whiteboard: [rtm+] [r=erik], sr=brendan → [rtm++] [r=erik], sr=brendan
Erik is gone for the day. Bob Jung asked me to check this in.

I'm working with Robert Churchill to build a current branch with the patch.

As soon as we build and do normal checking testing it will be checked in.
Fixed.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Masaki (katakai), per Teruko's request I am letting you know that this got 
into the branch build.
Robert and Brian,  Thanks for the getting this checked-in on very short notice.

Masaki and/or Toshi,
We do not have an environment internally to test this bug.  We don't have
ATOK X or VJE for Linux.  Please test and mark this bug VERIFIED (assuming
you can verify it).

We can regress to make sure kinput2 still works.

Also, please make sure the original bugzilla-j bug is updated, so folks in
Japan know this has been fixed on the trunk.
I'm very sorry I found there is a theoretical error in nsWindow.cpp.

> I verified the shell widget is always destroyed at last.

My comment above is true for general widgets, however the order
of destroying ICs is changed when the widgets have child widgets.

IMEDestroyIC() is called from nsWindow::DestroyNative() at

  http://lxr.mozilla.org/mozilla/source/widget/src/gtk/nsWindow.cpp#304

But it's *before* DestroyNativeChildren() which would call
nsWindow::DestroyNative() again.

  http://lxr.mozilla.org/mozilla/source/widget/src/gtk/nsWindow.cpp#309

That means IMEDestroyIC() is called for parent, then IMEDestroyIC()
is called for child. The order is not valid. In that case, only mXIC of
shell widget is destroyed and set to 0, mXIC of child widget is just a
reference and not set to 0, so it may cause critical error.

So, I had to change nsWindow.cpp with the patch. IMEDestroyIC()
should be called after calling DestroyNativeChildren() from
nsWindow::DestroyNative(). I sincerely appoligize for that.
I'll attach an additional patch.
That looks reasonable to me.

What is the behaviour when there is no shell widget that is managed by Mozilla?
 I'm specifically talking about an embedding context where the toplevel window
is not a mozilla window.  Use TestGtkEmbed as an example if you want to see.
The theory sounds reasonable to me, but if it is true, why didn't the build
with the previous fix (2nd patch in this report) ever crash?

My gut feeling is that we haven't really tested all this with enough variations
yet. I'm somewhat uncomfortable about leaving it checked into the branch. Maybe
we should consider backing it out.

Or we could check in the new fix (3rd patch), but we've had even less soak time
on that one.

Reopening. Sorry.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Masaki (or anyone),
Please create and try a testcase for your theory and see what actually happens?
When I put some debug message in IMEDestroyIC() as below,

void
nsWidget::IMEDestroyIC()
{
printf("IMEDestroyIC %x\n", this);

 if (!mXIC) return;
printf("mXIC = %x\n", mXIC);
 if (mIsToplevel == PR_TRUE) {
printf("delete mXIC %x\n", mXIC);
  delete mXIC;
 } else {
  // see discussion in bug 53989
printf("accessing mXIC %x\n", mXIC);
  nsWidget *widget = mXIC->GetFocusWidget();
  if (widget && widget == this && mIMEShellWidget) {
    mXIC->SetFocusWidget(mIMEShellWidget);
    mXIC->UnsetFocusWidget();
  }
 }
 mXIC = 0;
}

And I tried Search->Find In This Page and clicked Cancel button,
which shows the following,

IMEDestroyIC 862d178
mXIC = 86be988
delete mXIC 86be988        # mXIC is deleted here
IMEDestroyIC 86a8058
mXIC = 86be988
accessing mXIC 86be988 # however, trying mXIC->GetFocusWidget()

It is obviously an access error.  I've verified this also happens at
closing Preference Dialog. I've never seen any crash but it should be fixed.
If I could run runtime checker of Sun's compiler (build fails with the compiler
now...),
it could notice the error easily.

With the patch of nsWIndows.cpp, I've verified the order of destroying
nsWidget is correct for all possible windows and dialogs of Mozilla,
e.g. dialogs from Composer, Mail and Address Book.

I'm seeing the irregular case in Bookmark Properties dialog that
three popup windows (for menu) are destroyed after the shell widget
is destroyed. However, it this case, I've also verified that popup window
will be never focused. So it does not cause problem because mXIC will
be never created.
per email from Erik, this fix will be backed out for RTM.  Marking rtm-.
Erik - pls note in this bug report when you have done that.
Whiteboard: [rtm++] [r=erik], sr=brendan → [rtm-] [r=erik], sr=brendan
I backed out the fix from the RTM branch.
For the trunk, I think we should consider keeping the patch, and applying the
additional patch to make sure the objects are destroyed in the correct order.
I will provide review comments for that patch shortly.
I agree and was just going to add that comment myself!
I had a look at the 3rd patch (11/04/00 01:35). I understand that calling
IMEDestroyIC after calling DestroyNativeChildren will cause IMEDestroyIC to be
called for the children before it is called for the top-level parent, and since
they all share the same mXIC, we only delete that object when IMEDestroyIC is
called for the top-level parent, but it seems like the code might still be a bit
fragile even after this change.

Since several objects are all pointing to one object, we need to be very careful
about who is going to actually delete it, so that the other objects are not
holding on to something that has already been deleted. In Mozilla, we often use
normal ref-counting to make sure objects are not freed until there are no more
objects referring to that one. Mozilla also has "weak references". Look for
WeakReference.

Are we sure that all child objects will always be destroyed before a top-level
parent is destroyed? For example, what if we re-parent a child or group of
children? Or, if we are not doing that now, what if we do that in the future?
If this is really the case I would suggest that the object be cached at the
toplevel window.  This way, no child window will ever have a stale reference and
there won't be any of these kinds of race conditions.

Is there a chance that this could be a singleton object?  Does more than one of
these exist for the entire application?
It is my understanding that mXIC is a wrapper around GdkIC, which is itself a
wrapper around XIC. You can only associate one "client" window with an XIC, and
it may not be changed. Apps generally have one XIC per top-level window or one
XIC per widget inside a top-level window. For Mozilla, the Sun folks have
implemented one IC per top-level window.

In the embedding case, Mozilla may be embedded inside an app that does not use
nsWidgets. In this case, we may wish to cache the mXIC at the highest nsWidget.
The child widgets could access the IC by calling a method that goes up the
hierarchy to the top nsWidget. Or each child could store its own pointer to the
IC, but it should then be ref-counted for safety. I'd be happy to go either way
on this. Comments? Blizzard? Katakai?
Thanks for comment, Erik. Yes, we can use the highest widget
as the shell widget to control XIC. Chris has suggested
GetMozArea() to retrieve the widget. I attached the patch
in

http://bugzilla.mozilla.org/show_bug.cgi?id=50130

and it works in both Mozilla and GtkEmbed.
OK, but the patch in bug 50130 does not address the fact that all widgets have
a pointer to a single top-level mXIC object, which is dangerous, as we have
seen. At the very least, we need to apply the 3rd patch in this report
(11/04/00 01:35) to destroy objects in the right order. We may wish to have an
additional patch that either stores the mXIC in *one* place (highest widget)
and has the children fetch that pointer by climbing the hierarchy, or changes
mXIC to a ref-counted object (e.g. XPCOM object that implements nsISupports).

Perhaps there is no need to go for the ref-counting method, so how about
climbing the hierarchy. For example (pseudo-code):

  nsXIC
  nsWidget::GetInputContext(void)
  {
    nsWidget* parent = GetParent();
    if (parent) {
      return parent->GetInputContext();
    }
    else {
      return mXIC;
    }
  }
Erik, Thanks for suggestion. Yes, we should store XIC in one place. However,
I remember that getParent() of nsWidget method can not be used to
retrieve the highest widget so we made the current GetShellWidget() codes.
I don't think the current GetShellWidget() codes and new codes of
using GetMozArea() work properly when the highest widget is destroyed
before child. Can we use hashtable to store ShellWidget and XIC pair?
We can store XIC value in the hashtable and retrieve by ShellWidget key.
Even when shell widget is destroyed before child irregularly, we can
prevent the access error.

nsIMEGtkIC*
nsWidget::GetInputContext()
{
  if (mIMEShellWidget) {
    return (nsIMEGtkIC*) g_hash_table_lookup(mXICLookupTable, mIMEShellWidget);
  }
  return nsnull;
}

void
nsWidget::IMEDestroyIC()
{
  nsIMEGtkIC *xic = GetInputContext();
  // we can prevent bad access (xic->GetFocusWidget()) when
  // mIMEShellWidget is destroyed before
  if (!xic) return;
  if (mIsToplevel == PR_TRUE) {
    // remove XIC from hashtable by mIMEShellWidget key
    g_hash_table_remove(mXICLookupTable, mIMEShellWidget);
    delete xic;
  } else {
    // see discussion in bug 53989
    nsWidget *widget = xic->GetFocusWidget();
    if (widget && widget == this && mIMEShellWidget) {
      xic->SetFocusWidget(mIMEShellWidget);
      xic->UnsetFocusWidget();
    }
  }
}
I got a report that the original submitter has verfied
XIM works without any errors on the latest build on
Debian and FreeBSD.

 http://bugzilla.mozilla.gr.jp/show_bug.cgi?id=285
Maybe I'm misunderstanding, but that suggestion seems very strange to me. When
would a widget be destroyed before one of its children? Should that ever
happen?

Also, your hash table is called mXICLookupTable, implying that it is a member
of nsWidget. Does this mean that each widget contains such a hash table?

Or would the hash table be global? If so, how can we tell whether an entry in
the hash table refers to a valid shell widget?

I don't like this idea at all. I'd rather solve the "parent dying before child"
problem (if there is such a problem).
I'm also misunderstanding, sorry. I was thinking you concerned about
the irregular case. Yes, I never seen such case when I tried
possible window and dialog. So, the exact problems are,
 - each nsWidget has a reference of mXIC
 - each nsWidget has a reference of mIMEShellWidget
and we should try to store mXIC into one place, right?
How about the following scenario?
 - Child nsWidget does not have a reference of mIMEShellWidget.
   When nsWidget wants to acess the shell widget,
   call nsWindow::GetMozArea() for each time
 - Child nsWidget does not have a reference of mXIC.
   To get mXIC, get shell widget first by above,
   then return widget->mXIC if exists
   To store mXIC, get shell widget first by above,
   then store as shhe widget->mXIC
However, GetMozArea() has a cache of the widget, which
means we will still have a reference of IMEShellWidget.
Is this a problem? We will never meet the irregular case,
so can we trust IMEShellWidget is always valid?
Or should we use another GetMozArea() which doesn't have
a cache? But it would cause performance regression because
gdk_window_get_parent() is repeated every time.
It seems like it's OK to assume that the child widgets are destroyed before the
parent widget, right? If so, GetMozArea can cache the shell widget, and it
should be OK.
Erik, don't make the assumption that widgets are destroyed top down in all
cases.  In the case where it's embedded top toplevel window ( not the mozarea )
will be destroyed before the child widgets are and there's a good chance that
the native underlying windows will be destroyed as well.

This means that you have to make sure that you remove all of the native
references in a way that it can handle either case - top down or bottom up.
The shell widget is passed to XCreateIC() as client_window
at XIC creation. The window should be valid while the XIC
is being used. So, from XIM point of view, in either case,
top down or bottom up, the mXIC should be destroyed when
the shell widget is destroyed.

So, how about using hashtable again? We need to have a way to
check shellwidget is valid or not at destroying. I understand
we can not use the reference of shellwidget at creation, also
can not use GetMozArea() at that time. If we could manage
a shellwidget - XIC pair in hashtable, we can remove the entry
at destruction so that we can check the shellwidget is valid or not.
(nsWidget will still have a reference of shellwidget, but will
not use it if no entry is in the hashtable) I think using hashtable
is reasonable solution.

 - hashtable is global
 - when XIC is created, insert the XIC with shellwidget key to hashtable
 - each nsWidget has a reference of the shellwidget
 - each nsWidget can retrieve XIC from the hashtable with shellwidget key
 - when shellwidget is destroyed, remove XIC from hashtable
 - after shellwidget is destroyed, no entry in hashtable for XIC
   child widget has a reference of shellwidget, but does not understand
   it's valid or not. child widget can know the shellwidget is valid
   by checking the hashtable. If entry is in hashtable, it means
   the shellwidget is valid, also XIC is valid. If not, the reference of
   shellwidget can not be used, which means XIC is already destroyed.

I'll attach temp patch.

I don't know enough about widgets and their hierarchies to say whether the hash
table approach is the best design. Chris, what do you think?
Updated Status Whiteboard and removed "[r=erik], sr=brendan", since we are
working on a new patch.
Whiteboard: [rtm-] [r=erik], sr=brendan → [rtm-]
Chris, any comments on the patch of
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=19148 ?
This also can solve the embeded problem, bug 50130.

Added intl and nsbeta1 keywords.
Keywords: intl, nsbeta1
Keywords: crashfreeze
Keywords: freezehang
Changed QA contact to katakai@japan.sun.com
QA Contact: teruko → katakai
I updated the patch. This also can solve bug 50130 (gtkembed problem),
 - use hashtable to manage xic
 - proper way to get shell widget (see bug 50130)
 - correct destroying order
I'll review this soon but I'm in the process of moving.  It will be a few days.
Katakai-san,

It seems that with this code that there is a lot of room for improvement.  For
example, there are a few different ways to get an Input Context:

nsIMEGtkIC *xic = IMEGetInputContext();
xic = nsIMEGtkIC::GetXIC(mIMEShellWidget, gPreeditFontset,

Can't we move that all to a single method with one way to get the XIC?  I also
see this code:

  nsIMEGtkIC *xic = IMEGetInputContext();

  if (!xic) {
    GetXIC();
  }

  xic = IMEGetInputContext();

Which says "get the XIC, if it failed do initialization via GetXIC() and reget
the XIC."

Can't you move that implicit initialization into IMEGetInputContext() and remove
the check?  There might be other places where this isn't done and that check
isn't made.  Might that cause problems?

You also might want to use the mozilla hash table or an nsVoidArray which I'm
pretty sure is a hash table underneath.  In my humble opinion it's much more
readable.

Are there any cases where we are doing XIM pre-edit over anything that is not a
native Mozilla window?  That is, is this something that can be moved into the
nsWindow class?  We are doing some bad casting here back and forth and it gives
me the willies.
brendan informs me that the void array isn't a hash table and he's right.  You
should probably use the has table implementation in xpcom if possible.
Some comments that I hope are helpful:

You can make mXICLookupTable be statically allocated, not just a static pointer
to a dynamically allocated glibc hash table.  BTW, you should use the 'g' rather
than 'm' prefix for such static "globals": gXICLookupTable.

Last time I checked, glib hash tables used chaining, so were malloc-intensive,
and used a division hash.  xpcom/ds/pldhash.[ch] uses double hashing (one
malloc for the all entries in the table) and multiplicative hashing (which beats
division hash in cycles and effectiveness in spreading keys to avoid
collisions).

While you could use xpcom/ds/nsHashtable.h, until the fix for bug 72722 goes in,
that incurs lots of malloc overhead, not only per entry but per entry key!  If
you can tolerate the complexity, pldhash.h usage wins on memory, minimizing
malloc overhead on all fronts.  Here is a sketch, showing the glibc hash table
code in comments, followed by the pldhash alternative:

#include "pldhash.h"

struct nsXICLookupEntry : public PLDHashEntry {
  nsWidget*   mWidget;
  nsIMEGtkIC* mXIC;
};

//GHashTable *nsWidget::mXICLookupTable = NULL;
PLDHashTable nsWidget::gXICLookupTable;

//if (mXICLookupTable == NULL) {
//  mXICLookupTable = g_hash_table_new(g_direct_hash, g_direct_equal);
//}
//XXX we don't check for out-of-memory errors here, in either version.
  if (gXICLookupTable.ops == NULL) {
    PL_DHashTableInit(&gXICLookupTable, JS_DHashGetStubOps(), NULL,
                      sizeof(nsXICLookupEntry), 16))
  }

   // g_hash_table_insert(mXICLookupTable, mIMEShellWidget, xic);
      nsXICLookupEntry* entry =
        NS_STATIC_CAST(nsXICLookupEntry*,
                       PL_DHashTableOperate(&gXICLookupTable, mIMEShellWidget,
                                            JS_DHASH_ADD));
      if (entry) {
        entry->mWidget = gIMEShellWidget;
        entry->mXIC = xic;
      }

//  return (nsIMEGtkIC*) g_hash_table_lookup(mXICLookupTable, mIMEShellWidget);
    nsXICLookupEntry* entry =
      NS_STATIC_CAST(nsXICLookupEntry*,
                     PL_DHashTableOperate(&gXICLookupTable, mIMEShellWidget,
                                          PL_DHASH_LOOKUP));
    return entry->mXIC;

//  g_hash_table_remove(mXICLookupTable, mIMEShellWidget);
    PL_DHashTableOperate(&gXICLookupTable, mIMEShellWidget, PL_DHASH_REMOVE);

Where should the hash table be destroyed?  It appears to be leaked in the patch.

/be
Thank you very much for comments, Chris and Brendan.
I'll post new patch if ready.

The reason why I used GHashTable is I found the hashtable
is used in nsWindows.cpp. Should we file new bug and
try to fix this as well?

// this is a hash table that contains a list of the
// shell_window -> nsWindow * lookups
GHashTable *nsWindow::mWindowLookupTable = NULL;

The hash table that's in nsWindow.cpp is my code and I probably should have used
the correct hash table when I did it.  I'm not that worried about fixing the
code that is in there right now but I am worried about making sure that the
pattern isn't replicated if possible.
*** Bug 71796 has been marked as a duplicate of this bug. ***
Thanks for comments, Chris and Brendan,

I've updated my patch, sorry for late, please review.

- Use PLDHashTable instead of GHashTable
- Most XIM codes have been moved to nsWindows from nsWidget
- XIC is retrieved by IMEGetInputContext() and is only created
  by IMEGetInputContext() when it's called from IMESetFocusWindow(),
  which is being used as a trigger to create XIC
- mIMEShellWindow is also retrieved only in IMESetFocusWindow()
- removed unnecessary retrieving XIC, for example,

  a() {
    nsIMEGtkIC *xic=IMEGetInputContext();
    b();
  }

  b() {
    nsIMEGtkIC *xic=IMEGetInputContext();
    ...
  }

  has been changed to the following if possible case,

  a() {
    nsIMEGtkIC *xic=IMEGetInputContext();
    b(xic);
  }

  b(nsIMEGtkIC *xic) {
    ...
  }

And I've verified the following with the new patch,

- verified no errors and warnings with gcc, with -DUSE_XIM without -DUSE_XIM
- verified japanese input works on Linux and Solaris
- verified the original bug does not happen on Linux with ATOKX
- verified japanese input works on gtkEmebed on Linux and Solaris,
  on both on-the-spot and over-the-spot mode (with patch of bug 66951,
  but I found focus problem on gtkEmebed on nightly. see bug 76617)

But, I didn't put deletion codes of the hash table.
Chris, do you have idea about the deletion point ? 

Attached patch updated patchSplinter Review
Since, this is P3 | Critical, and it looks like we may have a patch, can we get
this in for nsbeta1? Please assign for M0.9.1
Updated the patch for today's nightly. (some updates in nsWidget.cpp and
nsWindow.cpp). Please review.
Attached patch updated patch.Splinter Review
You should change this code:

+  GtkWidget *top_mozarea = aWindow->GetMozArea();
+  if (top_mozarea) {
+    mozAreaWindow = (nsWindow *)gtk_object_get_data(GTK_OBJECT(top_mozarea), "n
sWindow");
+  }
 
to use NS_STATIC_CAST() like so:

mozAreaWindow = NS_STATIC_CAST(nsWindow *,
gtk_object_get_data(GTK_OBJECT(top_mozarea), "nsWindow"));

That's the pattern we use other places.

@@ -723,14 +609,11 @@
 /* virtual */ void
 nsWidget::LoseFocus(void)
 {
+exit(1);
   // doesn't do anything.  needed for nsWindow housekeeping, really.
   if (mHasFocus == PR_FALSE)
     return;
 
You probably want to remove that exit(1) in there. :)

Why are nsWidget::IMEComposeStart(), IMECommitEvent(), and IMEComposeText()
still included if they aren't called and are just assertions?  Are there other
dependencies that I don't see?  If they are truly unused, please remove them.

+struct nsXICLookupEntry : public PLDHashEntryHdr {
+  nsWindow*   mShellWindow;
+  nsIMEGtkIC* mXIC;
+};

This is a valid C+ but it's kind of clunky to me.  Since you are using a struct
can you use:

struct nsXICLookupEntry {
  PLDHashEntryHdr  keyHash;
  nsWindow        *mShellWindow;
  nsIMEGtkIC      *mXIC;
};

When I looked at the PLD hash table header file I was kind of confused by this
because of the use of public: instead of just putting it in the structure.

+/* virtual */ void
+nsWindow::LoseFocus(void)
+{
+  // doesn't do anything.  needed for nsWindow housekeeping, really.
+  if (mHasFocus == PR_FALSE)
+    return;
+
+#ifdef USE_XIM
+  IMEUnsetFocusWindow();
+#endif // USE_XIM
+  
+  sFocusWindow = 0;
+  mHasFocus = PR_FALSE;
+
+}

You should move the sFocusWindow = 0 and the mHasFocus related code back down to
nsWidget if possible in case it ends up getting used from there again.

In nsWindow::IMEGetShellWindow(void) you have another old style case that should
be an NS_STATIC_CAST().

+  textEvent.widget = (nsWidget *)this;

Another old style cast.  There are a few of those that you should fix.

Other than that the patch looks OK to me.  Let me do some testing.

The patch doesn't seem to cause any problems that I can see.  r/sr=blizzard
Thank you very much for comments.

I have fixed the codes except LoseFocus(). I'm sorry
I don't have good idea for LoseFocus(). Do you have?

I'll attach the diffs from the *previous* patch just for quick
reference.

- fixed old style cast
- removed exit()
- change to the following for nsXICLookupEntry,
struct nsXICLookupEntry {
  PLDHashEntryHdr mKeyHash;
  nsWindow*   mShellWindow;
  nsIMEGtkIC* mXIC;
}; 
- removed Unnecessary virtual functions from nsWidget
  only IMECommitEvent() is needed

I'll commit the changes after testing is completed on Linux.
Thank you very much for review.
fix checked in
Assignee: tajima → katakai
Status: REOPENED → NEW
change status to FIXED
Status: NEW → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
marked this as verified. Mozilla with ATOK12 works fine also
I've never heard such problem from users.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: