Closed Bug 297943 Opened 19 years ago Closed 4 years ago

Unable to enter Plane 1 (SMP, Supplemental Multilingual Plane) text - typing is ignored.

Categories

(Core :: DOM: UI Events & Focus Handling, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9alpha1

People

(Reporter: crookedcricker, Assigned: smontagu)

Details

(Keywords: intl)

Attachments

(3 files, 3 obsolete files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.8b2) Gecko/20050615
Build Identifier: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.8b2) Gecko/20050615

Using a plane 1 character map, such as Shavian, does not work in any Mozilla (or
Firefox or Thunderbird) text editor.  All plane 1 characters are ignored when
typed.  Such characters are displayed correctly within web pages and emails, and
you can copy from another editor and paste into Mozilla text boxes, location
bar, or the mail editor, etc. without a problem.  

Reproducible: Always

Steps to Reproduce:
1. Download plane1test keymap for xkb and copy it to /etc/X11/xkb/symbols/pc/
2. $ setxkbmap plane1test
3. Type some random text into any Mozilla text entry location.

Actual Results:  
Text is ignored.  Cursor does not move.

Expected Results:  
Various plane 1 characters should appear.

This is with the gtk2 and xft build, which is needed for good plane 1 support.
Severity: major → normal
Keywords: intl
I looked into this issue some time ago when it was mentioned on one of the
Mozilla newsgroups, but I didn't have the resources then to debug it properly. I
remember finding a couple of places by code inspection where Unicode values from
keyboard entry were being truncated to the 0-0xFFFF range.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Attached patch Patch (obsolete) — Splinter Review
With this patch unshifted and shifted keys work, but CapsLock doesn't. I'm not
sure if that's a fault with the patch or with the keymap.
Assignee: mozeditor → smontagu
Status: NEW → ASSIGNED
Attachment #186735 - Flags: superreview?(neil.parkwaycc.co.uk)
Attachment #186735 - Flags: review?(daniel)
Comment on attachment 186735 [details] [diff] [review]
Patch

r=daniel@glazman.org
looks ok to me (editor part); I recommend getting another review for the gtk2
part.
Attachment #186735 - Flags: review?(daniel) → review+
(In reply to comment #3)
> With this patch unshifted and shifted keys work, but CapsLock doesn't.

That is, what CapsLock does is return the unshifted mapping converted to
uppercase. If that is expected behaviour on gtk, then fine.

Target Milestone: --- → mozilla1.9alpha
(In reply to comment #3)
> Created an attachment (id=186735) [edit]
> Patch
> 
> With this patch unshifted and shifted keys work, but CapsLock doesn't. I'm not
> sure if that's a fault with the patch or with the keymap.

If you're using the keymap I supplied, then it's probably the keymap.  I didn't
bother with capslock handling.  Besides, many of the SMP scripts don't even have
upper or lower case, so I'm not sure how to handle the capslock key with e.g.
Shavian, which has no case.

I'm attaching the Shavian keymap I use, for an example of a complete map for one
script.
Attached file Shavian keymap (xkb)
here are the results of typing with the shavian keymap:

unshifted:
That extra replacement character at the end is probably an effect of bug 237585,
which makes editing SMP text very difficult, particularly on pango, because the
whole string will disappear if it contains any illegal codepoints.
(In reply to comment #9)
> That extra replacement character at the end is probably an effect of bug 237585,
> which makes editing SMP text very difficult, particularly on pango, because the
> whole string will disappear if it contains any illegal codepoints.

The Shavian all appears correct, except for that lone surrogate at the end,
which I'm familiar with - I sometimes delete Shavian text when composing an
email, and have to hit delete or backspace twice per character.  That bug is
still around for some reason, but can be worked around easily enough in a
standard xft/gtk2 build.  I haven't had to use pango since I don't use complex
scripts.  I will try this myself as soon as I get a chance to compile it,
perhaps tomorrow.
I finally got Firefox compiled with your patch, and it works great for me, other
than the effects of bug 237585, which hopefully is going to be fixed sometime.

Attached patch Patch version 2 (obsolete) — Splinter Review
I changed the gtk part to accept only codepoints in the full Unicode range
0x1-0x10FFFF.

In case anybody was wondering, non-BMP keyboard input on Windows works out of
the box since all characters are sent as UTF-16.
Attachment #186735 - Attachment is obsolete: true
Attachment #186735 - Flags: superreview?(neil.parkwaycc.co.uk)
Attachment #187026 - Flags: superreview?(neil.parkwaycc.co.uk)
Attachment #187026 - Flags: review+
Comment on attachment 187026 [details] [diff] [review]
Patch version 2

I can't tell from our event model whether we're expecting UCS-32 or UTF-16
event codes, but I expect that's outside of the scope of this bug.

>-    if ((ucs != -1) && (ucs < 0x10000))
>+    if ((ucs > 0) && (ucs <= MAX_UNICODE))
Did you mean >= 0?

>+      if (IS_IN_BMP(character)) {
>+        nsAutoString key(character);
>+        return TypedText(key, eTypedText);
>+      }
>+      else
>+      {
>+        nsAutoString surrogatePair(H_SURROGATE(character));
>+        surrogatePair.Append(L_SURROGATE(character));
>+        return TypedText(surrogatePair, eTypedText);
>+      }
I was wondering whether it would be simpler to use this:
nsAutoString text;
if (IS_IN_BMP(character))
{
  text.Append(character);
}
else
{
  text.Append(H_SURROGATE(character));
  text.Append(L_SURROGATE(character));
}
return TypedText(text, eTypedText);
Also I seem to recall that editor brace style is all braces on their own line,
whereas you have a { after your if condition.
Attachment #187026 - Flags: superreview?(neil.parkwaycc.co.uk) → superreview+
Attached patch Revised patch (obsolete) — Splinter Review
Arising from IRC discussion, this patch adds a modification to Windows to
dispatch a single key event with the UTF-32 codepoint.

Seamonkey typeaheadfind breaks on non-BMP characters either way, but I'd rather
fix that in a separate bug.
Attachment #187026 - Attachment is obsolete: true
Attachment #187334 - Flags: review?(neil.parkwaycc.co.uk)
Comment on attachment 187334 [details] [diff] [review]
Revised patch

When you say seamonkey typeaheadfind, are you saying that firefox typeaheadfind
works? Only there are lots of other places with a similar mechanism such as
menus and trees which will presumably break in a similar way.

Also, did you really mean to ask me for r rather than sr? I don't normally use
an IME let alone type non-plane-0 characters especially since my XP PC just
died :-/
Firefox typeaheadfind works with the single keyevent with a UTF-32 charcode
(i.e. the latest patch). Both flavours of typeaheadfind break with the pair of
UTF-16 keyevents (status quo on Windows and a version of the patch which I
tested but didn't attach) in a case where a page doesn't contain the character
typed but does contain a character whose high surrogate is the same. My test
scenario was searching for Shavian characters in
http://alanwood.net/unicode/deseret.html.

You're right about menus and trees, but I think fixing them can be even lower
priority, unless Ethan is planning to release Firefox with the UI in Shavian ;-)

I did mean to ask you for r on the widget/src/windows part, but no problem if
you want to defer on that.
Simon,

Is this patch sufficient to enter non plane-0 characters in form controls and
urlbar? With your patch and my patches from bug 306627 I still get two symbols.
I have not set up correct fonts, though. I am trying to input some musical
symbols (e.g. U+1D11E) and see that in nsPlaintextEditor::HandleKeyPress the
character 0x1D11E is correctly converted in UTF16 string with codes 0xD834 and
0xDD1E.

I think that Win32 nsWindow::OnKeyDown should take care to determine if
character is the high surrogate and fetch the successive low surogate character
and not expect that ProcessMessage with WM_CHAR will handle this. This is
required to cancel _both_ WM_CHAR messages if WM_KEYDOWN with non plane 0
character was canceled.

Please also consider to use the helper functions UCS4CharToUTF16 and
UTF16ToUCS4Char from bug 306627 to avoid code duplication. Whether to put them
in intl/unicharutil or xpcom/string is up to you. You can suggest a better
function names, too :)
QA Contact: editor
What's the status on this bug?  As of Firefox 3.0.3, I still cannot type non-bmp text, it is still ignored. I can copy and paste, and edit by moving things around, but I cannot type a single plane-1 character.
Attached patch Gtk patchSplinter Review
Revisiting this on current trunk. OS X seems to work fine, and Windows seems to work fine without the bugs in typeaheadfind that I found before (comment 16). Seamonkey typeaheadfind is still broken, but I think this is a bug on their side -- it's broken with IME as well.

So I don't think we need to patch editor at all, but we do need at least to patch widget/src/gtk2 to make it not drop SMP characters on the floor. This patch makes gtk behave similarly to OS X and send a text event with a UTF-16 string.
Attachment #187334 - Attachment is obsolete: true
Attachment #455621 - Flags: review?
Attachment #187334 - Flags: review?(neil)
Attachment #455621 - Flags: review? → review?(karlt)
Comment on attachment 455621 [details] [diff] [review]
Gtk patch

(In reply to comment #19)
> This patch makes gtk behave similarly to OS X and send a text event with a
> UTF-16 string.

A TEXT_EVENT doesn't seem ideal because it will prevent keyboard accelerators from working.  It sounds like something should be fixed to interpret nsKeyEvent::charCode properly.

But this can't be much worse than skipping the event as we do now.
Attachment #455621 - Flags: review?(karlt) → review+
I'm not going to lose any sleep over keyboard accelerators not working for these characters -- they don't work for CJK characters either!
Simon, can this be now marked as RESOLVED FIXED?
Component: Editor → Widget: Win32
QA Contact: editor → win32

Moving all keyboard/IME open bugs to DOM: UI Events & Focus Handling component.

Component: Widget: Win32 → DOM: UI Events & Focus Handling

(In reply to Simon Montagu :smontagu from comment #22)

http://hg.mozilla.org/mozilla-central/rev/d2d570a50ddf

(In reply to :ehsan akhgari from comment #23)

Simon, can this be now marked as RESOLVED FIXED?

Looks like we can to it.

And also, currently, we send the event to editor with a set of composition events. This is different behavior from other platforms. But it's not scope of this bug.

Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: