Make sure nsIDOMKeyEvent::GetCharCode () works fine with non-0-plane characters

NEW
Unassigned

Status

()

Core
DOM: Events
13 years ago
9 years ago

People

(Reporter: Dainis Jonitis, Unassigned)

Tracking

Trunk
All
Windows XP
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 2 obsolete attachments)

(Reporter)

Description

13 years ago
This is a followup bug 295228.

The problem is that current code throws away Unicode plane information by
casting the UCS4 character code that is returned by GetCharCode () function to
PRUnichar.

Boris sugests (295228#c18 and 295228#c20) that it is not a good idea to extend
the nsIDomNSUIEvent interface with new methods. Instead I created new methods
that perform conversion from UCS4 character to UTF16 string and back. This
solution is endian independent.

At least on Windows there is also a problem in nsWindow.cpp widget code that is
not ready to receive the high surogate character codes from WM_CHAR message. It
is easy to fix and I will later attach patch.

Now I want to hear some feadback whether changes that I've made are ok. I've put
UCS4<->UTF16 conversion routines in intl/unicharutil, but probably they better
fit in xpcom/string.
(Reporter)

Comment 1

13 years ago
Created attachment 194479 [details] [diff] [review]
Possible solution
See bug 297943. It looks as if we have been approaching the problem from
different ends :)
(Reporter)

Comment 3

13 years ago
Great, I will use your version of Win32 nsWindow::OnChar. 
(Reporter)

Comment 4

13 years ago
Note to self:
Boris in bug 316394 added function AppendUCS4ToUTF16 () that can be used instead of UCS4CharToUTF16 (). Look if there is equivalent for UTF16ToUCS4Char (). Need to update the patch with new functions.
> Look if there is equivalent for UTF16ToUCS4Char

Yes, I moved those in the same patch -- see macros in nsCharTraits.h.
(Reporter)

Comment 6

13 years ago
Created attachment 203309 [details] [diff] [review]
Possible solution

Updated the patch to use the new function AppendUCS4ToUTF16 ().
I do not see how using only macros from nsCharTraits.h extract UCS4 code from nsAString (so that caller is only one line). So I moved my UTF16ToUCS4Char () helper function from nsUniCharUtil.cpp to nsReadableUtils.cpp.
I think this function can be helpful. I searched Mozilla source code for users of SURROGATE_TO_UCS4. Majority of them work with PRUnichar* params, but there are a couple of users that take nsAString as a source:
  nsUnicodeNormalizer.cpp, mdn_normalize () 
  nsTextBoxFrame.cpp, nsTextBoxFrame::CalculateTitleForWidth ()
  nsIDNService.cpp, utf16ToUcs4
Some of callers use it in loop and for them it could be helpful to also return the number of UTF16 codepoints that were used to construct the returned UCS4 character.
Probably it could be more useful instead of:
  NS_COM PRUint32 UTF16ToUCS4Char (const nsAString& aSource);
have something like:
  NS_COM PRUint32 UTF16ToUCS4Char (const nsAString& aSource, PRUint32& aDest)

This will return number of UTF16 codepoints consumed, but result will be in aDest. It will better match the AppendUCS4ToUTF16 (). Still not sure about best name for this function.

Boris, what do you think?
Attachment #194479 - Attachment is obsolete: true
Before I start trying to dig into this, does the UTF16CharEnumerator enumerator class in nsUTF8Utils.h (yeah, kinda odd place for it to be) do what you're trying to do?  Seems to me like it might...
(Reporter)

Comment 8

13 years ago
Yeah, seems that PRUint32 UTF16CharEnumerator::NextChar(nsAString::const_iterator& iter, const nsAString::const_iterator& end, PRBool *err = nsnull) is doing what I need. The only problem that each caller then should use four lines instead of one. 

 nsAString::const_iterator start, end;
 aSource.BeginReading(start);
 aSource.EndReading(end);
 PRUint32 ucs4 = UTF16CharEnumerator::NextChar(start, end);

Not a big deal, but on the onther hand we can add third implementation of NextChar to UTF16CharEnumerator that takes "const nsAString&" as a param for those callers who do no care about updated iterator values.

Something like that:

static PRUint32 NextChar(const nsAString& aSource)
{
    nsAString::const_iterator start, end;
    aSource.BeginReading(start);
    aSource.EndReading(end);
    return NextChar (start, end);
}

Is it worth adding a new method?
Probably worth it, yeah.  Might want to call it FirstChar() (or FirstUCS4Char()?), though, since that's what it returns.
(Reporter)

Comment 10

13 years ago
Created attachment 203674 [details] [diff] [review]
Possible solution

Add function FirstUCS4Char () to nsReadableUtils.cpp that internally calls UTF16CharEnumerator::NextChar (). It avoids the need to callers to adittionally include nsUTF8Utils.h that is not an obvious place where to expect this function. I've intentionally have not made it inline in nsReadableUtils.h to avoid every user of nsReadableUtils.h implicitly including nsUTF8Utils.h and thus reducing build time.
Attachment #203309 - Attachment is obsolete: true
Assignee: events → nobody
QA Contact: ian → events
You need to log in before you can comment on or make changes to this bug.