Closed Bug 26473 Opened 25 years ago Closed 13 years ago

info needed for new Euro character?

Categories

(Core :: JavaScript Engine, defect, P3)

x86
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 652771

People

(Reporter: rginda, Unassigned)

Details

BugSplat doesn't seem to want to move this bug, manually copying instead...
********************************************************************************

This may not be a bug, and it may depend on some language in the specification
over just what version of Unicode we claim to support.

Java is bowing to international pressure, and adding support for the 'Euro'
character in the next minor release.  I think this is towards conformance with
Unicode 2.1; ECMA-262 v. 1 mentions Unicode 2.0, I think.  Maybe this'll be a
version 2 feature.

If we support this character, we need to make a few modifications to the table
in jsstr.c; it might also be good to look at any other changes in Unicode 2.1 at
that time.

Here's the letter detailing the change to Java:


        Alok Rishi <alok@Eng.Sun.COM>   06/26/98 19:51

Subject: Review: Java Language Specification - Unicode 2.1/Euro support
     To: java-lic@blort.Eng.Sun.COM, alok@javahome.Eng.Sun.COM

[Kindly note the change in review period end date: now July 13th]

Summary:

The European Monetary Union (EMU) is introducing a new common
currency, the euro, on January 1, 1999 in eleven European countries.
Licensees and developers alike have expressed a strong desire that
euro support be included in the JDK as soon as possible, beginning
with JDK 1.1.7. This would entail updating the Java Language
Specification to reflect the Unicode 2.1 standard, along with
implementing the necessary changes in JDK 1.1.7. Kindly review the
proposed changes and send your feedback to
javasoft-jls-comments@eng.sun.com, by July 13th.

Details:

While the euro is being added to JDK 1.2, we normally would not
consider such a change in a JDK maintenance release (JDK 1.1.X).
However, there are several key factors for adding the euro in JDK
1.1.7 release.  First, the euro represents a unique circumstance with
a profound global impact on all companies doing business in
Europe. Second, the impact is imminent; applications written for the
European market must be ready to handle the euro by January 1, 1999.
Third, the euro changes are based on the Unicode standard and thus,
changes to the JDK will be minimal (see details below).  Finally, many
of our licensees and developers have expressed a strong desire for
euro support now to ensure adequate time to incorporate it into
existing applications.  

To reflect the Unicode 2.1 standard, the changes required in the Java
Language Specification involve updates to the java.lang.Character
class. (Note that in the actual implementation, it will require
modifying more than just the Character class.)  

The Unicode 2.1 version has a number of corrections to the character
property data, including the new character for the euro.  However,
only only some of these will be visible to Java programs.  The
java.lang.Character API only exposes selected attributes of Unicode
characters.  (For details, see ftp.unicode.org/Public/2.1-Update.)
For illustrative purposes, we provide a table below that compares the
visible changes from Unicode 2.0.14 (used in JDK 1.1 through 1.1.6) to
Unicode 2.1.2 (latest published Unicode version) for JDK 1.1.7 as follows:


Method                  when given   under             under JDK 1.1.7,
java.lang.Character.*   character:   JDK 1.1-1.1.6,    it will return
                                     it will return    the value:
                                     the value:      
---------------------   ----------   ----------------- ----------------
toLower                 \u018e       \u0258            \u01dd
toLower                 \u019f       \u019f            \u0275
toUpper                 \u01dd       \u01dd            \u018e
toTitle                 \u01dd       \u01dd            \u018e
toUpper                 \u0258       \u018e            \u0258
toTitle                 \u0258       \u018e            \u0258
toLower                 \u0275       \u0275            \u019f
toUpper                 \u03c2       \u03c2            \u03a3
toTitle                 \u03c2       \u03c2            \u03a3
toUpper                 \u1e9b       \u1e9b            \u1e60
toTitle                 \u1e9b       \u1e9b            \u1e60
getType                 \u20ac    UNASSIGNED(0)        CURRENCY_SYMBOL(26)
isDefined               \u20ac       false             true
isJavaIdentifierPart    \u20ac       false             true
isJavaIdentifierStart   \u20ac       false             true
getType                 \u301f START_PUNCTUATION(21)   END_PUNCTUATION(22)
getType                 \ufffc    UNASSIGNED(0)        OTHER_SYMBOL(28)
isDefined               \ufffc       false             true

Note:  The most important of these, is the inclusion of 20AC, the Euro
character.

Although Unicode 2.1.2 is shown above, we plan to update to the latest
published Unicode 2.1.x version available for the JDK 1.1.7 release.
Therefore, the final diffs in the table above may undergo minor
variations reflecting changes between these Unicode 2.1.x releases.

We welcome your feedback.  Please send it to
javasoft-jls-comments@eng.sun.com, by July 13th.

Thanks.


------- Additional Comments From mccabe  09/21/98 15:15 ------- 

I've checked the current ECMAScript v1 (edition 2) draft, and the conformance
section states that characters will be interpreted in conformance with the
Unicode v2.0 standard - which doesn't include the above Unicode 2.1 changes.  We
might want to consider adopting Unicode 2.1 (Or higher?  Perhaps Unicode will
have 3.0 by then) for ECMAScript v2, in which case we should make these changes.
This will put us technically out of conformance with ECMAScript 1, but I don't
think anyone will care.

------- Additional Comments From mikeang  09/21/98 16:44 ------- 

The changes from Unicode 2.0 to 2.1 seem to be mostly minor corrections and 
clarifications.  There are some changes to the bidirectional algorithms.  The 
differences are on the unicode website at 
http://www.unicode.org/unicode/reports/tr8.html
I would guess that our European customers really want the Euro character, and 
that the only people who get upset will be those who want conformance for its 
own sake.  Maybe if we bundle this with getting rid of \u it will seem like less 
of a change.

------- Additional Comments From mccabe  09/21/98 17:05 ------- 

Reassigning to our i18n delegate.
Keywords: js1.5
a. The java 2.1 tables are quite different and need to be validated before being 
slapped right in.
b. Share with Mozilla international code to reduce footprint?
c. Very little or no impact on Euro character by leaving things alone for now.
Status: NEW → ASSIGNED
Target Milestone: M20
We puzzled over this recently and couldn't find a way to detect the properties 

specified in the table, at least for the politically-sensitive euro.  Seems OK to 

punt for the time being.

Severity: normal → minor
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → REMIND
Target Milestone: M20 → Future
reopening and marking Future...
Status: RESOLVED → REOPENED
Resolution: REMIND → ---
Updating QA contact -
QA Contact: rginda → pschwartau
Keywords: js1.5
cc'ing waldemar for comment. The current spec. does seem to require 2.1
compatibility but specifically only includes '$' as an identifier start. Clearly
the changes in toLower/toUpper need to reflected in our implementation, but
could you comment how the Euro symbol should be treated?

...and by the way, if you have any idea how the tables in jsstr.c function, feel
free to work up a patch :-)
Assignee: rogerl → general
Status: REOPENED → NEW
QA Contact: pschwartau → general
Target Milestone: Future → ---
From a EcmaScript perspective this character is meaningless, no special mapping, no indetifier or other special character class
Status: NEW → RESOLVED
Closed: 24 years ago13 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.