Closed Bug 198475 Opened 22 years ago Closed 17 years ago

SpiderMonkey needs to upgrade to Unicode 4.0

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 394604
mozilla1.9alpha1

People

(Reporter: ernestcline, Unassigned)

References

()

Details

(Keywords: helpwanted)

Attachments

(3 files)

User-Agent: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.3) Gecko/20030312 Build Identifier: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.3) Gecko/20030312 In the Unicode 4.0 beta, the Mongolian Vowel Spearator (U+180E) is being reclassified as a whitespace character (general category Zs). This may affect other componets as well, but it certainly affects the Javascript Engine since the ECMAScript specification refers to Unicode white space characters. Reproducible: Always Steps to Reproduce:
Changing summary from "Unicode 4.0 and Mongolian Vowel Separator U+180E" to "SpiderMonkey needs to upgrade to Unicode 4.0" It is my understanding that SpiderMonkey still uses the Unicode 3 tables (please correct me if I'm wrong on that).
Assignee: rogerl → khanson
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Unicode 4.0 and Mongolian Vowel Separator U+180E → SpiderMonkey needs to upgrade to Unicode 4.0
Wouldn't it be best to wait until Unicode 4.0 is finalized? I'd file an additional bug in Internationalization to update the rest of the browser when 4.0 final is out.
While it is possible that some minor changes in the character properties could happen between now and September 2003 when Unicode 4.0 is due to be approved, the new characters are set in stone and will only be not adopted in the unlikely event that the proposed Unicode 4.0 is rejected in its entirety. Still, I can see planning this improvement for the 1.5 release which should be released at about the same time as Unicode 4.0. The comment period for the 4.0 beta ends today (21 March) and it is doubtful they will change much. Wouldn't it be nice for Mozilla to be the first browser to support U4.0? By planning now, Mozilla could do so.
Assignee: khanson → general
QA Contact: pschwartau → general
Keywords: helpwanted
Target Milestone: --- → mozilla1.9alpha
UTF-8 encoded. The functions 'validate1' ... 'validate14' MUST all return true.
This patch updates the character table to Unicode 4.1.0. The lookup tables use a three array method similar to the original source but the content has been optimized for ECMA-262. These changes were built from the latest version of SpiderMonkey (1.5); I can provide the full jsstr.h and jsstr.c files to the Mozilla team on request.
Comment on attachment 211325 [details] [diff] [review] Diffs for source file First, use prevailing style (see the Emacs and often vim modelines in the license comment; see also the prevailing style ;-): four space indentation. Also, unindent function names in function definition to start in column 1, with the type, mode, and any storage class on the previous line. So jschar JS_TOUPPER(jschar c) not >+jschar JS_TOUPPER(jschar c) >+{ Second, no else after return non-sequiturs. In particular, that could mean: >+ if (ccode & 0x80) >+ return (c - (ccode >> 10)); >+ else >+ return (c + (ccode >> 10)); becomes: return (ccode & 0x80) ? c - (ccode >> 10) : c + (ccode >> 10); More detailed comments in a bit. Feel free to attach a single patch containing unified diffs (cvs diff -pu8 is good) for both files, instead of separate patches per file. Thanks for patching this bug! /be
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: