Closed
Bug 54135
Opened 24 years ago
Closed 23 years ago
conversion (fromU/toU) problem- Sjis code x'81ca' becomes x'fa54'
Categories
(Core :: DOM: Editor, defect, P3)
Tracking
()
VERIFIED
FIXED
mozilla0.9.6
People
(Reporter: hobbit_mak, Assigned: ftang)
References
()
Details
(Keywords: intl)
Attachments
(20 files)
4.69 KB,
patch
|
Details | Diff | Splinter Review | |
1.09 KB,
text/plain
|
Details | |
4.75 KB,
patch
|
Details | Diff | Splinter Review | |
820 bytes,
patch
|
Details | Diff | Splinter Review | |
16.51 KB,
application/octet-stream
|
Details | |
35.80 KB,
patch
|
Details | Diff | Splinter Review | |
313.27 KB,
text/plain
|
Details | |
479.19 KB,
patch
|
Details | Diff | Splinter Review | |
8.00 KB,
text/plain
|
Details | |
145.66 KB,
application/octet-stream
|
Details | |
928.55 KB,
patch
|
Details | Diff | Splinter Review | |
35.80 KB,
text/plain
|
Details | |
5.20 KB,
text/plain
|
Details | |
483.29 KB,
text/html
|
Details | |
490.65 KB,
text/html
|
Details | |
483.38 KB,
text/html
|
Details | |
483.38 KB,
text/html
|
Details | |
62.78 KB,
patch
|
Details | Diff | Splinter Review | |
164.38 KB,
application/octet-stream
|
ftang
:
review+
|
Details |
146.12 KB,
application/octet-stream
|
Details |
From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; m18) Gecko/20000924 BuildID: 2000092408 If you edit page of Shift JIS and save it proper character x'81ca' becomes x'fa54'. Reproducible: Always Steps to Reproduce: 1.Edit page of http;//homepage1.nifty.com/hobbit/html/utf8.html 2.Save it to local file. Actual Results: x'81ca'(proper code) changed to x'fa54'(Windows code) Expected Results: x'81ca' is reatained. Maybe related with 35166. http://bugzilla.mozilla.org/show_bug.cgi?id=35166
Assignee | ||
Comment 2•24 years ago
|
||
minor issue. mark it as assign
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Target Milestone: --- → Future
Reporter | ||
Comment 3•24 years ago
|
||
x'fa54'(Windows code) cannot be displayed by Mozilla itself. (Build 2000112704)
Comment 4•24 years ago
|
||
It is reported that Linux build also had this problem. http://bugzilla.mozilla.gr.jp/show_bug.cgi?id=474
Reporter | ||
Comment 5•24 years ago
|
||
Also sjis code 0x81E0 becomes to 0x8790 sjis code 0x81e6 becomes to 0xfA5B
Reporter | ||
Comment 6•24 years ago
|
||
Reporter | ||
Comment 7•24 years ago
|
||
Patch above was verified on Windows 2000 environments.
Reporter | ||
Comment 8•24 years ago
|
||
Reporter | ||
Comment 9•24 years ago
|
||
Attached list is in utf-8 encoding.
Assignee | ||
Comment 10•24 years ago
|
||
remove Future from the target milestone.
Keywords: intl
Target Milestone: Future → ---
Reporter | ||
Comment 11•24 years ago
|
||
This problem is fixed in Build 2001011720.
Reporter | ||
Comment 12•24 years ago
|
||
Sorry test with modified modules. This problem is reproduced on Build ID 2001012304.
Reporter | ||
Comment 13•24 years ago
|
||
Reporter | ||
Comment 14•24 years ago
|
||
Because http://bugzilla.mozilla.org/show_bug.cgi?id=44374 was fixed, 81BE becomes 879C. 81BF becomes 879B. 81DA becomes 8797. 81DB becomes 8796. 81DF becomes 8791. 81E3 becomes 8795. 81E7 becomes 8792. Patch is also updated.
Assignee | ||
Updated•24 years ago
|
Summary: Sjis code x'81ca' becomes x'fa54' → conversion problem- Sjis code x'81ca' becomes x'fa54'
Assignee | ||
Comment 15•24 years ago
|
||
hobbit.makoto@nifty.ne.jp: How you generate these patch ? Do you change the source table and use the ufrom and uto tool to generate it? If so, can you give us the change of the source table?
Summary: conversion problem- Sjis code x'81ca' becomes x'fa54' → conversion (fromU/toU) problem- Sjis code x'81ca' becomes x'fa54'
Reporter | ||
Comment 16•24 years ago
|
||
I could not find how to use the tool. So I changed both source of coment and object.
Reporter | ||
Comment 17•24 years ago
|
||
Reporter | ||
Comment 18•24 years ago
|
||
Mozilla convert U+FFE2 to 7C7B (ISO-8022-JP). It must be 224C (ISO-8022-JP).
Reporter | ||
Comment 19•23 years ago
|
||
How can I change the source table and use the ufrom and uto tool to generate it? I could not find these tools in source file.
Assignee | ||
Comment 20•23 years ago
|
||
tools at mozilla/intl/uconv/tools/umaptable.c nhotta- can you help to drive this ? I am overload
Assignee: ftang → nhotta
Status: ASSIGNED → NEW
Comment 21•23 years ago
|
||
hobbit.makoto@nifty.ne.jp, could you summarize the current remaining problem?
Reporter | ||
Comment 22•23 years ago
|
||
Problem left in build 2001050804 is - Ten characters are changed if you edit Shift JIS source and save it as Shift JIS code. 0x81be becomes 0x879c 0x81bf becomes 0x879b 0x81ca becomes 0xfa54 0x81da becomes 0x8797 0x81db becomes 0x8796 0x81df becomes 0x8791 0x81e0 becomes 0x8790 0x81e3 becomes 0x8795 0x81e6 becomes 0xfa5b 0x81e7 becomes 0x8792 Problem about iso-8022-jp was fixed. I could not download latest source yet, so I could not use tool yet.
Comment 23•23 years ago
|
||
Updated•23 years ago
|
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.1
Comment 24•23 years ago
|
||
hobbit.makoto@nifty.ne.jp: Please try the attached file and update your patch, thanks.
Reporter | ||
Comment 25•23 years ago
|
||
I download mozilla/intl/uconv/tools/. But I could not found how you made sjis.ut and shis.ut. I went to mozilla/intl/uconb/tools/. I nmaked make.win and get umaptable.exe. Maybe you made sjis.ut and sjis.uf by umaptable and original conversion table. But I could not fine where and how to make sjis.ut and shis.ut.
Comment 26•23 years ago
|
||
Let me ask Frank and I will update.
Assignee | ||
Comment 27•23 years ago
|
||
for convert from sjis into unicode I run /intl/uconv/tools/cp932tojdx.pl against http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT and it will generate source/intl/uconv/ucvja/jis0208.ump this will be shared by SJIS/EUC/ISO-2022-JP to unicode conversion for convert from unicode into ShiftJIS I run intl/uconv/tools/jis0208fromcp932.pl againt http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT It will generate a file and I then pipe that file into umaptable -uf > 0208.uf to generate the jis0208.uf
Reporter | ||
Comment 28•23 years ago
|
||
Reporter | ||
Comment 29•23 years ago
|
||
Reporter | ||
Comment 30•23 years ago
|
||
I got cp932.txt from unicode and made sjis.uf from that. But some characters mapped to two sjis position. So I comment out sjis locations that had not proper in JIS X 0208 and 0212. I attached diff list and sjis.uf and confirmed that this sjis.uf solves problems.
Updated•23 years ago
|
Target Milestone: mozilla0.9.1 → mozilla0.9.2
Updated•23 years ago
|
Target Milestone: mozilla0.9.2 → mozilla0.9.1
Comment 31•23 years ago
|
||
Comment 32•23 years ago
|
||
I put a diff for sjis.uf, it's very big. I expected something similar to the patch of 02/14/01 06:13. hobbit.makoto@nifty.ne.jp, do you have any idea why the diff is so large? What characters did you actually changed? Please list character codes of changed characters.
Reporter | ||
Comment 33•23 years ago
|
||
I suppose that original table is not derived from CP932.txt. I would like to know the original table also, but I could not find it.
Comment 34•23 years ago
|
||
I am going to ask Frank. The characters you changed are the same as listed in your comment 2001-05-08 18:07?
Reporter | ||
Comment 35•23 years ago
|
||
No, character I changed from cp932.txt is listed in 05/15/01 07:30. No character in 2001-05-08 18:07 is not changed. They are the same as in cp932.txt.
Reporter | ||
Comment 36•23 years ago
|
||
It is strongly recommended to record from which tool and table or other resource, source was created. It is better to record in source file. Maybe this is the reason of difficulity to solve this bug. In http://bugzilla.mozilla.org/show_bug.cgi?id=35166 You conclude that you use cp932 for Unicode to SJIS conversion.
Comment 37•23 years ago
|
||
Bug 67374 - sources and tools to build unicode converters not in tree.
Depends on: 67374
Updated•23 years ago
|
Whiteboard: ftang to provide a source file for the current sjis.uf
Comment 38•23 years ago
|
||
TM to 0.9.2 per PDT triage (it's OK to check it in by Friday or after 0.9.1 branch is made).
Target Milestone: mozilla0.9.1 → mozilla0.9.2
Assignee | ||
Updated•23 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 40•23 years ago
|
||
pdt+ base on 6/11 pdt meeting.
Assignee | ||
Updated•23 years ago
|
Whiteboard: ftang to provide a source file for the current sjis.uf → [PDT+]ftang to provide a source file for the current sjis.uf
Assignee | ||
Comment 41•23 years ago
|
||
I don't think we have time to address this problem by moz0.9.2. Push to moz0.9.3
Target Milestone: mozilla0.9.2 → mozilla0.9.3
Assignee | ||
Comment 42•23 years ago
|
||
remove PDT+
Whiteboard: [PDT+]ftang to provide a source file for the current sjis.uf → ftang to provide a source file for the current sjis.uf
Assignee | ||
Updated•23 years ago
|
Whiteboard: ftang to provide a source file for the current sjis.uf → no progress yet. ftang to provide a source file for the current sjis.uf
Comment 44•23 years ago
|
||
I read a part of program for japanese-unicode conversion. But I didn't recognize the sources and ways to generate some mapping tables. So, I made a tool to generate jis0201.uf, jis0208.uf, jis0208.ump, jis0208ext.uf and sjis.uf from CP932.TXT and SHIFTJIS.TXT. *.uf are generated with 'umaptable'. Diffs are so large because,,,, the original mapping policy about codes that SJIS:UCS2 = N:1 is to use HIGHER SJIS code. It is not so good idea. They shoud be mapped to LOWER SJIS code (without IBM ext codes : bug-82678). see http://support.microsoft.com/support/kb/articles/Q170/5/59.ASP. testpage : http://rh.vinelinux.org/~shom/sjis-cp932.html ---------- In addition, this tool can generate tables from APPLE_JAPANESE.TXT. # ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/JAPANESE.TXT If it is possible to add "Shift_JIS (Macintosh)" , some problems will be resolved: 1) SJIS in/out problems (bookmark import, saving mail draft, compose, etc.) on Mac0S 8,9 SJIS 815C (U+2014) EM DASH 8160 (U+301C) WAVE DASH 8161 (U+2016) DOUBLE VERTICAL LINE 817C (U+2212) MINUS SIGN 8191 (U+00A2) CENT SIGN (questionable : U+FFE0?) 8192 (U+00A3) POUND SIGN (questionable : U+FFE1?) 81CA (U+00AC) NOT SIGN (questionable : U+FFE2?) 2) Apple extended ShiftJIS codes (SJIS 8540-886D,EB41-ED96) # partly. because APPLE defined some codes as Unicode Sequences. # mozilla cannot process Unicode Sequeces. testpage : http://rh.vinelinux.org/~shom/sjis-mac.html
Comment 45•23 years ago
|
||
Comment 46•23 years ago
|
||
usage: mkjpconv.pl SHIFTJIS.TXT CP932.TXT (or mkjpconv.pl SHIFTJIS.TXT APPLE_JAPANESE.TXT APPLE_JAPANESE.TXT is generated (CR->LF) from APPLE/JAPANESE.TXT) SHIFTJIS.TXT is: ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/JIS/SHIFTJIS.TXT
Comment 48•23 years ago
|
||
Matsumoto san, could you put sjis.uf generated by your tool? I think the current problem is that it is hard to identify modifications. For example, if we want to change the mapping for Shift_JIS 0x81ca, we want to identify that change in sjis.uf. Then we can make sure the change won't affect other characters.
Comment 49•23 years ago
|
||
Comment 50•23 years ago
|
||
Comment 51•23 years ago
|
||
There is very large amount of diffs, but I can see all glyphs defined in SHIFTJIS.TXT on http://rh.vinelinux.org/~shom/sjis-cp932.html. I think current mapping table has many (hidden) problems espacially dual mapped codes in CP932.TXT. Do you have a tool (or method) to generate SJIS->UCS2, UCS2->SJIS, JIS->UCS2, UCS2->JIS mapping tables ?
Comment 52•23 years ago
|
||
Comment 53•23 years ago
|
||
I made a tool to check all codes in CP932.TXT. # to generate Shift JIS encoded HTML page perl mksjistest.pl CP932.TXT > sjis-cp932.html # to generate UTF-8 encoded HTML page perl mksjistest.pl CP932.TXT UTF-8 > sjis-cp932-utf8.html I modified sjis-cp932-utf8.html by 0.9.2 and 0.9.2 + generated maps, and 'Save As Charset' with Shift_JIS. (so I'm using Linux. Please check on Windows) diffs are: SRC = SJIS, ORG = modified by 0.9.2, NEW = modified by newmap SRC ORG NEW ------------------ JIS defined region 81BE 879C 81BE 81BF 879B 81BF 81CA FA54 81CA 81DA 8797 81DA 81DB 8796 81DB 81DF 8791 81DF 81E0 8790 81E0 81E3 8795 81E3 81E6 FA5B 81E6 81E7 8792 81E7 ------------------ NEC specific codes 8754 FA4A 8754 8755 FA4B 8755 : : : 875D FA53 875D 8782 FA59 8782 8784 FA5A 8784 878A FA58 878A 8790 8790 81E0 8791 8791 81DF 8792 8792 81E7 8795 8795 81E3 8796 8796 81DB 8797 8797 81DA 879A FA5B 81E6 879B 879B 81BF 879C 879C 81BE ----------------- NEC selected IBM ext region ED40 FA5C ED40 : : : EEF8 FA49 EEF8 EEF9 FA54 81CA EEFA FA55 EEFA EEFB FA56 EEFB EEFC FA57 EEFC ------------------ IBM ext region FA40 FA40 EEFA : : : FA49 FA49 EEF8 FA4A FA4A 8754 : : : FA53 FA53 875D FA54 FA54 81CA FA55 FA55 EEFA : : : FA57 FA57 EEFC FA58 FA58 878A FA59 FA59 8782 FA5B FA5B 81E6 FA5C FA5C ED40 : : : FC4B FC4B EEEC ------------------------- I think new mapping policy is same as OE. (I heard OE mapped codes in IBM ext region to NEC selected region)
Comment 54•23 years ago
|
||
Comment 55•23 years ago
|
||
Comment 56•23 years ago
|
||
Comment 57•23 years ago
|
||
Comment 58•23 years ago
|
||
Comment 59•23 years ago
|
||
Assignee | ||
Comment 60•23 years ago
|
||
roy yokoyama, can you help the check in the changes? shoji-san, which diffs should we pick?
Assignee: ftang → yokoyama
Status: ASSIGNED → NEW
Comment 62•23 years ago
|
||
Please use *.uf, *.ump in the next attachment (old newmap.zip is not include jisx0208ext.uf, sorry) or create them by mkjpconv.pl (from SHIFTJIS.TXT and CP932.TXT). 'jisx0201gl.uf' is obsolete (not used in all sources). And if these are acceptable (I'll make testcases), add mkjpconv.pl into intl/uconv/tools. # cp932tojdx.pl and jis0208fromcp932.pl will be obsolete. I don't know where is the source of jis0212.{uf,ump}. I want to change mkjpconv.pl to make jis0212.{uf, ump}.
Comment 63•23 years ago
|
||
Comment 65•23 years ago
|
||
nsbranch- since Frank moved it to 0.9.5
Comment 66•23 years ago
|
||
shoji-san: what is the status of this bug? Are we waitng for ftang to provide sjis.uf source as stated in the whileboard? Note: I'd appreciate if you can change the status of patches which are already obsolete. === cc'ing ftang
Comment 67•23 years ago
|
||
Comment 68•23 years ago
|
||
Please test new maps on Windows, Mac and OS/2. testcases.zip has SJIS encoded texts to test. 1. display ALL chars in raw.txt must be shown. On Windows, ALL chars in rawext.txt, rawibmext.txt must be shown. 2. compose (round trip) 1) edit raw{,ext,ibmext}.txt.html on composer 2) save as with ShiftJIS 3) rawdump.pl <saved html> "<ORG>:<NEW>:DIFF" are not round tripped codes. New codes must be "SJIS lower" in http://bugzilla.mozilla.org/attachment.cgi?id=44509&action=view (see http://support.microsoft.com/support/kb/articles/Q170/5/59.ASP) 3. mail 1) compose new mail 2) CUT & PASTE all chars in raw.txt 3) send ALL chars in the mail with raw.txt must be shown. on Windows, ALL chars in the mail with raw{ext,ibmext}.txt must be shown. ------ If any problem would be occured on Mac or OS/2 especially about 9 chars in http://rh.vinelinux.org/~shom/sjisprob.html , it should not be corrected by changing mapping tables.
Comment 69•23 years ago
|
||
nhotta is back from sabbatical. assiging back to him.
Assignee: yokoyama → nhotta
Status: ASSIGNED → NEW
Comment 70•23 years ago
|
||
move to 0.9.6
Status: NEW → ASSIGNED
Target Milestone: mozilla0.9.5 → mozilla0.9.6
Comment 71•23 years ago
|
||
I think the tool has to be reviewed first. Frank, please review mkjpconv.pl included in the attachment of 08/08/01 03:17.
Assignee: nhotta → ftang
Status: ASSIGNED → NEW
Whiteboard: no progress yet. ftang to provide a source file for the current sjis.uf
Assignee | ||
Updated•23 years ago
|
Status: NEW → ASSIGNED
Comment 72•23 years ago
|
||
Viewing the following diff by 4.x, we can see that mozilla is generating codes which 4.x cannot show, so put 4xp keywoard. diff between ...-sjis-0.9.2.html and ..--sjis-new.html http://bugzilla.mozilla.org/attachment.cgi?id=44534&action=view
Keywords: 4xp
Assignee | ||
Comment 73•23 years ago
|
||
Comment on attachment 45060 [details]
newmap.zip (mkjpconv.pl, jis0208.uf, jis0208ext.uf, jis0201.uf, sjis.uf, IBMNEC.map )
rs=ftang.
Attachment #45060 -
Flags: review+
Assignee | ||
Comment 74•23 years ago
|
||
Please check them in.
Assignee | ||
Comment 75•23 years ago
|
||
give back to nhotta for check in.
Assignee: ftang → nhotta
Status: ASSIGNED → NEW
Updated•23 years ago
|
Status: NEW → ASSIGNED
Comment 76•23 years ago
|
||
rs=blizzard
Comment 77•23 years ago
|
||
should someone from international QA be the qa_contact for this bug ?
Comment 79•23 years ago
|
||
Checked in to the trunk. The tool still needs to be checked in. Frank, please review the tool. http://bugzilla.mozilla.org/attachment.cgi?id=51199&action=view
Comment 80•23 years ago
|
||
The tool issue to be handled by bug 67374. Mark this as FIXED.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•