Closed Bug 30176 Opened 25 years ago Closed 24 years ago

Fail to print Chinese webpages

Categories

(Core :: Internationalization, defect, P3)

x86
Linux
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: wenzhuo, Assigned: yueheng.xu)

References

()

Details

mozilla either hangs or prints out funny chars and sometimes blank pages.
Component: All → Internationalization
Product: Architecture → Browser
Version: 5.0 → other
boing
Assignee: waterson → ftang
QA Contact: shaver → teruko
reassign to PostScript owner dcone
Assignee: ftang → dcone
Status: NEW → ASSIGNED
In order to print out postscript chinese characters.. the chinese AFM (Adobe 
Font Metrics) file appropriate for that font has to be installed.. or 
accessible.  Otherwise it will default to the default AFM files.  I am not sure 
where to go from here.. or how to resolve this bug.
Assignee: dcone → ftang
Status: ASSIGNED → NEW
dcone - please read the following newsgroup posting
add yueheng.xu@intel.com to the cc list.
reassign the bug back to dcone. We can chat after beta1 for this issue

news://news.mozilla.org/38C567F3.6FF7F61%40intel.com
news://news.mozilla.org/38598C72.F0B2946%40netscape.com
news://news.mozilla.org/385FF708.CBE89408%40netscape.com
news://news.mozilla.org/7DAA70BEB463D211AC3E00A0C96B7AB20359F002%40orsmsx41.jf.i
ntel.com
Assignee: ftang → dcone
George, note that this bug is also about Unix printing I18N. If someone at Sun
is working on this, it would be a good idea to coordinate with Intel.
Okay, Erik, well-heard.
Status: NEW → ASSIGNED
Target Milestone: M16
This bug, together with bug #31356 and #31351 and #31360 are under consideration
by me now. The solution need following steps

1. changed the Unicode to PostScript string conversion hack in 
   function PostScriptTextout() of file nsRenderingContextPS.cpp 
   so that non-latin characters can also be converted correctly
   (e.g. non-latin Unicode char get represented as oct number like 
     \215\312, etc.)

2. Write a post-processing filter which redefien the "show" command
   which recognize the above \xxx\xxx representation as a single char
   and call a correct rendering procedure, with the help of a properly
   installed font file usable by the PostScript interpreter.

   In TurboLinux. The approach is to install PostScript fonts for the
   char set used. 

   In our Intel Consumer Electronics Distribution of Linux, we insert
   the font info into the PostScript file to make it portable to any
   PostScript interpreter

   
Make sure you also change the GetWidth function so it do a reasonable 
measuring. Fixing DrawString itself is necessary but not enough.
Frank,

   You have a comment on bug #30176, which I am working on it
now. Can you point to me all the locations that GetWidth should be
changed as far as you know ?

    Thanks.

-Yueheng Xu-
yueheng.xu@intel.com

xu- I believe the current problem code is in gfx/src/ps/nsAFMObject.cpp
754 nsAFMObject :: GetStringWidth(const PRUnichar *aString,nscoord& 
aWidth,nscoord aLength)
755 {
756 PRUint8   asciichar;
757 PRUnichar *cptr;
758 PRInt32   i ,fwidth,idx;
759 float     totallen=0.0f;
760 
761  //XXX This needs to get the aString converted to a normal cstring  DWC
762  aWidth = 0;
763  cptr = (PRUnichar*)aString;
764 
765   for(i=0;i<aLength;i++,cptr++){
766     asciichar = (*cptr)&0x00ff;
767     idx = asciichar-32;
768     fwidth = (PRInt32)(mPSFontInfo->mAFMCharMetrics[idx].mW0x);
769     totallen += fwidth;
770   }
771 
772   totallen = NSFloatPointsToTwips(totallen * mFontHeight)/1000.0f;
773   aWidth = NSToIntRound(totallen);
774 }

Notice that line 766 truncate the higher 8 bits of the unicode. Read my 
newsgroup posting about some approxmiate method when the *cptr >= 0x0100
1. Latin scripts- find a "similar width character" in ASCII, for example U+0100 
(LATIN CAPITAL LETTER A WITH MACRON) can have the width of it's "similar width 
character"- "A". We can generate this table by looking at the unicode database 
(ftp://ftp.unicode.org/Public/3.0-Update/UnicodeData-3.0.0.txt) , take the first 
character in the 5th field (field 5 character decomposition mapping in 
ftp://ftp.unicode.org/Public/3.0-Update/UnicodeData-3.0.0.html) as the "similar 
width character"
2. For combining mark character- width is 0- we can generate a IsWidth0 function 
by looking at the unicode database, take the 3rd field (field 2 General Category 
in ftp://ftp.unicode.org/Public/3.0-Update/UnicodeData-3.0.0.html#General 
Category) and treat all character have the Mn/Mc/Cc/Cf category as character of 
width 0.
3. For CJK ideograph (U+3400-U+9FFF F900-FAFF) and Korean Hangul (AC00-D7FF), we 
can assume the apperance is square. Therefore, we can approximate the width of 
them to a ratio (1 ???) of it's height. We probably need to get this ratio from 
somewhere power user can change (prefs.js ???) 


This is a duplicate of 31360.. which was a work WORKSFORME as of this morning.


*** This bug has been marked as a duplicate of 31360 ***
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → DUPLICATE
The hang up at print part has been resolved ( which is related to #31350,
#31360).  The funny char print out part is a duplicate of #31356 # 31351 which 
I have a fix pending review and check-in. However, the Getwidth() part (as point
out by ftang in the comments) has not been resolved yet, so the cn.yahoo.com
print out will have overlapped lines. I am still working on it. So keep this 
bug open. 
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
I am working on the GetWidth() part for CJK chars and assign this re-opened bug 
to me.
Assignee: dcone → yueheng.xu
Status: REOPENED → NEW
The fix has been checked in on April 4th, 2000. I tested it using a base build
of 2000030708.  The design documentation is at http://linux.webchina.org/
Status: NEW → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
I verified this as fixed.  The page is sent to the printer. Remaining problem is 
in 35910.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.