The PA_HackTable in the parser, which converts CP 1252 to Unicode, is redundant and in fact buggy. It is missing some of the new 1252 characters such as the euro. Might be a good idea to get rid of this table in the parser, and make the parser use the Unicode converter instead (mozilla/intl/uconv).
From: Brendan Eich <email@example.com> Look at http://lxr.mozilla.org/mozilla/search?string=PA_HackTable -- PA_HackTable is duplicated not only with respect to mozilla/intl's table, it's duplicated in nsString.cpp and nsString2.cpp. The nsString definition and uses were the ones I was questioning: if NCRs and Windows page compatibility are handled by the parser, why should nsString have to do anything?
Please notify me when this is fixed, and I'll strip UCS2 stuff out of string.
Should the duplication of PA_HackTable in nsString.cpp (or nsString2.cpp, or both, whatever is in use) be a separate bug? It can't be firstname.lastname@example.org's bug because the solution there is not to switch nsString to use mozilla/intl's table -- it's to eliminate the use of any table. The CP 1252 hacking should be confined to htmlparser, and not done in a generic string class. /be
Status: NEW → RESOLVED
Last Resolved: 19 years ago
Resolution: --- → FIXED
Removed hack table from nsString, and removed ToUCS2() method as well. Note that the hacktable must remain in the token class in order to correctly map NCR's. Also removed call to ToUCS() in scanner.
updated qa contact.
QA Contact: janc → bsharma
Verified on: build: 2001-04-02-09-Mtrunk platform: WinNT Marking verified as per above developer comments.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.