Closed Bug 40350 Opened 25 years ago Closed 23 years ago

nsPlainTextSerializer not work if input is NCR

Categories

(Core :: DOM: HTML Parser, defect, P3)

x86
Windows NT
defect

Tracking

()

RESOLVED FIXED
mozilla1.1beta

People

(Reporter: nhottanscp, Assigned: nhottanscp)

References

Details

(Keywords: helpwanted)

Attachments

(2 files, 2 obsolete files)

1) Create a HTML file which contains NCR (AÁ). 2) Open that in browser, it shows "AÁ" (capital 'A' and AACUTE). 3) In messenger account set up, use this HTML as a signature file also uncheck to compose HTML mail. 4) Create a new message. A plain text mail shows up and the signature turns to square boxes (on WinNT). In the code messenger uses nsIHTMLToTXTSinkStream to convert from HTML to plain text. When the input HTML contain NCR, NCR turned to 0xFFFF after the conversion.
This bug was separated from 34373.
This is probably related to bug 13401, that the nsHTMLToTXTSinkStream needs to be converted to use nsISaveAsCharset just like the nsHTMLContentSinkStream already was. Marking M19 (13401 is M20 because no one seemed to care about it), but if this is important, let me know and I'll do it sooner. It shouldn't be difficult.
Status: NEW → ASSIGNED
Target Milestone: --- → M19
nsISaveAsCharset generates NCR but this bug is about interpreting NCR, so not sure if nsISaveAsCharset should be used.
*** Bug 13401 has been marked as a duplicate of this bug. ***
Per beppe, marking future. Is this blocking anyone?
Target Milestone: M19 → Future
This was a problem since HTML composer generated NCR for Japanese HTML. I think currently that does not happen (text saved as a target charset instead of NCR).
adding help wanted keyword
Keywords: helpwanted
Anthonyd is taking over Output. I'm not sure if this is still a problem or not -- Naoki, can you comment?
This is still reproducible with win32 branch build 2000-10-24-09-MN6.
The reassign apparently didn't work -- trying again.
Assignee: akkana → anthonyd
Status: ASSIGNED → NEW
updated qa contact.
QA Contact: janc → bsharma
--> kin
Assignee: anthonyd → kin
QA Contact: bsharma → moied
Blocks: 115643
removing myself from the cc list
kin, could you review the patch?
Assignee: kin → nhotta
Status: NEW → ASSIGNED
Summary: nsIHTMLToTXTSinkStream not work if input is NCR → nsPlainTextSerializer not work if input is NCR
Target Milestone: Future → ---
akkana, could you review the patch?
Comment on attachment 91430 [details] [diff] [review] Changed nsPlainTextSerializer to try NCR if CER conversion fails. Looks reasonable, r=akkana. Does str have the '#' in it, and if so, would it be safer to skip over that character if ToInteger isn't guaranteed to? But if it works this way, it can only make things better.
Attachment #91430 - Flags: review+
ToInteger() skips '#', I actually copied the code from the parser. http://lxr.mozilla.org/seamonkey/source/htmlparser/src/nsHTMLTokens.cpp#2183
Comment on attachment 91430 [details] [diff] [review] Changed nsPlainTextSerializer to try NCR if CER conversion fails. So I'm a little curious, if an entity is not in our conversion table, say we have &foobar;, we will try to convert that to an int, will ToInteger() return a zero in that case? Is that ok?
I think it's better to check if the first character is '#'.
Attachment #91430 - Attachment is obsolete: true
akkana, could you review the new patch?
Comment on attachment 91838 [details] [diff] [review] Check if the fist character is '#'. I have a vague and possibly incorrect memory that it's better (faster/more efficient) to use str.First() rather than str[0]. Can you check with jag or scc on that? Or maybe Kin knows. Otherwise r=akkana.
Attachment #91838 - Flags: review+
Comment on attachment 91838 [details] [diff] [review] Check if the fist character is '#'. sr=kin@netscape.com With the |str[0]| to |str.First()| change as akkana and scc suggest. As a side note, I'm wondering if we should just write out the entity string in the case where |entity == -1 && str.First() != '#'|? Not required for this bug, but just a thought.
Attachment #91838 - Flags: superreview+
Attachment #91838 - Attachment is obsolete: true
Comment on attachment 91886 [details] [diff] [review] Change str[0] to str.First(). copy r/sr
Attachment #91886 - Flags: superreview+
Attachment #91886 - Flags: review+
Comment on attachment 91886 [details] [diff] [review] Change str[0] to str.First(). a=asa (on behalf of drivers) for checkin to 1.1
Attachment #91886 - Flags: approval+
checked in to the trunk
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.1beta
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: