Closed
Bug 40350
Opened 25 years ago
Closed 23 years ago
nsPlainTextSerializer not work if input is NCR
Categories
(Core :: DOM: HTML Parser, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla1.1beta
People
(Reporter: nhottanscp, Assigned: nhottanscp)
References
Details
(Keywords: helpwanted)
Attachments
(2 files, 2 obsolete files)
53 bytes,
text/html
|
Details | |
777 bytes,
patch
|
nhottanscp
:
review+
nhottanscp
:
superreview+
asa
:
approval+
|
Details | Diff | Splinter Review |
1) Create a HTML file which contains NCR (AÁ).
2) Open that in browser, it shows "AÁ" (capital 'A' and AACUTE).
3) In messenger account set up, use this HTML as a signature file also uncheck
to compose HTML mail.
4) Create a new message. A plain text mail shows up and the signature turns to
square boxes (on WinNT).
In the code messenger uses nsIHTMLToTXTSinkStream to convert from HTML to plain
text. When the input HTML contain NCR, NCR turned to 0xFFFF after the
conversion.
Assignee | ||
Comment 1•25 years ago
|
||
This bug was separated from 34373.
Comment 2•25 years ago
|
||
This is probably related to bug 13401, that the nsHTMLToTXTSinkStream needs to
be converted to use nsISaveAsCharset just like the nsHTMLContentSinkStream
already was.
Marking M19 (13401 is M20 because no one seemed to care about it), but if this
is important, let me know and I'll do it sooner. It shouldn't be difficult.
Status: NEW → ASSIGNED
Target Milestone: --- → M19
Assignee | ||
Comment 3•25 years ago
|
||
nsISaveAsCharset generates NCR but this bug is about interpreting NCR, so not
sure if nsISaveAsCharset should be used.
Comment 5•25 years ago
|
||
Per beppe, marking future. Is this blocking anyone?
Target Milestone: M19 → Future
Assignee | ||
Comment 6•25 years ago
|
||
This was a problem since HTML composer generated NCR for Japanese HTML.
I think currently that does not happen (text saved as a target charset instead
of NCR).
Comment 8•25 years ago
|
||
Anthonyd is taking over Output. I'm not sure if this is still a problem or not
-- Naoki, can you comment?
Assignee | ||
Comment 9•25 years ago
|
||
This is still reproducible with win32 branch build 2000-10-24-09-MN6.
Assignee | ||
Comment 10•25 years ago
|
||
Comment 11•25 years ago
|
||
The reassign apparently didn't work -- trying again.
Assignee: akkana → anthonyd
Status: ASSIGNED → NEW
Comment 14•23 years ago
|
||
removing myself from the cc list
Assignee | ||
Comment 15•23 years ago
|
||
Assignee | ||
Updated•23 years ago
|
Status: NEW → ASSIGNED
Summary: nsIHTMLToTXTSinkStream not work if input is NCR → nsPlainTextSerializer not work if input is NCR
Target Milestone: Future → ---
Assignee | ||
Comment 17•23 years ago
|
||
akkana, could you review the patch?
Comment 18•23 years ago
|
||
Comment on attachment 91430 [details] [diff] [review]
Changed nsPlainTextSerializer to try NCR if CER conversion fails.
Looks reasonable, r=akkana.
Does str have the '#' in it, and if so, would it be safer to skip over that
character if ToInteger isn't guaranteed to? But if it works this way, it can
only make things better.
Attachment #91430 -
Flags: review+
Assignee | ||
Comment 19•23 years ago
|
||
ToInteger() skips '#', I actually copied the code from the parser.
http://lxr.mozilla.org/seamonkey/source/htmlparser/src/nsHTMLTokens.cpp#2183
Comment 20•23 years ago
|
||
Comment on attachment 91430 [details] [diff] [review]
Changed nsPlainTextSerializer to try NCR if CER conversion fails.
So I'm a little curious, if an entity is not in our conversion table, say we
have &foobar;, we will try to convert that to an int, will ToInteger() return a
zero in that case? Is that ok?
Assignee | ||
Comment 21•23 years ago
|
||
I think it's better to check if the first character is '#'.
Assignee | ||
Comment 22•23 years ago
|
||
Attachment #91430 -
Attachment is obsolete: true
Assignee | ||
Comment 23•23 years ago
|
||
akkana, could you review the new patch?
Comment 24•23 years ago
|
||
Comment on attachment 91838 [details] [diff] [review]
Check if the fist character is '#'.
I have a vague and possibly incorrect memory that it's better (faster/more
efficient) to use str.First() rather than str[0]. Can you check with jag or
scc on that? Or maybe Kin knows. Otherwise r=akkana.
Attachment #91838 -
Flags: review+
Comment 25•23 years ago
|
||
Comment on attachment 91838 [details] [diff] [review]
Check if the fist character is '#'.
sr=kin@netscape.com
With the |str[0]| to |str.First()| change as akkana and scc suggest.
As a side note, I'm wondering if we should just write out the entity string in
the case where |entity == -1 && str.First() != '#'|? Not required for this bug,
but just a thought.
Attachment #91838 -
Flags: superreview+
Assignee | ||
Comment 26•23 years ago
|
||
Assignee | ||
Updated•23 years ago
|
Attachment #91838 -
Attachment is obsolete: true
Assignee | ||
Comment 27•23 years ago
|
||
Comment on attachment 91886 [details] [diff] [review]
Change str[0] to str.First().
copy r/sr
Attachment #91886 -
Flags: superreview+
Attachment #91886 -
Flags: review+
Comment 28•23 years ago
|
||
Comment on attachment 91886 [details] [diff] [review]
Change str[0] to str.First().
a=asa (on behalf of drivers) for checkin to 1.1
Attachment #91886 -
Flags: approval+
Assignee | ||
Comment 29•23 years ago
|
||
checked in to the trunk
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•23 years ago
|
Target Milestone: --- → mozilla1.1beta
You need to log in
before you can comment on or make changes to this bug.
Description
•