Closed
Bug 57047
Opened 24 years ago
Closed 24 years ago
We are missing <pre> when there are other tags inside the <pre>
Categories
(Core :: DOM: Serializers, defect, P3)
Tracking
()
VERIFIED
FIXED
mozilla0.9
People
(Reporter: bratell, Assigned: bratell)
Details
(Keywords: dataloss, Whiteboard: fix in hand)
Attachments
(1 file)
3.13 KB,
patch
|
Details | Diff | Splinter Review |
If you for instance cut'n'paste from a page looking like below we drop the line breaks. That is because we, in the plaintext serializer (nsHTMLToTXTSinkStream mk.2) only look for <pre> at the top of the stack and in this case the <font> is above the <pre>. I've no idea if this is legal HTML but it is used and we fail on it. I think it might be too expensive to look at the whole stack, but maybe look at the top 3-4? <pre> <font color="green">// The system has IP address from Private Address // area,only. Internet from the computer can be accessable // via Proxy. If user turn on proxy connection flag, the // function believe Internet accessable.</font> </pre>
Assignee | ||
Comment 1•24 years ago
|
||
Marking dataloss since we drop the text's new lines and sometimes those are really important.
Whiteboard: dataloss
Comment 2•24 years ago
|
||
Yes, I've often wondered when/if this might become a problem. Easy enough to look at the top few. I had previously written a routine that looked all the way up to the top of the stack (IsInPre), but someone else did a rewrite and got rid of it.
Status: NEW → ASSIGNED
Updated•24 years ago
|
Comment 3•24 years ago
|
||
Are these hard numbers (look only for the nenxt few) really necessary? <pre> can contain line tags, but not block tags. So, I propose to look (down) into the stack until you find a block-level tag. Just exactly this, if it is <pre> or not.
Comment 4•24 years ago
|
||
> line tags s/line/inline/ See <http://www.w3.org/TR/REC-html40/struct/text.html#edef-PRE>.
Assignee | ||
Comment 5•24 years ago
|
||
It might be much more expensive to check agains a list of inline tags than to just traverse the whole stack looking for <pre>. Easiest might be to, in OpenContainer, count the number of <pre>:s on the stack and keep the number in a varable which is decreased which <pre>:s are closed in CloseContainer. Then we only have to check if "m_pre_on_stack>0".
Assignee | ||
Comment 6•24 years ago
|
||
I've constructed a fix for this. It's to look for a pre in the stack until we reaches another block level tag. Since the check if a tag is block level or not is quite expensive it reduces performance by about 2%. Nothing that anyone will notice I guess. I've also rearranged the two if statements to make the IsPre the final condition since it's the most expensive. I'm looking for a review of this. Akkana? Ben?
Assignee: akkana → bratell
Status: ASSIGNED → NEW
Whiteboard: fix in hand
Target Milestone: Future → mozilla0.9
Assignee | ||
Comment 7•24 years ago
|
||
Comment 8•24 years ago
|
||
r=akkana. Looks good, thanks!
Comment 9•24 years ago
|
||
Wouldn't it be a bit cheaper to just maintain state in the serializer indicating whether a PRE element had been opened? In fact, shouldn't that be what mPreformatted indicates? That's not the case currently - mPreformatted is set based on checks for the existence of style properties related to preformatted block. It seems, though, that it could be subsumed to generally indicate that we are in a preformatted blocks. It might need to change from a boolean to an integer count for this.
Assignee | ||
Comment 10•24 years ago
|
||
I thought about that too, but that would be difficult to implement if there are for instance a table or something else non-pre inside a <pre>. My reasoning was that this was the way that would break least. If I'm wrong I can do it the other way. I'm sure you know those things better than me. For performance my measurements showed about 90 ms to make plain text from the url in this bug and this code added somewhere close to 2 ms to that.
Comment 11•24 years ago
|
||
The other reason not to subsume mPreformatted is that it could be toggled off prematurely: for instance, if the style says the whole document is formatted, or if the whole document is wrapped in a <pre>, then somewhere inside there is a <pre> tag (legal and not too unlikely in the first case, illegal in the second but could still happen), when the inner <pre> was closed we would prematurely toggle off mPreformatted. With Daniel's stack check, this can't happen.
Assignee | ||
Comment 12•24 years ago
|
||
I think that was why Vidur said we would have to switch from a bool to an integer (counter). Then we increase the count every time we see a <pre> and decrease when we see a </pre>. If it's preformatted from the beginning we start the count at 1 and it will never reach zero. But I still like my patch better unless someone has any more arguments against it.
Comment 13•24 years ago
|
||
Oh, sorry, I missed the integer counter part of the comment. Yes, that would work, and in fact, I think one of the many iterations of this code was implemented that way (as are list counters). It's really just a matter of taste. I tend to prefer Daniel's way because it's straightforward and a little more reliable (no worry about forgetting to add the increment or decrement code where it needs to be). And besides, we have his fix in hand, another advantage over the other solutions. :-)
Comment 14•24 years ago
|
||
I can live with Daniel's fix as well. sr=vidur.
Assignee | ||
Comment 15•24 years ago
|
||
Fix checked in.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 16•24 years ago
|
||
Please verify and mark VERIFIED-FIXED...thanks!
Assignee | ||
Comment 18•24 years ago
|
||
Just FYI, this bug also exists in MSIE 5.5. Bad Microsoft. :-)
Comment 19•7 years ago
|
||
1. the font tag requires a close tag and 2. should be interpreted as a font change inside pre because they are HTML tags (albeit old). if < and > are desired to be seen, they should be encoded with < and > some groups do not understand this yet and new beginners to HTML find this a difficulty. it should be in the manual on the pre element.
You need to log in
before you can comment on or make changes to this bug.
Description
•