Closed Bug 64190 Opened 24 years ago Closed 24 years ago

&sometext is rendered into character entities

Categories

(Core :: Layout, defect)

defect
Not set
normal

Tracking

()

VERIFIED INVALID

People

(Reporter: bugzilla, Assigned: clayton)

Details

if you have the string &lang in your html, the &lang is rendered as something 
that look likes a big less than sign.
why?

to reproduce:
<html>&lang</html>
&lang; is a valid HTML 4 character entity reference: 

<!ENTITY lang     CDATA "&#9001;" -- left-pointing angle bracket = bra,
                                     U+2329 ISOtech -->
<!-- lang is NOT the same character as U+003C 'less than' 
     or U+2039 'single left-pointing angle quotation mark' -->
I'm not refering to &lang; but just &lang
This is a more global bug:
<html>&amp</html> is rendered into a &
<html>&oelig</html> is rendered into a ø
etc...
Summary: &lang is rendered to a big less than sign → &sometext is rendered into character entities
Well, "&lang" is invalid html, since that should be "&amp;lang".

Of course, invalid html is the norm, so for "&word" (where "word" is any word")
we get to guess whether the author meant "&word;" or "&amp;word", both mistakes
happen all the time and we're expected to deal with both in a reasonable
fashion. I strongly suspect "&amp" rendered as "&" is by design, not a bug.

I guess the mechanism for "&word" is to try it as an entity first, and only if
it can't find "word" in the list of known entities to render it as "&word".
Peter, I disagree: <sometag>&amp</sometag> *is* a valid HTML fragment. The 
semicolon is not required (except for XHTML documents).

'Note. In SGML, it is possible to eliminate the final ";" after a character 
reference in some cases (e.g., at a line break or immediately before a tag). In 
other circumstances it may not be eliminated (e.g., in the middle of a word). 
We strongly suggest using the ";" in all cases to avoid problems with user 
agents that require this character to be present.'

Richard: You're right. Thanks for correcting me on that point (learn something
new every day).

I suggest this report be marked invalid. I doubt we'll break support for valid
entity references to accomodate those cases where people wrote "&sometext" when
they meant "&amp;sometext".
thanx for the insight...
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → INVALID
yup
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.