&sometext is rendered into character entities

VERIFIED INVALID

Status

()

Core
Layout
VERIFIED INVALID
17 years ago
17 years ago

People

(Reporter: Henrik Gemal, Assigned: clayton)

Tracking

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

17 years ago
if you have the string &lang in your html, the &lang is rendered as something 
that look likes a big less than sign.
why?

to reproduce:
<html>&lang</html>

Comment 1

17 years ago
&lang; is a valid HTML 4 character entity reference: 

<!ENTITY lang     CDATA "&#9001;" -- left-pointing angle bracket = bra,
                                     U+2329 ISOtech -->
<!-- lang is NOT the same character as U+003C 'less than' 
     or U+2039 'single left-pointing angle quotation mark' -->
(Reporter)

Comment 2

17 years ago
I'm not refering to &lang; but just &lang
(Reporter)

Comment 3

17 years ago
This is a more global bug:
<html>&amp</html> is rendered into a &
<html>&oelig</html> is rendered into a ø
etc...
Summary: &lang is rendered to a big less than sign → &sometext is rendered into character entities

Comment 4

17 years ago
Well, "&lang" is invalid html, since that should be "&amp;lang".

Of course, invalid html is the norm, so for "&word" (where "word" is any word")
we get to guess whether the author meant "&word;" or "&amp;word", both mistakes
happen all the time and we're expected to deal with both in a reasonable
fashion. I strongly suspect "&amp" rendered as "&" is by design, not a bug.

I guess the mechanism for "&word" is to try it as an entity first, and only if
it can't find "word" in the list of known entities to render it as "&word".

Comment 5

17 years ago
Peter, I disagree: <sometag>&amp</sometag> *is* a valid HTML fragment. The 
semicolon is not required (except for XHTML documents).

'Note. In SGML, it is possible to eliminate the final ";" after a character 
reference in some cases (e.g., at a line break or immediately before a tag). In 
other circumstances it may not be eliminated (e.g., in the middle of a word). 
We strongly suggest using the ";" in all cases to avoid problems with user 
agents that require this character to be present.'

Comment 6

17 years ago
Richard: You're right. Thanks for correcting me on that point (learn something
new every day).

I suggest this report be marked invalid. I doubt we'll break support for valid
entity references to accomodate those cases where people wrote "&sometext" when
they meant "&amp;sometext".
(Reporter)

Comment 7

17 years ago
thanx for the insight...
Status: NEW → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → INVALID
yup
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.