Closed Bug 278404 Opened 20 years ago Closed 20 years ago

&prod causes ∏ to be displayed

Categories

(SeaMonkey :: General, defect)

1.7 Branch
x86
All
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED DUPLICATE of bug 155047

People

(Reporter: dan, Unassigned)

References

()

Details

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-GB; rv:1.7.6) Gecko/20050112 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-GB; rv:1.7.6) Gecko/20050112 When the term &prod is used in a dynamic URL it displays the ∏ symbol. Surely it shouldn't do that unless there realy is a semi-colon in the URL? Reproducible: Always
your URL should use &prod It's true that Mozilla doesn't require the ; anymore, but that was done for IE-compatibility unfortunately.
Summary: &prod causes ∏ to be displayed → &prod causes ∏ to be displayed
I don't understand what you mean by IE-compatibility?? IE (v6) displays that page correctly!?
Version: unspecified → 1.7 Branch
Comfirmed with Mozilla Suite 1.8a6 release build/Win-2K. > Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a6) Gecko/20050111 This is *NOT* a problem on real URL in anchor tag. Status bar displays this link as expected("&prod=" is displayed as "&prod=" properly). Problem in the test case(URL: field of this bug) is on text string of "&prod" in [URL-format-plain-text] part in following HTML source format. > <a href="...">[URL-format-plain-text]</a> I know "&" in text in HTML is recommended to be written as "&amp;". But text string of "&prod=" should not be interpreted as "&prod;=", I think. Jo Hermans, do you know the bug number which introduced the "IE-compatibility" you say?
Status: UNCONFIRMED → NEW
Ever confirmed: true
The bug also appears in Standards compliance mode (where, I think, dirty work-arrounds to implement IE bugs should disappear, like the document.all...) (By the way, I see it on linux too, so OS->ALL ?)
confirmed, I see this with Mozilla on Linux and on Solaris so changing OS to "All"
OS: Windows 2000 → All
invalid, see bug 155047 comment 2 also see http://www.w3.org/TR/html4/charset.html#entities "Note. In SGML, it is possible to eliminate the final ";" after a character reference in some cases (e.g., at a line break or immediately before a tag). In other circumstances it may not be eliminated (e.g., in the middle of a word). We strongly suggest using the ";" in all cases to avoid problems with user agents that require this character to be present." the = sign is one of those cases. *** This bug has been marked as a duplicate of 155047 ***
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → DUPLICATE
(In reply to comment #6) > the = sign is one of those cases. I agree on that "=" can be one of cases ";" of character reference can be eliminated, if this is on data in a tag itself, such as attribute=value in a tag. But this bug's case is "[preceding_text]&[entity_name]=[trailing_text]" in plain text data portion in HTML source. I think next description in "Note:" in "5.3 Character references" of HTML 4.01 reference should be applied, although "=" can usually be one of separater of words when natural language. > In other circumstances it may not be eliminated > (e.g., in the middle of a word). I believe that this description does not mean "; can be eliminated at any end of word". And I believe that next descrition means "; is required usually except on some special cases such as just before CR+LF or tag starting character." > We strongly suggest using the ";" in all cases to avoid problems with user > agents that require this character to be present." I feel that current logic too widely applies "; can be eliminated". Christian Biesinger, what do you think?
ok, HTML 4.01 normatively references SGML. since SGML is an ISO standard, it's unfortunately not free...
(In reply to comment #8) > ok, HTML 4.01 normatively references SGML. since SGML is an ISO standard, it's > unfortunately not free... Is it caused by use of SGML? I cannot believe it. Bug 155047 comment #0 says : > It happens for: &amp;nbsp, &amp;pound, &amp;yen, &amp;deg, &amp;cent, &amp;#123 > but not for: &amp;plus, &amp;period, &amp;equals, &amp;dollar If due to SGML use or definition, I believe same logic will be(should be) applied to both "&amp;yen" and "&amp;dollar". But not. Christian Biesinger, why ";" after "&amp;yen" can be elminated even though ";" after "&amp;dollar" should not be eliminated?
Verified. SGML clearly spells out what "can be eliminated" means -- in brief, any character that's not a valid entity name character indicates the stop of the entity. Note that any page that doesn't escape the '&' is depending on the wholly buggy behavior of HTML browsers which show the entity name when they don't know the entity... a real SGML processor would simply treat the document as being in error at that point instead.
Status: RESOLVED → VERIFIED
> why ";" after "&amp;yen" can be elminated even though ";" after "&amp;dollar" > should not be eliminated? Because HTML defines an entity named "yen", but not an entity named "dollar". "&dollar;" (with the ';') will also just show as plaintext. See the part about handling unknown entities in comment 10. And please test things before making claims about when ';' can be eliminated (that is, put the ';' in, and see what happens).
(In reply to comment #11) Sorry for my bad question based on undefined entitiy name. My concern is "&" with valid entitiy name followed by "=" without ";" case. I understand that using "&amp;" is always recommended. But I also think accepting "omition of ;" should be based on SGML since Mozilla uses SGML. http://www.isgmlug.org/sgmlhelp/g-sg17.htm says : >Once an entity has been declared, it may be referenced anywhere within a document. >This is done by supplying its name prefixed with the ampersand character and >followed by the semicolon. >The semicolon may be omitted if the entity reference is followed by a space >or record end. This document is not exact SGML standard definition but I think this description is basic concept on entity reference in SGML document. (Sorry but I still don't know where is official SGML standard definition.) I think that most natural understanding of "record end" in HTML is "Line Break(end of line)", and if added, tag start character("<" when HTML). I think next in HTML specicification corresponds to "record end" in SGML, > it is possible to eliminate the final ";" after a character reference in some > cases (e.g., at a line break or immediately before a tag) and next corresponds to "space" in SGML. > In other circumstances it may not be eliminated > (e.g., in the middle of a word). (In other words, "If not followed by space, ';' is required".) "=" is apprently not "space". Boris Zbarsky, "=" in text between <a> and </a> in HTML source is "record end" in SGML?
"record end", in this context, is what I said -- anything that's not a valid entity name character.
Oh, I see. I can now explain why "&" should be written as "&amp;" to any claiming users :-) Boris, thanks for your teaching on SGML spec to me.
OK, Wow! That certainly was a learning experiance for me :-) I now need to go and raise a bug on the forum software I was using for not translating "&" into "&amp;" and go though all this again! Keep up the great work people and sorry for wasting time with a duplicate bug :-\
You need to log in before you can comment on or make changes to this bug.