Uppercase elements are being treated as XHTML elements in XML documents




18 years ago
17 years ago


(Reporter: chrispetersen, Assigned: pierre)


({css1, testcase, xhtml})

css1, testcase, xhtml

Firefox Tracking Flags

(Not tracked)


(Whiteboard: (py8ieh: evil tests needed))


(4 attachments)



18 years ago
Platforms: All
Expected Results: A XML parser error should be displayed because document is not

What I got: Document's content is displayed.

Steps to reproduce:

1) Open xml test case

2) The test case contains elements that are in uppercase instead of lowercase.

3) The document's content is rendered.

4) According to the XHTML 1.O spec, XHTML documents must use lower case for all
HTML element and attribute names. Our XML parser should trap for this.

Comment 1

18 years ago
Created attachment 22299 [details]
A xhtml document that contains upper case element names.

Comment 2

18 years ago
Created attachment 22301 [details]
A revised version of original xml file-please use
Please mark xhtml bugs with the xhtml keyword.

Also, could you make a regular XML file that has just a couple of elements from
the XHTML namespace to see if this problem appears in that case as well?

I suspect that in this case we might actually be going through the HTML parser
in which case we would not care about upper/lowe case... It depends at least on
the file suffix (local file) and mime type (http). I am not completely sure
about .xml (or text/xml mime type) document containing only XHTML tags...
Keywords: xhtml
We are not a validating parser, so we should not spit an error on this -- the
file is well formed, just invalid. This is why we are not doing anything with
the attribute or the <TITLE> element -- they are not valid HTML, so we treat
them like generic, non-magical tags.

Whiteboard: INVALID?
Wrong, the file is not well formed. For example "<p>Blah</P>" is missing an end
tag "p" and is also missing a start tag "P" (notice the case).
Whiteboard: INVALID?

Comment 6

18 years ago
Created attachment 22371 [details]
A XML doc that uses HTML name space.

Comment 7

18 years ago
Ok, the new xml test case uses HTML namespace in it's root element. The document
conatins six html elements ( 3 uppercase names, 3 lowercase names). The three
lowercase names elements (P, H1, H3) are rendered correctly. The uppercase
elements are rendered as inline text.
Heikki: None of the test cases have mismatched case start and end tags as far
as I can tell. On which line of which attachment did you see that?

ChrisP: The first testcase doesn't work because no namespace is given (since,
as you obviously noticed and then corrected, the xmlns attribute is misspelt).

The second testcase is showing an error that I had missed before -- somehow,
one or both of the HEAD and TITLE elements are being wrongly recognised as 
XHTML elements and thus hidden. This is wrong, since XHTML is case-sensitive,
and so upper case tags should not be recognised as XHTML elements.

This third testcase is showing correct behaviour, since the "html" and "HTML"
namespace prefixes are not the same, and so should not (and do not) both
resolve to the same thing, resulting in the HTML:P, HTML:H1 and HTML:H3 elements
being treated as generic XML elements. (BTW, the namespace you used in the third
testcase is deprecated, you should use the one you used in the other two files.)

Therefore, I am retitling this bug to address the issue given in the second 
testcase -- why is the Style System matching the elements in the HTML namespace
in an XML document case insensitively?
Assignee: heikki → pierre
Component: XML → Style System
Keywords: css1
Summary: Uppercase elements and attributes are not being detected by XML parser → Uppercase elements are being treated as XHTML elements in XML documents
Am I missing something here? In the second testcase the head, title, body and p
elements have start and end tags in different case.

Also, I don't think this is a style system bug but rather the case that XHTML
documents are treated differently based on the file suffix/mime type - in some
cases they go through the HTML parser in which case it is treated as normal HTML
(so case does not matter for example) and in other cases the document goes
through the XML parser which has incomplete support for XHTML namespace (for
example title and style tags don't work).

suffix or mime type    parser
.htm*                  html
.xml                   xml
.xhtml                 ?
text/html              html
text/xml               xml

and so on...
Heikki: Where do you see that they are different case??? They look like the 
same case to me:

 <?xml version="1.0"?>
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1- strict
 <html xmlns="http://www.w3.org/1999/xhtml">
 <TITLE>Elements and attributes are uppercase</TITLE>
 <P ALIGN="CENTER">This text uses the align attribute with a value of center</P>

  -- http://bugzilla.mozilla.org/showattachment.cgi?attach_id=22301

Also, yes, the MIME type is important; but in this case the MIME type is 
text/xml so it is definitely the XML parser that is being used. Or at least it
should be.
AARGH! Ignore my comments and shoot me please. I was looking at the attachment
in NS6, and View > Page Source really showed the starting tags of the
aforementioned elements in lower case. When I looked at it in latest Mozilla, or
saved the file in NS 6 I actually saw that the file was as you described.

Ok, so now that I know the testcase is correct... I believe this is an XML bug
still. Hmm... or maybe there are bugs both in XML and the style system...
Specifically in XML, in nsXMLContentSink::OpenContainer() (maybe elsewhere as
well, like attributes) we check for HTML namespace and if so, create an HTML
element with NS_CreateHTMLElement. So we do not check for the element case, and
NS_CreateHTMLElement is case insensitive because it has to be for normal HTML.

So it looks like the fix in the XML side seems to be simple: just check that the
element name (attributes?) does not contain uppercase characters used in the
XHTML DTD (A-Z, for simplicity?).

The fix in the style system side is probably needed as well, although it is not
as important. Even though the fix in the content sink would fix normal paths, it
would still be possible to create XHTML elements in wrong case using the DOM and
XPCOM. I don't know how the style system works, but if it looks at content
objects it can get a pointer to the content's document (sometimes null so this
is not foolproof?), and if it is not HTML the element case should matter.
Whiteboard: (py8ieh: evil tests needed)
Nominating for nsbeta1 because of standards compliance, this could introduce bad
habits for web developers.

By the way, bug 29171 is related if not dupe (see my description there of the
2-3 bugs these bug reports describe).
Keywords: nsbeta1

Comment 13

18 years ago
Created attachment 25041 [details]
A upper case P element with a lowercase attribute

Comment 14

18 years ago

How should empty elements (BR, HR, IMG) or elements like object ,applet or form
be treated in this case ? Just not be rendered ?
Upper case empty HTML elements should be treated the same as any other empty XML
element that has no formatting associated with it - I think it does not affect
the formatting in any way.

Byt the way, I have the fix for this on the XML side in the fix to the STYLE

Comment 16

18 years ago
Two additional test cases with uppercase elements examples:



Comment 17

18 years ago
Reassigned to heikki who's got a fix.
Assignee: pierre → heikki
No, this is about the fix to the style system.

There is a bug on Nisheeth's list to fix things on the XML side (bug 29171). I
suspect that once that bug is fixed this bug is hidden (so with luck we might
never see this again).

I am giving this back to pierre, but putting moving to future.
Assignee: heikki → pierre
Keywords: nsbeta1 → nsbeta1-
Target Milestone: --- → Future

Comment 19

18 years ago
Works great in the 6/07 branch build. Uppercase elements are nolonger being 
rendered as HTML elements. 
WFM, yesterday's Linux CVS build.
Keywords: testcase
Please note that this bug will be hidden because bug 29171 was fixed. This is
not fixed, AFAIK. But as long as this does not cause problems this can be
futured, or if you really want to remove this from people's radar, mark it
worksforme and if it some day raises its head try to remember to reopen this and
not open a new one.


Comment 22

18 years ago
For the record, this bug is under Style System because of the question raised by 
Ian on [ 2001-01-13 04:09] which is... Why is the Style System matching the 
elements in the HTML namespace in an XML document case insensitively?
Priority: -- → P4
> Why is the Style System matching the 
> elements in the HTML namespace in an XML document case insensitively?

Presumably because the old content model code was changing the case of those
elements.  I doubt there was ever a style system bug here.
See previous comment.

*** This bug has been marked as a duplicate of 29171 ***
Last Resolved: 17 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.