Closed Bug 286132 Opened 20 years ago Closed 19 years ago

xml-stylesheet PI doesn't handle href attribute as in specification

Categories

(Core :: XSLT, defect)

1.0 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: blum, Assigned: peterv)

References

Details

Attachments

(5 files, 6 obsolete files)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 When processing the xml-stylesheet processing instruction, the href attribute should be treated as a complete URI as specified by W3C http://www.w3.org/TR/xml-stylesheet/ Firefox is not processing it as full URI, because querystring parameters are not being received by the stylesheet Problem : I can't generate a dynamic stylesheet Example: <?xml version="1.0" encoding="iso-8859-1"?> <?xml-stylesheet type="text/xsl" href="dynamic_stylesheet.asp?key=value"?> dynamic_stylesheet.asp just don't receive the parameters. Reproducible: Always Steps to Reproduce: <?xml version="1.0" encoding="iso-8859-1"?> <?xml-stylesheet type="text/xsl" href="dynamic_stylesheet.asp?key=value"?> dynamic_stylesheet.asp just don't receive the parameters. Actual Results: Since I can't generate dynamic stylesheets, I can't create a full xml+xsl driven site without using sessions states, because I rely on parameter-passing throught querystring. Application scenario is : User selects option in listbox. Value is submited to dynamic XML file. It then returns a specially made <?xml-stylesheet href="dynamic_sheet?listbox=value">. Firefox would then access the xsl generator which would receive parameters and draw form fields accordinging, e.g., the listbox 'selected' attribute would be correct. With this incorrect behavior, I have no option but to use MSIE OR put my query parameters in the XML, which I don't want Expected Results: It should have called the stylesheet passing all parameters.
Assignee: firefox → peterv
Component: General → XSLT
OS: Windows XP → All
Product: Firefox → Core
QA Contact: general → keith
Hardware: PC → All
Version: unspecified → 1.0 Branch
I do receive parameters as of 1.04, but &amp; does not get translated to &. Depending on the backend application this might get handled properly or not (I'm using always Apache and PHP, but different versions). Here is the example page: http://my.sitebar.org/news.php You will find a reference to dynamically generated stylesheet there: <?xml-stylesheet href="xsl.php?file=xbel2news&amp;skin=Modern" type="text/xsl"?> When using & instead of &amp; it works. But it does not work in IE. If I use &amp; then FF will ask for it. In the Apache log file I will always see a request with &amp;, however, I should see a request with single & to get properly handled by the CGI program.
(In reply to comment #1) > I do receive parameters as of 1.04, but &amp; does not get translated to &. Yes, you are right. I've designed a test suit myself, and you are dead right. FF request the url with &amp; on the query string. Problem is that ASP 3.0 + IIS does NOT understand &amp; it I don't know if this is the correct behavior. Right now I haven't looked at HTTP specifications but my feeling is that it SHOULD be translated... I'll attach an archive with my source files, the IIS log and the picture of the whole thing
Attached file bug in FF1.04 + IIS 5
the screenshot shows that the application fails to parse the querystring parameters when FF uses &amp; (instead of translating them back to &) before doing the request.
Please try a more recent build.
(In reply to comment #4) > Please try a more recent build. So this should be fixed by the expat landing? Or are you just wondering if it would be fixed?
It might have been fixed, so I'm wondering.
Reproducible with: Deer Park Alpha 1 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b2) Gecko/20050531 Firefox/1.0+
http://www.w3.org/TR/REC-xml/#IDAP51S : "Parameter entity references MUST NOT be recognized within processing instructions."
Status: UNCONFIRMED → RESOLVED
Closed: 20 years ago
Resolution: --- → INVALID
I humbly think you misinterpreted the document, mainly because it might have been uncarefully written. Look: "Parameter entity references MUST NOT be recognized ..." Does the document mean "Parameter-entity reference"? or "Parameter entity-reference" ? These have different meanings that CRITICALLY matter in this subject. from http://www.w3.org/TR/REC-xml/#sec-references [Definition: An entity reference refers to the content of a named entity.] [Definition: References to parsed general entities use ampersand (&) and semicolon (;) as delimiters.] -> this was the &amp; we are talking about Now, the IMPORTANT part [Definition: Parameter-entity references use percent-sign (%) and semicolon (;) as delimiters.] This is what, in my humble opinion, what must not be recognized. This is NOT an entity-reference like &amp; Also section 3.3.3 describes the algorithm by which attibute-values should be normalized ( http://www.w3.org/TR/REC-xml/#AVNormalize ), and this include recognition of entity-references (step 3) Also, in section 4.4.5 you can see how the document differentiates "entity references" from "parameter entity references" When an entity reference appears in an attribute value, or a parameter entity reference appears in a literal entity value, its replacement text MUST be processed in place of the reference itself (...) At last, for curiosity, note that the document always use the expression "parameter-entity reference" to avoid problems in the interpretation. I think the absense of the dash in the expression "Parameter entity references MUST NOT be recognized within processing instructions" is what ultimately lead to this bug.
Status: RESOLVED → UNCONFIRMED
Resolution: INVALID → ---
3.3.3 is about attributes in elements, nothing to do with PIs. I still think this is invalid, see also 2.4: "The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings "&amp;" and "&lt;" respectively"
Actually, this bug is valid. Entities shouldn't be parsed during XML parsing so we're doing the right thing as far as all the arguments brought up so far goes. However when we extract the href "attribute" from the PI we should expand entities according to the xml-stylesheet spec: http://www.w3.org/TR/xml-stylesheet/#NT-StyleSheetPI
Status: UNCONFIRMED → NEW
Ever confirmed: true
Aaaah. We'll probably need our own routine for doing that :-/.
Attached patch v1 (obsolete) — Splinter Review
Attachment #187957 - Flags: review?(bugmail)
There is a test case that works for me on Firefox 1.0.4: <https://bugzilla.mozilla.org/attachment.cgi?id=189035>. I suggest that the reporter check his server configuration and that the Mozilla authorities close this bug, perhaps changing the “summary”, “product”, and “component” fields to something more appropriate. A good summary would be ‘URI/URL query dropped before getting “xml-stylesheet” style sheet (processing instruction, PI, “href” pseudo-attribute)’.
(In reply to comment #16) > I suggest that the reporter check his server configuration ... We have got little further with this bug (but yes better summary would be welcome). You cannot reproduce this that simple. You will have to use stylesheet URI like href="https://bugzilla.mozilla.org/attachment.cgi?id=189034&amp;color=green" and inside of the stylesheet parse CGI GET parameter "color". But you will not get that parameter, you will get "amp;color" instead. If you used only & instead of &amp; in the href attribute, then it would work, but then it will not work in IE and apparently IE sticks better to standards here.
(In reply to comment #16) > I suggest that the reporter check his server configuration and that the Mozilla > authorities close this bug, perhaps changing the “summary”, “product”, and > “component” fields to something more appropriate. A good summary would be > ‘URI/URL query dropped before getting “xml-stylesheet” style sheet (processing > instruction, PI, “href” pseudo-attribute)’. There is no need for these testcases or comments. This bug is valid and I have a fix. Please don't spam this bug anymore.
Summary: <?xml-stylesheet href attribute not processed as URI → xml-stylesheet PI doesn't handle href attribute as in specification
Comment on attachment 187957 [details] [diff] [review] v1 The expat stuff looks fine, however I think you've added this to the wrong place in the sink. The "entity expansion" doesn't only apply to the href attribute, but to all attributes. It might be better to add the call in nsParserUtils::GetQuotedAttributeValue That way it'll work for dynamic changes too.
Attachment #187957 - Flags: review?(bugmail) → review-
Comment on attachment 187957 [details] [diff] [review] v1 Oh, i forgot: >+ if (tok == XML_TOK_ENTITY_REF) { >+ const XML_Char *name; Name seems unused. >+ XML_Char ch = (XML_Char)XmlPredefinedEntityName(enc, ptr, *next - 2); Shouldn't that say |ptr + 2| ?
Attached patch v2 (obsolete) — Splinter Review
Attachment #187957 - Attachment is obsolete: true
Attachment #211128 - Flags: review?(bugmail)
(In reply to comment #20) > >+ XML_Char ch = (XML_Char)XmlPredefinedEntityName(enc, ptr, *next - 2); > > Shouldn't that say |ptr + 2| ? I don't think so, since ptr points to the character after the ampersand.
Status: NEW → ASSIGNED
Comment on attachment 211128 [details] [diff] [review] v2 This doesn't work.
Attachment #211128 - Flags: review?(bugmail) → review-
Attached patch v2.1 (obsolete) — Splinter Review
Attachment #211128 - Attachment is obsolete: true
Attachment #211190 - Flags: review?(bugmail)
Comment on attachment 211190 [details] [diff] [review] v2.1 >Index: content/base/src/nsParserUtils.cpp ... >+ AppendUnicodeTo(start, chunkEnd, aValue); Hmm.. I can't find which AppendUnicodeTo implementation that this is calling. The only implementation I can find lives in nsReadableUtils.cpp, but that takes |nsAString::const_iterator|s which is not what you have here. If that is in fact the version that ends up being called, it seems simpler and faster to just use |aValue.Append(Substring(start, chunkEnd))|. Could you attach a -w version of the patch, should make reviewing the contentsink changes a lot simpler. Also, in general it's nice with a bit more context then just three lines.
Attached patch v2.2 (obsolete) — Splinter Review
With better iterators.
Attachment #211190 - Attachment is obsolete: true
Attachment #211256 - Flags: review?(bugmail)
Attachment #211190 - Flags: review?(bugmail)
Attached patch v2.2 (diff -w) (obsolete) — Splinter Review
Comment on attachment 211257 [details] [diff] [review] v2.2 (diff -w) I would've attached this if I thought it was useful. Anyway, since you seem to insist: here it is :-).
Attachment #211257 - Attachment description: v2.2 () → v2.2 (diff -w)
Attachment #211256 - Flags: superreview?(bzbarsky)
So.. I have two questions: 1) This is changing the behavior of GetAttrValue for all nsXMLProcessingInstructions. That seems wrong. Then again, the method is only called by nsXMLStylesheetPI. Perhaps it should simply live in nsXMLStylesheetPI? Then the changes would be fine.... Alternately, there should be some clear documentation that this "attr value" is following the xml-stylesheet specification and other PIs should NOT use it unless their syntax coincides with this one. 2) Why are we adding stuff to parser/expat/lib/moz_extensions.c that's not actually used by expat? Shouldn't that code live elsewhere? Or is the idea that we're reusing expat's predefinedEntityName?
for 1 I'd like to just document what the function does and leave it up to users if they want to use it or not. While the exact syntax is only currently defined for xml-stylesheet, nothing about it is specific to stylesheets so hopefully other specs will use it.
XBL2 currently claims to use the same syntax.
(In reply to comment #29) > 1) This is changing the behavior of GetAttrValue for all > nsXMLProcessingInstructions. That seems wrong. Then again, the method is only > called by nsXMLStylesheetPI. Perhaps it should simply live in > nsXMLStylesheetPI? Then the changes would be fine.... Alternately, there > should be some clear documentation that this "attr value" is following the > xml-stylesheet specification and other PIs should NOT use it unless their > syntax coincides with this one. Maybe we should just drop GetAttrValue. It's not very efficient: it'd be better to only call GetData once and nsParserUtils::GetQuotedAttributeValue multiple times in nsXMLStylesheetPI::GetStyleSheetInfo, instead of calling GetAttrValue multiple times. > 2) Why are we adding stuff to parser/expat/lib/moz_extensions.c that's not > actually used by expat? Shouldn't that code live elsewhere? Or is the idea > that we're reusing expat's predefinedEntityName? We're using little2_/big2_scanRef, XmlCharRefNumber, XmlUtf16Encode and XmlPredefinedEntityName. I don't really want to start exporting xmltok.h, which is an internal Expat header and I don't want to rewrite all those routines. Besides, where would you rather put code to parse an XML entity? I don't see any issue with having it there. Ideally we'd be able to call MOZ_XMLTranslateEntity directly, without having to go through the parser service, but we need to link the parser into layout then.
Comment on attachment 211256 [details] [diff] [review] v2.2 >Index: content/base/src/nsParserUtils.cpp >+ nsString::const_char_iterator dataStart, start, end; >+ dataStart = aSource.BeginReading(start); Why bother? You treat them as PRUnichar* anyway below, so might as well use PRUnichar* throughout the method... >Index: content/xml/content/src/nsXMLProcessingInstruction.cpp >PRBool >nsXMLProcessingInstruction::GetAttrValue(nsIAtom *aName, nsAString& aValue) ... >+ return nsParserUtils::GetQuotedAttributeValue(data, aName, aValue); You changed GetQuotedAttributeValue to return nsresult, so something's wrong here, no? >Index: content/xml/content/src/nsXMLProcessingInstruction.h >+ PRBool GetAttrValue(nsIAtom *aName, nsAString& aValue); Document per our discussion in this bug? >Index: content/xml/document/src/nsXMLContentSink.cpp >+ // If it's not a CSS stylesheet PI... >+ nsAutoString type; >+ rv = nsParserUtils::GetQuotedAttributeValue(data, nsGkAtoms::type, type); >+ NS_ENSURE_SUCCESS(rv, rv); I don't think this method should be returning failure just because there was a "malformed type attr" on the PI.... Why should that be a failure? >+ rv = ParsePIData(data, href, title, media, isAlternate); >+ NS_ENSURE_SUCCESS(rv, rv); Same here. >+nsXMLContentSink::ParsePIData(const nsString &aData, >+ aTitle.CompressWhitespace(); I know the code used to do this... but why? I see no reason we should be doing that; certainly not in the spec. >+ ToLowerCase(aMedia); Again, I know we used to do this, but why bother? >Index: content/xul/document/src/nsXULContentSink.cpp >+ rv = nsParserUtils::GetQuotedAttributeValue(data, nsGkAtoms::href, href); >+ NS_ENSURE_SUCCESS(rv, rv); Again, not sure we want to throw if we can't parse it... Similar elsewhere in this method. >+ if (!FindInReadable(NS_LITERAL_STRING("xml-stylesheet"), targetStart, >+ targetEnd)) { I know the code did this already.... but it's pretty bogus. File a followup bug to fix it, please? >Index: parser/expat/lib/moz_extensions.c >+PRBool MOZ_XMLTranslateEntity(const char* ptr, const >+ int n = XmlCharRefNumber(enc, ptr); So there are two interesting cases here that this code doesn't handle well: 1) n == 0 2) n >= 0x10000 For the former, we'll happily stick a null in the middle of our attribute value, which I'm not sure if desirable. For the former we'll bail out completely (which means that we'll screw up titles in some languages; in particular anything not in the BMP). I think we need to fix both of those issues.
Attachment #211256 - Flags: superreview?(bzbarsky) → superreview-
> >+ nsString::const_char_iterator dataStart, start, end; > >+ dataStart = aSource.BeginReading(start); > > Why bother? You treat them as PRUnichar* anyway below, so might as well use > PRUnichar* throughout the method... well, nsString::const_char_iterator *is* a PRUnichar*, so that logic can be applied to every user of nsString::const_char_iterator. No?
(In reply to comment #33) > >+ int n = XmlCharRefNumber(enc, ptr); > > So there are two interesting cases here that this code doesn't handle well: > > 1) n == 0 > 2) n >= 0x10000 > > For the former, we'll happily stick a null in the middle of our attribute > value, which I'm not sure if desirable. For the former we'll bail out > completely (which means that we'll screw up titles in some languages; in > particular anything not in the BMP). For the former, 0x0 is defined as BT_NONXML (see http://lxr.mozilla.org/seamonkey/source/parser/expat/lib/asciitab.h), as are all the other 'illegal' characters. This is where reusing the Expat code pays off, it already has to deal with all this when parsing. For the latter, Expat defines int to be int32. Ugly, but it works.
> so that logic can be applied to every user of nsString::const_char_iterator. Sure. And in code we're adding we're just using PRUnichar*. We're not doing a treewide cleanup jihad, but that's not really relevant here. > For the former, 0x0 is defined as BT_NONXML Ah, ok. So checkCharRefNumber() saves us there, right? Might be worth to document (and assert? Not sure we can assert in this code) that n != 0 when we return from XmlCharRefNumber > For the latter, Expat defines int to be int32 I don't see what that has to do with the issue I raised about non-BMP chars. The point is, those can't be expressed in a single PRUnichar, and MOZ_XMLTranslateEntity only outputs a single PRUnichar. It should probably output a PRUint32 or equiv and the caller should break it up into PRUnichars itself as needed. Or something.
Oh, right, I misunderstood the second one. Good point, I guess I could pass in a PRUnichar[2] buffer instead.
Sure. That would work too. Or PRUnichar[3] and demand null-termination or something.
Attached patch v2.3 (obsolete) — Splinter Review
Attachment #211256 - Attachment is obsolete: true
Attachment #211257 - Attachment is obsolete: true
Attachment #214678 - Flags: superreview?(bzbarsky)
Attachment #214678 - Flags: review+
I'll try to review in detail on Sunday, but is GetQuotedAttributeValue returning nsresult or PRBool? It doesn't seem to be consistent...
(In reply to comment #40) > I'll try to review in detail on Sunday, but is GetQuotedAttributeValue > returning nsresult or PRBool? It doesn't seem to be consistent... I changed it back to returning PRBool, since we almost always ignore the result anyway. I checked all the callers now, don't see a problem. Please point it out in your review if I missed one.
Comment on attachment 214678 [details] [diff] [review] v2.3 Actually, I seem to have started off with the wrong patch.
Attachment #214678 - Flags: superreview?(bzbarsky) → superreview-
Attached patch v2.3Splinter Review
Attachment #214678 - Attachment is obsolete: true
Attachment #214792 - Flags: superreview?(bzbarsky)
Attachment #214792 - Flags: review+
Comment on attachment 214792 [details] [diff] [review] v2.3 >Index: content/base/src/nsParserUtils.cpp >+nsParserUtils::GetQuotedAttributeValue(const nsString& aSource, nsIAtom *aName, >- NS_ASSERTION(!aAttribute.IsEmpty(), "Empty attribute name cannot be searched for usefully"); How about we assert that aName is non-null? >Index: content/xul/document/src/nsXULContentSink.cpp >+ if (!FindInReadable(NS_LITERAL_STRING("xml-stylesheet"), targetStart, >+ targetEnd)) { We're still filing a followup on this, right? >Index: parser/expat/lib/moz_extensions.c >+ XML_Char ch = (XML_Char)XmlPredefinedEntityName(enc, ptr, *next - 2); "2" because that's sizeof(XML_Char) or something? Maybe this should just use said sizeof here and document why you're subtracting it? It's certainly not clear to me why you are. I assume we're backing up past the ';'? sr=bzbarsky with the nits picked.
Attachment #214792 - Flags: superreview?(bzbarsky) → superreview+
In what you checked in I noticed this snippet, which is in attachment 214678 [details] [diff] [review] but not in attachment 214792 [details] [diff] [review] : + nsAString::iterator dest; + aValue.BeginWriting(dest); Did you check in the right thing? (Reason I noticed it is 'coz there's nothing there to make sure aValue's buffer is actually big enough, nor do you call SetLength() afterwards to update the string's internals. Of course this is a moot point since your last patch doesn't use BeginWriting().)
(In reply to comment #45) > Did you check in the right thing? No, I used the wrong tree again, see also comment 42. I've nuked that tree now :-).
Status: ASSIGNED → RESOLVED
Closed: 20 years ago19 years ago
Resolution: --- → FIXED
Comment on attachment 214792 [details] [diff] [review] v2.3 It'd be nice to have this fixed for FF2 too since it's a spec compliance thing
Attachment #214792 - Flags: approval-branch-1.8.1?(jst)
Attachment #214792 - Flags: approval-branch-1.8.1?(jst) → approval-branch-1.8.1+
Comment on attachment 214792 [details] [diff] [review] v2.3 Actually, the patch currently can't land on the branch since it modifies nsIParserService :( Peterv: Is there some way we can do this without modifying the parser service? Or should we just not port this?
Attachment #214792 - Flags: approval-branch-1.8.1+ → approval-branch-1.8.1-
Yeah, I tried to port this a couple of days ago and realized the same thing. I don't really know how to do this without changing nsIParserService or copying all the entity decoding code from Expat (which I'd rather not do).
You could create a nsIParserService_BRANCH181 or whatever those are called instead.
Ah, yeah, a new interface that just contains the new function would work.
Attachment #220573 - Flags: approval-branch-1.8.1?(bugmail)
Comment on attachment 220573 [details] [diff] [review] v2.3 ported to 1.8 branch So I take it that this is the same as the other but just uses a new interface?
Attachment #220573 - Flags: approval-branch-1.8.1?(bugmail) → approval-branch-1.8.1+
Keywords: fixed1.8.1
*** Bug 346359 has been marked as a duplicate of this bug. ***
Depends on: 347879
Removing fixed1.8.1 keyword since we reverted that change on the 1.8 branch in bug 347879
Keywords: fixed1.8.1
Can we check in some tests for this?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: