[FIX] XML serializer serializes namespace prefixes unnecessarily

RESOLVED FIXED in mozilla1.8alpha5

Status

()

defect
P2
normal
RESOLVED FIXED
15 years ago
15 years ago

People

(Reporter: bzbarsky, Assigned: bzbarsky)

Tracking

Trunk
mozilla1.8alpha5
x86
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 1 obsolete attachment)

Blocks: 155723
Priority: -- → P2
Target Milestone: --- → mozilla1.8alpha5
Posted patch Patch (obsolete) — Splinter Review
Comment on attachment 161297 [details] [diff] [review]
Patch

peterv, would you review?
Attachment #161297 - Flags: superreview?(peterv)
Attachment #161297 - Flags: review?(peterv)

Comment 3

15 years ago
What about this change that I found useful earlier:

 nsXMLContentSerializer::ConfirmPrefix(nsAString& aPrefix,
				       const nsAString& aURI)
 {
-  if (aPrefix.Equals(kXMLNS)) {
+  if (aPrefix.Equals(kXMLNS) ||
+     (aPrefix.Equals(kXMLPrefix) && aURI.Equals(kXMLNameSpaceURI))) {
     return PR_FALSE;
   }

It stops redeclaring xmlns:xml since the "xml" prefix is standardized. 

Try a testcase with xml:lang to see.

The above will stop, e.g., <p xml:lang="en"> from becoming <p xml:lang="en"
xlmns:xml="http://www.w3.org/XML/1998/namespace">

Also, I found this comment of peterv more clear (bug 155723 comment 12):

"The problem that generated prefixes fix is that anyone can insert nodes through
the DOM that have a namespace URI and a random or empty or previously existing
prefix that's totally unrelated to the prefixes declared at that point through
xmlns attributes."

Care to fold it (or a variartion of it) somewhere in your documentation?
Comment on attachment 161297 [details] [diff] [review]
Patch

>Index: content/base/src/nsXMLContentSerializer.h
>===================================================================

>+  /**
>+   * ConfirmPrefix is needed for cases where scripts have moved/modified
>+   * elements/attributes and for cases when we serialize ranges.  It ensures that we can
>+   * map aPrefix to the namespace URI aURI (for example, that the prefix is not already
>+   * mapped to some other namespace).  aPrefix will be adjusted, if necessary, so the
>+   * value of the prefix _after_ this call is what should be serialized.
>+   * @param aPrefix the prefix that may need adjusting
>+   * @param aURI the namespace URI we want aPrefix to point to
>+   * @param aMustHavePrefix PR_TRUE if we the output prefix must be nonempty whenever a

s/we//

>Index: content/base/src/nsXMLContentSerializer.cpp
>===================================================================

>@@ -371,23 +369,24 @@ nsXMLContentSerializer::ConfirmPrefix(ns
>   }
> 
>   // There are no namespace declarations that match the prefix, uri pair. 
>   // If there's another prefix that matches that URI, us it.

While you're here, change 'us' to 'use'

>   if (uriMatch) {
>     aPrefix.Assign(closestURIMatch);
>     return PR_FALSE;
>   }
>-  // If we don't have a prefix, create one
>-  else if (aPrefix.IsEmpty()) {
>+  // If we don't have a prefix but used to have one, create one
>+  else if (aPrefix.IsEmpty() && aMustHavePrefix) {

While you're here, drop the else.


I'm worried that this code is (still) broken.

I think aMustHavePrefix is not enough and you need to decouple
|aPrefix.IsEmpty()| from |we can't use this prefix| in ConfirmPrefix. I think
the following will break with your change:

element (qname: foo, namespace URI: whatever)
  attr (qname: xmlns, value: bar)

We'll call |ConfirmPrefix("", "whatever", PR_FALSE)| with in the stack a
mapping from "" to "bar". Looping over the stack, we'll find a prefix match but
the URIs don't match. We don't make up a prefix (|aPrefix.IsEmpty() &&
aMustHavePrefix| is false) and so we'll push a mapping from "" to "whatever" on
the stack. We'll output two attributes on the same element declaring the
default namespace to a different URI.

I also don't understand how this code is supposed to work with the same prefix
being declared on an element and on one of its ancestors with a different URI.
As I read the code, if the URI we're trying to use is the same as the one
declared in the ancestor we'll end up using that prefix, even if the current
element maps it to a different URI.

We never check if the synthetized prefix doesn't already exist. Granted, it's
unlikely but it could happen when repeatedly loading/changing/serializing the
same doc.

Regarding the xml namespace, we don't handle that really well either. I'm also
in favor of not outputting a declaration for it (we're allowed to but not don't
need to). I did add code in most DOM methods to disallow creating things
that'll cause us to map the 'xml' prefix to a different URI or to map a
different prefix to http://www.w3.org/XML/1998/namespace, but we might want to
assert that we never do so during serialization.

Feel free to correct me if I misunderstood some of this code, and to file new
bugs or fix it here if I didn't.
(In reply to comment #4)
> I also don't understand how this code is supposed to work with the same prefix
> being declared on an element and on one of its ancestors with a different URI.
> As I read the code, if the URI we're trying to use is the same as the one
> declared in the ancestor we'll end up using that prefix, even if the current
> element maps it to a different URI.

Actually, this is not if the same prefix is declared but if we try to output an
element with no prefix and a namespace URI and the default namespace on one of
its ancestors is set to a different URI and on another ancestor further up to
the same URI. Though I'm not sure, this code is a bit hard to follow :-/.

element (qname: baz, namespace URI: bar)
  element (qname: foo, namespace URI: whatever)
    element (qname: something, namespace URI: bar)
Comment on attachment 161297 [details] [diff] [review]
Patch

(In reply to comment #4)
> I think the following will break with your change:

You're right.  I'll write a testcase showing that.

> As I read the code, if the URI we're trying to use is the same as the one
> declared in the ancestor we'll end up using that prefix, even if the current
> element maps it to a different URI.

I don't think that can happen, since we push our own namespaces on before
calling ConfirmPrefix().

I do think that we can run into trouble in that code due to truncating
aPrefix(), however.  I'll write a testcase for that too.

> We never check if the synthetized prefix doesn't already exist.

Indeed.  Very much indeed.

I'll work on fixing these issues up.
Attachment #161297 - Flags: superreview?(peterv)
Attachment #161297 - Flags: superreview-
Attachment #161297 - Flags: review?(peterv)
Attachment #161297 - Flags: review-
> I do think that we can run into trouble in that code due to truncating
> aPrefix(), however.  I'll write a testcase for that too.

Nevermind.  I don't think that works, too easily.  I agree with the problem
cited in comment 5, however.
Hmm... So I just realized that the problem is that 

  attr (qname: xmlns, value: bar)

won't get pushed on the namespace stack (because it's not in the xmlns
namespace).  I think we should push such attrs anyway.  Once we serialize them,
they'll _look_ like namespace decls.
Comment on attachment 161442 [details] [diff] [review]
Updated patch

I think this fixes almost all the issues raised in this bug.  The one exception
is the part about DOM doing things that map the "xml" prefix to a different URI
or map a different prefix to the XML namespace...
Attachment #161442 - Flags: superreview?(rbs)
Attachment #161442 - Flags: review?(peterv)
(In reply to comment #9)
> Hmm... So I just realized that the problem is that 
> 
>   attr (qname: xmlns, value: bar)
> 
> won't get pushed on the namespace stack (because it's not in the xmlns
> namespace).  I think we should push such attrs anyway.  Once we serialize them,
> they'll _look_ like namespace decls.

Most of them will be in the xmlns namespace: the sinks do put them in the right
namespace and the *NS methods to create/set attributes do too.
We might have a problem with the older non-namespace-aware DOM methods like
createAttribute. Of course, the DOM spec totally ignores the problem of
serialization, it notes that the created nodes "do not have any namespace
prefix, namespace URI, or local name" and "In the DOM, all namespace declaration
attributes are by definition bound to the namespace URI:
'http://www.w3.org/2000/xmlns/'. These are the attributes whose namespace prefix
or qualified name is "xmlns" as introduced in [XML Namespaces 1.1].".
Creating a xmlns attribute with createAttribute will probably work but I'm not
sure that technically it's a namespace declaration attribute. So either we treat
them as a namespace declaration attribute or we don't output them, I don't think
there are other posibilities :-/.
Comment on attachment 161442 [details] [diff] [review]
Updated patch

> Index: content/base/src/nsXMLContentSerializer.h
> ===================================================================

> +  /**
> +   * The problem that ConfirmPrefix fixes is that anyone can insert nodes
> +   * through the DOM that have a namespace URI and a random or empty or
> +   * previously existing prefix that's totally unrelated to the prefixes
> +   * declared at that point through xmlns attributes.  So what ConfirmPrefix
> +   * does is ensure that we can * map aPrefix to the namespace URI aURI (for
> +   * example, that the prefix is not already * mapped to some other namespace).
> +   * aPrefix will be adjusted, if necessary, so the * value of the prefix

You have a couple of stray * there.

> Index: content/base/src/nsXMLContentSerializer.cpp
> ===================================================================

> +    if (!uriMatch && aURI.Equals(decl->mURI)) {
> +      // Need to check that decl->mPrefix is not declared anywhere closer to
> +      // us.  If it is, we can't use it.
> +      PRBool prefixOK = PR_TRUE;
> +      for (PRInt32 index2 = count-1; index2 > index && prefixOK; --index2) {

I prefer index2 declared before for.

> +        NameSpaceDecl* decl2 = (NameSpaceDecl*)mNameSpaceStack.ElementAt(index2);

Long line.

> +        prefixOK = !(decl2->mPrefix == decl->mPrefix);

= (decl2->mPrefix != decl->mPrefix)

I think this handles every case correctly now. Thanks.
Attachment #161442 - Flags: review?(peterv) → review+
Will fix those before checking in.

Comment 16

15 years ago
Comment on attachment 161442 [details] [diff] [review]
Updated patch

sr=rbs
Attachment #161442 - Flags: superreview?(rbs) → superreview+
Fixed in with peterv's comments fixed.
Status: NEW → RESOLVED
Last Resolved: 15 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.