Closed Bug 520581 Opened 12 years ago Closed 11 years ago

[HTML5][Patch] HTML5 parser reverses attributes on some elements

Categories

(Core :: DOM: HTML Parser, defect, P1)

x86
macOS
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: bzbarsky, Assigned: hsivonen)

References

Details

Attachments

(2 files)

Attached file Testcase
STEPS TO REPRODUCE:

1)  Enable HTML5 parser.
2)  Load attached testcase.
3)  Examine the string shown.

EXPECTED RESULTS:  <img height="10" width="20">

ACTUAL RESULTS: <img width="20" height="10">

I've tried some other elements, like <br> and <span> and <hr> and <link>.  <hr> has the same issue.  The others don't seem to.
Priority: -- → P4
Assignee: hsivonen → nobody
This breaks a whole bunch of tests.
Priority: P4 → P1
The test case isn't showing the problem for me on Mac. However, I do see order reversals on Mochitest.

How about I make the HTML5 parser add the attributes to the node in the reversed order and remove the pref-based special casing from the serializer? This way at least it wouldn't be necessary to add more pref-based special casing to the serializer. (If we later want the nodes to get the attributes in the forward order, that change could be made once the old HTML parser isn't used anymore.)
OK, on trunk (using mac nightlies) the behavior on the attached testcase changed in this range: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=91e00d39570f&tochange=f1c0c5a52b29

Of course that range has no HTML5 parser changes in it.  Which is the point: why is the ordering different for different elements to start with, when I was testing it?  Why are you seeing differences between mochitest and not?

It basically smells like there's an uninitialized variable somewhere.  Or something.
And in particular, rev f1c0c5a52b29 shows the bug for me in a Linux debug build I just did, but not in the Mac nightly built from that rev!
(In reply to comment #3)
> It basically smells like there's an uninitialized variable somewhere.  Or
> something.

The obvious candidate (the variable that controls the loop direction) is always initialized:
http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsHTMLContentSerializer.cpp#124

Very odd.

I'm going to try removing the reversal from the serializer and adding the reversal to the HTML5 tree op execution code.
(In reply to comment #3)
> why is the ordering different for different elements to start with, when I was
> testing it?

Other than <span>, it could have been an issue with void elements.

> Why are you seeing differences between mochitest and not?

The Mochitest failures I've looked at so far don't use the innerHTML getter but instead use Gecko-specific API to get an nsIDocumentEncoder.
http://hg.mozilla.org/mozilla-central/rev/c2843401d2f9 looks interesting. Perhaps before that patch there was platform-dependent linker behavior?
Hmm. Seems too implausible.

Anyway, I'm going to try making the iteration not depend on the choice of parser.
The patch makes the new parser behave like the old parser removing the need for pref-based special-casing in the serializers (for test cases that don't run the HTML serializer). 

I didn't find a satisfactory explanation for the partial failures, yet, but I filed bug 536383 about unifying the HTML vs. XML behavior once there is no longer a need to unify things with the old HTML parser. It seems simpler to conduct an investigation when the target is the same iteration order for all possible serializer modes.
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
Attachment #418851 - Flags: review?(bzbarsky)
Summary: [HTML5] HTML5 parser reverses attributes on some elements → [HTML5][Patch] HTML5 parser reverses attributes on some elements
> Other than <span>, it could have been an issue with void elements.

Yes, I thought it was void elements... until I tried <span>.
OK.  I just stepped through this in a debug tip build, where I still see the problem.  The serializer part is fine: it serializes the attributes in the order in which the element has them.  The SetAttr calls also happen in the "right" order (first height, then width).  However after the second SetAttr call, the element reports its attributes as being set in the order width, then height.

I wonder whether this is some sort of mapped attribute problem.
Looks like mapped attributes are sorted, to make the uniquing work right, and they're sorted effectively by nsIAtom* value (that is, the actual nsGkAtom pointer value).  So if I take a build in which the testcase passes, and reverse the attrs in the testcase without restarting the build, it starts to fail.  This behavior is independent of the HTML5 parser.  The exact behavior will depend on where the atoms end up in memory during startup.

I think we should just mark this bug invalid, at least as far as the HTML5 parser is concerned.  Sorry for the confusion.  :(
Thank you for figuring out what's really going on here.

Still, I'd like to land the patch, because it makes Mochitests fail less. Considering the alternatives of removing tests and having dual pass conditions for the old and the new parser, it seems the easiest to align the behavior of the new parser here as long as both parsers are around. Once the old parser is no longer around, I'd like to fix bug 536383.
Comment on attachment 418851 [details] [diff] [review]
Reverse attributes when executing HTML5 tree ops

So with this patch, what attribute order will be produced for:

  <span foo="a" bar="b" foo="c" bar="d">

?  The current parser does has the attributes bar="b" and foo="a" in that order.  Is that what your new setup will do?

What about:

  <span foo="a" bar="b" bar="d" foo="c">

?  In this case, current parser has foo="a" and bar="b" in that order.

Maybe it doesn't much matter, though...
Or put another way, it's not clear to me that you can actually align with the existing setup for the duplicate-attribute cases without actually using the same uniquing algorithm that the current parser uses...
(In reply to comment #15)
> Or put another way, it's not clear to me that you can actually align with the
> existing setup for the duplicate-attribute cases without actually using the
> same uniquing algorithm that the current parser uses...

My main concern isn't aligning with the old parser for the duplicate attribute cases but to narrow the difference in common cases so that Mochitests that compare serializations fail less and leave less stuff to special case as long as the tests need to support two parsers.
(In reply to comment #14)
> (From update of attachment 418851 [details] [diff] [review])
> So with this patch, what attribute order will be produced for:
> 
>   <span foo="a" bar="b" foo="c" bar="d">
> 
> ?  The current parser does has the attributes bar="b" and foo="a" in that
> order.  Is that what your new setup will do?

Yes.

> What about:
> 
>   <span foo="a" bar="b" bar="d" foo="c">
> 
> ?  In this case, current parser has foo="a" and bar="b" in that order.

With the patch, also this case gives bar=b followed by foo=a.
Blocks: 539684
This patch indeed made bug 539684 easier to patch over.
Comment on attachment 418851 [details] [diff] [review]
Reverse attributes when executing HTML5 tree ops

r=bzbarsky.  Sorry for the lag.
Attachment #418851 - Flags: review?(bzbarsky) → review+
Thanks. Pushed:
http://hg.mozilla.org/mozilla-central/rev/8bfa6babb89b
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Blocks: 507872
You need to log in before you can comment on or make changes to this bug.