<rp> auto-closing ruby tags breaks content

RESOLVED FIXED in Firefox 33

Status

()

defect
RESOLVED FIXED
5 years ago
3 years ago

People

(Reporter: zefling, Assigned: yuki.sekiguchi)

Tracking

({regression})

33 Branch
mozilla34
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox33+ fixed, firefox34+ fixed)

Details

(Whiteboard: [mozfr-community])

Attachments

(3 attachments)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:34.0) Gecko/20100101 Firefox/34.0 (Beta/Release)
Build ID: 20140723080654

Steps to reproduce:

I use massively ruby on my website with a CSS formatting for Firefox & Chrome. But, with Nightly the render is broken.

-----

Got as far as we can go bisecting nightlies...
Ensuring we have enough metadata to get a pushlog...
Last good revision: 366b5c0c02d3 (2014-06-23)
First bad revision: e86b84998b18 (2014-06-24)
Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=366b5c0c02d3&tochange=e86b84998b18



Actual results:

The difference between Fx31 (left) and Nightly (right)
http://ikilote.net/Galeries/Autres/Divers/Fx_Ruby.png
The page :
http://ikilote.net/fr/Collection/Personnalit%C3%A9.html?id=394


Expected results:

The broken render is normal ?
[Tracking Requested - why for this release]:

That regression range contains bug 664104 which implemented spec changes to ruby parsing.

Looking at the page, it has:

  <ruby class="ruby_h"><rbc><rb>雨宮</rb><rb> 

etc, with the "rb" marked as an error in view-source because there is no <ruby> element open at that point as far as the parser is concerned.

Presumably the issue is that there are no actual rbc/rtc elements in HTML?
Blocks: 664104
Flags: needinfo?(yuki.sekiguchi)
Keywords: regression
And in particular, the <rtc> ends up empty inthe DOM as far as I can tell.  The <rp> and <rt> bits are hoisted to be kids of the <ruby> itself.  Presumably this then confuses whatever the CSS is doing.
Yes, I use the XHTML 1.1 ruby element on my page. It's the problem, for CSS the parent of all elements is <ruby>.
Minimal testcase:

http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=3075

It looks like "rt" doesn't close "rtc", but "rp" does.  That seems broken.
Status: UNCONFIRMED → NEW
Ever confirmed: true
> I use the XHTML 1.1 ruby element on my page

Except you're not asking for your page to be parsed as XHTML, of course...
(In reply to Boris Zbarsky [:bz] from comment #5)
> > I use the XHTML 1.1 ruby element on my page
> 
> Except you're not asking for your page to be parsed as XHTML, of course...

It's true, with application/xhtml+xml the render is correct...
(In reply to Boris Zbarsky [:bz] from comment #1)
> etc, with the "rb" marked as an error in view-source because there is no
> <ruby> element open at that point as far as the parser is concerned.

That error message is due to an inverted check (we currently error about not seeing a ruby tag when we want to error about unclosed ruby children), this bug has been around for a while and shouldn't affect the resulting tree:
http://dxr.mozilla.org/mozilla-central/source/parser/html/nsHtml5TreeBuilder.cpp#1331

(In reply to Boris Zbarsky [:bz] from comment #4)
> Minimal testcase:
> 
> http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=3075
> 
> It looks like "rt" doesn't close "rtc", but "rp" does.  That seems broken.

This is actually the specified w3 parser behavior:
http://www.w3.org/html/wg/drafts/html/CR/syntax.html#parsing-main-inbody (see 'A start tag whose tag name is one of: "rb", "rp", "rtc"' and 'A start tag whose tag name is "rt"')

rtc is meant to contain rt. I'm not sure how much sense it makes to allow rp or other ruby elements in there, but this is a spec question.
Okay, with the current parser, the complex ruby is impossible, except with the XML parser. Ruby like this :

<ruby>
  <rbc><rb>壱岐 </rb><rb>ひより</rb></rbc>
  <rtc><rp>(</rp><rt>いき </rt><rt class="ruby_hide">ひより</rt><rp>)</rp></rtc>
  <rtc><rp>(</rp><rt>IKI </rt><rt>Hiyori</rt><rp>)</rp></rtc>
</ruby>	


The result :

<ruby class="ruby_double">
  <rbc>
    <rb>壱岐 </rb>
    <rb>ひより</rb>
  </rbc>
  <rtc></rtc>
  <rp>(</rp>
  <rt>いき </rt>
  <rt class="ruby_hide">ひより</rt>
  <rp>)</rp>
  <rtc></rtc>
  <rp>(</rp>
  <rt>IKI </rt>
  <rt>Hiyori</rt>
  <rp>)</rp>
</ruby>
If you move the <rp> out to a child of <ruby>, the result is probably closer to what you expect. e.g.

<ruby>
  <rbc><rb>壱岐 </rb><rb>ひより</rb></rbc>
  <rp>(</rp><rtc><rt>いき </rt><rt class="ruby_hide">ひより</rt></rtc><rp>)</rp>
  <rp>(</rp><rtc><rt>IKI </rt><rt>Hiyori</rt></rtc><rp>)</rp>
</ruby>

It's my understanding that <rp> is semantically ignored by user agents that properly support ruby, so I'm not sure how much sense it would make to allow it in <rtc>.

I don't feel I understand ruby well enough to tell if this is a spec bug or not.
Thanks wchen, your understanding matches with mine.

If HTML5 parser reads "<ruby><rtc><rp>(</rp></rtc></ruby>", closing rtc element before rp element is expected behaviour.

If XHTML parser reads "<ruby><rtc><rp>(</rp></rtc></ruby>", not closing rtc element before rp element is expected behaviour because the parser is just XML parser and doesn't auto-close tags. However, the content is not valid XHTML because rtc element cannot have rp element as a child.
http://www.w3.org/TR/ruby/#abstract-def
> Elements   Attributes   Minimal Content Model
> rtc        Common       rt+

XHTML validator also complains about it:
http://validator.w3.org/check?uri=http%3A%2F%2Fikilote.net%2Ffr%2FCollection%2FPersonnalit%25C3%25A9.html%3Fid%3D394&charset=%28detect+automatically%29&doctype=Inline&group=0

If you want to fix the content, as wchen said, moving rp element outside of rtc element is an answer.
If you want to use rp elements inside rtc element, you should request HTML Working Group[1] because the spec author doesn't know this use case.

[1]: http://lists.w3.org/Archives/Public/public-html/
Flags: needinfo?(yuki.sekiguchi)
The real question is whether we think there is other content out there that uses <rp> inside <rtc> this way....

> this bug has been around for a while 

Filed?
I will change my HTML in the future, but more five years this HTML structure works perfectly, and today not. Rewrite all my articles is not simple.
When I looked at the content corpora we have, I found usage of ruby constructs (that guided the design) but not a single instance of <rp> inside <rtc>. I'm happy to change the spec if we have any reason to believe that either there's a good use case or enough content to justify it.
I don't think there's any reason to restrict <rp> locations at the parser level. We might not think it should go in certain places, but we don't know for sure what conventions are sensible to use for the more complex cases. IMHO it's better to have <rp> not trigger any auto-closing behavior.

Fwiw, placing <rp> inside <rtc> seems perfectly sensible to me: it gives you better control over styling the <rp> belonging to that particular level, and if you remove that level of annotation, the associated <rp> are removed with it.
Summary: Ruby is broken on Nightly → <rp> auto-closing ruby tags breaks content
(In reply to Boris Zbarsky [:bz] from comment #11)
> The real question is whether we think there is other content out there that
> uses <rp> inside <rtc> this way....

Agreed. This is not about use-cases, but how much we try not to break invalid-but-existing content, though it looks like Robin did his best to assure such content are very rare.

Filed at https://www.w3.org/Bugs/Public/show_bug.cgi?id=26424

> > this bug has been around for a while 
> 
> Filed?

It's not an implementation bug, because rb not being a direct child of ruby should be a parser error as per the current spec, but rather a spec issue.

Filed at https://www.w3.org/Bugs/Public/show_bug.cgi?id=26426
Version: 31 Branch → 33 Branch
OK, it is a regression, so we are going to track it but we won't block the release for that.
I believe Robin is going to fix the parsing spec so that <rp> does not trigger auto-close behavior. (He mentioned he'd fix it this week, so I will pester him about it if he does not. :) Do we have someone to write a patch for this?
I fixed the spec so that rp no longer autocloses rtc. It is in master: http://www.w3.org/html/wg/drafts/html/master/.
I'll take a look on Monday.
It seems that discussion about rp auto close rt and rb is on-going[1]. Should I fix soon or wait for the conclusion?

I locally fix this, and it looks well for the original site. We should fix html5lib-tests because mochitest-plain fails at test_html5_tree_construction_part2.html.

[1]: https://www.w3.org/Bugs/Public/show_bug.cgi?id=26424#c9
I think we should update the implementation to match Robin's recent spec change because its a minor change that better matches old parser behavior, the other proposed solution has behavior with more backwards compatibility risk. If we decide on another solution for w3c bug 26424, we can implement that later.

As for the tests, if it's one of the new ruby tests failing, we can fix it to have the correct output for now and Robin will probably get around to fixing it in the html5lib pull request.

Thanks for working on this Yuki.
Attachment #8471381 - Flags: review?(wchen)
(In reply to William Chen [:wchen] from comment #21)
> I think we should update the implementation to match Robin's recent spec
> change because its a minor change that better matches old parser behavior,
> the other proposed solution has behavior with more backwards compatibility
> risk. If we decide on another solution for w3c bug 26424, we can implement
> that later.

I see. Uploaded patches.

> As for the tests, if it's one of the new ruby tests failing, we can fix it
> to have the correct output for now and Robin will probably get around to
> fixing it in the html5lib pull request.

I sent pull request to Robin's new-ruby branch:
https://github.com/darobin/html5lib-tests/pull/2
This is same as part 3 of this bug.

> Thanks for working on this Yuki.

You are welcome. Thank you for good information in the W3C bug.
Attachment #8471380 - Flags: review?(wchen) → review+
Attachment #8471381 - Flags: review?(wchen) → review+
Attachment #8471383 - Flags: review?(wchen) → review+
Hi William,

Thank you for reviewing. Could you commit these patches?
Using the checkin-needed patch should be enough to get the patches landed.

Could you request an uplift request for aurora (33)? Thanks
Flags: needinfo?(yuki.sekiguchi)
Keywords: checkin-needed
We need to get a push to the tryserver with these patches before we can check it in, to help ensure it doesn't break things for everyone when it gets checked in.
Keywords: checkin-needed
AFAICT, only parts 2 and 3 need to land on our tree. Please correct me if that's wrong.
https://tbpl.mozilla.org/?tree=Try&rev=182fedec319f

If this is green, I'll go ahead and push it to inbound (no need to set checkin-needed again).
Assignee: nobody → yuki.sekiguchi
(In reply to Sylvestre Ledru [:sylvestre] from comment #27)
> Using the checkin-needed patch should be enough to get the patches landed.

Thank you. I'll try next time.

> Could you request an uplift request for aurora (33)? Thanks

I'm not sure uplift is normal, but since safari-600 branch[1] contains ruby parser change, I think we should uplift to aurora.
If uplift this patch, please uplift bug 664104, too.

[1]: http://trac.webkit.org/log/branches/safari-600.1-branch
Flags: needinfo?(yuki.sekiguchi)
Comment on attachment 8471381 [details] [diff] [review]
Part2: Translate html parser.

Approval Request Comment
[Feature/regressing bug #]: Bug 664104
[User impact if declined]: Rendering of ruby in some pages may break due to having a different DOM (e.g. pages using a <rp> element within a <rtc> element).
[Describe test coverage new/current, TBPL]: Parser test is included in one of patches on this bug. Bug 664104 includes many parser tests of the new ruby parsing behavior that is being modified by this change.
[Risks and why]: No known risk, this patch changes behavior to be more backwards compatible with behavior before the regression.
[String/UUID change made/needed]: None
Attachment #8471381 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/f18e821acdf3
https://hg.mozilla.org/mozilla-central/rev/c50473a8b795
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Attachment #8471381 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Whiteboard: [mozfr-community]
You need to log in before you can comment on or make changes to this bug.