Closed
Bug 664104
Opened 13 years ago
Closed 10 years ago
Implement spec changes to ruby parsing
Categories
(Core :: DOM: HTML Parser, defect)
Core
DOM: HTML Parser
Tracking
()
RESOLVED
FIXED
mozilla33
Tracking | Status | |
---|---|---|
firefox33 | - | --- |
People
(Reporter: hsivonen, Assigned: yuki.sekiguchi)
References
Details
(Keywords: dev-doc-needed)
Attachments
(9 files, 4 obsolete files)
69.11 KB,
patch
|
Details | Diff | Splinter Review | |
15.73 KB,
patch
|
Details | Diff | Splinter Review | |
7.30 KB,
patch
|
wchen
:
review+
|
Details | Diff | Splinter Review |
2.94 KB,
patch
|
wchen
:
review+
|
Details | Diff | Splinter Review |
6.95 KB,
patch
|
wchen
:
review+
|
Details | Diff | Splinter Review |
1.62 KB,
patch
|
wchen
:
review+
|
Details | Diff | Splinter Review |
14.86 KB,
patch
|
wchen
:
review+
|
Details | Diff | Splinter Review |
69.64 KB,
patch
|
wchen
:
review+
|
Details | Diff | Splinter Review |
7.27 KB,
patch
|
Details | Diff | Splinter Review |
The parsing spec got changed in response to http://www.w3.org/Bugs/Public/show_bug.cgi?id=12935 and will probably change a bit more. Implement the changes.
Reporter | ||
Comment 1•13 years ago
|
||
Reporter | ||
Comment 2•13 years ago
|
||
Reporter | ||
Comment 3•13 years ago
|
||
Attachment #539181 -
Attachment is obsolete: true
Reporter | ||
Comment 4•13 years ago
|
||
The new patches implement the proposal from
www.w3.org/Bugs/Public/show_bug.cgi?id=13113
(Test cases not included before seeing how exactly the spec settles.)
Attachment #539180 -
Attachment is obsolete: true
Comment 5•11 years ago
|
||
(In reply to Henri Sivonen (:hsivonen) from comment #4)
> The new patches implement the proposal from
> www.w3.org/Bugs/Public/show_bug.cgi?id=13113
>
> (Test cases not included before seeing how exactly the spec settles.)
The new ruby markup model is now part of the W3C HTML specification:
http://lists.w3.org/Archives/Public/public-html/2013Dec/0010.html
Reporter | ||
Comment 6•11 years ago
|
||
I don't see https://github.com/w3c/html/commit/e2ddb663fd04803d2be7f16026e2117ced167c01 changing the parsing. And even if it did, I think we should not implement W3C-only parsing changes. The parsing algorithm is too important to play along with WHATWG and W3C diverging on parsing.
Comment 7•11 years ago
|
||
It does change the parsing, see https://github.com/w3c/html/commit/e2ddb663fd04803d2be7f16026e2117ced167c01#diff-36cd38f49b9afa08222c0dc9ebfe35ebL98664 and below. A perhaps more readable view of the (small) changes is in: https://github.com/html5lib/html5lib-python/pull/126/files
I wouldn't call this a "W3C-only parsing change", it certainly isn't what it intends to be. The only reason I moved ahead with this is because it had implementer support. (It also matches what every single user I have spoken to wants, and reflects existing content I could find.)
Reporter | ||
Comment 8•11 years ago
|
||
(In reply to Robin Berjon from comment #7)
> The only reason I moved ahead with this is because it had
> implementer support.
From which implementors?
Comment 9•10 years ago
|
||
I was pinged by a volunteer developer willing to help this bug. Henri, or anyone else watching this bug, any concerns to make this work progress?
Here's the current status updates in case you missed any:
* We've got good coverage of test cases submitted here[1] and here[2].
* Though these cases are not merged to html5lib-tests yet, Robin in HTML WG said he can merge them once Mozilla is happy to this change and the developer is assigned.
* WebKit has landed this change 2 months ago[3].
* I talked with Jet in the last TPAC, and then 2 months ago with fantasai. He looked fine for this change.
* I asked dbaron to pay attention to this bug a month ago at CSS WG F2F since a developer might be on board in June.
Yuki, the developer, is currently learning your WIP patch, and is working on to integrate test cases. Hopefully he'll speak up in this bug in a week or so, but if you have any concerns on this work, could you please let us know? I'd be more than happy to discuss further.
[1] https://github.com/html5lib/html5lib-tests/pull/27/files
[2] https://github.com/darobin/html5lib-tests/pull/1/files
[3] http://trac.webkit.org/changeset/167437
Reporter | ||
Comment 10•10 years ago
|
||
(In reply to Koji Ishii from comment #9)
> I was pinged by a volunteer developer willing to help this bug. Henri, or
> anyone else watching this bug, any concerns to make this work progress?
My main concern is that I think it's very unhealthy for the WHATWG and HTML WG parsing algorithms to diverge. It also worries me that the WebKit change was reviewed by someone who I don't recognize as having been active in parsing algorithm spec matters.
I'd like there to be a plan for making sure that the WHATWG and HTML WG parsing algorithms and the implementations converge.
Is there such a plan?
Note that I'm not saying that Hixie always right or anything like that, but if he hasn't changed his mind yet, I'd like to know what the plan for getting the WHATWG spec changed is if there is a seriously good reason to believe that doing the HTML WG version is the Right Thing in this case. What's the reason to believe that the HTML WG has made the right call here?
Reporter | ||
Updated•10 years ago
|
Assignee: hsivonen → nobody
Reporter | ||
Updated•10 years ago
|
Status: ASSIGNED → NEW
Comment 11•10 years ago
|
||
Good point, I agree with you, I do not want to make a diverge either. Here's what I think.
Several folks in i18n world have been working on investigating use cases and which markup supports which use cases. This work resulted in a wiki page[1] and then a W3C Note[2]. The previous investigations and discussions missed two important ruby stakeholders; publishers and educations. I've reached to publishers a few years ago during my work on EPUB3, and then to people in educational market only one year or less ago. Educational materials in Japan use ruby heavily, and it became pretty obvious that the previous spec was not good enough for them.
With such investigation, additional information, and the wiki/note, I think we can respond to any objections except one point that implementers are not interested in the presented use cases. IIUC, that was one of the biggest points for a long time.
So the plan is that, once it's proved that at least two implementers are good with the change, I'd like to work with W3C (Robin and Richard specifically) to re-start the conversation between W3C and WHATWG on this matter, and resolve the diverge. I won't leave this issue just by landing on Gecko, rather, I'm willing to follow up until WHATWG and all other implementations match. Does this plan sound good to you?
I preferred working on code first because the time Yuki can work on the patch is limited that we may loose him if we'd wait for the conversation. That said, if we start the conversation first, I cannot commit implementations, but I know WHATWG requires commitments of implementations to reach consensus, so it's a bit chicken-and-egg problem for me. Working on code first then to discuss will help me to resolve this chicken-and-egg problem.
The reviewer of the WebKit change is the engineering manager of Safari and WebKit. In the last TPAC discussion, Ryosuke Niwa was invited from WebKit team and he was good with the change. I've been chatting with him on and off since then on this topic. If I understand his words correctly, his team is all on board with adding rb back. Our plan was Ryosuke to find appropriate reviewer once the patch was submitted, but Darin responded quicker while Ryosuke was off to Asia. I hope this clarifies your concern at least a bit.
[1] http://www.w3.org/International/wiki/Rb
[2] http://www.w3.org/TR/ruby-use-cases/
Reporter | ||
Comment 12•10 years ago
|
||
(In reply to Koji Ishii from comment #11)
> Good point, I agree with you, I do not want to make a diverge either. Here's
> what I think.
>
> Several folks in i18n world have been working on investigating use cases and
> which markup supports which use cases. This work resulted in a wiki page[1]
> and then a W3C Note[2]. The previous investigations and discussions missed
> two important ruby stakeholders; publishers and educations. I've reached to
> publishers a few years ago during my work on EPUB3, and then to people in
> educational market only one year or less ago. Educational materials in Japan
> use ruby heavily, and it became pretty obvious that the previous spec was
> not good enough for them.
>
> With such investigation, additional information, and the wiki/note, I think
> we can respond to any objections except one point that implementers are not
> interested in the presented use cases. IIUC, that was one of the biggest
> points for a long time.
>
> So the plan is that, once it's proved that at least two implementers are
> good with the change, I'd like to work with W3C (Robin and Richard
> specifically) to re-start the conversation between W3C and WHATWG on this
> matter, and resolve the diverge. I won't leave this issue just by landing on
> Gecko, rather, I'm willing to follow up until WHATWG and all other
> implementations match. Does this plan sound good to you?
Sounds good.
> I preferred working on code first because the time Yuki can work on the
> patch is limited that we may loose him if we'd wait for the conversation.
OK. Let's not block him.
> The reviewer of the WebKit change is the engineering manager of Safari and
> WebKit. In the last TPAC discussion, Ryosuke Niwa was invited from WebKit
> team and he was good with the change. I've been chatting with him on and off
> since then on this topic. If I understand his words correctly, his team is
> all on board with adding rb back. Our plan was Ryosuke to find appropriate
> reviewer once the patch was submitted, but Darin responded quicker while
> Ryosuke was off to Asia. I hope this clarifies your concern at least a bit.
This addresses my WebKit review concern. Thanks.
Assignee | ||
Comment 13•10 years ago
|
||
Thank you for clarifying the situation.
I'll upload patches soon.
Assignee | ||
Comment 14•10 years ago
|
||
Implemented new ruby model in the following HTML5 parser algorithm to htmlparser.
http://www.w3.org/html/wg/drafts/html/CR/syntax.html#parsing-main-inbody
> A start tag whose tag name is one of: "rb", "rp", "rtc"
> If the stack of open elements has a ruby element in scope, then generate implied end tags. If the current node is not then a ruby element, this is a parse error.
>
> Insert an HTML element for the token.
> A start tag whose tag name is "rt"
> If the stack of open elements has a ruby element in scope, then generate implied end tags, except for rtc elements. If the current node is not then a ruby element or an rtc element, this is a parse error.
>
> Insert an HTML element for the token.
Attachment #8441172 -
Flags: review?(hsivonen)
Assignee | ||
Comment 15•10 years ago
|
||
Translated code of Part1.
Attachment #8441173 -
Flags: review?(hsivonen)
Assignee | ||
Comment 16•10 years ago
|
||
Current Mozilla considered ruby related elements as HTMLUnknownElement, but they should be HTMLElement.
http://www.w3.org/html/wg/drafts/html/CR/text-level-semantics.html#the-ruby-element
> 4.5.21 The ruby element
> Uses HTMLElement.
http://www.w3.org/html/wg/drafts/html/CR/text-level-semantics.html#the-rb-element
> 4.5.22 The rb element
> Uses HTMLElement.
http://www.w3.org/html/wg/drafts/html/CR/text-level-semantics.html#the-rt-element
> 4.5.23 The rt element
> Uses HTMLElement.
http://www.w3.org/html/wg/drafts/html/CR/text-level-semantics.html#the-rtc-element
> 4.5.24 The rtc element
> Uses HTMLElement.
http://www.w3.org/html/wg/drafts/html/CR/text-level-semantics.html#the-rp-element
> 4.5.25 The rp element
> Uses HTMLElement.
Attachment #8441179 -
Flags: review?(hsivonen)
Assignee | ||
Comment 17•10 years ago
|
||
Part3 patch will fix expected failure about ruby editing.
Attachment #8441180 -
Flags: review?(hsivonen)
Assignee | ||
Comment 18•10 years ago
|
||
Since I don't know the rule of updating parser/htmlparser/tests/mochitest/html5lib_tree_construction/, this patch may violate the rule.
My main concern is that the original pull request is not merged to master.
Attachment #8441182 -
Flags: review?(hsivonen)
Assignee | ||
Comment 19•10 years ago
|
||
Same as part5, my main concern is that the original pull request is not merged to master.
Attachment #8441183 -
Flags: review?(hsivonen)
Assignee | ||
Comment 20•10 years ago
|
||
Hi Henri,
Could you review this patch?
I used "suggested reviewers" feature in bugzilla, and it suggests you. I confused that the suggestion was different from module owners. Please feel free to tell me if I send to wrong reviewer.
Comment 21•10 years ago
|
||
(In reply to Yuki Sekiguchi from comment #19)
> Created attachment 8441183 [details] [diff] [review]
> Part6: Update html5lib-tests to
> https://github.com/html5lib/html5lib-tests/pull/27.
>
> Same as part5, my main concern is that the original pull request is not
> merged to master.
It's sort of a chicken-and-egg problem here. That pull request won't be merged unless there's implementer interest. But it hasn't met with technical objections beyond a divergence discussion very similar to the one here, with the same conclusion (clearly, it's a concern we all have and wish to address).
Reporter | ||
Comment 22•10 years ago
|
||
Comment on attachment 8441172 [details] [diff] [review]
Part1: Parse new ruby model.
Unfortunately, I won't have the time to review this properly before I head out of office and will be back the week after next.
wchen, can you take a look?
Attachment #8441172 -
Flags: review?(hsivonen) → review?(wchen)
Reporter | ||
Updated•10 years ago
|
Attachment #8441173 -
Flags: review?(hsivonen) → review?(wchen)
Reporter | ||
Updated•10 years ago
|
Attachment #8441179 -
Flags: review?(hsivonen) → review?(wchen)
Reporter | ||
Updated•10 years ago
|
Attachment #8441180 -
Flags: review?(hsivonen) → review?(wchen)
Reporter | ||
Updated•10 years ago
|
Attachment #8441182 -
Flags: review?(hsivonen) → review?(wchen)
Reporter | ||
Updated•10 years ago
|
Attachment #8441183 -
Flags: review?(hsivonen) → review?(wchen)
Comment 23•10 years ago
|
||
Comment on attachment 8441172 [details] [diff] [review]
Part1: Parse new ruby model.
Review of attachment 8441172 [details] [diff] [review]:
-----------------------------------------------------------------
You also need to handle rt, rb, rtc and rp in the 'An end tag whose tag name is "html"' case in the "in body" insertion mode.
::: src/nu/validator/htmlparser/impl/TreeBuilder.java
@@ +2433,5 @@
> + generateImpliedEndTagsExceptFor("rtc");
> + }
> + if (eltPos != currentPtr) {
> + int rtcPos = findLastInScope("rtc");
> + if (rtcPos != currentPtr) {
!isCurrent("rtc")
Attachment #8441172 -
Flags: review?(wchen) → review-
Comment 24•10 years ago
|
||
Comment on attachment 8441180 [details] [diff] [review]
Part4: Remove expected failure about ruby editing.
yay
Attachment #8441180 -
Flags: review?(wchen) → review+
Updated•10 years ago
|
Attachment #8441182 -
Flags: review?(wchen) → review+
Updated•10 years ago
|
Attachment #8441183 -
Flags: review?(wchen) → review+
Comment 25•10 years ago
|
||
Comment on attachment 8441173 [details] [diff] [review]
Part2: Translate new ruby model parser.
Needs new translation after addressing review comments.
Attachment #8441173 -
Flags: review?(wchen)
Comment 26•10 years ago
|
||
Comment on attachment 8441179 [details] [diff] [review]
Part3: Make ruby related element HTMLElement.
Review of attachment 8441179 [details] [diff] [review]:
-----------------------------------------------------------------
::: parser/htmlparser/public/nsHTMLTagList.h
@@ +132,5 @@
> +HTML_HTMLELEMENT_TAG(rb)
> +HTML_HTMLELEMENT_TAG(rp)
> +HTML_HTMLELEMENT_TAG(rt)
> +HTML_HTMLELEMENT_TAG(rtc)
> +HTML_HTMLELEMENT_TAG(ruby)
Let's move all these entries below HTML_TAG(q, Shared) to keep the order consistent.
Attachment #8441179 -
Flags: review?(wchen) → review+
Assignee | ||
Comment 27•10 years ago
|
||
Attachment #8441172 -
Attachment is obsolete: true
Attachment #8443354 -
Flags: review?(wchen)
Assignee | ||
Updated•10 years ago
|
Attachment #8443354 -
Attachment description: Part1-1: Support html end tag. Use isCurrent(). → Part1-2: Support html end tag. Use isCurrent().
Assignee | ||
Comment 28•10 years ago
|
||
Attachment #8441173 -
Attachment is obsolete: true
Attachment #8443355 -
Flags: review?(wchen)
Assignee | ||
Comment 29•10 years ago
|
||
Assignee | ||
Comment 30•10 years ago
|
||
Hi William,
Thank you for reviewing.
I fixed your comments. If all patches are r+, Could you commit these patches?
Updated•10 years ago
|
Attachment #8443355 -
Flags: review?(wchen) → review+
Updated•10 years ago
|
Attachment #8443354 -
Flags: review?(wchen) → review+
Comment 31•10 years ago
|
||
Thanks for working on this.
https://hg.mozilla.org/integration/mozilla-inbound/rev/6e2c2a52ae8d
https://hg.mozilla.org/integration/mozilla-inbound/rev/9861af1d5d3e
https://hg.mozilla.org/integration/mozilla-inbound/rev/557f6ff322cf
https://hg.mozilla.org/integration/mozilla-inbound/rev/317dedadd516
https://tbpl.mozilla.org/?tree=Try&rev=d4e22c52856b
Assignee: nobody → yuki.sekiguchi
Flags: in-testsuite+
Comment 32•10 years ago
|
||
Comment 33•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/6e2c2a52ae8d
https://hg.mozilla.org/mozilla-central/rev/9861af1d5d3e
https://hg.mozilla.org/mozilla-central/rev/557f6ff322cf
https://hg.mozilla.org/mozilla-central/rev/317dedadd516
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Assignee | ||
Comment 34•10 years ago
|
||
Thank you all!
Comment 35•10 years ago
|
||
Comment 36•10 years ago
|
||
Where is the intent to implement for this change? We should not land things without intent to implement.
(Also, it seems if anything layout has to improve first here.)
Comment 37•10 years ago
|
||
I was under the impression that the only issue here was parser algorithm divergence (comment 21) and that we were OK with the new changes as long as we follow up to solve the divergence (comment 12).
The patches in this bug are relatively small, isolated and located in code that doesn't change very often. If we reach the consensus that we don't want the additional elements, it should be easy to revert the changes.
(In reply to Anne (:annevk) from comment #36)
> Where is the intent to implement for this change?
I didn't find one, and I wasn't aware that it was necessary here. It looked like implementation started years ago on this bug. Also, we have someone currently looking at the layout side of ruby.
Comment 38•10 years ago
|
||
The parser divergence is just a symptom of the overall ruby design. See discussion in bug 33339 (it has pointers to spec bugs where this is discussed further). The tl;dr is that as far as I can tell, the additional complexity of the proposed changes is not warranted by the relatively minor gain. There's lots of features we haven't added to HTML because the cost:benefit ratio doesn't justify it; IMHO this particular feature is just getting less push-back from the average Jane because it's not an area that most English-speaking people know or care about, so they ignore it.
Comment 39•10 years ago
|
||
Hixie, thank you for your comment, but I don't think this is the right place to discuss how much a ruby feature is important. I'm in sync with Richard and Robin to find the right way to restart the conversation, and it's my goal to resolve the temporary divergence. So please stay tuned, and I look forward to talking to you in the right place.
Comment 40•10 years ago
|
||
Please backout and send an intent to implement. Landing new features without one is not acceptable.
Flags: needinfo?(yuki.sekiguchi)
Updated•10 years ago
|
tracking-firefox33:
--- → ?
(In reply to :Ms2ger from comment #40)
> Please backout and send an intent to implement. Landing new features without
> one is not acceptable.
I don't think this requires backing out, but an intent to implement should be sent.
Comment 42•10 years ago
|
||
Thank you David, we'll be working on it.
Assignee | ||
Comment 43•10 years ago
|
||
(In reply to :Ms2ger from comment #40)
> Please backout and send an intent to implement. Landing new features without
> one is not acceptable.
According to David's comment, I won't request backing out.
Koji sent an intent to implement:
https://groups.google.com/forum/#!topic/mozilla.dev.platform/VqSAwVuskm8
Flags: needinfo?(yuki.sekiguchi)
Updated•10 years ago
|
Comment 44•10 years ago
|
||
So where has the conversation been reopened? There was no response to my comments on dev.platform.
Why are we going down this path?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Making the bug status not reflect the actual state of the tree doesn't help anything.
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
Comment 46•10 years ago
|
||
We need to check what if any developer documentation changes need to be made here.
Keywords: dev-doc-needed
You need to log in
before you can comment on or make changes to this bug.
Description
•