The default bug view has changed. See this FAQ.

View Source highlighting of entities off by one; When a semicolon has been tokenized in a character reference, should return to the return state eagerly

RESOLVED FIXED in mozilla11

Status

()

Core
HTML: Parser
P3
normal
RESOLVED FIXED
7 years ago
5 years ago

People

(Reporter: hsivonen, Assigned: hsivonen)

Tracking

Trunk
mozilla11
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox10 wontfix)

Details

Attachments

(1 attachment)

(Assignee)

Description

7 years ago
Spinoff from bug 535499:

If document.write ends with the semicolon of a named character reference, the tokenizer state is the data state when returning to the parser thread. (Instead, the return state is the data state and the next input character will cause the return to the data state.) This causes a useless speculation failure.

Should move to the return state eagerly to put the tokenizer into the data state in time for the state transfer over the thread boundary.
(Assignee)

Updated

7 years ago
Priority: P2 → P3
(Assignee)

Updated

6 years ago
Blocks: 482921

Comment 1

6 years ago
Could you please update the summary and also add comment that this also causes one more character after semicolor highlighted as part the preceding HTML entity reference. You linked to this bug in http://hsivonen.iki.fi/view-source/ . But people not reading that article will file bugs against it and will never find this bug.
(Assignee)

Updated

6 years ago
Summary: [HTML5] When a semicolon has been tokenized in a character reference, should return to the return state eagerly → View Source highlighting of entities off by one; When a semicolon has been tokenized in a character reference, should return to the return state eagerly
(In reply to :aceman from comment #1)
> Could you please update the summary and also add comment that this also
> causes one more character after semicolor highlighted as part the preceding
> HTML entity reference. You linked to this bug in
> http://hsivonen.iki.fi/view-source/ . But people not reading that article
> will file bugs against it and will never find this bug.

Well if that IS in fact the same bug as this, then the importance is completely wrong.  Fixing incorrect highlighting is NOT an enhancement.
(Assignee)

Comment 3

5 years ago
(In reply to Bill Gianopoulos from comment #2)
> Well if that IS in fact the same bug as this, then the importance is
> completely wrong.  Fixing incorrect highlighting is NOT an enhancement.

Good point.
Severity: enhancement → normal
It occurs to me that this might be confusing for those not familiar with US-English marketing-speak.

"Enhancement" in software bug reporting context means "Request for a new feature".  Therefore, it is not appropriate for fixing an existing feature.
Probably related: bug 701775. Not a dupe since current bug (535530) was reported far before HTML5 parser has been first used for View Source.
(Assignee)

Updated

5 years ago
Duplicate of this bug: 701775
(Assignee)

Updated

5 years ago
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
Depends on: 704058
(Assignee)

Comment 7

5 years ago
Created attachment 576479 [details] [diff] [review]
Transition eagerly when a semicolon matches
(Assignee)

Comment 8

5 years ago
Comment on attachment 576479 [details] [diff] [review]
Transition eagerly when a semicolon matches

This patch first fixes a major brainfart that turns out not to have much impact for Gecko: it moves the setting of the reconsume variable to immediately before the calls to transition() so that transition() sees the right value for reconsume.

Then it fixes the actual problem by breaking the named character reference matching loop early when the semicolon is the last character that matches. Normally, the loop is broken by the first character that doesn't match, because HTML requires longest match. However, since the semicolon never appears in the middle of a named character name, there can't be a longer match after a semicolon has been seen.

When the loop is broken early, there's no need to reconsume the semicolon and this needs to be taken into account when setting the start of the next text run when applicable.
Attachment #576479 - Flags: review?(bugs)

Updated

5 years ago
Duplicate of this bug: 705733

Updated

5 years ago
Attachment #576479 - Flags: review?(bugs) → review+
(Assignee)

Comment 10

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/eb1badf94821
status-firefox10: --- → affected
Flags: in-testsuite+
Target Milestone: --- → mozilla11
Looks like you're leaking on OSX Crashtests:

https://tbpl.mozilla.org/php/getParsedLog.php?id=7657775&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=7658113&tree=Mozilla-Inbound
(Assignee)

Comment 12

5 years ago
(In reply to Ms2ger from comment #11)
> Looks like you're leaking on OSX Crashtests:
> 
> https://tbpl.mozilla.org/php/getParsedLog.php?id=7657775&tree=Mozilla-Inbound
> https://tbpl.mozilla.org/php/getParsedLog.php?id=7658113&tree=Mozilla-Inbound

Must be bug 699365. That bug added a crashtest. *This* bug neither adds a crashtest nor changes anything that affects refcounting.
https://hg.mozilla.org/mozilla-central/rev/eb1badf94821
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
(Assignee)

Comment 14

5 years ago
Comment on attachment 576479 [details] [diff] [review]
Transition eagerly when a semicolon matches

(This comment has been posted on all the bugs mentioned in this comment, except bug 710142, so that the release drivers see it regardless of the order in which they process approval requests.)

The new View Source implementation landed before Firefox 10 moved to Aurora. Afterwards, a bunch of regressions were identified. Many of the regression fixes didn't land before Firefox 10 moved to Aurora but they have now been fixed on trunk except for bug 710142.

To avoid shipping with regressions, we either need to land all the regression fixes on Aurora for Firefox 10 (followed by a fix for bug 710142 in Beta if it doesn't make it before Dec 20th) or switch back to the old View Source implementation on Aurora. The new View Source implementation provides much better diagnostics for Web developers than the old View Source implementation.

So I'd like to ask the release drivers to either approve bug 535530, bug 699356, bug 699365, bug 700034, bug 700361, bug 703965, bug 704667 and bug 705473 plus bug 695640, which is a non-regression tweak, or to approve bug 710175 for reverting to the old View Source implementation for Firefox 10.
Attachment #576479 - Flags: approval-mozilla-aurora?

Comment 15

5 years ago
Comment on attachment 576479 [details] [diff] [review]
Transition eagerly when a semicolon matches

[triage comment]
We decided to back out the new view source parser (bug 710175) rather than take this fixup for Firefox 10.

Denying for Aurora.
Attachment #576479 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora-
(Assignee)

Comment 16

5 years ago
Firefox 10 no longer affected for View Source due to bug 710175 landing. WONTFIX for Firefox 10 as far as speculative parsing goes.
status-firefox10: affected → wontfix
You need to log in before you can comment on or make changes to this bug.