Note: There are a few cases of duplicates in user autocompletion which are being worked on.

in a utf-8 page, a utf-8 encoded NBSP (c2a0) from a linked css with unspecified encoding is decoded and rendered as latin

RESOLVED FIXED in mozilla8

Status

()

Core
HTML: Parser
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: al_9x, Assigned: hsivonen)

Tracking

unspecified
mozilla8
x86
Windows XP
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [inbound], URL)

Attachments

(2 attachments)

(Reporter)

Description

6 years ago
Fx 5.0.1, new profile
load http://es5.github.com/

"NOTEÂ Â Â " will be seen at the beginning and throughout

the three utf-8 encoded NBSP following NOTE are coming from a linked css whose encoding is not specified (http, bom, @charset)

.note::after {
  content: "   ";
}

according to http://www.w3.org/TR/CSS2/syndata.html#charset

the css should inherit the encoding (if not specified) of the doc, which is utf-8

Comment 1

6 years ago
Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0            Mozilla/5.0 (Windows NT 5.1; rv:8.0a1) Gecko/20110729 Firefox/8.0a1

> "NOTEÂ Â Â " will be seen at the beginning
cannot reproduce
Component: Layout: Text → General
QA Contact: layout.fonts-and-text → general
(Reporter)

Comment 2

6 years ago
Created attachment 549671 [details]
new profile
(Reporter)

Comment 3

6 years ago
8.0a1 (2011-07-31) as well
wfm with Mozilla/5.0 (Windows NT 6.1; rv:8.0a1) Gecko/20110731 Firefox/8.0a1 SeaMonkey/2.5a1
(Reporter)

Comment 5

6 years ago
If I copy and load the page (and css) locally (file:) I can still repro, but if I remove all but the first DIV from the body, the problem goes away.

The page is large, perhaps it's a question of timing: the determination of the  page's charset is not being made quickly enough (due to its size) so that when the presumably async load of the css tries to determine its own charset, the page still thinks it's latin (which is the default)
the <meta charset="utf-8"> is at the top jut before the css link.
Component: General → Style System (CSS)
QA Contact: general → style-system
(Reporter)

Comment 7

6 years ago
(In reply to comment #6)
> the <meta charset="utf-8"> is at the top jut before the css link.

Nevertheless, the problem is present when the page is large and slower to load and absent when small.

Comment 8

6 years ago
Henri, it looks like the style preload (off of nsHtml5LoadFlusher::Run calling FlushSpeculativeLoads) happens before the document charset is set (that comes via nsHtml5ExecutorFlusher::Run calling nsHtml5TreeOpExecutor::RunFlushLoop).

Is that sort of expected?  I can add the current document charset charset to the hashtable key, I suppose, but it's a little weird to have the preload start (and presumably get as far as OnDetermineCharset, so the data is already partially available!) before the charset is set on the document.
Status: UNCONFIRMED → NEW
Component: Style System (CSS) → HTML: Parser
Ever confirmed: true
QA Contact: style-system → parser
(In reply to comment #8)
> Henri, it looks like the style preload (off of nsHtml5LoadFlusher::Run
> calling FlushSpeculativeLoads) happens before the document charset is set
> (that comes via nsHtml5ExecutorFlusher::Run calling
> nsHtml5TreeOpExecutor::RunFlushLoop).
> 
> Is that sort of expected?

Given the code, one can expect this to happen, but this is an implementation error on my part. eTreeOpSetDocumentCharset shouldn't be a tree op in the first place. It should be in the preload op queue instead.
Assignee: nobody → hsivonen
Created attachment 550040 [details] [diff] [review]
Move charset setting to the other op queue

I don't see the problem in action on my computer, but by code inspection, the bug is clearly there. Due to the race condition nature of the bug, there isn't a way to write a deterministic test case for this.

Anyway, let's use this fix if the tryserver is OK with it.
Attachment #550040 - Flags: review?(bzbarsky)

Comment 11

6 years ago
I can in fact reproduce the problem, and the attached patch fixes it.

Comment 12

6 years ago
Comment on attachment 550040 [details] [diff] [review]
Move charset setting to the other op queue

>+++ b/parser/html/nsHtml5SpeculativeLoad.cpp
>+        aExecutor->SetDocumentCharsetAndSource(narrowName,
>+                                             intSource);

Fix indent, please?

>+++ b/parser/html/nsHtml5SpeculativeLoad.h
>+     * queue as opposed to tree operation queue is that the manifest must get

"manifest"?  You mean "charset change"?

>+     * processed before any actually speculative loads such as style sheets.

s/actually/actual/ ?

>+     * If mOpCode is eSpeculativeLoadImage, this is the value of the
>+     * "crossorigin" attribute.  If mOpCode is eSpeculativeLoadStyle
>+     * or eSpeculativeLoadScript then this is the value of the
>+     * "charset" attribute.  Otherwise it's empty.

This needs fixing for the new eSpeculativeLoadSetDocumentCharset opcode.

r=me with those nits.
Attachment #550040 - Flags: review?(bzbarsky) → review+
Thanks. Pushed with the nits fixed:
http://hg.mozilla.org/integration/mozilla-inbound/rev/8e4e181a9ce9
Status: NEW → ASSIGNED
Whiteboard: [inbound]
Landed followup to fix maemo build bustage:
https://bugzilla.mozilla.org/show_bug.cgi?id=675499
e.g.
http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla-Inbound/1312310251.1312310578.31642.gz
> /builds/slave/m-in-lnx-maemo5-gtk/build/parser/html/nsHtml5SpeculativeLoad.h:54: error: comma at end of enumerator list
http://hg.mozilla.org/mozilla-central/rev/8e4e181a9ce9
http://hg.mozilla.org/mozilla-central/rev/c1f5f3220d2f
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla8
You need to log in before you can comment on or make changes to this bug.