Closed
Bug 272815
Opened 20 years ago
Closed 8 years ago
On page reloads, MS quotes alternately rendered, butchered
Categories
(Core :: Networking: Cache, defect)
Core
Networking: Cache
Tracking
()
RESOLVED
FIXED
People
(Reporter: klbehren, Unassigned)
References
()
Details
Attachments
(1 file)
3.62 KB,
patch
|
bzbarsky
:
review+
jst
:
superreview+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 http://evergreenleaders.org/elcc/tiki-newsletters.php?nlId=7&editionId=2 uses a UTF-8 character set and includes Microsoft smart quotes. These include fancy left and right slanted quotation marks and stylized apostrophes. Such characters are often present in content which originated in MS Word. Their character codes are as follows: apostrophe: 92h, 146 decimal open quote: 93h, 147 decimal close quote: 94h, 148 decimal Upon the initial page load, Firefox fails to render the smart quotes. Instead it interprets the UTF-8 code as three separate characters: apostrophe displays as ’ open quote displays as “ close quote displays as †However, a FireFox-induced page reload causes the smart quotes to render legibly. In subsequent reloads, the page alternates predictably between rendered and butchered smart quotes. In all cases, FireFox page source displays all quotes legibly. A file compare indicates that reloading does not change any relevant page source content. Reproducible: Always Steps to Reproduce: 1. load URL 2. observe butchered smart quotes 3. click FireFox "Reload" button 4. observe pretty smart quotes 5. click FireFox "Reload" button 6. observe butchered smart quotes 7. click FireFox "Reload" button 8. observe pretty smart quotes 9. repeat as desired Actual Results: Microsoft smart quotes were butchered, then rendered, then butchered, then rendered. Expected Results: Render the Microsoft smart quotes legibly on the first page load and all subsequent reloads.
Summary: On repeated reloads, MS smart quotes alternate being rendered or butchered → On page reloads, MS quotes alternately rendered, butchered
Updated•20 years ago
|
Assignee: firefox → nobody
Component: General → Layout
Product: Firefox → Core
QA Contact: firefox.general → core.layout
Version: unspecified → Other Branch
Comment 1•20 years ago
|
||
The page is seriously broken - it contains two <html> blocks, one of them containing a <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> tag, and the other a <meta http-equiv=Content-Type content="text/html; charset=windows-1252">. Still, Firefox's behaviour seems weird.
(In reply to comment #1) The outer <html> tag was generated by my CMS. The inappropriate inner tag is from user-stored content (the entire contents of a MS Word html file). Note also that the URL I included in the description is incomplete. It should read: http://evergreenleaders.org/elcc/tiki-newsletters.php?nlId=7&editionId=204
Comment 3•20 years ago
|
||
Not a layout issue... over to networking for a start, but this could also be parser. When we get into the parser's first OnDataAvailable, loading from cache, the amount of data available is 2 or 3 bytes (the amount varies). So the parser naturally doesn't find that first meta tag in that 2-3 byte chunk, which ends up relying on the meta charset observer to set the charset; it's not quite clear to me why that's screwing up and not seeing the first meta tag. Oh, and I can't seem to reproduce this in a debug build, of course.... :( Over to necko for now, I guess -- it'd be nice to get more than 2 bytes out of the cache. Blake, do you have time to look at this from the charset observer end?
Assignee: nobody → darin
Status: UNCONFIRMED → NEW
Component: Layout → Networking: Cache
Ever confirmed: true
QA Contact: layout → networking.cache
Updated•20 years ago
|
OS: Windows 2000 → All
Hardware: PC → All
Version: Other Branch → Trunk
Comment 4•20 years ago
|
||
Taking. I will have a wallpaper over this (very strange cache behavior) from the parser's end sometime soon.
Assignee: darin → mrbkap
Target Milestone: --- → mozilla1.8beta2
Updated•20 years ago
|
Status: NEW → ASSIGNED
Comment 5•20 years ago
|
||
This is a slightly dirty way to fix this bug and it would be nice to know why the cache was returning two bytes in the first chunk. Basically, on the first load, we would see the first <meta> tag and process the charset. This meant that on the first load, we'd always get the right charset. However, on subsequent loads from the cache, we'd ignore the first <meta> tag, since the charset was already the same as the one we'd obtained from the cache (and there was no reason to reload). However, this meant that we'd never notice the first charset, so when we came across the second charset, we'd use that, thinking that our only charset hint was from the cache. With this patch, we notify the parser that we've seen a meta tag giving us charset information, so we pass over the second <meta> tag.
Attachment #174645 -
Flags: review?(bzbarsky)
Comment 6•20 years ago
|
||
Comment on attachment 174645 [details] [diff] [review] patch v1 [checked in] r=bzbarsky
Attachment #174645 -
Flags: review?(bzbarsky) → review+
Comment 7•19 years ago
|
||
Comment on attachment 174645 [details] [diff] [review] patch v1 [checked in] Darin, can you sr?
Attachment #174645 -
Flags: superreview?(darin)
Comment 8•19 years ago
|
||
Comment on attachment 174645 [details] [diff] [review] patch v1 [checked in] punting sr request to jst. i don't know this code.
Attachment #174645 -
Flags: superreview?(darin) → superreview?(jst)
Comment 9•19 years ago
|
||
If the cache tells you the charset, then it means that you analyzed the content previously and determined that that is the best charset. Why do you need to analyze the stream again when you have a hint from the cache? It should be the correct charset if it was the correct charset last time. If you trust the charset from the cache, then you don't have to worry about the 2 bytes of data from the cache. That 2 bytes could come for any number of reasons. Maybe we got a context switch at an inopportune moment while reading from the cache file (on the background thread). NOTE: the act of writing to the nsPipe causes an event to be dispatched to the UI thread, which could trigger a context switch immediately. so, this scenario is possible. we could say that we will delay first ODA until we have at least 4k of data (or the entire document), but that shouldn't be necessary for things to function properly.
Comment 10•19 years ago
|
||
it is the parser that sets this value on the cache btw. it owns this entire process :)
Comment 11•19 years ago
|
||
ok, upon further investigation it appears that the HTML for the site is served with "cache-control: no-store" so i don't believe that our cached copy is ever used. it would be a major bug if it were used.
Comment 12•19 years ago
|
||
Darin, for the reason that we re-detect the charset (over the cache), see bug 81253 comment 75 and beyond. The short of it is a worry that if a user turns off charset autodetection and we guess wrong, turning autodetection back on should detect the charset, even if we have a cached charset. Simon, is this still an issue? I also set a breakpoint in nsHTMLDocument::TryCacheCharset(), and after the first load, I invariably see the function returning PR_TRUE (and a real charset), so it looks like the cache is, in fact, storing the charset despite the headers. I'm not sure where the rest of the data comes from.
Comment 13•19 years ago
|
||
Comment on attachment 174645 [details] [diff] [review] patch v1 [checked in] sr=jst
Attachment #174645 -
Flags: superreview?(jst) → superreview+
Comment 14•19 years ago
|
||
The attachment 174645 [details] [diff] [review] has been checked in. Giving the bug back to Darin, so he can look at the charset issue. Darin, Boris was wondering on IRC what it means for the cache entry to exist and have a charset (and whether cache entries are reused).
Assignee: mrbkap → darin
Status: ASSIGNED → NEW
Comment 15•19 years ago
|
||
This bug might give a clue about the cause of bug 278571. I failed to reproduce it the other day with a debug build running within a gdb session, but I'll take another look.
Updated•19 years ago
|
Priority: -- → P4
Updated•19 years ago
|
Target Milestone: mozilla1.8beta2 → mozilla1.9alpha
Comment 16•18 years ago
|
||
-> default owner
Assignee: darin → nobody
Target Milestone: mozilla1.9alpha → ---
Updated•18 years ago
|
Priority: P4 → --
Updated•17 years ago
|
Attachment #174645 -
Attachment description: patch v1 → patch v1 [checked in]
Comment 17•8 years ago
|
||
this patched is checked in (c14) and other than that seems to have bitrotted..
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•