Closed
Bug 275650
Opened 20 years ago
Closed 17 years ago
grammatical-evolution.org - Mozilla fails to render page, just shows raw HTML, I.E. renders page OK
Categories
(Tech Evangelism Graveyard :: English US, defect)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: xanthian, Unassigned)
References
()
Details
Attachments
(4 files)
User-Agent: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.8a6) Gecko/20041127 Build Identifier: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.8a6) Gecko/20041127 For the subject URL, the Mozilla browser begins to render the page, but quickly reverts to just displaying the raw HTML code. The same page displays appropriately rendered when Internet Explorer is used instead. Reloading the page several times had no different effect. Reproducible: Always Steps to Reproduce: 1. Open URL 2. 3. Actual Results: After a brief flash of the header of the rendered page, the browser reverts to just displaying the HTML. A brief glance at the raw code shows no obvious problems. Expected Results: Rendered the page or showed an error message saying why it could not do so. Nothing fancy going on, this is a fairly new download, with a theme that has caused no problems in months of use.
Comment 1•20 years ago
|
||
WFM 20041220 PC/WinXP
WFM Mozilla 1.8a6 2004121205 on WinNT4
Version: unspecified → Trunk
Comment 3•20 years ago
|
||
Also cannot reproduce in a current trunk build on Linux....
| Reporter | ||
Comment 4•20 years ago
|
||
(In reply to comment #3) > Also cannot reproduce in a current trunk build on Linux.... Okay, I've since upgraded to nightly build 2005011605 on MS-Win98SE, and the bug is still reproducable for me. I wish I could do a movie for you; the page starts to render, renders the visible window full (and presumably keeps parsing and rendering "offscreen"), the rendering finds something it cannot digest, and the page is repainted top down with the raw HTML text. Attempts later to reload get just the raw HTML seen, probably as I'd expect if I understood the caching better. This in a freshly rebooted system in which I'd only done my DSL connection, and opened KevTerm to read the email copy of your remark, before trying this test, so I don't think the odds that something is corrupted in fragile Win98 before this test run are too high, given that the error recurs with different startups. I'm very suspicious that there is indeed something broken in the HTML on that page, but I've lost track of (or perhaps invented the existence of) the W3C page that proofreads HTML for you, so I don't have an automated way to "parse it for errors", that would indicate the location of the error, and the page is a bit big to proofread accurately by eye. Of course, Mozilla should fail more gently than this even if the HTML is in error, I'd think, preferably with an error note, either separately via a message box to the user or else embedded in the rendered page. Let me know if there's more info I can supply. FWIW xanthian.
Comment 5•20 years ago
|
||
Kent, do you have HTTP pipelining turned on, by any chance? As for the W3C HTML validator, it's at http://validator.w3.org/ but I agree that we shouldn't be displaying garbage even on somewhat invalid HTML... ;)
| Reporter | ||
Comment 6•20 years ago
|
||
(In reply to comment #5) Hi, Boris. > Kent, do you have HTTP pipelining turned on, by > any chance? Not unless Mozilla comes with it enabled "out of the box"; I had to go read your FAQ even to know what the term meant. I surely haven't played with wherever it is controlled. > As for the W3C HTML validator, it's at > http://validator.w3.org/ Thanks for that, at least I didn't imagine it out of whole cloth. > but I agree that we shouldn't be displaying > garbage even on somewhat invalid HTML... ;) Wouldn't that be an ideal world, though! If you've read Hofstader's Goedel, Escher, Bach, and the record player parable there, you know what the odds of that ever happening look like. xanthian.
Comment 7•20 years ago
|
||
OK. Kent, if you save the file to disk (using wget or another browser or something), can Mozilla load it from disk correctly? If it can, could you do an HTTP log of the failing pageload per the instructions at http://www.mozilla.org/projects/netlib/http/http-debugging.html and attach it to this bug?
| Reporter | ||
Comment 8•20 years ago
|
||
Per request from Boris, per instructions at link he provided.
| Reporter | ||
Comment 9•20 years ago
|
||
Just a screen capture to show this bug is "real", since it is a "WFM" on some testers' systems, but not on my MS-Win98SE system.
| Reporter | ||
Comment 10•20 years ago
|
||
(In reply to comment #7) > OK. > Kent, if you save the file to disk (using wget or > another browser or something), can Mozilla load it > from disk correctly? I downloaded a local copy with wget, and modulo that the image local references no longer worked for the two images on the page, the page rendered correctly (to the eye) from my local copy. The two images are located toward the top of the page, and though when rendering from the remote copy is failing, events happen too fast (a fraction of a second) to see whether they are displayed, I suspect they are not involved in the failure, simply because successful rendering goes well past that location before failing. That may not be a correct inference on my part, though. To make sure that the problem hadn't changed out from under my testing, I also then went back and confirmed that the failure still occurred as previously described, when the remote copy was accessed, and such is the case. > If it can, could you do an HTTP log of the failing > pageload per the instructions at > http://www.mozilla.org/projects/netlib/http/http-debugging.html > and attach it to this bug? Okay, I have done this, cutting and pasting from a non-mozilla copy of the instructions to assure I got it correct (kudos to the author, it worked as described) and will attach log.txt before adding this comment. The "instrumented" version didn't behave exactly the same, though, there was no "flash of a rendered version", only the final display of the raw HTML. It would be good if whoever reviews log.txt can check that some local cached/already (not)rendered version was not the source of the different behavior, if I said that so it makes sense. Since this bug doesn't reproduce for everyone, I am also attaching a screen capture of the failure to confirm its reality. And by the way, with the logging no longer occurring, when I made that screen capture, again there was no "flash of rendered text" as in the first access attempt, just a "straight to the raw HTML" display. HTH xanthian.
Comment 11•20 years ago
|
||
The log shows us just reading from the cache.... Does clearing cache and then loading the site show the problem too?
| Reporter | ||
Comment 12•20 years ago
|
||
(In reply to comment #5) > As for the W3C HTML validator, it's at > http://validator.w3.org/ I ran that validator on the failing page, it told me two things: No Character Encoding Found! Falling back to UTF-8. I was not able to extract a character encoding labeling from any of the valid sources for such information. Without encoding information it is impossible to reliably validate the document. I'm falling back to the "UTF-8" encoding and will attempt to perform the validation, but this is likely to fail for all non-trivial documents. and Sorry, I am unable to validate this document because on line 176 it contained one or more bytes that I cannot interpret as utf-8 (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication. A look at the indicated line of my local copy with vim() shows nothing scarier than a couple of raw umlauted letter "a"s (which vim() is happy to render as intended using whatever glyph code sheet is its default). There may be something invisible, but more than likely, this is what choked Mozilla. That also is far enough down the document to be out of sight in the original partial rendering attempt, so it is consistent with the evidence of the failure. Granted, the page author should have used HTML enities rather than the raw non-ASCII characters from his/her European keyboard, is this an expected "Mozilla killer"? FWIW xanthian. Re: comment 11 That's what I was afraid of; let me try again after some puttering around to get back to a test situation. xanthian.
| Reporter | ||
Comment 13•20 years ago
|
||
Re: comment 11 But before I go... Even read out of cache, the rendering fails, while when read from a local copy, it succeeds. Why the difference? Moreover, the failure mode changes when read from cache. Please tell me you don't cache the already (failed to be-)rendered version! My comments online and off about bug 271239 are just dying to re-erupt.
Comment 14•20 years ago
|
||
We cache the exact bytes the server sent. So if the server sent "bogus" bytes once, we'd cache it till the expiration time for that data.
| Reporter | ||
Comment 15•20 years ago
|
||
(In reply to comment #14) > We cache the exact bytes the server sent. So if the server sent "bogus" bytes > once, we'd cache it till the expiration time for that data. Well, that makes me quite a bit happier, thanks. xanthian.
| Reporter | ||
Comment 16•20 years ago
|
||
This looks much nicer, being almost six times as long, but the failure mode still omitted the initial partial rendering, so something remains changed from the failure under a completely clean start; was edit=>preferences=>advanced=>cache=>clear cache;OK insufficient? Sorry for the delay returning this, but something, perhaps Mozilla, since that's what I was using at the time, forced me to reboot [twice, after the first reboot neither Mozilla nor my DSL connection would come up]. Playing with Mozilla can be hazardous to one's free time.
Comment 17•20 years ago
|
||
No, clearing the cache should be sufficient (and this shows HTTP requests being made). Looks like we're getting ok data from the site, though. If I may ask, what does View > Character Coding say for you on this page? If it doesn't show ISO-8859-1, what are you charset autodetect settings?
| Reporter | ||
Comment 18•20 years ago
|
||
(In reply to comment #17) > No, clearing the cache should be sufficient (and > this shows HTTP requests being made). Looks like > we're getting ok data from the site, though. Then why no "flash of rendered version" even after I did that? Shouldn't the behavior have reverted to "first time seen" behavior? > If I may ask, what does View => Character Coding > say for you on this page? If it doesn't show > ISO-8859-1, what are your charset autodetect > settings? If I understand what I'm seeing, the button with center filled is opposite "Western Windows 1252", and the filled button in the charset autodetect menu is (!!!) Japanese! I didn't do that, I don't read Japanese, and no one has been in the house since I last replaced my current nightly build, but some few weeks ago a Japanese friend who visits regularly installed into the MS-Windows environment Japanese character ability so he could use his Hotmail/Yahoo! Mail Japanese account and read the text here. Would that have changed a browser setting, or would he have been instructed to change a browser setting? (why), and would that have lingered (how, though I probably can guess in a general sense), and would that have led to other pages being broken on arrival despite that they were _not_ Japanese (oops)? Remember also the one complaint from the W3C validator that there wasn't a source of character set information, which is independent of my whole browser setup (if that complaint makes sense at all), and thus shouldn't depend on my settings, though perhaps my settings are indeed making that lack unsurvivable for Mozilla's rendering pass. Curiouser and curiouser. Nice query, though, I'd _never_ have gone looking there. Oh, and if I set that to something else, what should it be, and will doing so break his ability to read his Kanji(?) email? xanthian.
| Reporter | ||
Comment 19•20 years ago
|
||
(In reply to comment #18) Okay, not to sit here passive, I changed the settings to view => character encoding => autodetect => off cleared the cache, reloaded, and the problem remained; then I additionally changed view => character encoding => Western (ISO-8859-1) again cleared the cache, reloaded, and the page now displayed correctly! Thus, the settings you chose were indeed the immediate cause of the symptoms. Now, is that purely a "user error", not a bug, or should Mozilla: 1) Have rendered the page correctly anyway, or 2) Failed to render the page correctly, but put up a useful message explaining why? If a "user error", not a bug, how does the user who needs dual language use avoid that error? Since Mozilla with those settings has been rendering other pages without problems, I assume that the non-ASCII raw text on that page is interacting with the character encoding settings in an unhappy way. Should that be the case? Should that case elicit a "check your character encoding settings" prompt? HTH xanthian.
| Reporter | ||
Comment 20•20 years ago
|
||
(In reply to comment #12) > A look at the indicated line of my local copy with > vim() shows nothing scarier than a couple of raw > umlauted letter "a"s (which vim() is happy to render > as intended using whatever glyph code sheet is its > default). I went back to check, and the word "Jyväskylä" with the two raw text umlauted "a"s was indeed correctly rendered on the page, specifically. HTH xanthian.
Comment 21•20 years ago
|
||
> or would he have been instructed to change a browser setting?
He almost certainly changed the browser setting, yes.
So what's happening here is that the page says absolutely nothing about the
encoding it's in. Since autodetect is enabled, and set to Japanese, when we hit
non-ASCII bytes we try to guess which exact Japanese encoding the page is in by
looking at the non-ASCII byte pattern and picking the Japanese encoding that is
most likely to have that byte pattern. Then we reparse the page from the
beginning (this is the "flash" thing you saw), in this new encoding.
It seems that the particular encoding in question (the one that was most closely
resembled by the decidedly non-Japanese text on the page) leads to garbage
display in this case, most likely because a '<' char somewhere is treated as
part of a character in this encoding or something along those lines.
I'm afraid this is in fact user error -- the user has told us to try decoding
all pages as Japanese unless the pages say otherwise, and we're doing our best
to do that...
ccing some intl folks to confirm, though.
Comment 22•20 years ago
|
||
I'll look at this more closely in a little while, but I notice right off that the server is sending a header "Content-Type: text/plain", which can't be helping :)
Comment 23•20 years ago
|
||
Hmm. this is very odd. The page in question has only two 8bit octets (with MSB
set), both of which are 0xE4. They're followed by non-8bit octets('s' and ','),
which are invalid in ISO-2022-JP and EUC-JP. Because of that, they're rendered
as 'an invalid char' when the encoding is set to EUC-JP or ISO-2022-JP. In
Shift_JIS, they're valid and rendered as Japanese Kanjis as expected.
With the cache cleared and Japanese autodetector on, however, I could reproduce
the problem. If earlier on it was detected as EUC-JP / ISO-2022-JP, 0xE4 + 's'
and 0xE4 + '.' should be just rendered as 'an invalid char'. If it was
Shift_JIS, everything is valid...
Simon, how did you get 'text/plain' response? attachment 171927 [details] and the result I
got at 'http://websniffer.org' have 'Content-Type: text/html'. If the server
'sometimes' emits 'text/plain' (although most times, it emits 'text/html'), that
may explain what we have here.
Comment 24•20 years ago
|
||
(In reply to comment #23) > If the server > 'sometimes' emits 'text/plain' (although most times, it emits 'text/html'), that > may explain what we have here. This seems to be the case. First load of http://sniffuri.org/view.cgi?url=http%3A%2F%2Fwww.grammatical-evolution.org%2Fpubs.html gives me text/html in the headers, but reloading gives text/plain. I'm guessing this is some misconfiguration on the server which is being triggered by the reload that can occur with charset auto-detection.
Comment 25•20 years ago
|
||
Ah, yes. Over to evangelism.
Assignee: general → english-us
Status: UNCONFIRMED → NEW
Component: General → English US
Ever confirmed: true
Product: Mozilla Application Suite → Tech Evangelism
QA Contact: general → english-us
Summary: Mozilla fails to render page, just shows raw HTML, I.E. renders page OK → grammatical-evolution.org - Mozilla fails to render page, just shows raw HTML, I.E. renders page OK
Version: Trunk → unspecified
| Reporter | ||
Comment 26•20 years ago
|
||
(In reply to comment #25) > Ah, yes. > Over to evangelism. Well, thanks for that at least, I was expecting an INVALID! So despite it being a user-self-inflicted problem, it's still interesting. That's good, Boris, all your efforts haven't been a waste. xanthian.
| Reporter | ||
Comment 27•20 years ago
|
||
(In reply to comment #21) > So what's happening here is that the page says > absolutely nothing about the encoding it's in. > Since autodetect is enabled, and set to Japanese, > when we hit non-ASCII bytes we try to guess which > exact Japanese encoding the page is in by looking > at the non-ASCII byte pattern and picking the > Japanese encoding that is most likely to have that > byte pattern. Then we reparse the page from the > beginning (this is the "flash" thing you saw), in > this new encoding. Hmm, later comments indicate that there is also some interaction with the server's claim for what it is sending, so once it stops claiming to be sending text/html, and starts claiming to be sending text/plain, the initial attempt the render (the "flash") doesn't happen any more. It would be of some interest to see whether there is _ever_ an attempt to render when the file is first downloaded by some non-Mozilla means, and then accessed from local disk, there being then no "server" involved, and since as noticed the file contains no internal typifying of its own contents. > It seems that the particular encoding in question > (the one that was most closely resembled by the > decidedly non-Japanese text on the page) leads to > garbage display in this case, most likely because That's a bit misleading, no biggie, but it isn't displaying "garbage" (except possibly offscreen where I don't see it, and depending as noted in another comment perhaps on which flavor of "Japanese" encoding it guesses it is seeing). Instead, what ends up finally on the screen is just the raw text of the HTML code, no "garbage" at all, even where the two offending umlauted "a"s are. > a '<' char somewhere is treated as part of a > character in this encoding or something along > those lines. Such seems to be the case. > I'm afraid this is in fact user error -- the user > has told us to try decoding all pages as Japanese > unless the pages say otherwise, and we're doing > our best to do that... Sigh. Sorry for the inconvenience, then. I'm still a bit promoting that Mozilla should comment on what went wrong, though, when it decided not to render the page at all, just display the raw HTML, after detecting from the contents that the page _was_ HTML (if such is actually the case), so the user has a clue. Something like "You told me to interpret this web page as [Japanese, e.g.], but then when I did that, some of the characters came out invalid in that encoding, so I'm just displaying the page uninterpreted as a fallback." would have been nice, let the user know _why_ what happened had happened, pointed the user mildly toward a fix ("what do you _mean_, 'Japanese'?!?") and avoided this bug report. Mozilla going into a failure mode fallback silently isn't as helpful. One might want to condition such warnings on a "verbose" preferences setting, but there's then an issue of how to make the naive user aware that enabling that setting is a good first start when Mozilla does "spooky stuff". > ccing some intl folks to confirm, though. Thanks, since that seems to have raised other issues among folks more familiar with the internationalization pitfalls. xanthian.
Comment 28•20 years ago
|
||
> "You told me to interpret this web page as
> [Japanese, e.g.], but then when I did that,
> some of the characters came out invalid in
> that encoding, so I'm just displaying the
> page uninterpreted as a fallback."
But that's not what happened. Mozilla does not do that. it's probably the fact
that the server sometimes sends the page as text/plain, as mentioned above...
Comment 29•20 years ago
|
||
re comment #27 You're misinterpreting what we figured out. (comment #23 and comment #24). bz's analysis in comment #21 was not quite right. It's NOT invalid octets that makes mozilla render html as 'text/plain'. If Mozilla did, it would be a very serious bug and we should fix it.
| Reporter | ||
Comment 30•20 years ago
|
||
(In reply to comment #28) > But that's not what happened. Mozilla does not do > that. it's probably the fact that the server > sometimes sends the page as text/plain, as > mentioned above... Well, no, that wouldn't explain why the failure still occurred when the HTML was being read from a local copy I downloaded with wget() per Boris' request, still showing me the raw HTML, just without the prior rendered flash. Whatever the problem is, the behavior of showing the raw HTML is associated with the file contents, I just turned the "Japanese" stuff back on and checked other simple and complex local HTML pages that don't have a text type declaration, and they render without problems, as expected. The behavior of _flashing first_ does seem to be "server" related, since I cannot duplicate that at all from the local copy of the same web page. FWIW xanthian.
Comment 31•20 years ago
|
||
I can't reproduce the problem if the file (pubs.html) is on my local disk or on my server ( http://jshin.net/moztest/275650.html ).
Comment 32•17 years ago
|
||
(In reply to comment #31) > Created an attachment (id=172079) [details] > the html file in question > > I can't reproduce the problem if the file (pubs.html) is on my local disk or on > my server ( http://jshin.net/moztest/275650.html ). ditto. and the URL Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.2) Gecko/20070222 SeaMonkey/1.1.1 XpcomViewer/0.8.9
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → WORKSFORME
Updated•10 years ago
|
Product: Tech Evangelism → Tech Evangelism Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•