Closed Bug 501953 Opened 15 years ago Closed 15 years ago

[HTML5][Patch] Content encoding error (invalid or unsupported form of compression) at http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-June/020620.html

Categories

(Core :: DOM: HTML Parser, defect, P2)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: u88484, Unassigned)

References

()

Details

With HTMl5.enabled set to true, I get the following xul error page when loading http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-June/020620.html Content Encoding Error The page you are trying to view cannot be shown because it uses an invalid or unsupported form of compression.
OS: Windows Vista → All
Hardware: x86 → All
This seems like a Necko-layer error. How could a stream listener do things wrong and trigger this on the lower layer?
This happens with or without the HTML5 parser and can be resolved by clearing the cache. This seems to be a common occurrence with the trunk and the WHATWG list archives. Likely dupe of bug 453988, bug 366023 or bug 469352.
Component: HTML: Parser → Networking: HTTP
QA Contact: parser → networking.http
Summary: [HTML5] Content encoding error (invalid or unsupported form of compression) at http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-June/020620.html → Content encoding error (invalid or unsupported form of compression) at http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-June/020620.html
Henri, yes I've seen reports of this without the HTML5 parser but I can only get that page to do that with HTML5 parser enabled. Disabling it, I can access the page with no problem.
> How could a stream listener do things wrong and trigger this on the lower layer It couldn't.
Marking blocking1.9.2?, because Firefox ships with this bug, site admins may feel the gzip encoding is poisoned for a long time.
Flags: blocking1.9.2?
This needs steps to reproduce. I got the error precisely once when loading this page. Ever since then it's been fine. Is this reliably reproducible for someone?
I have only unreliable steps to reproduce: Navigating forward in the WHATWG list archives until this bug is hit.
Once you hit it, do you reliably get the error for that page unless you reload or shift-reload?
OK. Steps to reproduce (on Mac; adjust as needed for other OSes): 1) rm -r /tmp/test-profile 2) mkdir /tmp/test-profile 3) env NSPR_LOG_MODULES=nsHttp:5 NSPR_LOG_FILE=/Users/bzbarsky/log.txt firefox -profile /tmp/test-prof about:blank 4) Load about:config 5) Toggle the HTML5 parser pref to true 6) Load the URI from this bug I see us making a GET request for the page; the server responds with: -1340452864[545020]: HTTP/1.1 200 OK -1340452864[545020]: Date: Fri, 04 Sep 2009 13:43:17 GMT -1340452864[545020]: Server: Apache -1340452864[545020]: Last-Modified: Wed, 01 Jul 2009 04:42:20 GMT -1340452864[545020]: Etag: "34190e3-9009-4a4ae92c" -1340452864[545020]: Accept-Ranges: bytes -1340452864[545020]: Keep-Alive: timeout=2, max=100 -1340452864[545020]: Connection: Keep-Alive -1340452864[545020]: Content-Type: text/html -1340452864[545020]: Content-Encoding: gzip -1340452864[545020]: Content-Length: 13162 Then I see the channel get canceled (not sure why yet) and we do a second GET on the URI, this time with "Range: bytes=1149-" (because that's how much we now have in the cache). The server responds with: -1340452864[545020]: HTTP/1.1 206 Partial Content -1340452864[545020]: Date: Fri, 04 Sep 2009 13:43:18 GMT -1340452864[545020]: Server: Apache -1340452864[545020]: Last-Modified: Wed, 01 Jul 2009 04:42:20 GMT -1340452864[545020]: Etag: "34190e3-9009-4a4ae92c" -1340452864[545020]: Accept-Ranges: bytes -1340452864[545020]: Content-Length: 35724 -1340452864[545020]: Content-Range: bytes 1149-36872/36873 -1340452864[545020]: Keep-Alive: timeout=2, max=100 -1340452864[545020]: Connection: Keep-Alive -1340452864[545020]: Content-Type: text/html Note that this is not compressed, whereas the original response was. So we abort the load (see bug 247334). Note that this is a server bug, by the way. It would be rather nice to fix the broken server in this instance. The only interesting thing here is that the HTML5 parser might be triggering that second GET somehow whereas we didn't use to do that with the non-HTML5 parser. Checking why now. In any case, doesn't look like it blocks 1.9.2.
(In reply to comment #10) > The only interesting thing here is that the HTML5 parser might be triggering > that second GET somehow whereas we didn't use to do that with the non-HTML5 > parser. Checking why now. Probable cause (unverified): Charset meta triggering the renavigation code path.
The meta doesn't fall within the first 512 bytes on this page.
Flags: blocking1.9.2?
nsHtml5Parser::PerformCharsetSwitch is what stops the old load and starts the new one. In the old parser, in this testcase, the charset source is set to kCharsetFromMetaTag by the SetDocumentCharset call in ParserWriteFunc. So when the meta charset observer sees the <meta> tag, it does nothing, hence no reload. In the HTML5 parser we also have a <meta> prescan, but limited to 512 bytes. In this particuar case, due to the long REL="made" the <meta> tag doesn't finish before 512 bytes. The 512th byte is the 's' in 'us-ascii'. Obvious options include: 1) Somehow recover from this server bug instead of just reporting an error message. 2) Get the server bug fixed. 3) Doom the cache entry on charset reload, or otherwise forbid a Range request, so we don't do a Range request and don't trigger this server bug. 4) Raise the size of the HTML5 sniffing buffer so we don't have to reload in this case. 5) Don't reload on switch from ISO-8859-1 to us-ascii. I rather like option 4 myself. Over to parser for now. If we decide that we absolutely must do option 1 above and we have no existing bug on that, put this back into HTTP-land.
Component: Networking: HTTP → HTML: Parser
QA Contact: networking.http → parser
Summary: Content encoding error (invalid or unsupported form of compression) at http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-June/020620.html → [HTML5]Content encoding error (invalid or unsupported form of compression) at http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-June/020620.html
4) is what WebKit does. 5) needs to happen in any case. 6) What the spec says: perform the renavigation so that the original HTTP response keeps writing to a cache entry and the renavigation reads from the cache entry without hitting the network again.
Ah, indeed. (6) might be interesting. Jason, you think we can add such an API to necko? How is (6) supposed to work when the content isn't cached, though? I guess just redo the request?
I guess we shouldn't depend on (6) for getting the HTML5 parser turned on by default. Is anyone seeing this elsewhere (that is, somewhere other than the WHATWG list archives)?
Priority: -- → P2
Henri, I haven't and the page I reported this on is now a 404 so hopefully you guys have enough information to fix this.
(In reply to comment #17) > Henri, I haven't Great. (I've seen this fairly recently when browsing the WHATWG archives.) > the page I reported this on is now a 404 so hopefully you > guys have enough information to fix this. Yes, the browser side of this issue is now understood.
> I guess we shouldn't depend on (6) for getting the HTML5 parser turned on Indeed.
We have had a similar issue with users of our website. The problem seems to involve Range responses that are gzip encoded. I used the httpfox plugin and noticed that every time the error appeared, the request included a Range and If-Range header, and the response was 206 Partial Content (even though I was hard-refreshing). I'm not sure if this is a bug in mod_deflate, if range responses shouldn't be compressed, or in how firefox handles these responses. To solve the problem we updated our apache config to disable Range requests using mod_headers, since compression seems to provide more value: RequestHeader unset Range early RequestHeader unset If-Range early Header unset Accept-Ranges
Jeremy, that's unrelated to this bug. In general, though, if the 206 response is compressed when the 200 response is not, or vice versa, then you'd get this error (because there's no sane way to map from one to the other). That's almost certainly what you're running into.
It's also happens on this site http://www.station-drivers.com/ On even number of reloads site display correctly, on odd number and first run display wrong
Yep, exactly. The meta tag that sets the charset starts at byte 905 or so.
I expect the patch for bug 545658 to paper over this. (Solution #4 from comment 13.)
Depends on: 545658
Summary: [HTML5]Content encoding error (invalid or unsupported form of compression) at http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-June/020620.html → [HTML5][Patch] Content encoding error (invalid or unsupported form of compression) at http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-June/020620.html
Looks fixed for me, can some1 confirm, so we can close it ?
Oops. Sorry. I forgot to mark this fixed when I landed the patch for bug 545658.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.