Closed Bug 687295 Opened 13 years ago Closed 13 years ago

ThinVNC screen not updating with Firefox due to use of deflate-stream in websockets; works with Chrome and Opera

Categories

(Core :: Networking: WebSockets, defect)

x86
Windows XP
defect
Not set
major

Tracking

()

VERIFIED FIXED
mozilla8
Tracking Status
firefox7 + ---
firefox9 --- fixed

People

(Reporter: fehe, Assigned: mcmanus)

References

Details

(Keywords: dev-doc-complete, regression, Whiteboard: inbound)

Attachments

(2 files)

Firefox's image handling makes breaks functionality for ThinVNC browser-based client to update the screen representation of the remote computer.  Instead the following appears in error console and the screen remains static:

Error: Image corrupt or truncated: http://192.168.1.20:8080/css/images/ui-bg_inset-hard_100_fcfdfd_1x100.png
Source File: http://192.168.1.20:8080/css/images/ui-bg_inset-hard_100_fcfdfd_1x100.png
Line: 0

Chrome, Opera, and I suspect also IE 9 (if I was running Windows 7) work without issue.

ThinVNC is a VNC implementation that does not require a separate client -- just an HTML5 capable browser -- to manage a remote computer.

STR:

1. Visit http://www.thinvnc.com/thinvnc/html5-vnc.html
2. Download and install ThinVNC 2.0 on another computer or a VM.
3. Verify that the ThinVNC service is started.  You will see an icon in the System Tray / Notification area.
4. Configure ThinVNC as necessary, noting what port and protocol you must connect with
5. Use the browser to connect to the ThinVNC server.
6. Notice that the initial screen displays, but nothing happens beyond that, with mouse movement.  Note the relevant error message(s) in Error Console.
7. If the ThinVNC server host is within view and has a display, glance at it and note that it is indeed responsive.
8. Compare with any of the other HTML5-capable browsers.  Note that they work without fuss.
Works with June 11 nightly: http://hg.mozilla.org/mozilla-central/rev/dc8d154f3710
Fails with June 12 nightly: http://hg.mozilla.org/mozilla-central/rev/e37011790a8a

Regression range: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=dc8d154f3710&tochange=e37011790a8a

I suspect this is caused by bug 661045
Blocks: 661045
Keywords: regression
Summary: ThinVNC cannot update remote screen with Firefox - get error: Image corrupt or truncated; works with Chrome and Opera → ThinVNC screen not updating with Firefox [Image corrupt or truncated]; works with Chrome and Opera
Can you upload one of these corrupt images to bugzilla (or somewhere else) so that I can test?
The image is corrupt; however, the other browsers appear to be able able to skip it somehow and continue processing the rest of the page, whereas Firefox simply stop displaying anymore updates.

I believe you really need to install ThinVNC, as indicated in comment 0, to be able to test this properly.
Bug 661045 seems pretty unlikely to cause this.

Bug 663194 on the other hand, seems like it could well have this effect....

Do you see the error message about corrupt or truncated images in the June 11 nightly too?  Or just in the June 12 and later ones?
(In reply to Boris Zbarsky (:bz) from comment #4)
> Do you see the error message about corrupt or truncated images in the June
> 11 nightly too?  Or just in the June 12 and later ones?

Error message does also appear in June 11 nightly.
Then yeah, I bet the issue is bug 663194.

Can you build yourself, or do you want me to set up a try build with that patch backed out?
Would likely be a lot easier if you created the build.  Thanks.
Worth keeping in mind that we have less than 48 hours to diagnose and fix this for 7.
No longer blocks: 661045
1. How common is this?
2. Does it still exist if we back out 663194?
3. What's the downside of backing out bug 663194?
(In reply to Christian Legnitto [:LegNeato] from comment #9)
> 2. Does it still exist if we back out 663194?

If somebody creates a try build with bug 663194 backed out, I could confirm.
I'd be fine with backing out yes. I'll leave to others what to do as far as raising this on WHATWG and seeing what other browsers do.
(In reply to Boris Zbarsky (:bz) from comment #12)
> OK, there's a test build...

Tested and still does not work.  Definitely not bug 663194.  Thanks.
OK, I'm going to spin up several builds spread over the regression range in comment 1 to see whether we can narrow it down (several because the latency is so high, so a pure binary search would take too long).

I should have the links to the builds tomorrow morning....
Works with the first build, 55359cf599c8, does not work with any of the others.

Thanks
If you flip the "network.websocket.extensions.stream-deflate" preference to "false" in about:config (and possibly restart), does the problem go away?
Yes it does go away when that is set to "false" (no restart required).
I would guess this means that Thin VNC has a broken implementation of the websockets stream-deflate extension that they have opted in to during the handshake.

We are lucky in this case that the path forward is easy: disable our support for that extension by default. We should do this because recently (in an ietf draft later than we support) that extension has been deprecated - so we know now that it likely has no future. Time spent on determining exactly where the error is won't pay off down the road. This is an extension, so turning it off shouldn't really impact anything.

There currently is no other compression extension to replace it with. I argued that was a mistake in the IETF, but cest la vie. We'll go compressionless for now rather than spreading this around the net when it has already been deprecated.
this patch just flips the pref and adjusts the test case which expected the pref to be on. It has been sent to try.
Attachment #561199 - Flags: review?(bzbarsky)
Disabling this before we have an alternative on the implementation side does mean that some websites will have to create their own compression logic on top of WebSockets.

It does seem like a shame to disable the protocol just because of one buggy implementation, but I'll leave that decision to you.
Comment on attachment 561199 [details] [diff] [review]
deflate-stream-off 1

r=me, but it's interesting that this works in Chrome.  Does it not support deflate-stream?
Attachment #561199 - Flags: review?(bzbarsky) → review+
Summary: ThinVNC screen not updating with Firefox [Image corrupt or truncated]; works with Chrome and Opera → ThinVNC screen not updating with Firefox due to use of deflate-stream in websockets; works with Chrome and Opera
(In reply to Boris Zbarsky (:bz) from comment #22)

> r=me, but it's interesting that this works in Chrome.  Does it not support
> deflate-stream?

right. we were all alone on that front. and now it has been removed :(
Per conversation with johnath it looks like this fix is going to miss 7.
Assignee: nobody → mcmanus
Component: ImageLib → Networking: WebSockets
QA Contact: imagelib → networking.websockets
I think that's really bad.  The result will be us being the only ones to ship this compression method, and then we'll drop support in the next release, never reinstate it.... and no one else will ever ship it.  So it's pure loss, not only for us but for the web (which then has to deal with the fact that this code got shipped sometime, and probably has to do so in perpetuity).

I assume we have really good reasons to not take this pref change, to outweigh those negatives...
johnath told me to email release-drivers if I thought this needed to be taken.  I'll let Boris or Patrick do that.
(In reply to Boris Zbarsky (:bz) from comment #25)
> I think that's really bad.  The result will be us being the only ones to
> ship this compression method, and then we'll drop support in the next
> release, never reinstate it.... and no one else will ever ship it.  So it's
> pure loss, not only for us but for the web (which then has to deal with the
> fact that this code got shipped sometime, and probably has to do so in
> perpetuity).

normally I would full throated-ly agree. but in this case the change is really going to be transparent to just about every situation. Anyone who wanted to treat it non-transparently was going to have to do feature detection with it anyhow by looking at the extensions attribute (because other clients didn't implement anything) and when we pull it back the feature detection will still work fine. So I don't see one release of it followed by disabling as something servers will need to support forever (or at all - the server always has to opt into the extension at handshake time.)

OTOH, the deflate-stream extension has no future. If we've identified even one site that doesn't operate with it then yanking it is still preferred imo,.

I'll check into inbound and set the various triage flags after the try run completes.
I mailed release-drivers.

Apparently final builds of Fx7 have already been created.  If we end up respinning for a stop-ship issue sometime in the next week, we will take this fix, but we're not treating this bug as a stop-ship.

I mailed the ThinVNC folks about the problem, on the off chance that they can fix whatever is broken on their end...
I have to say that I'm not quite as optimistic about the transparent-ness as you are. This bug is an excellent example of the extension not being transparent, right?
Comment on attachment 561199 [details] [diff] [review]
deflate-stream-off 1

beta-7 approval request acking that it won't trigger a respin
Attachment #561199 - Flags: approval-mozilla-beta?
Attachment #561199 - Flags: approval-mozilla-aurora?
Whiteboard: inbound
Target Milestone: --- → mozilla9
Try run for 005bb98b23bb is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=005bb98b23bb
Results (out of 63 total builds):
    success: 56
    warnings: 7
Builds available at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mcmanus@ducksong.com-005bb98b23bb
(In reply to Jonas Sicking (:sicking) from comment #29)
> I have to say that I'm not quite as optimistic about the transparent-ness as
> you are. This bug is an excellent example of the extension not being
> transparent, right?

I see an implementation bug, sure.

By transparent I mean that shipping the extension and then later removing it or changing to a different one is not going to make applications running on compliant clients and servers break.
(In reply to Mozilla RelEng Bot from comment #31)
> Try run for 005bb98b23bb is complete.

when you account for random-orange and the perma breakage of android on try, this is actually green.
I think the assumption that clients and servers on the web are compliant is more or less unwarranted....
Indeed, the distinction between compliant and non-compliant servers and clients isn't really meaningful. What matters are the servers and clients that are popular enough on the web.

Again, the bug is a good case in point, the server here is clearly not compliant, but we still decided to spend engineering hours to dealing with it due to its popularity.
(In reply to Jonas Sicking (:sicking) from comment #35)

> Again, the bug is a good case in point

please summarize the case in point. would it be fair to say you mean "all negotiable protocol extensions are inherently bad"?

It doesn't seem any different than feature-detection driven html 5 to me.
The main point is that the distinction between compliant and non-compliant implementations doesn't make much sense. As far as I know we've never taken that into account when triaging bugs. What we care about is their popularity. This has been the case in all areas of Firefox for as long as it has existed.

This is somewhat simplified. Of course we always push non-compliant implementations into fixing their stuff. But many many times we've simply had to deal anyway.

The other is "Any disparities between UAs are bad". (Restricted to disparities exposed to webpages/servers).

Sure, some disparities will always happen as we move forward with the web platform as not all browsers implement features at the same time. That is bad, but we really don't have an option short of halting development or eliminating all competition.

Feature detection is far from a silver bullet. There are lots of websites out there that forget to do it or do it wrong. For example apple.com used to render almost blank when we implemented CSS transitions since they did the feature detection wrong and detected our support but then ended up using either a webkit-prefixed CSS property or a wrongly prefixed moz property, i forget which.

Short term we'll absolutely need to have disparities between UAs as we develop new features, there simply is no alternative. However long term we should always AIM for having as much alignment between UAs as possible.
https://hg.mozilla.org/mozilla-central/rev/6bcc3a001bb3
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
The ThinVNC folks have confirmed that they can reproduce the problem and that disabling deflate-stream on their end also fixes things.  They're looking into it from their end as well.
Attachment #561199 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/c84ac4a98e07
Target Milestone: mozilla9 → mozilla8
Attachment #561199 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Documentation updated:

https://developer.mozilla.org/en/WebSockets#Gecko_notes

Mentioned on Firefox 8 for developers.
Verified fixed: http://hg.mozilla.org/mozilla-central/rev/90575e23ea93

By the way, the latest build of ThinVNC 2.0.0.16 has fixed the websockets issue.  I'm not sure if that means they've fixed their implementation or simply removed support for deflate-stream, as I don't know how to test for it; however, enabling deflate-stream in Firefox no longer breaks functionality.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: