Closed Bug 584092 Opened 14 years ago Closed 12 years ago

multipart/x-mixed-replace stream terminated on bad first frame

Categories

(Core :: Graphics: ImageLib, defect)

1.9.2 Branch
x86_64
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 787899

People

(Reporter: rodneyp, Assigned: joe)

References

()

Details

(Keywords: regression, stackwanted)

User-Agent:       Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.2; OfficeLiveConnector.1.5; OfficeLivePatch.1.3; .NET4.0C)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8

There seems to be a problem with mjpeg support.  On selecting to display an mjpeg video in a movable configuration I get a connection error.  I can view the direct feed at http://tpm.ce.unr.edu/feeds/Local_Feed/Camera%20-%20E/mjpeg/1
but can not open it in the normal browser window.  

Reproducible: Always

Steps to Reproduce:
1.Go to webpage
2.Click on active Local_Feed link on left
3. 
Actual Results:  
video box is displayed on page, but see "Connection error!" in place of mjpeg video. 

Expected Results:  
Video box is displayed on page with live video feed in box.  

Same problem on Mac.  Same problem with 3.6.7.  Worked in 3.6.6.  Right click where feed should be and selecting "View Image" shows feed to confirm that feed is active.  Web page works in IE and Safari. Work arround is to instruct all viewers to use different browser.
On 1.9.2 this regressed within http://hg.mozilla.org/releases/mozilla-1.9.2/pushloghtml?fromchange=4b8165a8d602&tochange=7dca7be78aad
=> likely Bug 542096. Thus it's coherent the Reporter saying that it worked on 3.6.6 and not on .7/.8.

The Issue is the same on Minefield/4.0b3pre ID:20100803055502.

If i choose Right-Click/View Image on the Error Message, the Images are shown.
Blocks: 542096
Status: UNCONFIRMED → NEW
Component: General → ImageLib
Ever confirmed: true
Keywords: regression
Product: Firefox → Core
QA Contact: general → imagelib
Version: unspecified → 1.9.2 Branch
Regression window on m-c build:
Works:
http://hg.mozilla.org/mozilla-central/rev/520f23ddf196
Mozilla/5.0 (Windows; Windows NT 6.1; WOW64; en-US; rv:2.0b2pre) Gecko/20100711 Minefield/4.0b2pre ID:20100711154123
Fails:
http://hg.mozilla.org/mozilla-central/rev/97b8d1dd4c65
Mozilla/5.0 (Windows; Windows NT 6.1; WOW64; en-US; rv:2.0b2pre) Gecko/20100710 Minefield/4.0b2pre ID:20100711192444
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=520f23ddf196&tochange=97b8d1dd4c65
The "connection error" thing is coming from the site; we don't show anything like that....  Not only that, but the m-c regression range has nothing resembling what the 1.9.2 regression range looks like.

Bobby, any idea what's going on here?
In addition to Comment #2,
I confirm that changeset 4e1d168e50cf of Bug 553982 causes the problem on trunk.
Blocks: 553982
(In reply to comment #3)
> The "connection error" thing is coming from the site; we don't show anything
> like that....  Not only that, but the m-c regression range has nothing
> resembling what the 1.9.2 regression range looks like.
> 
> Bobby, any idea what's going on here?

Thanks to Alice's precise regression window, I do.

So "Connection Error" here is the alt-text of the _first_ image in the stream that the embedded viewer on the web page tries to display (viewing the stream directly doesn't appear to display this naughty frame). We don't sniff an appropriate mimetype in this frame, and so we don't initialize the imgContainer. When the imgContainer isn't initialized, we determine in OnStopRequest that we got bad data and cancel the request (this code is slightly different after bug 572520 landed, but the idea is the same):

http://mxr.mozilla.org/mozilla-central/source/modules/libpr0n/src/imgRequest.cpp#876

The solution here is probably to check whether the request is multipart, and if it is give it another chance. This would probably be a good followup bug to bug 514033.

Given the extent to which the server is misbehaving here, I don't think we should do anything about it on branch. I'm open to being convinced otherwise though.
No longer blocks: 542096, 553982
Depends on: 514033
Summary: Opening mjpeg produces connection error and not video → multipart/x-mixed-replace stream terminated on bad first frame
Comments from the server developer:  
It is correct, the initial image is of MIME type JPEG. That is the correct MIME type for the data and I am not setting that MIME type, Apache is. That first image is not part of the MJPEG stream. From what I can remember, I set up the video box, and set the insides to be the single JPEG (mime type JPEG) and then replace that element with the MJPEG. An MJPEG stream is not displayed until the first frame in the multipart message is received. There can be a delay, so rather than showing a blank box, it shows the "Connecting..." JPEG. Firefox should then replace this once the first complete JPEG from the MJPEG stream is received.

This seems like a reasonable way to code this.  From your comments above, you would expect the placeholder image to also be an mjpeg?
There is no separate "mjpeg" type.  All an "mjpeg" is is a multipart stream with jpeg parts.  

And bobby is saying that the first jpeg in the stream in this case isn't actually a jpeg.  Furthermore, the response is different depending on how the data is linked to, apparently (which is a little odd.... wonder why... different accept header?).
(In reply to comment #6)
> Comments from the server developer:  
> It is correct, the initial image is of MIME type JPEG. That is the correct MIME
> type for the data and I am not setting that MIME type, Apache is. 

Mozilla doesn't pay any attention to the MIME type sent by the server for images, because servers tend to lie. Instead, it examines the first few bits of the data stream to see it can make sense of them.

I haven't looked into this too closely, so I could be wrong here. However, the regression changeset is a very simple one, and gives me a lot of confidence in my theory.

That first
> image is not part of the MJPEG stream. From what I can remember, I set up the
> video box, and set the insides to be the single JPEG (mime type JPEG) and then
> replace that element with the MJPEG. 

Anything to do with setting <img src="foo"> and then later setting <img src="bar"> shouldn't matter here, because we instantiate an entirely new imgRequest when the source is changed. I'm saying that this appears to be a case of setting <img src="foo">, where "foo" is a multipart/x-mixed-replace stream whose first frame fails to parse as a jpeg.

> An MJPEG stream is not displayed until the
> first frame in the multipart message is received. There can be a delay, so
> rather than showing a blank box, it shows the "Connecting..." JPEG. Firefox
> should then replace this once the first complete JPEG from the MJPEG stream is
> received.

What are you using to trigger the switch? onload? Are you switching by setting src=bar, or doing something else?

Are you sure that the server is delivering the exact same multipart stream to the embedded player as it delivers to the standalone request? I could fire up a debugger and see what's going on, but I'm guessing there's something very explicit happening server-side.
We are looking at the following fix:  

The issue is with the X-Resource-Status header being sent in the JPEG stream.  The newer versions of Firefox will display the streams correctly if that tag is removed from the HTTP headers.  I'm reviewing the impact of removing that header, but I can send along the updated FlexTPS file to anyone that's interested in testing the changes further.

Is there a reason that this would flag the jpeg stream as invalid?  Please note as well that I have had Firefox crash 6 times while working on this problem, and have submitted the crash reports.
(In reply to comment #9)
> We are looking at the following fix:  
> 
> The issue is with the X-Resource-Status header being sent in the JPEG stream. 
> The newer versions of Firefox will display the streams correctly if that tag is
> removed from the HTTP headers.

The only thing I can find about this on google is from FlexTPS, so I'm assuming it's a home-grown header. bz - any ideas on how this might be causing problems?

> 
> Is there a reason that this would flag the jpeg stream as invalid?  Please note
> as well that I have had Firefox crash 6 times while working on this problem,
> and have submitted the crash reports.

That sounds bad. Do you have the crash-report id for any of them? I'd be happy to take a look...
> bz - any ideas on how this might be causing problems?

Is it being dumped into the data instead of the headers?  We do nothing with that header on our end....
(In reply to comment #11)
> > bz - any ideas on how this might be causing problems?
> 
> Is it being dumped into the data instead of the headers?  We do nothing with
> that header on our end....

Given that we're failing to sniff the mimetype, that seems like a likely story.
Here's what wireshark has to say for that packet:

0000  00 1f 5b cb af a1 00 18  01 92 3c 80 08 00 45 00   ..[..... ..<...E.
0010  01 59 49 dd 40 00 30 06  8f 47 86 c5 27 ca c0 a8   .YI.@.0. .G..'...
0020  01 43 00 50 fd df ce 8d  79 f8 ec 0a 7a 85 80 18   .C.P.... y...z...
0030  00 37 21 f7 00 00 01 01  08 0a 71 c8 30 37 00 8f   .7!..... ..q.07..
0040  d8 e7 48 54 54 50 2f 31  2e 31 20 32 30 30 20 4f   ..HTTP/1 .1 200 O
0050  4b 0d 0a 50 72 61 67 6d  61 3a 20 6e 6f 2d 63 61   K..Pragm a: no-ca
0060  63 68 65 0d 0a 43 61 63  68 65 2d 43 6f 6e 74 72   che..Cac he-Contr
0070  6f 6c 3a 20 6d 75 73 74  2d 72 65 76 61 6c 69 64   ol: must -revalid
0080  61 74 65 2c 20 6e 6f 2d  63 61 63 68 65 2c 20 6e   ate, no- cache, n
0090  6f 2d 73 74 6f 72 65 0d  0a 44 61 74 65 3a 20 54   o-store. .Date: T
00a0  75 65 2c 20 31 30 20 41  75 67 20 32 30 31 30 20   ue, 10 A ug 2010 
00b0  31 39 3a 31 34 3a 34 30  20 47 4d 54 0d 0a 45 78   19:14:40  GMT..Ex
00c0  70 69 72 65 73 3a 20 54  75 65 2c 20 31 30 20 41   pires: T ue, 10 A
00d0  75 67 20 32 30 31 30 20  31 39 3a 31 34 3a 34 30   ug 2010  19:14:40
00e0  20 47 4d 54 0d 0a 43 6f  6e 6e 65 63 74 69 6f 6e    GMT..Co nnection
00f0  3a 20 63 6c 6f 73 65 0d  0a 43 6f 6e 74 65 6e 74   : close. .Content
0100  2d 54 79 70 65 3a 20 6d  75 6c 74 69 70 61 72 74   -Type: m ultipart
0110  2f 78 2d 6d 69 78 65 64  2d 72 65 70 6c 61 63 65   /x-mixed -replace
0120  3b 20 62 6f 75 6e 64 61  72 79 3d 2d 2d 2d 2d 2d   ; bounda ry=-----
0130  2d 2d 2d 4a 50 45 47 5f  46 52 41 4d 45 5f 42 4f   ---JPEG_ FRAME_BO
0140  55 4e 44 41 52 59 0d 0a  0d 0a 58 2d 52 65 73 6f   UNDARY.. ..X-Reso
0150  75 72 63 65 2d 53 74 61  74 75 73 3a 20 61 63 74   urce-Sta tus: act
0160  69 76 65 0d 0a 0d 0a                               ive....          

Note the "0d 0a 0d 0a" coming before the X-Resource-Status.  So that is in fact being injected in the wrong place: at the beginning of the data, not at the end of the headers.  rodneyp@unr.edu, sounds like a bug in whatever is injecting the header, there.
And he next packet has some of that stuff repeated, as well as a Content-Type and Content-Length header... but it's too late, since end of headers came in this packet.
Thanks for helping identify the problem.  The "0d 0a 0d 0a" wasn't suppose to be there.  I have a patch for our server and will be distributing it tomorrow.  I don't know if this will fix the crashes I was experiencing as well, but I would suggest closing this ticket, and I will open another if I get another crash.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WORKSFORME
We shouldn't be crashing on input from the web, no matter what.  So I'd still appreciate links to those crash incident ids.

Bobby don't we still want to do comment 5, though?
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
(In reply to comment #16)
> Bobby don't we still want to do comment 5, though?

Good catch - taking.
Assignee: nobody → bobbyholley+bmo
Status: REOPENED → ASSIGNED
How do I find the crash incident IDs.  I just clicked the standard submit button when prompted.  I didn't ever see the IDs.
OS: All → Windows 7
Thanks, These should be the possibly related crash reports 
143acf92-fc74-4111-866b-0bef8516b6c9    8/9/2010    4:48 PM
1039f2de-b4a9-47b5-b5e0-a195b321787b    8/9/2010    3:54 PM
46f69978-f40e-4a4f-86dd-52191c479448    8/9/2010    3:05 PM
14b0d219-9cc7-4360-8355-fc5e0d144aea    8/4/2010    2:28 PM
9724ae28-921e-42c3-90d1-f063c2b6ef7b    8/4/2010    1:27 PM
bp-d3335b25-9041-4a42-8cf9-b70ea2100803    8/3/2010    4:30 PM
8a60f23c-821e-4d97-994f-78be192edf0b    8/2/2010    6:26 PM
rodney, the incident starting with "bp-" was submitted, and looks like a crash inside the java plug-in as far as I can tell.  The others were not submitted (presumably due to the throttling thing); you can click on those links to submit them.  Once they turn into ids starting with "bp-", please paste those ids here?
for reference, bz/I tried loading those, there seems to be something wrong w/ the server.

if you have the time (it's actually fairly easy), 

https://developer.mozilla.org/en/How_to_get_a_stacktrace_with_WinDbg
you can install the older version of windbg (which is ~16mb instead of a CD/DVD) -- I will get someone to fix the wiki to suggest that at some point -- please be sure to install the 32bit version.

here's a link for it:
http://msdl.microsoft.com/download/symbols/debuggers/dbg_x86_6.11.1.404.msi
Hey Bobby,

I am the developer for flexTPS. I think we chatted via e-mail during debugging of this issue.

The spurious "\r\n" after X-Resource-Status has been corrected. I have been looking into the issue and what I have found is that the stream issue (with or without the incorrect "\r\n" injection) is only occurring when the "status" JPEG frame being prepended/appended to the source MJPEG stream is not the same size (WIDTHxHEIGHT) even if they are the same aspect ratio.

Basic system layout:
<SOURCE>---<flexTPS proxy>---<flexTPS HTTPd portal>----<client/Firefox>

Prior to sending the collected stream from the proxy to Firefox, an image showing a "Connecting..." status is sent as part of the MJPEG stream. This is done so that the user doesn't think things are stalled if it is taking a while for the proxy to acquire a connection to the SOURCE. This stream continues for ever unless there is a unrecoverable connection failure or a set timeout in which case the multipart message is concluded with a "Stream Failure" or "Disconnected" JPEG so that the user is given information rather than something that looks like a hung frame.

The "Connecting...", "Disconnected", "Stream Failure" images are all 704x480 which is what the most common Axis camera frame size was back in the day. Some sites had cameras that returned 640x480. 

So a stream could have frames of varying sizes:
Frame
1: 704x480 (Connecting...)
2: 640x480
3: 640x480
4: 640x480
5: 640x480
...
X: 704x480 (Disconnected/Stream Failure)

And Firefox handled this with out complaining.

Currently (3.6.16) Firefox is able to handle flexTPS streams if the "status" frames are the same size.

There are source streams that are 704x480, 704x288, 640x480, etc.
If any of the component the images are a different size Firefox treats it as a failure and displays the ATL text which is "Connection Error!".

The code for the script that outputs the MJEPG stream to the client is here:
http://trac.codaxus.com/flexTPS/browser/2.x.x/trunk/portal/2.0.x/site/perl/nph-mjpeg_stream.pl

This current code works in Chrome and the Java Applet used to display the stream in IE. The "spec" for MJPEG and the MIME multipart/replace format doesn't specify, form what I can tell, that all parts have to be the same size (or even the same type, one could be JPEG and the next TEXT).
> is not the same size

Sounds related to bug 639303.
Depends on: 639303
I don't have viewing privileges to bug 639303.

One interesting thing I have track down about this issue is that it works even if the status frames are the same size as the stream stream frames, but it fails if the MJPEG stream is the source of an <img> tag in a page.

Displaying it directly works (i.e. view image):
http://127.0.0.1/feeds/feed_1/stream_1/mjpeg/1?status_frame=true&amp;random=0.9764673116119299


Displaying the same stream as the source of an image tag fails:
<html>
<head>
</head>
<img
src="http://127.0.0.1/feeds/feed_1/stream_1/mjpeg/1?status_frame=true&amp;random=0.9764673116119299"
alt="BAD">
</html>
The above comment should say:
"it works even if the status frames are not the same size as the stream stream frames, but it fails if the MJPEG stream is the source of an <img> tag in a page."
Hi Christopher,

I'm not actively working on Mozilla stuff at the moment, so I'm punting this one over to joe. ;-)
Assignee: bobbyholley+bmo → joe
I'm almost sure I'm fixing this in bug 787899.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago12 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.