Crash displaying multipart JPEG

RESOLVED FIXED

Status

()

Core
ImageLib
RESOLVED FIXED
11 years ago
11 years ago

People

(Reporter: roc, Assigned: roc)

Tracking

({crash, fixed1.8.0.4, fixed1.8.1})

Trunk
x86
Linux
crash, fixed1.8.0.4, fixed1.8.1
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 1 obsolete attachment)

I will attach a testcase that loads a JPEG which refreshes using multipart and causes a crash on Linux/GTK2. What happens is that the imgRequest sees an initial content type of image/jpeg, and decides that this means it's not multipart, so calls SetMutable(PR_FALSE) on the gfxImageFrame. Later it sees a multiper/x-mixed-replace content type, calls SetMutable(PR_TRUE), and starts mutating the data. This eventually causes a crash in GTK, apparently because we've thrown away some data structure.
Created attachment 213271 [details] [diff] [review]
fix?

This patch fixes the bug for this testcase. I don't know if it's correct without studying this code a lot more, so I'd appreciate some advice. In particular, I'm not sure if it's a bug in GTK (should it be able to handle a SetMutable(PR_TRUE) after a SetMutable(PR_FALSE)?), imgRequest (is this patch the correct way to detect possible multipart-ness?) or necko (is the multipart channel returning the right initial content type?).
Attachment #213271 - Flags: superreview?
Attachment #213271 - Flags: review?(pavlov)
Created attachment 213272 [details] [diff] [review]
fix?

This patch fixes the bug for this testcase. I don't know if it's correct without studying this code a lot more, so I'd appreciate some advice. In particular, I'm not sure if it's a bug in GTK (should it be able to handle a SetMutable(PR_TRUE) after a SetMutable(PR_FALSE)?), imgRequest (is this patch the correct way to detect possible multipart-ness?) or necko (is the multipart channel returning the right initial content type?).
Attachment #213272 - Flags: superreview?
Attachment #213272 - Flags: review?(pavlov)
Attachment #213272 - Attachment is obsolete: true
Attachment #213272 - Flags: superreview?
Attachment #213272 - Flags: review?(pavlov)
Attachment #213271 - Flags: superreview? → superreview?(darin)

Comment 3

11 years ago
Hmm.. the channel's content type is not supposed to change after it calls OnStartRequest.  Could you please upload a NSPR_LOG_MODULES=nsHttp:5 log?
So is there a testcase?  I bet it's a multipart with another multipart in it, right?
You should be able to reproduce with this image:

http://www.airport-nuernberg.de/_/tools/webcam.html?_FRAME=64&refresh=0&datei=webcam-0-0.jpg

The original bug report is here:
https://bugzilla.novell.com/show_bug.cgi?id=140416
Created attachment 213373 [details]
nsHttp log

Here's the log. I actually made it with a 1.8 branch build, because the crash is much harder to reproduce on trunk (but I'm quite sure that I have reproduced it there).

Comment 7

11 years ago
NOTE: The original channel is redirected.  The redirected channel has a content-type of text/html.  Maybe image lib is reading the content type from the redirected channel instead of the newly created channel?
The reason I was having trouble reproducing the bug on trunk is because I was running my patched build :-)

(In reply to comment #7)
> NOTE: The original channel is redirected.  The redirected channel has a
> content-type of text/html.  Maybe image lib is reading the content type from
> the redirected channel instead of the newly created channel?

imgRequest is getting the type from mChannel, which is the base channel for the multipart. http://lxr.mozilla.org/mozilla/source/modules/libpr0n/src/imgRequest.cpp#620
Maybe that's set wrong further up the stack, but I can't see it from reading the code.

If we agree that it's a necko issue I'll debug further to try to figure out where the initial content type is coming from.
> imgRequest is getting the type from mChannel

Which is set when we start the load.  So comment 7 is right on target.  Sounds to me like image loader wants to observe redirects, no?

That said, making this not depend on the type like that is also a good idea, I suspect.
Attachment #213271 - Flags: review?(pavlov) → review?(darin)
I see. I think that means my patch is correct.

Comment 11

11 years ago
Comment on attachment 213271 [details] [diff] [review]
fix?

Two possible correct fixes:

1) Observe redirects, and update mChannel accordingly.

2) Use aRequest instead of mChannel to read the content type.

NOTE: #1 probably needs to be done to avoid another bug.  imgRequest::Cancel cancels mChannel, but it does not cancel any channels that result from redirects.  Bad libpr0n!
Attachment #213271 - Flags: superreview?(darin)
Attachment #213271 - Flags: superreview-
Attachment #213271 - Flags: review?(darin)
Attachment #213271 - Flags: review-

Comment 12

11 years ago
The image loader is so in need of an overhaul.  This is just one of many related bugs.  It does not use necko correctly.

Comment 13

11 years ago
indeed. if anyone has lots of spare time, i can make some good suggestions on how it should be overhauled :)
Created attachment 213411 [details] [diff] [review]
fix

How about this? It seems to work.
Attachment #213411 - Flags: superreview?
Attachment #213411 - Flags: review?
Attachment #213411 - Flags: superreview?(darin)
Attachment #213411 - Flags: superreview?
Attachment #213411 - Flags: review?(darin)
Attachment #213411 - Flags: review?

Comment 15

11 years ago
I don't understand how that works.  Are you sure that OnChannelRedirect is being called?  The channel's notification callbacks are set to point at the load group's notification callbacks.  I don't see how the channel would access the imgRequest as a nsIChannelEventSink.
No, I'm not sure. I guess I need to study this code some more.
Attachment #213411 - Flags: superreview?(darin)
Attachment #213411 - Flags: superreview-
Attachment #213411 - Flags: review?(darin)
Attachment #213411 - Flags: review-

Comment 17

11 years ago
roc: I think you need to set some image lib object as the notification callbacks for the channel before calling AsyncOpen (see the call to NS_NewChannel).  Notification callbacks is a nsIInterfaceRequestor.  That object will be asked for a nsIChannelEventSink when a redirect occurs.  That gives you the opportunity to observe the redirect.  That is a bit of a complex solution.  Another solution is as follows:

Construct a load group for the channel.  The channel will then insert any channels that it creates via redirect into the load group.  When imgRequest::Cancel is called, cancel the load group instead of mChannel.  That will take care of canceling all channels.  In fact, it is probably not necessary to keep mChannel (and that might be a good thing to do to keep people from being tempted into using it).  Then, in imgRequest::OnStartRequest, QI aRequest to nsIChannel to read the content type.

I think this second solution might be better.

Comment 18

11 years ago
nsLoadGroup implements nsISupportsPriority, so it'd be easy to replace those calls on mChannel with similar calls on "mLoadGroup."  Perhaps a delicate issue is with the calls to mChannel->GetOriginalURI.
darin, what do you think of taking https://bugzilla.mozilla.org/attachment.cgi?id=213271 as a branch fix (and on trunk)? It seems to be both correct and a simplification, although it doesn't solve all the problems.

I'm going to try this loadgroup approach for trunk, but I'm afraid that it's going to be a significant risk for branch.

One question: if I set the loadgroup for the channel, won't that cause problems if it already belongs to a loadgroup?

Comment 20

11 years ago
> darin, what do you think of taking
> https://bugzilla.mozilla.org/attachment.cgi?id=213271 as a branch fix (and on
> trunk)? It seems to be both correct and a simplification, although it doesn't
> solve all the problems.

Let's do this as a last resort... let's try for the trunk patch first?


> I'm going to try this loadgroup approach for trunk, but I'm afraid that it's
> going to be a significant risk for branch.

Well, it occured to me that I can basically implement <a ping> today using "new Image" combined with this bug.  So, from the point of view of ensuring that all content loads stop when a page transition occurs, we should probably fix the "cancelation" bug for 2.0 as well.


> One question: if I set the loadgroup for the channel, won't that cause 
> problems if it already belongs to a loadgroup?

Image lib currently does not put channels into a loadgroup!  *gasp*
> > darin, what do you think of taking
> > https://bugzilla.mozilla.org/attachment.cgi?id=213271 as a branch fix (and
> > on trunk)? It seems to be both correct and a simplification, although it
> > doesn't solve all the problems.
> 
> Let's do this as a last resort... let's try for the trunk patch first?

Apart from it not fixing all problems, what's wrong with it? It seems cleaner to do a QI than to check the MIME type.

(In reply to comment #20)
> > One question: if I set the loadgroup for the channel, won't that cause 
> > problems if it already belongs to a loadgroup?
> 
> Image lib currently does not put channels into a loadgroup!  *gasp*

Yeah, so why doesn't it?
http://lxr.mozilla.org/mozilla/source/modules/libpr0n/src/imgLoader.cpp#207
You went to all the trouble of passing a load group in there, but decided not to use it?

But even though no-one currently puts image load channels in a loadgroup, is it really right to rely on that for the future? Shouldn't we be fixing NewImageChannel to use the loadgroup?

Comment 22

11 years ago
> Apart from it not fixing all problems, what's wrong with it? It seems cleaner
> to do a QI than to check the MIME type.

Agreed.  It's important that aRequest be QI'd instead of mChannel, which is what you are doing.  So, OK... yeah, let's go for it, and then spin off the other issue to another bug (or whatever).


> > > One question: if I set the loadgroup for the channel, won't that cause 
> > > problems if it already belongs to a loadgroup?
> > 
> > Image lib currently does not put channels into a loadgroup!  *gasp*
> 
> Yeah, so why doesn't it?
> http://lxr.mozilla.org/mozilla/source/modules/libpr0n/src/imgLoader.cpp#207
> You went to all the trouble of passing a load group in there, but decided not
> to use it?
> 
> But even though no-one currently puts image load channels in a loadgroup, is 
> it really right to rely on that for the future? Shouldn't we be fixing
> NewImageChannel to use the loadgroup?

So, we have these imgRequestProxy things that are put into the loadgroup of the corresponding docshell.  Since we may have multiple requests for the same URL, we chain them all together on a single imgRequest that actually talks to necko.  Given that there is not a 1-1 relationship between imgRequestProxy and necko channel, we cannot put the necko channel in the docshell's loadgroup.  If we did, then stop a load in one page would stop the same image load that may be happening in another page.  By the way, I'm not defending this architecture -- only explaining it ;-)  The fact that we do this across documents is very broken in my opinion.

Comment 23

11 years ago
Comment on attachment 213271 [details] [diff] [review]
fix?

r+sr=darin for this patch.  but please file a bug on the cancelation issue.  you can assign it to me.
Attachment #213271 - Flags: superreview-
Attachment #213271 - Flags: superreview+
Attachment #213271 - Flags: review-
Attachment #213271 - Flags: review+
Filed bug 328903. Thanks!
checked in on trunk
Status: NEW → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED
Comment on attachment 213271 [details] [diff] [review]
fix?

I think we should take this on branches too, since it fixes a crasher and appears to be low risk.
Attachment #213271 - Flags: approval-branch-1.8.1?(darin)

Updated

11 years ago
Attachment #213271 - Flags: approval-branch-1.8.1?(darin) → approval-branch-1.8.1+
Checked into 1.8.1
Keywords: fixed1.8.1
Comment on attachment 213271 [details] [diff] [review]
fix?

approval for 1.8.0.3?
Attachment #213271 - Flags: approval1.8.0.3?

Comment 29

11 years ago
This is a crasher.  It's been baking ont he trunk since 2/28/06.  My pre-vote for  the 1.5.0.3 Release Team is to approve.

Just to claify, we want attachment 213271 [details] [diff] [review] ("fix?") for 1.8.0.3 even though attachement 213411 ("fix") was created more recently and does not have a "?" in the attachement name.
Keywords: crash
Comment on attachment 213271 [details] [diff] [review]
fix?

approved for 1.8.0 branch, a=dveditz for drivers
Attachment #213271 - Flags: approval1.8.0.3? → approval1.8.0.3+
Checking in modules/libpr0n/src/imgRequest.cpp;
/cvsroot/mozilla/modules/libpr0n/src/imgRequest.cpp,v  <--  imgRequest.cpp
new revision: 1.84.8.1.2.1; previous revision: 1.84.8.1
done
Keywords: fixed1.8.0.3
You need to log in before you can comment on or make changes to this bug.