Last Comment Bug 648568 - <script src> before <img> which returns non-image data causes double-download of the non-image data
: <script src> before <img> which returns non-image data causes double-download...
Status: NEW
: dev-doc-needed, perf
Product: Core
Classification: Components
Component: ImageLib (show other bugs)
: Trunk
: x86 Windows XP
: -- normal (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
:
Mentors:
: 654401 694326 760826 (view as bug list)
Depends on:
Blocks: 727754 654401
  Show dependency treegraph
 
Reported: 2011-04-08 10:23 PDT by Nicholas C. Zakas
Modified: 2015-01-10 06:47 PST (History)
16 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments

Description Nicholas C. Zakas 2011-04-08 10:23:54 PDT
User-Agent:       Mozilla/5.0 (Windows NT 5.1; rv:2.0) Gecko/20100101 Firefox/4.0
Build Identifier: Mozilla/5.0 (Windows NT 5.1; rv:2.0) Gecko/20100101 Firefox/4.0

If a page includes an external JavaScript file before an <img> whose src has a non-image content-type, Firefox downloads the <img> src twice. Reproducible in Firefox 3.6 and Firefox 4.

Simple example:

<!DOCTYPE html>
<html>   
<head>
    <script type="text/javascript" src="http://yui.yahooapis.com/combo?3.3.0/build/yui/yui-min.js&3.3.0/build/loader/loader-min.js"></script>
</head>
<body>
    <img src="http://www.yahoo.com/" width=1 height=1 border=0>
</body>
</html>

This also shows the issue:

<!DOCTYPE html>
<html>   
<head>
</head>
<body>
    <script type="text/javascript" src="http://yui.yahooapis.com/combo?3.3.0/build/yui/yui-min.js&3.3.0/build/loader/loader-min.js"></script>
    <img src="http://www.yahoo.com/" width=1 height=1 border=0>
</body>
</html>

Reproducible: Always

Steps to Reproduce:
1. Include external JS in a page
2. Include an <img> with a src set to a non-image
3. Load page and watch HTTP traffic
Actual Results:  
<img> src is downloaded twice

Expected Results:  
<img> src is downloaded once

This is important because a lot of ad-tracking beacons don't use the image content-type, and this issue can cause double-counting of ads for any page where an external JavaScript is included before the beacon.
Comment 1 Boris Zbarsky [:bz] (still a bit busy) 2011-04-08 10:47:40 PDT
Presumably what happens here is that we kick off the image preload and by the time the real image load starts we've detected that the result is not an image (I assume _that_ is what matters, not the content-type; please correct me if I'm wrong), and killed the preload, so we have to make a new request.

It might be nice to cache the "this is not an image" bit in imagelib (by not dropping the imgRequest from the cache just because it's not an image?).  Would improve performance, I guess.  Joe, what do you think?

> This is important because a lot of ad-tracking beacons don't use the image
> content-type, and this issue can cause double-counting of ads for any page
> where an external JavaScript is included before the beacon.

I don't think we should be constrained by that broken setup.  If we think we can get a better browsing experience for our users by making two requests for the image (or none!) we should do just that.
Comment 2 Boris Zbarsky [:bz] (still a bit busy) 2011-04-08 10:48:06 PDT
Er, ccing Joe too.  Joe, see comment 1?
Comment 3 Nicholas C. Zakas 2011-04-08 11:31:10 PDT
You could be right. I've seen two scenarios:

1. A response that is missing a content-type and doesn't contain an image
2. A response that has a non-image content-type and also doesn't contain an image

I've not built out a test to see if an image served with a non-image content-type would cause the issue. 

For some background: the tracking beacons used by ads are frequently included as images in the page because there's no cross-domain restriction. They sometimes return images (1x1 tracking pixels) but don't necessarily always return images since the response isn't necessary to the experience...it's only necessary that the request reaches the server for tracking.
Comment 4 Boris Zbarsky [:bz] (still a bit busy) 2011-04-08 11:35:51 PDT
> I've not built out a test to see if an image served with a non-image
> content-type would cause the issue. 

OK.  For <img> loads, the Content-Type header is more or less completely ignored, so all that matters is the data that was returned.
Comment 5 David Murdoch 2011-04-08 11:37:41 PDT
If the prefetched-(non)image's cache expiration hasn't expired by the time the
real <img /> load starts the "image" should probably NOT be downloaded again.
If it has expired or has no expiration directive the (non)image SHOULD be
redownloaded; as the contents of the url may have changed in brief interim
between the prefetch and load-start. Right?
Comment 6 Boris Zbarsky [:bz] (still a bit busy) 2011-04-08 12:37:39 PDT
David, once we discover that the data is not an image we close the connection.  So the non-image data never gets cached in its entirety.
Comment 7 Joe Drew (not getting mail) 2011-04-11 23:04:00 PDT
I think that caching a non-image is just fine, actually! I'm also 100% certain that it will require some changes in the notifications sent by imagelib, because we never assume we're going to keep in-error images around.
Comment 8 Christopher Blizzard (:blizzard) 2011-05-26 17:15:34 PDT
Hey, guys.  I'm poking you about figuring out what we want to do here.  Maybe nothing is the right answer, but I'd like some forward progress and ownership.  Thanks!
Comment 9 Boris Zbarsky [:bz] (still a bit busy) 2011-05-26 18:06:13 PDT
There's no figuring out to do.  We need to do what comment 1 suggested and comment 7 agreed with.

As for ownership... we need someone other than Joe working in imagelib.  Can you make that happen?
Comment 10 Christopher Blizzard (:blizzard) 2011-05-27 15:38:25 PDT
Let me see what I can do. :)
Comment 11 Henrik Blase 2011-06-11 02:14:18 PDT
I'd like to add another variation of this bug:
even if the image-tag is part of the external javascript, it is loaded twice:
document.write('<img src="http://www.yahoo.com" width="1" height="1">')

But if the image-tag in the external script is coded this way, it is loaded only once:
document.write(unescape('%3Cscript') + ' type="text/javascript">');
document.write('document.write(\'<img src="http://www.yahoo.com" width="1" height="1">\');');
document.write(unescape('%3C%2Fscript>'));
Comment 12 Henrik Blase 2011-06-11 02:18:24 PDT
and it also happens, if the src-url is a non-image, that redirects to an image (e.g. in those tracking-pixels)
Comment 13 Boris Zbarsky [:bz] (still a bit busy) 2011-10-13 11:40:08 PDT
*** Bug 694326 has been marked as a duplicate of this bug. ***
Comment 14 Joe Drew (not getting mail) 2012-01-20 14:11:01 PST
*** Bug 654401 has been marked as a duplicate of this bug. ***
Comment 15 Boris Zbarsky [:bz] (still a bit busy) 2012-06-26 08:49:26 PDT
*** Bug 760826 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.