Closed Bug 1121693 Opened 9 years ago Closed 6 years ago

SVG Fragment downloaded for every image

Categories

(Core :: Graphics: ImageLib, defect, P3)

34 Branch
x86_64
Windows 7
defect

Tracking

()

RESOLVED DUPLICATE of bug 1027106

People

(Reporter: kylemit, Unassigned)

References

Details

(Keywords: testcase, Whiteboard: [gfx-noted])

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36

Steps to reproduce:

1. Open up the developer tools 
2. Load any page that contains multiple images that differ only by the fragment ID porition of the URL
 - Here's a demo page: http://kylemitofsky.com/StackSVG/
3. You should see multiple requests made for the same SVG image to the same URL


Actual results:

Multiple requests are made for the same image


Expected results:

When loading a page with multiple svg images, that differ only by the fragment id at the end, the browser should make a single GET request for all images.

When fetching any image type (png / svg / gif) that doesn't contain a fragment ID, the requests are all bundled into a single request so the same image doesn't need to be requested more than once.  

Since the fragment ID is not part of the URL that is sent to sent to the server, it should not impact the way resources are fetched.

The fragment identifier is strictly for the client to utilize in determining which portion or fragment of the document to display once returned.

The expected behavior is that fragmented URLs would be fetched as efficiently as their non-fragmented counterparts since this is obtainable from a server perspective.

This is a popular technique called SVG Stacks which was first widely publicized by on [http://simurai.com/](http://simurai.com/blog/2012/04/02/svg-stacks/)

Here's a demo page:

http://kylemitofsky.com/StackSVG/

This emerged out of a discussion on StackOverflow here:

http://stackoverflow.com/a/27861428/1366033

Fragment ID's are part of the SVG spec here:

http://www.w3.org/TR/SVG/linking.html#SVGFragmentIdentifiers
Robert, do you know what's causing this?

(this also reproduces on Nightly)
Flags: needinfo?(longsonr)
Keywords: testcase
Product: Firefox → Core
We seem to strip off the ref when loading in http://mxr.mozilla.org/mozilla-central/source/dom/base/nsDocument.cpp#909 so off the top of my head I don't know what's wrong. bzbarsky is more familiar with this than me given that he wrote this.
Flags: needinfo?(longsonr)
Resolved-Incomplete due to time since last communication/update by reporter.
Please feel free to reopen if the error occurs in a current Firefox version.
Status: UNCONFIRMED → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
(In reply to Robert Longson from comment #2)
> We seem to strip off the ref when loading in
> http://mxr.mozilla.org/mozilla-central/source/dom/base/nsDocument.cpp#909 so
> off the top of my head I don't know what's wrong. bzbarsky is more familiar
> with this than me given that he wrote this.

Boris, do you have ideas about where to look for this? FWIW, this also affects things like bug 1191230 inside chrome code...
Status: RESOLVED → REOPENED
Component: Untriaged → SVG
Ever confirmed: true
Flags: needinfo?(bzbarsky)
Resolution: INCOMPLETE → ---
The link given in comment 2 is where we load SVG external resources (paint servers and the like).  That's NOT the codepath where we load SVG images.

The codepath for SVG images and coalescing of their loads would be in imagelib code.  dholbert knows more about that stuff, but I expect we need to strip the ref when looking up things in the image cache and storing things in it, and we then need to make sure that multiple hits on the same imgRequest with different refs actually work properly.
Flags: needinfo?(bzbarsky) → needinfo?(dholbert)
It looks like we issue separate network requests for whatever.jpg#foo and whatever.jpg#bar, too -- i.e. this isn't SVG-specific.  (Though it's less useful for authors to distinguish between #foo and #bar for raster images vs. SVG images, so that's a less common use case)

Reclassifying this bug as ImageLib.
Component: SVG → ImageLib
Some strawman proposals, with coalescing done at progressively higher levels:

Strawman Proposal #1: Coalesce stack.svg#foo and stack.svg#bar into a single object in the image cache -- a single VectorImage object.

That can't easily work, that VectorImage would receive "Draw()" calls, and it wouldn't have any idea that it needs to navigate to #foo or #bar and update the document before it draws.  Plus, we store cached rasterizations of VectorImages, and those are only indexed by SVG-viewport-size -- we'd need to disable that functionality, or make the indexing more complicated, for this to work.

Strawman Proposal #2: Make stack.svg#foo and stack.svg#bar into separate VectorImage objects, but let them share a SVG document under the hood.  So when ImageLib receives the second request, it'd somehow find the first VectorImage and ask it for a reference to its document, and set up the second VectorImage to use that same document.  This might work, but it would be complicated to manage -- and we'd need to navigate that document back and forth between #foo and #bar for our draw operations (at least until we've got cached rasterizations), which is also a little complex/heavyweight.

Strawman Proposal #3: Use separate VectorImages & separate SVG Documents, but simply strip off the #ref when we issue the network request.  From imagelib's perspective, we won't be sharing anything, but hopefully this should allow the networking code to figure out that it's received 2 requests for the same resource, and it can serve the second one from the network cache (I think?) without actually hitting the network a second time.

Assuming this works [I don't know our networking code well, but I think it supports this], this seems like the easiest option & the least likely to break stuff.
Flags: needinfo?(dholbert)
(In reply to :Gijs Kruitbosch from comment #4)
> Boris, do you have ideas about where to look for this? FWIW, this also
> affects things like bug 1191230 inside chrome code...

Can you clarify what you're getting at RE how this impacts chrome code here? As I understand it, this bug is about trying to avoid hitting the network twice, and for icons in chrome code, we've already got all of the resources locally.
Flags: needinfo?(gijskruitbosch+bugs)
(I think the changes here would mostly be in imgLoader.cpp or imgRequest.cpp -- maybe in imgLoader::LoadImage() or something that it calls.  I don't know that code well enough to speculate further on how this should work & make use of the various caches, beyond the strawman suggestions in comment 9. I think Seth or Timothy know that code better than I do; tagging seth to get his thoughts here, when he's got a chance.)
Flags: needinfo?(seth)
(In reply to Daniel Holbert [:dholbert] from comment #10)
> (In reply to :Gijs Kruitbosch from comment #4)
> > Boris, do you have ideas about where to look for this? FWIW, this also
> > affects things like bug 1191230 inside chrome code...
> 
> Can you clarify what you're getting at RE how this impacts chrome code here?
> As I understand it, this bug is about trying to avoid hitting the network
> twice, and for icons in chrome code, we've already got all of the resources
> locally.

So in bug 1191230 we started using SVG icons for something. The SVG icons were meant to be different for win10 vs other versions of Windows. The typical way to do this with a raster image is:

- package foo.png
- package foo-preWin10.png
- add a chrome manifest directive:

override foo.png foo-preWin10.png osversion<10

to override foo.png with foo-preWin10.png iff the OS version < 10.

Now even if foo.png is a spritesheet with 6 or 10 or N different things in it, no matter which bit of it I need, I will get the right image on pre-win10 versions of Windows.

So far so good.

If you do this with SVG, and instead of a sprite and -moz-image-region or whatever, you use #refs to use different parts of the image, this doesn't work because we actually request them with the #ref, and so we look for chrome overrides with the #ref as well. So now I have to write 6/10/N override lines like:

override foo.svg#first foo.pre-Win10.svg#first osversion <10
override foo.svg#second foo.pre-Win10.svg#second osversion <10
override foo.svg#nth foo.pre-Win10.svg#nth osversion <10

etc.

Which is annoying and repetitive. While we could try to update the 'override' directive to ignore #refs itself (ie strip them out when doing lookups), and that might be a sensible thing to do anyway, fixing the SVG issue here would at least for the SVG case (which is by far the most common one) resolve the problem as well.


Updating just the network portion here seems sensible to me - let's see what Seth thinks.
Flags: needinfo?(gijskruitbosch+bugs)
Whiteboard: [gfx-noted]
The problem here is bug 1118926. We cannot merge requests for different fragment id's *in general* without removing #-moz-resolution.

That said: it used to be that hacking around that was extremely difficult, but now that ImageCacheKey exists we *can* just check if the URI contains the string "-moz-resolution" and treat the ImageCacheKey's as equivalent whenever it doesn't.

(In reply to :Gijs Kruitbosch from comment #12)
> Which is annoying and repetitive. While we could try to update the
> 'override' directive to ignore #refs itself (ie strip them out when doing
> lookups), and that might be a sensible thing to do anyway, fixing the SVG
> issue here would at least for the SVG case (which is by far the most common
> one) resolve the problem as well.

For this particular issue I think the 'override' directive is the problem. ImageLib definitely strips the fragment id out before doing the actual network request.
Depends on: 1118926
Flags: needinfo?(seth)
Whoops, I meant "-moz-sample-size" above. "-moz-resolution" was also an issue, but it has been removed now.
Depends on: 1120056
No longer depends on: 1118926
(In reply to Seth Fowler [:seth] [:s2h] from comment #14)
> Whoops, I meant "-moz-sample-size" above. "-moz-resolution" was also an
> issue, but it has been removed now.

It looks like the blocking usecase for the removal of -moz-sample-size (ie for bug 1120056) fixing is b2g-only, which is now tier-3. Is it possible to make the splitting of request + implementation of -moz-sample-size specific to b2g or something, so we can improve this for desktop, android etc. ?
Flags: needinfo?(seth)
(In reply to :Gijs Kruitbosch from comment #15)
> It looks like the blocking usecase for the removal of -moz-sample-size (ie
> for bug 1120056) fixing is b2g-only, which is now tier-3. Is it possible to
> make the splitting of request + implementation of -moz-sample-size specific
> to b2g or something, so we can improve this for desktop, android etc. ?

I'd be happy to straight up forbid -moz-sample-size except on B2G, yeah, which would make solving this problem everywhere else trivial. (At the cost of some uglification of ImageCacheKey, but at least we can contain it there.)

What I need to do is make sure that -moz-sample-size hasn't crept into any non-B2G code, and then we can decide how to move forward. I think what you propose is actually a better solution than what I proposed in comment 13, as parsing the URI to check for -moz-sample-size has a cost that we'd be better off not paying, so let's do it if we can.

I'm going to leave the needinfo request on and come back to this, because this is something better done at the start of a release cycle rather than this close to uplift.
Depends on: 1311246
No longer depends on: 1120056
Hello,

sorry that I have made a duplicate issue. Now I am wondering, why this is not fixed? It is still here after 2 years so I guess it is happening on much more sites. Is there a way how I can help with this bug to get fixed?
There's a guide to getting started as a developer if you want to work on a patch to fix this bug: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide
I also recently stumbled upon this issue. Such fragments could be a huge performance improvement if only Firefoy would handle them correctly (Chrome, Edge and Safari do nicely).

Now that Firefox is "Quantum", is there any ETA on this or any way how I can help to fix this? If I understood the discussion correctly, the needed changes are not that huge.
Now that bug 1118926 is fixed, we should definitely revisit this.

As far as comment 9 goes...  Strawman proposal #3 doesn't seem different from what we do now, since necko itself strips the ref before doing cache lookups and server access, right?  If this mechanism isn't working now, we should figure out why not.  Mostly likely because we send the second request to necko before we have managed to load and cache the first one, right?  Maybe worth checking with the necko folks.  For the particular testcase in this bug, of course, the '?' in the URL means that caching is going to not work very well to start with.

Given the other straman proposals, I think there are two plausible options here:

Strawman proposal #4: Hook up both images to the same necko channel, have them both receive the raw XML and parse it separately.   This proposal is not clear on what happens if one image is fully loaded and then the other one is requested.

Strawman proposal #5: Hook up both images to the same data structure that receives the data from the channel and parses it.  Then .cloneNode() the document around as needed.  This also works for the "fully loaded" case.

Daniel, thoughts?  Including on the question about how someone can help in comment 23?
Flags: needinfo?(dholbert)
As I can see from just looking into the Developer Tools, Chrome and Edge detect that the file should be only requested once while Firefox requests the file multiple times and sometimes (I did not yet figure out how the behavior changes) uses the cache for the response and sometimes waits for the server to reply.
This supports Boris' claim that the second request leaves before the first one is loaded and cached.

My test case is similar to #1421595.
(In reply to Boris Zbarsky [:bz] from comment #24)
> As far as comment 9 goes...  Strawman proposal #3 doesn't seem different
> from what we do now, since necko itself strips the ref before doing cache
> lookups and server access, right?  If this mechanism isn't working now, we
> should figure out why not.

Agreed (I have the same gut feeeling & curiosity here).

> Strawman proposal #4: Hook up both images to the same necko channel, have
> them both receive the raw XML and parse it separately.   This proposal is
> not clear on what happens if one image is fully loaded and then the other
> one is requested.
> 
> Strawman proposal #5: Hook up both images to the same data structure that
> receives the data from the channel and parses it.

[interjection: That'd be SVGDocumentWrapper, or possibly one level below it.)

> Then .cloneNode() the
> document around as needed.  This also works for the "fully loaded" case.
> 
> Daniel, thoughts?  Including on the question about how someone can help in
> comment 23?

RE how someone can help: I don't know that this is easily mentorable until we've figured out for sure why we're still sending separate requests here (i.e. why our existing kinda-implementation of strawman #3 isn't working).  Answering that question likely requires some knowledge of imagelib/necko internals & some time spent on debugging this.

RE strawman proposal #4 vs #5: I don't have a good handle on which would be more straightforward / maintainable. I guess #5 is more robust in the "already loaded" case, as you noted, so maybe that's what we should pursue.  I'm also not sure what happens in that scenario if we need to paint both copies of the image before we've finished receiving all of the data. (It sounds like under that proposal, we have to wait for the full document before we can clone, I think?)  I don't have a great handle on how parsing and/or cloning into multiple documents can/would work, so I'm kinda hand-waving in my thoughts here, obviously. :)
Flags: needinfo?(dholbert)
> I'm also not sure what happens in that scenario if we need to paint both copies of the image before we've finished receiving all of the data.

Do we do that with SVG images right now?

> I don't have a great handle on how parsing and/or cloning into multiple documents can/would work

Not very well.  You want to finish parsing, then start cloning.
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #27)
> > I'm also not sure what happens in that scenario if we need to paint
> > both copies of the image before we've finished receiving all of the data.
> 
> Do we do that with SVG images right now?

Hmm, I *thought* we did paint SVG images incrementally [as they download] in some cases, but I can't seem to make that happen in a testcase now.  Maybe I'm misremembering, or we did in the past at some point & had to disable it.

So, I suppose this part is less of a practical concern, which makes strawman #5 more tenable than I was initially imagining it to be.
On my site with:

> <img src="image.svg#id-1">
> ...
> <img src="image.svg#id-20">

Chrome, Opera performs only 1 request to image.svg (200 Kb).

But Firefox performs 20 requests! This is really terrible behaviour :(

It also looks like a duplicate of https://bugzilla.mozilla.org/show_bug.cgi?id=1027106
(In reply to x_O from comment #30)
> It also looks like a duplicate of
> https://bugzilla.mozilla.org/show_bug.cgi?id=1027106

Looks like you're right. Marking as such.
Status: REOPENED → RESOLVED
Closed: 8 years ago6 years ago
Flags: needinfo?(seth.bugzilla)
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: