Closed Bug 1051117 Opened 10 years ago Closed 2 years ago

Symlinked SVG images (using "svg stack" trick w/ #ref) in chrome:// URIs may not render, due to nsFileChannel stripping off #ref when resolving symlink

Categories

(Core :: Networking: File, defect, P3)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: gps, Assigned: dholbert)

References

()

Details

(Whiteboard: [fixed by bug 1054289][necko-backlog])

Attachments

(3 files)

In bug 1045415, we tried to remove a build warning by removing preprocessing of an SVG file. Unfortunately, this caused the SVG file to not render. See bug 1016405 comment 57.

A side-effect of preprocessing SVG files is that the objdir will contain a raw file as opposed to a symlink back into the source directory. Since the symlink should be the only difference between the two modes, the theory is that is somehow related. Personally, I suspect some kind of extra permissions checking on SVG files that isn't liking the symlink.

I have no clue who would know about these things. needinfo ehsan to triage this.

If symlinks are the cause, we can either fix Gecko. Or, if that is not possible or is too much work, we can change the build system to not symlink SVG files (as hacky as that is).

Don't rule out something with the way this particular svg (browser/themes/*/content-contextmenu.svg) is used - I don't think anyone has looked into this too closely.
Flags: needinfo?(ehsan)
I don't have any information on what's going on here.  I think Dao does.

This file is used like the below here: <http://mxr.mozilla.org/mozilla-central/source/browser/themes/shared/contextmenu.inc.css#12>

list-style-image: url("chrome://browser/skin/content-contextmenu.svg#back");

(Same as in other places in that file.)

So, my guess is either the symlink breaks the chrome packaging stuff, or the SVG image rendering.  Moving to Core::SVG for now...
Component: General → SVG
Flags: needinfo?(ehsan) → needinfo?(dao)
Summary: Symlinked SVG files may not render → Symlinked SVG images embedded in chrome:// URIs may not render
Bug 1016405 Comment 72 suggests a solution
(In reply to Tim Nguyen [:ntim] from comment #2)
> Bug 1016405 Comment 72 suggests a solution

I don't know what + does, but that seems like a hack at best, not a solution.
'+' in jar.mn forces the creation of a new file. See https://ci.mozilla.org/job/mozilla-central-docs/Tree_Documentation/buildsystem/jar-manifests.html

Use of a '+' is a workaround, not a proper fix.
Do we have STR to see this with current Nightly or current mozilla-central?

(i.e. is this breaking any images in nightly/mozilla-central right now, or do you need to make a local change to trigger the bug?)
I think I can reproduce this by simply visiting chrome://browser/skin/content-contextmenu.svg#back in a debug build. That renders blank in my debug build. (Is that this bug?)  In an opt build, it loads the backarrow just fine.

I think there might be some sort of race condition here, because it seems to load correctly under some circumstances, like if I load the SVG file *first* and *then* add the #suffix. (Or if I navigate somewhere else and then go back to it.)

So e.g. if I do:
 1. Load chrome://browser/skin/content-contextmenu.svg (with optional #anythingYouLike) --> BLANK
 2. Edit location bar to make it end in #back --> RENDERS CORRECTLY
Unsurprisingly, this style rule...
> use:not(:target) {
>   display: none;
> }
...seems to be failing to apply. So, this is a bug at the style-system level or earlier.

In particular, when this reproduces, the frame tree for the SVG document is empty.  (So, everything is still "display:none", including the <use id="back"> element.)

If I edit the rule to have a different property set here (e.g. I tried "fill:red" on a simplified document), then I see that property being universally set on all the <use> elements, regardless of the "#back" suffix in the URL bar.  (though as noted in comment 6, it's only broken on first load / on reload; not when simply editing the URL)
Two other points:
 (1) At least on my system, this doesn't seem to depend on the fact that content-contextmenu.svg is a symlink. I see the same bug in my debug build if I move the symlink out of the way and create a new file called "content-contextmenu.svg" file with the same contents.
 (2) If I view the file directly, I don't see the bug. So there *is* something special about viewing the file via a chrome:// URI.
 (3) If I replace the file's contents with some XHTML and try to reproduce the bug with an equivalent <style> tag in that XHTML (e.g. styling <divs> as display:none unless they're :target'ed), I can't reproduce -- I get correct behavior, with :target'ed divs showing up.  This only seems to happen with SVG for some reason.
(er, "three other points)
(In reply to Daniel Holbert [:dholbert] from comment #6)
> I think I can reproduce this by simply visiting
> chrome://browser/skin/content-contextmenu.svg#back in a debug build. That
> renders blank in my debug build.

(Unsurprisingly, I need to have bug 1045415's patch applied locally to reproduce this, too. In comment 6, I was using a few-days-old build that had that patch in the tree.)

(In reply to Daniel Holbert [:dholbert] from comment #8)
> Two other points:
>  (1) At least on my system, this doesn't seem to depend on the fact that
> content-contextmenu.svg is a symlink. I see the same bug in my debug build
> if I move the symlink out of the way and create a new file called
> "content-contextmenu.svg" file with the same contents.

This part isn't actually true. Now if I repeat those steps while reproducing the bug (converting content-contextmenu.svg to an actual file instead of a symlink, in my objdir) and reload, it fixes the bug.
(And actually, I can reproduce this with XHTML after all. So comment 8 is mostly-incorrect. Sorry for confusion.)

Reclassifying as a CSS bug, since this doesn't seem to be SVG-dependent, and seems to be an issue with us figuring out which CSS rules apply or don't. (It may actually end up belonging somewhere like Document Navigation, but the non-SVG-dependent aspect means it's not appropriate for SVG at least.)
Component: SVG → CSS Parsing and Computation
Version: unspecified → Trunk
Here's a diagnostic patch, which breaks this bug (applying bug 1045415's patch) and replaces the SVG with some XHTML that triggers the same issue.

With this patch applied, when I visit chrome://browser/skin/content-contextmenu.svg#back or chrome://browser/skin/content-contextmenu.svg#forward , the first load renders nothing; but if I subsequently edit the URL to tweak the #suffix, and then navigate back/forward, *then* the content actually renders. (And if I reload while things are working, that breaks it.)  The file extension is still SVG, but I've replaced its contents with XHTML, to demonstrate that this isn't SVG-specific.

(I've only tested this on Linux, but I've copied my XHTML-masquerading-as-content-contextmenu.svg file into other platforms' theme directories, so this should be usable for poking at the bug on any platform.)
Attachment #8471852 - Attachment description: diagnostic patch 1 (linux-only): replace SVG with XHTML → diagnostic patch 1: replace SVG with XHTML
This patch uses #back and #forward to seek to targets at the bottom of the page, instead of to display content.

If I load the file directly (via file:///) with #back or #forward, I jump to the right part of the page. I stay at the right part of the page if I shift+reload.

BUT if I load the file via its chrome:// URI (with #back or #forward on the end), I'm not taken to the right part of the page.

So, it's looking like we completely ignore #suffixes on chrome:// URLs, on first load. Or something like that. I wonder if this is a security measure of some sort.
(diagnostic patch 2 demonstrates that this is not a CSS bug. Moving to General for the moment)
Component: CSS Parsing and Computation → General
What does nsChromeRegistry::Canonify (which is called from nsChromeProtocolHandler::NewURI) do to the #?
It doesn't do anything to it -- the URI isn't changed by that function. (The #suffix remains intact.)

Poking around to see what happens after that, though...
I think the ref might be stripped in something inside of nsFileChannel::nsFileChannel().  At least, that gets called with the arg being the full file:// equivalent of our chrome URI (*including* the ref), but by the time we get to its SetURI call, we're working with targetURI = the file:// URI *without* the ref.

*And* the logic there depends on "symlink" (a bool indicating whether our file is a symlink or not).

In particular, we get a call with uri="file://$OBJDIR/.../content-contextmenu.svg#back", and the logic flow there is:
>254   nsCOMPtr<nsIFileURL> fileURL = do_QueryInterface(uri);
>255   if (fileURL && 
>256       NS_SUCCEEDED(fileURL->GetFile(getter_AddRefs(file))) &&
>257       NS_SUCCEEDED(file->IsSymlink(&symLink)) && 
>258       symLink &&
>259       NS_SUCCEEDED(file->GetNativeTarget(fileTarget)) &&
>260       NS_SUCCEEDED(NS_NewNativeLocalFile(fileTarget, PR_TRUE, 
>261                                          getter_AddRefs(resolvedFile))) &&
>262       NS_SUCCEEDED(NS_NewFileURI(getter_AddRefs(targetURI), 
>263                    resolvedFile, nullptr))) {
>264     SetURI(targetURI);

...which produces targetURI = "file://$SRDIR/.../content-contextmenu.svg", which is notably:
 - now in my srcdir (the actual file path, after traversing the symlink)
 - lacking the "#back" suffix

This is the bug, I think -- we're inadvertantly stripping the #ref when taking this "symlink" codepath here.

(We don't hit this when I view the file:// URI directly because I think some other code handles the symlink translation. In particular: when I load the file directly, the location bar is updated to show the actual path (not the symlink path that I told it to load). So I think something higher-up than nsFileChannel() is doing that translation.)
Component: General → Networking: File
Summary: Symlinked SVG images embedded in chrome:// URIs may not render → Symlinked SVG images (using "svg stack" trick w/ #ref) in chrome:// URIs may not render, due to nsFileChannel stripping off #ref when resolving symlink
So, if we want to make this SVG file a symlink (i.e. if we want to fix bug 1016405), then first we need to either:
 (1) fix the nsFileChannel bug here, laid out in comment 17, and prevent the #ref from being removed from the translated URI
...or:
 (2) split this image into actual standalone SVG images, instead of an "svg stack" that depends on having a #ref in the URI

(Part (1) is a real bug and is worth fixing, but I kinda think we should do (2) as well...  Note that SVG stacks may be nice for combining images into a single file, but they don't actually confer any perf benefit in imagelib.  Unlike a traditional raster-image sprite (with a grid of images that gets positioned & masked), I think an SVG stack (with CSS and #refs to hide/show things) will generates a different image in imagelib for each #ref version that gets used, and each version will have its own full copy of the SVG document (since the URI is different).  And each copy has the baggage of the full DOM, including all of the pieces that aren't being rendered in that version. So, not very scalable.)
Flags: needinfo?(dao)
(In reply to Daniel Holbert [:dholbert] from comment #19)
> (Part (1) is a real bug and is worth fixing, but I kinda think we should do
> (2) as well...  Note that SVG stacks may be nice for combining images into a
> single file, but they don't actually confer any perf benefit in imagelib.

I've been working to make sure that SVG stacks perform much better than multiple separate image files, so I'm not too keen on (2).
(In reply to Daniel Holbert [:dholbert] from comment #17)
> This is the bug, I think -- we're inadvertantly stripping the #ref when
> taking this "symlink" codepath here.

Sounds like something bz might have an opinion on.
Flags: needinfo?(bzbarsky)
Propagating the ref to the new filename makes total sense.  The fact that it doesn't happen now is just an oversight.
Flags: needinfo?(bzbarsky)
Can we please get this rolling?
Flags: needinfo?(dholbert)
Sorry, I didn't realize the ball was still in my court here. Anyway -- sure, this shouldn't be too hard to finish off.
Assignee: nobody → dholbert
Flags: needinfo?(dholbert)
dao, can you actually reproduce this bug? I can't, rendering-wise, via the context-menu or via directly loading the chrome:// URI from comment 6.

(Though I can see in GDB that the #ref is still failing to be copied in the code referenced in comment 17.)

It looks like this file has been made back into a symlink, in bug 1109826, and I verified that it's a symlink in my objdir, so I'm not sure why I can't reproduce it. Maybe it's already been hacked around elsewhere somehow?
Flags: needinfo?(dao)
(I think this should fix the issue described in comment 17, though I want to make sure this is still reproducible somehow before landing this.)
I think this may have been fixed by bug 1054289's patch, actually. (mozilla-central changeset bdcf3a2da55c)

That line tweaked a "ScrollToRef()" call to pass in the original document URI (which in this case, includes the #ref suffix), instead of the nsContentSink's "mDocumentURI" (which does not).

If anyone can still reproduce this bug, then we may want my attached "fix v1"; but if not, I tend to think we should just close this as FIXED by bug 1054289.
Depends on: 1054289
(In reply to Daniel Holbert [:dholbert] from comment #27)
> I think this may have been fixed by bug 1054289's patch, actually.
> (mozilla-central changeset bdcf3a2da55c)
> 
> That line tweaked a "ScrollToRef()" call to pass in the original document
> URI (which in this case, includes the #ref suffix), instead of the
> nsContentSink's "mDocumentURI" (which does not).

More specifically: that changeset switched us from using nsContentSink's "mDocumentURI" to instead using "mDocument->GetDocumentURI()", and I believe that fixed this bug.

Here's where those values come from (to explain why the former is wrong and the latter is right, for the purposes of this bug):

 - nsContentSink's "mDocumentURI" is taken directly from the channel's "GetURI" method (called in XMLDocument::StartDocumentLoad) -- that's passed from there into NS_NewXMLContentSink, which is how the content sink gets it.  And as noted in comment 17, the channel's "GetURI()" is missing the hash, because it gets stripped while we set up the channel.

 - mDocument->GetDocumentURI() is a getter for nsDocument::mDocumentURI, which comes from the 'uri' variable in nsDocument::Reset(), which is populated via NS_GetFinalChannelURI(), which in this case returns the "original" URI (with the #ref).

So, mDocument->GetDocumentURI() includes the #ref (in this case at least), so I expect this should be fixed now that we're using that. (as of bug 1054289)
Yeah, this seems to work now. The reason for why I wasn't seeing the SVG in the patch I'm working on seems to have been a different one.
Flags: needinfo?(dao)
Attachment #8538609 - Attachment description: fix v1 → fix v1 [probably correct, but ultimately not needed for this bug anymore]
Cool. I'm going to call this fixed by bug 1054289 then.

(Not pushing to land my attached patch -- it's probably correct, but I'm not aware of any actual broken behavior that it fixes, and I don't really know its area of code well enough to be sure that it won't cause subtle brokenness/regressions somehow.  So, my internal cost/benefit analysis leans against taking it, unless we discover something that it actually fixes in practice.)
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by bug 1054289]
> - nsContentSink's "mDocumentURI" is taken directly from the channel's "GetURI" method 

That seems pretty broken.  It should, imo, do the same exact thing that nsDocument does.
Flags: needinfo?(dholbert)
Just filed bug 1148173, which is related to this one.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
See Also: → 867407
Whiteboard: [fixed by bug 1054289] → [fixed by bug 1054289][necko-backlog]
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P1
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P1 → P3
Severity: normal → S3

Just re-ran across this bug, when triaging old needinfo's.

Two notes:
(1) I've lost the relevant context for my pending needinfo from comment 31 here (and it's about a question from nearly a decade ago). I'm not sure if things still behave as-described there, and I don't immediately have cycles to dive in to investigate, so I'm canceling that needinfo at this point.
(2) It looks like this bug was left open by accident. Looks like we called this fixed in comment 30, and then it was reopened in comment 32, but I suspect that may've been an accident? (e.g. maybe BillM was prepping to reopen and then thought better of it and filed the separate bug, but forgot that he had toggled the "reopen" option when hitting submit.)

In any case, we've since fixed the associated bug that BillM linked when reopening this bug. (He filed bug 1148173, which was duped to bug 867407, which was later fixed.) So I don't think there's any reason for this to be open at this point.

Flags: needinfo?(dholbert)
Status: REOPENED → RESOLVED
Closed: 10 years ago2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: