Open Bug 153281 Opened 18 years ago Updated 1 year ago
No way to reach original, untransformed XML document from DOM
A friend and I are able to reach an XML document transformed by XSLT in the post-transformation form. However, we are having a great deal of trouble reaching them pre-transformation form of the document by DOM. For instance, document.styleSheets.length == 0. This despite the fact that the second line in the XML document is an <?xml-stylesheet ?> PI. View Source gives us the source document, untransformed. DOM Inspector sees only the transformed document. I have considered filing this bug as an RFE to add an "originalDocument" property of the Document node, as another Document node reflecting the untransformed document. I do not believe this is the correct solution, given that the count of stylesheets is incorrect. Testcase: Open the above URI. Check document.styleSheets.length. Actual results: document.styleSheets.length == 0. Expected results: document.styleSheets.length == 1, or some reference to an untransformed XML document existing with that document having a styleSheets.length == 1.
document refers to the result document, which is what most use cases of js need. Exposing the original document is a RFE, and I'd say this is rather a Mozilla DOM extension than anything else. I am not sure, what kind of nsIDOMStyleSheet implementation we should have for XSLT, btw. What would we do on dynamic stylesheet removal? Heikki had issues with not transforming the source document, so should we do that? Hrm. I guess nobody tried to add a XSLT stylesheet PI via js yet. If we do it, I'd expect the originalDocument to be hooked up to the window object. I doubt we should expose it for documents that have been transformed by js. Use cases like the p3p viewer leave me puzzled. Not sure if we really wanna do that, putting more folks on CC. Heikki for the stylesheet adding and removing stuff, vidur as he's co-author of the DOM Style spec. (I keep QA and owner as is, as that'll end up on peterv anyway, the way the world turns) Lots of issues.
Severity: normal → enhancement
Component: XSLT → DOM Mozilla Extensions
OS: Windows 98 → All
Hardware: PC → All
I really think you should implement that. If there is not standard to access the original doc, then create one. I can see tons of cases where this would be needed. For example, I add an XSLT stylesheet for the case that somebody opens that document in a browser, but the primary use is to get the information out of the original doc via DOM and JS for some fancy webapp. Or I have several uses for the same document (maybe in different browsers or in different apps), and in one case, I need the transformed doc and in one case, I need the untransformed one.
FYI, I've filed bug 153799 for a crash bug caused when I click on the URI above.
I personally like the idea of grabbing the original, unmodified document's DOM, at least from a perspective of data (not rendering) and the DOM. Sometimes we might be able to garner data from the original document which the XSLT stylesheet has, deliberately or otherwise, obscured. Such data might be more efficiently accessible through the original document's DOM.
Alas, this bug's URL is dead.
Thanks to bz for advice on where to look. New URL field reflecting the file in LXR. I still want to see this implemented, but I just realized that we may pay a heavy price for this one extension. As I see it, we have three options. The first is to make the original document available as DOM document node. If we go that route, that will result in a sizable increase in memory usage for a feature. Despite the enthusiasm of comment 3, this makes me think twice. The second is to instead store the text of the original document, and have the application developer call for a DOMParser on, say, window.originalDocumentSource. Though this would use much less memory, bz says this might be much harder to implement. The third is WONTFIX. I really do not want to see that happen, but if the first and second options are unsavory to DOM module owners, then that's all she wrote. I'm willing to do the legwork on this one. I'd like module owners to advise me on which route they prefer.
I must admit I don't see a huge need for this. If you need access to some data that is in the source document bring it over to the result documet while creating it. I.e. let the XSLT copy over all data you need. For the usecase in comment 2 when you primarily are using a document as data but want to be able to display it if it's opened in a browser just put a PI at the top of it. Then we'll apply the stylesheet if the document is displayed, but we won't if you load it from js. Though I'm a little confused if comments 0 to 2 is asking for more. It seems like they're talking about dynamic modifications to the source document would trigger automatic retransforms to create new source documents. Please take that in a new bug, although it does sound a little like bug 18722. Anyhow, back to the RFE in the summary As a convinience (or possibly purity) feature I can see uses of this. But as previous comment stated, keeping around a DOM that is very likly not to be used is a waste of memory. One way to possibly implement this without using more memory though would be to drop the DOM but still keep a handle to the cached datastream. Then when someone requests the original DOM reparse it using the cache data. The problem is what to do if the data is removed from the cache, we could download it again, but that won't work for POST documents and is in general pretty evil. Another solution is to have some way in the XSLT to state that you want access to the original document. For example something like <xsl:output moz:keep-source-document="true" .../>. I think this is the best solution from a mozilla perspective, but it's nonstandard with small chance of getting support from any other browser.
jst's opinion, per #developers, is as follows: (1) This is not a high-priority bug. But if you want to work on it, go for it and good luck. (2) Does any other software product implement something like this? If so, we should mimic their behavior as closely as is practical. ("where it makes sense to") As I said before, I'm willing to work on an implementation, but don't expect me to submit a patch for review anytime soon. I'm taking this bug and cc'ing peterv, but this bug may still end up a WONTFIX.
Assignee: peterv → ajvincent
It seems IE is using windows.document.XMLDocument : XMLDocument and windows.document.XSLDocument : XMLDocument based on the code for " Internet Explorer Tools for Validating XML and Viewing XSLT Output" found at http://www.microsoft.com/downloads/details.aspx?FamilyId=D23C1D2C-1571-4D61-BDA8-ADF9F6849DF9&displaylang=en
According to the DOM inspector, the original XML (in text form) can be accessed from the textContent property in the view-source: URI conversion. For example, the following code: document.location = "view-source:test.xml" alert(document.documentElement.textContent) should pop up a text box with the original XML content. Feeding this through the parseFromString() should produce the XML DOM. Is this something that could be wrapped into a convenience function which would only be generated on demand?
Unfortunatly that won't work reliably. First off it relies on caching. If the url does not happen to be cached for some reason (too big, accessed through https, etc) the page will be redownloaded and you might end up with a different document. Second, it won't work for pages that are generated from POST since those aren't indentified soly by their URL. And I don't think that anyone is interested in using a js-function that just works 'sometimes'. As i've stated before. If you're interested in the original document just copy it over to the result document. It'll just take a single xslt instruction: <xsl:copy-of select="/"/>
*** Bug 305121 has been marked as a duplicate of this bug. ***
After four years, I'm moving this bug back to DOM default owner.
Assignee: ajvincent → general
QA Contact: keith → ian
Component: DOM: Mozilla Extensions → DOM
https://bugzilla.mozilla.org/show_bug.cgi?id=1472046 Move all DOM bugs that haven't been updated in more than 3 years and has no one currently assigned to P5. If you have questions, please contact :mdaly.
Priority: -- → P5
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.