Closed Bug 128326 Opened 23 years ago Closed 23 years ago

Impossible to save XML files without dataloss

Categories

(Core Graveyard :: File Handling, defect, P2)

Tracking

(Not tracked)

VERIFIED FIXED
mozilla1.0

People

(Reporter: bzbarsky, Assigned: hjtoi-bugzilla)

References

Details

(Keywords: dataloss, regression)

Attachments

(2 files, 1 obsolete file)

BUILD: anything since after the "save page, complete" changes landed STEPS TO REPRODUCE: 1) Load an XML document in the browser (or an XHTML page served as application/xhtml+xml) 2) File > Save Page As... 3) Select the only option in the filter list ("xml") 4) Save ACTUAL RESULTS: Browser calls SaveDocument(), since we are saving a document and the selected option in the filter list is not the second one (this is where the logic error is). XML is saved from the DOM instead of from cache, causing dataloss (the XML saved is invalid, for one thing). EXPECTED RESULTS: Be able to save the original data.
Keywords: dataloss, nsbeta1
Status: UNCONFIRMED → NEW
Ever confirmed: true
cc'ing heikki... also notice that when i tried saving simple.xml (from content/xml/tests/), that the css file docbook.css isn't saved --the only file in simple_files/ is some file called mzcolor.html, which doesn't really exist orgininally. (simple.xml does refer to mzcolor.gif, but that file doesn't seem to be availble in the tests dir afaict...) fwiw, i saved the xml tests to my internal webserver at http://hopey.mcom.com/tests/xml/
Is this failure to save from cache a generic problem? Don't we still have a dup already open?
No, this problem is specific to XML, because we detect it as a "document" type so it follows the "document" path. But not being "text/html" it does not get a "web page, complete" option in the filepicker, it only gets a single "xml" option. that option is the first one in the list, just like the "web page, complete" option is for HTML, so the code decides that the user wants to save the complete page and just saves the DOM instead of saving the file. In the process of saving the DOM things like XML declarations are removed (due to the XML encoder being broken). The point is that the only way to save XML is in "complete" mode, and the user has no indication this is happening.
Status: NEW → ASSIGNED
Priority: -- → P3
Target Milestone: --- → mozilla1.0.1
nsbeta1- per Nav triage team, ->1.2/helpwanted.
Keywords: nsbeta1helpwanted, nsbeta1-
Target Milestone: mozilla1.0.1 → mozilla1.2
Boris, is the saved XML really invalid? Missing XML declaration is a known problem in the DOM. There is no DOM object for the XML declaration. IE uses PI, which is against the spec but the most reasonable thing to do in lack of a correct object. So what I am saying is the encoder is not in error in this case, it is just that we never even put the XML declaration into the DOM in Mozilla.
Well... the saved XML _is_ technically valid in that the XMLDecl production is optional in the prolog. However, charset information needed to decode the document is in the XML declaration. So we can save valid XML but only as long as said XML is encoded as UTF-8. More to the point, this particular bug is not about the document encoder. The bug is that we save the DOM instead of the original file without telling the user that and don't offer the option of saving the original file. This is purely an XP Apps problem. In addition to the charset issue, the current save method will modify the attributes of some elements in the file if those elements happen to be in the XHTML namespace... So there are dataloss concerns other than the encoding here.
See bug 131620. This causes serious dataloss (no way to open the page anymore), not just theoretical dataloss. Renominating. Proposed simple fix -- make XML types not get detected as "document types" in contentAreaUtils.js until we actually properly save them as "document types"
Keywords: nsbeta1-nsbeta1
dup of bug 126669
Boris, when renominating bugs for nsbeta1, you need to give a compelling reason why it has high impact for the MachV release. Nearly all of our customers will never do this, which is why it is nsbeta1-.
Keywords: nsbeta1nsbeta1-
This has to be fixed! I have one very crude fix to disable save as complete happening at all for XML types, but I'd like something better if I can figure out how. I also noticed another error, anyone know if there is a bug for this: from context menu Save Link As... will always offer to save "Web Page, HTML Only", regardless of the actual mime type. This scares the heck out of me, and the most annoying thing is it actually appends .html to the end of the filename unless I enclose the whole name in quotes (this on Windows).
Assignee: ben → heikki
Status: ASSIGNED → NEW
Priority: P3 → P2
Target Milestone: mozilla1.2alpha → mozilla1.0
So unless I can figure out how to add another option to the filepicker (I haven't found how or where this happens) I am going to disable the whole save as complete feature for XML. Oh, and I'll do some testing to see if the save as complete actually works (I have already fixed some bugs there so I know some cases where it works). Looking for reviews for this one in case I need to go with this approach.
Seems like Save As Complete in XML seems to save images and external scripts OK. For some reason stylesheets don't get saved.
> I also noticed another error, anyone know if there is a bug for this: from > context menu Save Link As... will always offer to save "Web Page, HTML Only", > regardless of the actual mime type. We use a HTTP HEAD request to get the mime type of the document on the other end of the link and at the moment we unconditionally bypass cache to do it (though changing that will not really help). If you're getting text/html, it sounds like the web server is sending text/html for all HEAD requests (I know the bugzilla web server does this, regardless of the type of the actual content, but other servers are better configured).
Ok, that HEAD thing explains the save target as weirdness. Anyway, I have identified enough problems with save as complete to make me decide I really want to disable it for XML. I opened up a bug for enabling it, see bug 136717. So, reviews for the patch please...
Status: NEW → ASSIGNED
I think the code here is broken: http://lxr.mozilla.org/seamonkey/source/xpfe/communicator/resources/content/contentAreaUtils.js#285 since it only provides post data for text/html. So if the XML came from a POST submit, then it won't work right. But it seems like that line of code is broken anyway since the same problem exists right now with text/plain, for example. Boris?
Attachment #78406 - Flags: review+
Comment on attachment 78406 [details] [diff] [review] Disables Save As Complete for XML r=rbs
Yep. What Bill says in comment 16 is definitely true. Basically, we should not be using |isDocument| to determine whether to grab the POST data, but should rather use aData.document.
So shall we fix that here, or via another bug? I say here, since it is a trivial fix and likely to go unfixed otherwise.
So is this the fix? I don't have a clue, I am just writing what I think others are saying...
well, you still want to return false for the XML case... But the first part of the patch is correct, yes.
...regression from the earlier patch :-)
Attached patch Correct patchSplinter Review
Sorry, I was testing things and forgot it...
Attachment #78795 - Attachment is obsolete: true
Comment on attachment 78804 [details] [diff] [review] Correct patch r=bzbarsky
Attachment #78804 - Flags: review+
Comment on attachment 78804 [details] [diff] [review] Correct patch sr=blake
Attachment #78804 - Flags: superreview+
Fixed on trunk, leaving open so it is easier to track. I mailed drivers earlier today, but have not yet heard back...
Whiteboard: [fixed on trunk]
When it's been tested on the trunk, please update this bug.
Comment on attachment 78804 [details] [diff] [review] Correct patch a=asa (on behalf of drivers) for checkin to the 1.0 branch
Attachment #78804 - Flags: approval+
tested using today's commercial trunk builds (linux, mac os x, win2k): 1. opened http://hopey.mcom.com/tests/xml/simple.xml 2. hit accel+S to do a "save page as" --the resulting file picker has "extensible markup language" as the only save option. 3. hit save button. results: the only file saved is "simple.xml" --the css file "docbook.css" is not saved. (there is no subsidiary directory created either.) is this expected behavior? (btw, source for saved file is the same as the one on the server.)
Yes, that is the expected behavior. Once we enable save as complete again you will have two options (save as complete or same XML only), see bug 136717 for that.
Adding adt1.0.0+ Please check in to branch asap, & mark fixed.
Keywords: adt1.0.0adt1.0.0+
Checked in on branch as well.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Keywords: adt1.0.0+fixed1.0.0
Resolution: --- → FIXED
Whiteboard: [fixed on trunk]
sairuh verified this on trunk earlier, marking as such.
Status: RESOLVED → VERIFIED
*** Bug 141741 has been marked as a duplicate of this bug. ***
verified fixed on the branch using 2002.06.27.0x comm bits on all platforms.
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: