User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7) Gecko/20040803 Moonpartridge/0.9.3 (Firefox/0.9.3 polymorph) Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7) Gecko/20040803 Moonpartridge/0.9.3 (Firefox/0.9.3 polymorph) While this may be related to 181456, I don't think that it is exactly the same. The Save Page As... [Web Page, complete] function turns a series of into a series of Â/#194 characters (the actual character, not the ampersand entity) separated by spaces. Consequently the page looks bad when it is loaded back up in Firefox. I think that this may be a consequence of an encoding scheme mismatch (the local filesystem comes up with ISO-8859-1, whereas the W3 website uses UTF-8). Reproducible: Always Steps to Reproduce: 1. Load a webpage that has a long sequence of (the URL provided is a case where I noticed the problem) 2. File -> Save Page As... 3. Choose Web Page, complete as the file type 4. Save the file. 5. Load the file back in the browser and observe the ickyness. Actual Results: The sequence of " " looks like "A A A A A" (except the Acirc character instead). Expected Results: I don't mind that the save as function loses the &xxx; character sequences when it saves the file, as long as it saves the correct character. Ideally though, any unusual characters should be saved using the &#xxx; sequence to opportunitythe chance for such errors to occur. Now this could be a critical bug for two reasons, but I'll let someone else decide that. 1. It causes data loss (but it does have a workaround, albeit a painful and tedious one. 2. It should just work - this is not an advanced feature and the impact on users who see such a bug is fairly negative.
On your example URL, on which line appears this long "nbsp;..." string in the source code? I can't find it.
This is basically like bug 119146. I wonder why that patch doesn't help here..
Well, I have made some testcases. They might be useful (or not): http://martijn.heelveel.info/test/mozilla/headers/ (I don't have time right now to test and compare them)
Martijn's test cases show that this is only a problem when only the HTTP headers report the charset. If the charset is reported in the <meta ...> tags, the page displays correctly. The &xxx; sequences are still modified though. It would appear that the suggestion at the end of bug 119146 should be adopted.
> It would appear that the suggestion at the end of bug 119146 should be adopted. There's a suggestion there on how to fix the bug? Again, the fix in that bug should have handled " " (that's what that bug was about). It was fixed. I just realized that reporter is using Firefox, though (and damn extensions that hide the fact). And firefox has forked all the download code, and the fix for bug 119146 was NOT ported to firefox. Over to firefox; this works fine in Mozilla builds.
Assignee: file-handling → bugs
Status: UNCONFIRMED → NEW
Component: File Handling → File Handling
Ever confirmed: true
Product: Browser → Firefox
QA Contact: ian → bmo
14 years ago
Flags: blocking-aviary1.0? → blocking-aviary1.0+
Created attachment 160463 [details] [diff] [review] patch add addtl encoding flag for rich text saves.
br & trunk fixed.
Status: NEW → RESOLVED
Last Resolved: 14 years ago
Resolution: --- → FIXED
looks like it has been fixed on the branch, so could somebody please add the fixed-aviary1.0 keyword? (i know this is trivial but it would be useful for querying the overall number of remaining bugs)
You need to log in before you can comment on or make changes to this bug.