Closed Bug 115174 Opened 23 years ago Closed 16 years ago

Save Page As Web Page, HTML Only will attempt to resubmit cgi to save instead of using cached or displayed content

Categories

(Core Graveyard :: File Handling, defect, P2)

x86
All
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: bugzilla, Unassigned)

References

(Blocks 1 open bug, )

Details

(4 keywords)

Attachments

(1 file)

bill has a counter cgi test to illustrate this, but here's a summary test case he told me about: 1. go to a page that has a counter. let's say the counter displays "1013 visits to date!" 2. save the page [ctrl+S, File > Save Page As, etc.]. 3. look at the page that you had saved locally. results: the counter now reads "1014 visits to date!"
feel free to resummarize this bug --my language skills are kinda woggly today. ;)
Keywords: regression
Would trying PERSIST_FLAGS_FROM_CACHE before falling back on PERSIST_FLAGS_NONE work?
Summary: saving page w/counter will resubmit cgiand save that rather than the cached version → saving page w/counter will resubmit cgi and save that rather than the cached version
Status: NEW → ASSIGNED
Priority: -- → P2
Target Milestone: --- → mozilla0.9.8
the way the idl is documented suggests that using PERSIST_FLAGS_FROM_CACHE would result in potential failures if the document was not cached. Isn't the default behaviour of WBP to use cached data if it's present, (which in the counter case it certainly is) - Adam?
Adding the PERSIST_FLAGS_FROM_CACHE should ensure the cached copy is saved if it exists but it depends how the counter data is being embedded. If the site uses JS to generate the counter, this might not work, e.g. my own site (http://www.iol.ie/~locka/mozilla/mozilla.htm).
the problem with doing that though is that according to the documentation, this could cause other saves to fail if data isn't in the cache.... based on my interpretation of the interface, the default behaviour should be best, if it uses cached data if it exists, falling back on retrieving from network. Is this what happens?
I believe the default behaviour should fetch from cache if it exists or from the net otherwise. Doug can you just confirm that this code below should do just that - when I fetch a URI it will get it first from cache unless the flags say otherwise? http://lxr.mozilla.org/seamonkey/source/embedding/components/webbrowserpersist/src/nsWebBrowserPersist.cpp#458
I discussed this a bit with Ben yesterday and suggested that we may not be able to use the same cache strategy for all downloads. In the case of "save page" (and various frame-related flavors of that) and "save image," then we most likely want whatever copy is in the cache (if there is one). That's "validate never" in terms of the lower-level (and older?) necko caching flags (which are distinct from those supported on the nsIWebBrowserPersist interface). But in the case of "save link as," we need to do standard cache validation, I think (i.e., respecting the caching control specified by the web server via http response headers, and, the user prefs for validating once-per-session, etc.). If we don't do that, then "save link as" will result in saving stale data sometimes. I think "save link as" should save the same information the user would see if they clicked on the link. Note that this is how nsIStreamTransfer::selectFileAndTransferLocationSpec used to be utilized. It has a "doNotValidate" argument that was PR_TRUE for saving pages and images, and PR_FALSE for saving links.
so my question is, is it likely that you could miss in the cache in any of the save sitations proposed? (and thus fail to do anything - not retrieve from network)
Keywords: nsbeta1
Target Milestone: mozilla0.9.8 → mozilla0.9.9
OK, I pushed this out to .9.9 because even when I pass in the unconditional USE_CACHE flag, the counter at the URL specified is incremented when I load the saved version and compare it to the one on screen. Adam, this seems to imply something is incorrect inside webbrowserpersist, unless I'm missing something?
Doug can you confirm that webbrowserpersist is doing everything it can to ensure data gets pulled from the cache first? http://lxr.mozilla.org/seamonkey/source/embedding/components/webbrowserpersist/src/nsWebBrowserPersist.cpp#458 The http spec <http://www.ietf.org/rfc/rfc2616.txt> does allow servers to specify cache control and expiry headers so downloaded data might not be in the cache but in this instance there appears to be none of that. The log of data being sent back and forth indicates that this is a straightforward operation and the data should be retrived from cache if it is there. D:\personal\junkbuster\junkbstr.exe: accept connection ... OK scan: GET /cgi-bin/counter.exe HTTP/1.1 scan: Host: law.mcom.com scan: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.4) Gecko/20011019 Netscape6/6.2 scan: Accept: text/xml, application/xml, application/xhtml+xml, text/html;q=0.9, image/png, image/jpeg, image/gif;q=0.2, text/plain;q=0.8, text/css, */*;q=0.1 scan: Accept-Language: en-us scan: Accept-Encoding: gzip, deflate, compress;q=0.9 scan: Accept-Charset: ISO-8859-1, utf-8;q=0.66, *;q=0.66 scan: Keep-Alive: 300 scan: Proxy-Connection: keep-alive crunch! addh: Proxy-Connection: Keep-Alive D:\personal\junkbuster\junkbstr.exe: GPC law.mcom.com/cgi-bin/counter.exe D:\personal\junkbuster\junkbstr.exe: connect to: law.mcom.com ... OK scan: HTTP/1.1 200 OK scan: Date: Wed, 16 Jan 2002 18:51:08 GMT scan: Server: Apache/1.3.14 (Win32) scan: Transfer-Encoding: chunked scan: Content-Type: text/plain D:\personal\junkbuster\junkbstr.exe: accept connection ...
darin should look at that.
yes, it appears that the improper load flags are being set on the http channel used to request the document. nsIRequest::LOAD_FROM_CACHE should be set on the channel to avoid hitting the net for the document. in this case it appears that this flag is not being set, and because the page is served up w/ a zero freshness lifetime, it is re-requested from the server instead of being pulled from the cache (just as it would be if the user visited the page via a href).
Ben, webbrowserpersist will pass nsIRequest::LOAD_FROM_CACHE if PERSIST_FLAGS_FROM_CACHE is passed to it, therefore I think this is the thing to try first. If that doesn't work, I don't know what the problem could be. The load flag setting coding in webbrowserpersist is pretty straightforward.
*** Bug 118487 has been marked as a duplicate of this bug. ***
OS: Linux -> All (according duped bug 118487)
OS: Linux → All
nsbeta1+ per ADT triage team, assuming this means an incorrect page could be saved in the general case. Adding topembed keyword in case this is an embedding issue too.
Keywords: nsbeta1nsbeta1+, topembed
-> 1.0
Target Milestone: mozilla0.9.9 → mozilla1.0
isn't this a really old bug, related to view source doing the same thing as well?
Whiteboard: dupeme
nsbeta1- per ADT
Keywords: nsbeta1+nsbeta1-
Target Milestone: mozilla1.0 → mozilla1.2
topembed- per EDT triage.
Keywords: topembedembed, topembed-
This is fixed in recent builds. It was related to the famous view source/CGI/cache problem, I think. I tested this successfully at http://www.sdsu.edu/~boyns/counter.html with 1.0 RC1 branch Build ID: 2002050706 Windows 98. Updating URL, as the old one is dead. Can we resolve this worksforme?
As a note, the view-source fix should not have affected this... so if it now works, it's for some other reason.
nominating for buffy. but should this go to adam or rpotts?
Keywords: nsbeta1-nsbeta1
WFM.
*** Bug 166584 has been marked as a duplicate of this bug. ***
QA Contact: sairuh → petersen
Depends on: 177329
nsbeta1- per the nav triage team.
Keywords: nsbeta1nsbeta1-
Despite whatever "Cache-Control" header the server has given you, there needs to be a way to save the CURRENT PAGE. Ie, whatever source is being rendered on the browser right now. I use this feature to save invoices / order confirmations from several vendors. But I've "regressed" to using Internet Explorer for all orders, because Mozilla will always resubmit the order if I try to "Save" the confirmation / invoice. "Save Page" should save what's being displayed on the browser window, just like "View Source" should show you the source that's being rendered, rather than reconnecting with the website and downloading new source. This is a Mozilla anachronism.
mozilla@acct.whipsnap.org: are you saying that mozilla will automatically POST data to a server without asking you for confirmation first? if so, have you tried increasing the size of your memory and/or disk cache to remedy the problem? (i'm not claiming that this is a great solution, but i'm just trying to understand the problem you are seeing.) thx!
*** Bug 151124 has been marked as a duplicate of this bug. ***
> are you saying that mozilla will automatically POST data to a server Yes. > without asking you for confirmation first? Yes. The problem is that until bug 170722 got fixed you could pass the postData to the persistence object (so we did), but you could not pass a cache key (so it never set the channel to only read from cache). The result was that trying to save a POST result page would repost silently. Now that the persistence object takes a cache key, we should be able to fix this.... (contentAreaUtils.js, just pass around the nsIWebPageDescriptor along with the postdata stream....)
I spoke too soon.. the postdata handling in there is _also_ broken; it just uses the postdata for the toplevel page and breaks if you do, eg, "save frame as". <sigh>.
*** Bug 193110 has been marked as a duplicate of this bug. ***
I would argue for a severity increase (dataloss). This problem can result in people losing things like order numbers, inadvertently placing duplicate orders and similar. As a quick fix, warning the user that the page will be resubmitted is essential. Moz 1.2.1 doesn't appear to suffer this problem, but 1.3b does. URL needs updating as broken, how about http://www.sys3175.co.uk/try.php ? Description for this bug is terrible, basically unfindable by people about to file this as a bug. My experience is this only happens with POST pages - is this the case? (if so, the description needs to reflect this, to make it easier for people to find => less dupes). Target milestone also needs updating.
Depends on: 84106
Removing past target. Adam, didn't this get worked on recently on a separate bug report?
Target Milestone: mozilla1.2alpha → ---
The patch against bug 177329 would add some extra flags to allow finer cache control behaviour. This bug covers a lot of the same ground as that one.
Blocks: 210065
I've seen this behavior when making payments to Discover Card (http://www.discovercard.com) and Citicards (http://www.citicards.com). Of course I can't give complete URLs without revealing personal banking information. The Citicards web server gives me an error message saying essentially that my payment has already been submitted. This indicates to me that the form may be being sumbitted again when "Save Page As, HTML Only" is selected. What really concerns me about this is that, on a less smart web page, a user's payment may be submitted twice, when all the user wanted to do was save a copy of the payment receipt. This is more than just annoyance, it could cause people's checks to bounce unexpectedly. I've seen web pages (can't remember URL's off the top of my head, though) which say something to the effect of "Press the Submit button ONLY ONCE, or your account may be debited or charged twice." Selecting "Save As" should NEVER cause a form to be re-submitted, at least not without a warning to the user. I first made this comment in bug 144541, but it probably belongs here. I'm surprised this bug isn't higher severity, as in my opinion it makes Mozilla unsafe for doing ANYTHING involving money, as the risk of duplicate transactions and lost confirmation pages (invoices, order numbers, etc.) is too great. I'll use Mozilla for browsing, but Internet Explorer gets opened when I go to my bank's web site or an online store.
*** Bug 144541 has been marked as a duplicate of this bug. ***
Correct. Setting dataloss/critical per Comment #33, etc. I've saved pages that turned out to be important and were not actually saved. Unfortunately the two bugs this depends on need fixing; it'll take a bunch of developer work and time. Perhaps the Release Notes need an entry. Replacing summary :saving page w/counter will resubmit cgi and save that rather than the cached version with : Save Page As... will attempt to resubmit cgi/reload (e.g. bank/financial/purchase/post transaction is re-attempted and may fail, or even worse, succeed; page counter will increase) and try to save that rather than the cached or the displayed version. Removing URL (http://www.sdsu.edu/~boyns/counter.html); it's now 404. Marking Bug 144541 a duplicate. 'Save Page As...' should NEVER cause a form to be re-submitted. Bug 185368 is a symptom/dupe of this bug and the now deprecated frame.
Severity: normal → critical
Keywords: dataloss
Summary: saving page w/counter will resubmit cgi and save that rather than the cached version → Save Page As... will attempt to resubmit cgi/reload (e.g. bank/financial/purchase/post transaction is re-attempted and may fail, or even worse, succeed; page counter will increase) and try to save that rather than the cached or the displayed version.
http://bugzilla.mozilla.org/show_bug.cgi?id=160454#c6 perhaps explains how this bug can be fixed, though I wonder if printing a page that's being displayed but isn't cached and mustn't be reloaded over the 'net is going to require a change in how printing works. Perhaps this bug shold be filed under printing, not file handling? (Just a guess; perhaps Chris Lyon, who I note has experience assigning, can tell us if I'm FOS.)
what does printing have to do with this? printing can just use the in-memory DOM and re-layout it; while saving the page really does require the original page from somewhere.
I haven't verified this, but it looks like maybe encrypted pages aren't cached.
Useless-ui as "File | Save Page As" fails to save page in these instances. Ben Goodger is assignee. Considering his hard work on MozillaFirebird, he might not have time to work on this bug. Should we reassign it?
Flags: blocking1.7a?
Keywords: useless-UI
To whom, exactly? Ben did check in the code in question, you know...
Clobbering useless ui. Useless and not working 100% aren't the same. For the people who are whining in this bug instead of posting patches: Does File>Save Page As...>Save As Type: Web Page, complete. Give you the same problem?
Keywords: useless-UI
Yes, same problem. The problem seems to occur before you even get the chance to select 'Web page, complete'. I'm using 1.3a, btw. To reproduce, go to www.theprinterworks.com and add something to your shopping cart. When I try to save the shopping cart page, I get the error "The link could not be saved. The web page might have been removed or had its name changed." 'View:Page Source' on the page seems to work, though. If 'Work Offline' is selected, 'Save Page As...' just silently fails. Please ignore my earlier assertion. I have no clue how the cache works.
Erm. Nothing before 1.4 should be considered remotely supported unless you're on MacOS9 in which case you should be using a vendor for tech support. For anyone using a non branded mozilla you should be using 1.5 or later. If you're using something older than those you should provide a reason ("Bug XXXXX made mozilla 1.4 and later unusable").
1.6b does the same thing, except, when working offline, I now get the same error message as I do when online.
We should really hold the full, original, source of any page that we're still displaying in memory. Considering our current expansion factor, it wouldn't increase our footprint too much, and it would allow us to solve this class of bugs with saving pages and similar bugs with character encoding switching.
looks like we are going to miss the window for 1.6 on this one.
Flags: blocking1.7b?
Flags: blocking1.7a?
Flags: blocking1.7a-
Flags: blocking1.7b?
Flags: blocking1.7b-
Flags: blocking1.7?
It would be great to have a testcase for this bug.
Flags: blocking1.8a?
not likely for 1.7.
Flags: blocking1.7? → blocking1.7-
Flags: blocking1.8a?
Flags: blocking1.8a-
Flags: blocking1.7a-
Flags: blocking1.8a2?
from posting to the newsgroup... Felix Miata wrote: >> In a generic sense, this has happened to me before. Today in 1.7rc2 was >> different than I remember before. When trying to save the "thank you for >> placing your order" page at https://servicesales.sel.sony.com/..., which >> came up after placing an online order with a credit card, I was greeted with >> this alert: >> >> The link could not be saved. The web page might have been removed or had its >> name changed. >> >> I got the same result doing view source and trying to save that. I was able >> to save the page by selecting all on view source and copying that to some file. >> >> I have previously been unable to save thank you pages from >> https://secure.computergate.com/ online purchases. There, instead of the >> alert I got today from Sony, the page does save, but differently than >> displayed, losing all relevant content and instead including the following >> content: >> >> "Unauthorized Access Detected, Access Denied >> >> You attempted to access a secure page without logging in first, OR you >> attempted to access a secure page in the incorrect sequence. . . ." >> >> The only existing bugs that look at all related to this are bug 166786 & bug >> 172429, but neither seem to be on point. Does anyone know of an existing bug >> on point? If not, can someone explain to me what special information needs >> to be collected the next time in order to file a good bug?
I ran into this bug weekly in the same scenario, as I buy vitamins/music/books/tickets/etc on the web and save the page off for later. This is another one of those bugs that once I ran into it and saw the bug (this one) had been entered was went unfixed for 3 years that I went back to using IE. As I commented on another bug, if 1.7 is to be the "long lived" branch, it seems that major deficiencies like this in a common use case should raelly be fixed.
On http://www.pcimicro.com/ on the post-order-submission (confirmation) page it isn't even possible to view source, much less save the page. Trying either gets an empty shopping basket page.
This bug and the same concerning view source make it hard to use mozilla for intranet applications. The source never maps with what is displayed. Often, the source is the error page for unauthentified users or for session lost. Today I was just unable so save an error page to send it to the developer. The same way, it is impossible to save page that summaries a payement transaction because they cannot be done twice (hopefully most servers don't make you pay twice...)
Workaround: If you only need to save a copy of the rendered page view, use the Print function. Print to a .ps file, or use something like PDFcreator for .pdf. Useful for saving invoices, banking statements, etc. Not useful for seeing actual page sources. Note: In no way is this intended to discourage squashing of this bug. It is a bad one and IMO really needs to be fixed. <http://sourceforge.net/projects/pdfcreator/>
Flags: blocking1.8a2? → blocking1.8a2-
Another workaround is to save the page as text. You lose all the formatting but the text is OK. I've been doing this for years. But, yes, I agree that this is a nagging problem and needs to be fixed soon.
critical, dataloss: seems a candidate for "blocking-aviary1.0PR"
Flags: blocking-aviary1.0PR?
Flags: blocking1.7b-
Flags: blocking-aviary1.0PR? → blocking-aviary1.0PR-
I believe that I have been the victim of this bug, using Firefox 0.9.1 on MacOS 10.3. I used "Save Page As..." on a .jsp page contaning an order confirmation, and I ended up with a duplicate credit card charge that was a pain to rescind. I just now saw the problem on another .jsp page using Firefox 0.9.3; luckily it was not a financial transaction. I am rather shocked to see the long history of this bug. If it is so hard to fix, at least warn the users -- "Saving this page as html may re-submit the information used to generate it: OK/Cancel".
I really can't understand why it takes years to fix this problem which is for me (and many others) critical and almost showstopper for a real Firefox 1.0 or whatever non beta ...
Present in Firefox 1.0 release (it's caught me twice.) Mozilla/5.0 (Windows; U; Win98; en-GB; rv:1.7.5) Gecko/20041110 Firefox/1.0 This kind of bug could generate really bad PR, as it costs people money and it upsets the banks. I support the just-in-case dialogue box of Comment #59, if nothing better is possible. For Firefox's sake a link is going in my signature. Time to get this one squashed.
There seems to be a completely *weird* philosophy in saving / viewing source. Under no circumstances should either function have to access the site again. The chosen philosophy causes an endless list of problems, ranging from annoying to dataloss/moneyloss critical. A security patch really should be issued for Moz and Firefox ASAP.
Can someone provide a link to a page that shows this problem in a nightly build? And no "go buy something on e-bay then try to save" or "sign up for a back account then transfer your money". A page that we can get to without signing up for anything and can submit over and over without fear of going backrupt.
http://www.duckbytes.com/dbstore_demo.php is a demo for an online store system. It acts like a real online store, but doesn't actually charge money (and you can enter fake data for everything, I entered "123" for the credit card number). I placed an order, attempted to save the confirmation page, and the saved page was not the displayed confirmation page but a page which said "cart is empty". This is exactly what you get if you hit the reload button, only you don't get a warning when you attempt to save the page. I'm also adding this bug to be tracked in bug 288462.
Blocks: 288462
Both Save As and View Source work for me on that page in Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b2) Gecko/20050326 Firefox/1.0+ Have you (can you) try with a nightly build (http://www.mozilla.org/developer/)? Also, what are your cache settings?
Ok, I downloaded the nightly build (I had been using Firefox 1.0.2). I was able to reproduce the problem. However, I did notice it only happens when using "Web Page, HTML only". "Web Page, complete" (and also "Text Files") correctly saves the page. I usually like to save HTML only because I don't care about all the image files (logos, etc.) that are typically on these types of confirmation pages. My disk cache is set at 50 MB and I cleared the cache before doing this test. I still would be wary about even using "Web Page, complete" (and thus I'll continue to use IE for these types of transactions) until I can be assured that Mozilla will never, ever, ever, ever repost form data without warning me and without my confirmation. To me this behavior is about as bad as an e-mail program which opens executable attachments without warning.
Gotcha, I can reproduce with HTML only.
Summary: Save Page As... will attempt to resubmit cgi/reload (e.g. bank/financial/purchase/post transaction is re-attempted and may fail, or even worse, succeed; page counter will increase) and try to save that rather than the cached or the displayed version. → succeed; page counter will increase) and try to save that rather than the cached or the displayed version. Save Page As Web Page, HTML Only will attempt to resubmit cgi/reload (e.g. bank/financial/purchase/post transaction is re-attempted and may fail, o…
Asking to block since it is a dataloss bug.
Flags: blocking-aviary1.1?
Flags: blocking1.8a2-
Flags: blocking1.8a1-
Flags: blocking1.7-
Flags: blocking-aviary1.1?
Flags: blocking-aviary1.1-
Flags: blocking-aviary1.0PR-
*** Bug 293470 has been marked as a duplicate of this bug. ***
*** Bug 235142 has been marked as a duplicate of this bug. ***
(In reply to comment #68) > Asking to block since it is a dataloss bug. I'm very surprised this was minused, as it may also be a security risk.
*** Bug 300037 has been marked as a duplicate of this bug. ***
This happened to me again today (Firefox 1.0.5), and this time it has actually cost me money (unlike any security bug this far). Maybe the talk of CGI counters in the beginning of this bug distracts people into thinking this is important. At the very least it should refuse to anything if it cannot save the cached page. Submitting a URL is not a free operation! I now have the regretful task of calling the suppliers premium rate number from another continent to undo the duplicate charge (if that is at all possible). This bug should have the 'ecommerce' keyword, but I am just a lowely user and cannot add that.
Flags: blocking-aviary2.0?
Flags: blocking-aviary1.1?
Flags: blocking-aviary1.1-
Flags: blocking-aviary1.0.7?
Flags: blocking-aviary1.0.6?
Not a regression. Minusing for aviary1.0.6/1.0.7.
Flags: blocking-aviary1.0.7?
Flags: blocking-aviary1.0.7-
Flags: blocking-aviary1.0.6?
Flags: blocking-aviary1.0.6-
Flags: blocking-aviary1.1? → blocking-aviary1.1-
For a clear example, go to http://www.cartex.co.uk and add some items to your basket, then save the basket page as Web Page HTML only. Open the saved page and - voila, another item magically appears. In the past I've actually had a duplicate order with this company because I saved the confirmation page - and it resubmitted my order. Judging by how long this bug's been around, the underlying problem will take some time to fix properly. In the meantime, as others have said, the browser should definitely warn when you try to save this kind of page. Would it be helpful if I filed a seperate bug for the warning dialog?
Can this be fixed please?
Darin, biesi, I think this is something consumers of webbrowserpersist have to handle (try with the "only from cache" flag, and if that fails request permission from the user to retry for real)... webbrowserpersist itself has no way to know what you mean when you call saveURI, and imo should not be posing any prompts. Thoughts?
Shouldn't we instead have a way to guarantee that we have the source available for any Web page that we're displaying? Why should the fact that we don't ever have to be "handled"?
We could just keep two copies of the source, I suppose (in cache and not in cache). That's the only way I can see to always have the source available given that the cache can be disabled.
I think that this should be done: - bug 262350 should be fixed - store the cache token for the currently-loaded page on docshell or contentviewer something - make "save page as" use that function - then do comment 77
Depends on: 262350
I don't think we should rely on cache tokens. For example, the cache will refuse data that exceeds a certain byte length, so the cache token will be invalid. There are other cases too where we might not have a usable cache token.
Assignee: bugs → file-handling
Status: ASSIGNED → NEW
QA Contact: chrispetersen → ian
Whiteboard: dupeme
As a user, the expected result of saving or printing a page currently being displayed is to do exactly that -- not re-fetch the page first. As others have already pointed out, a multitude of problems can be caused by the way Mozilla currently handles this. Call it a technical issue, call it a HCI issue.. regardless, it needs to be fixed ASAP.
Another user here. I heard about this from Slashdot. Anyone care to give a user-friendly status update, when we might see a fix etc? Thanks.
Please, NO MORE "USER COMMENTS" like comment 82 or comment 83. Don't respond to this comment either. These make it significantly harder for developers to discuss and fix the bug since it's harder to find the technical discussion amid all the noise.
Then how about giving people a status update every so oftern so that other people don't have to post these kinds of messages? I'm fully aware that "me too" comments shouldn't be posted, but how else are we can we find out what's going on? I mean it has been 4 years since this bug was filed. Bugzilla isn't just for the developers. Perhaps there needs to be a new feature added to Bugzilla to seperate technical comments from other comments?
If there were technical progress being made on this bug, you would likely be reading about it here and seeing patches attached. Community forums like Mozillazine are a more appropriate place for comments not directly related to fixing a bug. It's probably not the best use of developer time for them to go around to every bug that hasn't been worked on to say "nope, haven't gotten to this yet" every so often.
*** Bug 321037 has been marked as a duplicate of this bug. ***
I really doubt this is branch-friendly, but moving it over to the core nominations for evaluation.
Flags: blocking-aviary2? → blocking1.8.1?
Annoying, and if complete and text-only work it sounds like it won't be too invasive: 1.8.1+.
Flags: blocking1.8.1? → blocking1.8.1+
why do you say that? web page, complete and text only just serialize the DOM, while HTML only should save exactly what the server sent. so they are rather different...
How about a status update? This bug was first reported over FIVE years ago. My god, this is a CRITICAL issue. I thought this was only a pain in the ass for web developers, but seeing that this affects normal users, and has implications for people's financial information?! What does it take to light a fire under someone's ass and get some progress on this bug? How about some news?
Note Bug 136633. Also note for historical purposes, Bug 40867, which shows that this problem was marked as a Blocker, and fixed back in 2002.
Jonas - can you take a look at this?
Assignee: file-handling → bugmail
The only way I can see fixing this is to serialize the DOM like "Web page complete" does. Or at least use that as fallback when the original source is not available in cache. I'm not exited about the idea of pinning the source in cache using cache tokens. Even disregarding the fact that it won't work for too large source-files, it seems like a waste of memory when for 99% of the pages you go to you're not interested in getting to the source.
What's worse than re-getting the source is resubmitting form-data. Maybe fix it in two steps (1: don't re-submit, 2:don't re-get) if it cannot be done in one step (have a smart cache). Or at least pop up a warning as "step 0" before harmful operations are carried out.
(In reply to comment #94) > it seems like a waste of memory when for 99% of the pages you go > to you're not interested in getting to the source. You mean a waste of disk space, right? (In reply to comment #95) > What's worse than re-getting the source is resubmitting form-data. But that's exactly the same thing, is it not?
Well, if we save to disk it's a waste of cycles which are short to come by during page load.
? We already do that. It's called "disk cache". Most pages do not end up in the memory cache.
Given recent comments it doesn't look like there is an easy low-risk fix. If that changes please re-nom.
Flags: blocking1.8.1+ → blocking1.8.1-
So is pinning in cache and hoping it'll end up in disk really what we want to do? Darin?
I think that we need to not rely on the disk cache for proper functionality here. If the network cache cannot supply the document, then we should just serialize the DOM like IE does.
*** Bug 357084 has been marked as a duplicate of this bug. ***
Summary: Save Page As Web Page, HTML Only will attempt to resubmit cgi/reload (e.g. bank/financial/purchase/post transaction is re-attempted and may fail, or even worse succeed; page counter will increase) and try to save that rather than the cached or the displa… → Save Page As Web Page, HTML Only will attempt to resubmit cgi to save instead of using cached or displayed content
http://127.0.0.1:1051/bug.cgi WHATEVER THIS IS, IT'S RUINING MY LIFE!! PLEASE HELP!! I ALSO GET A LITTLE SQUARE BOX WITH AN X IN IT. NO CLUE AS TO WHAT ANY OF THIS MEANS...DO YOU???? THANK YOU, CAROL
It is not JUST a case of using the cache, rather than refreshing. Using Firefox 2.0.0.3, I keep finding that you do a 'Save As' on the simplest web pages (even text only ones) and the software 'appears' to do the save and reports no error, yet you find that none of the pages have in fact been saved. So there is also a problem with using a temp folder, in that the existence/validity of that temp folder needs to be confirmed, or AT LEAST an error message should be given to the user, that the page was NOT saved. I've repeated this test over 200 times, and Firefox has not saved ANY of the pages, as html-only or even text-only, yet IE6 has consistently saved those pages. Thinking outside the square, not being able to save a web page as viewed into a single file (like IE does with '.mht' files) remains the biggest single obstacle to wide use of Firefox2. Instead of going down a proprietary MHT route, it would be great to be able to save any html page as an ODT file, or it too hard, a PDF, as either of those are more portable/standardised than MHT. And you wouldn't want the interactive code stored, but simply the current screen paint of such request from a server, so that you could keep a simple permanent record of that database retrieval (airline booking or whatever) as it was presented to you at that time, rather than the code, which would not work the same ever again, as the server's data would be different if re-run. In other words, you almost want what you get with a print-screen, except that of course it would be best to have the text as text, rather than pixels... but the goal is a perfect archive of what a user saw at that time... THAT would beat the pants of what IE6 can do!!!
I agree that Firefox needs the ability to save a web page as shown on the screen, including all text typed in input fields, selection of radio buttons, etc. This is often vital for use as a receipt of online transactions, and currently the only way to get this is to print the page (note that solutions that allow you to print to a PDF or PostScript file are common and need not require specific support from Firefox -- they're generally done as a virtual printer). Problems with printing a page to use as a receipt include inability to copy and paste the text later, inability to see all the text contained in TEXTAREAs with scrollbars, issues with content getting cut off on the right side of the page, etc. Therefore, my proposed solution for this (bug 293834) is to save as HTML but to have the form inputs filled in as currently shown on the screen. I don't see any pressing need for a single-file archive like IE's .MHT -- an HTML file plus an accompanying directory of images and other support files generally works well. However, I don't think further discussion of this should be held on this bug, which is focused on the problem of page requests getting resubmitted when saving. (I do think it's good for those working on this bug to know that work on bug 293834 is underway, though, since they do relate.)
after six years still in Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071025 Firefox/2.0.0.9 very irritating and can compromise security in some situations
I just wrote a simple PHP script to test the scenario described in the initial comment of this issue, and the test showed that this bug doesn't exist (at least, in Firefox 3). Both saving and viewing source (a different bug) work as expected. I think this issue should be closed. If this bug still shows up, conditions must be different from what's described in the initial comment and should be described in more detail, and a new issue should be opened, in my opinion. If somebody wants to test this, here's the code (to save you time): <?php // load counter data $info = simplexml_load_file("counter.xml"); $counter = $info->counter + 1; echo "you are visitor #$counter"; // update the counter $info->counter = $counter; // save to file file_put_contents("counter.xml", $info->asXML()); ?> counter.xml: <?xml version="1.0" ?> - <root> <counter>11</counter> </root>
If I open a page with 300 images, wherefrom some images do not exist, but I fully let the page load, then turn off my internet connection, and hit 'Save Page As', my browser freezes completely, even the OS becomes nearly unresponsive, and many to-be-saved images and the html file are never saved anywhere. ( on XP & FF 3.03 )
Assignee: jonas → nobody
QA Contact: ian → file-handling
When I save pages or images, which are visible on the screen, Firefox 3.0.4 re-downloads them. Especially bad when I try to save confirmation pages after purchasing something with a credit card. I have ceased using Firefox for all credit card transactions due to this bug, instead using any other browser all of which have no trouble saving the currently viewed page.
There are some reasons that a resource may not be retrieved from cache, or stored in the cache at all. This includes a range of cache control directives that an HTTP 1.1 server may send. A client needs to obey these to be compliant with the HTTP spec. (there may be security implications to changing any of Firefox's behaviour in regard to this, which we should consider) It is acceptable for the client to use a cached copy in situations such as pressing the back button or saving/printing the current page even if the cache-control headers originally sent with that resource indicate that it is no longer fresh or should not be retrieved from the cache for any reason. However, cache directives such as "must-revalidate" and "no-store", if I understand correctly, may need be honoured even in these trivial situations. For instance, a "no-store" directive sent by the server will cause the original version of the page to never enter the cache. Viewing the source or saving such a resource would not be possible without resubmitting the request as the original response was not stored. There may also be other reasons that something cannot be sent to the cache. The original post in this bug refers to a "web counter" which increments as the result of a GET request. Despite other comments above, the correct behaviour is not a security risk brought about by the browser, because a browser is supposed to be able to re-submit a non-POST request at any time, without notifying the user, and without adverse affects on a web application (in the case, the counter). The fact that this web counter increments as a result of such a re-request, even a duplicate request coming from the same client, is a result of the way the counter is written and is not a bug in Firefox or the HTTP spec; it is correct behaviour that such a counter be incremented again if: - The client needs to request the resource again, and - The original response was uncacheable, and - It's just a GET request, and - The counter increments the count value on repeat requests. The example in comment #109 does not take into account that the server may be sending different cache-control directives and may even fall into a different zone (Firefox treats localhost differently to external addresses with regard to caching). The design of such a web counter could be improved such that it tries to detect repeat requests, though as long as it updates without POST-ing a form, then it will be susceptible to this kind of thing. On the other hand, if there are ways in which we could improve Firefox's re-use of requests without disobeying the HTTP 1.1 spec or weakening the security of any online application that depends on current caching behaviour, then such an improvement would be good. Particularly, if the browser were re-submitting POST requests under any circumstances without warning the user, then this is a problem that would need to be fixed.
Thomas Rutter: "There are some reasons that a resource may not be retrieved from cache, or stored in the cache at all. This includes a range of cache control directives that an HTTP 1.1 server may send. A client needs to obey these to be compliant with the HTTP spec. (there may be security implications to changing any of Firefox's behaviour in regard to this, which we should consider)" Correct. But the cache should have nothing to do with printing the currently viewed page/image, saving the currently viewed page/image, viewing the source of the currently viewed page, or any other similar functions. The cache is relevant when a person wishes to go to a page a second time -- not the first time. The page being viewed in the here and now is the page being viewed in the here and now, not some to-be-downloaded-again page, that would be a *different* page, not the one being viewed when the user hits Save As, Print, View Source, etc. Similarly, when the Back button is hit, the user expects to go "back" to the page as it was when they were first viewing it, not having it reloaded from the network, which would actually be going forward to a *new* page. This is all the more pronounced today, when most web pages are dynamically generated content and change when reloaded. Why should a Back (or Forward) button not go to the page the person was on previously? Is this not how everyone understands the functioning of Back/Forward? It is how I have always understood it. If I do any of the functions mentioned when I am using other browsers, such as Opera or ELinks (the two I use the most apart from Firefox), it will never reload the pages from the network -- the pages are already there being viewed. This seems pretty clear and straight-forward to me and numerous others in the comments on these many interrelated bug reports.
(In reply to comment #113) > Similarly, when the Back button is hit, the user expects to go "back" to the > page as it was when they were first viewing it, not having it reloaded from the > network, which would actually be going forward to a *new* page. This is all > the more pronounced today, when most web pages are dynamically generated > content and change when reloaded. Why should a Back (or Forward) button not go > to the page the person was on previously? Is this not how everyone understands > the functioning of Back/Forward? It is how I have always understood it. In fact, that is also how the W3C or the IETF understand it--one of the W3C or IETF specifications (probably the HTTP RFC) says that the Back button is expected to show what was shown before, and _not_ re-retrieve the page to show its _current_ content.
Well, "view source" is mainly used by web developers. Usually, they want to see the downloaded source code, not the current serialized DOM - especially because the DOM may have been parsed badly from improperly formatted source. However, for printing and saving, it makes much more sense to save the modified DOM. For example, many times people "save" pages to show them to people, and when those pages utilize JavaScript - as is becoming more common - that doesn't really help anyone. Printing is similar, in that you would want the current DOM printed - just with the correct CSS applied to it. Even when a page is not cached, the current DOM should be available of course. So this seems like a viable solution. Still, there's a separate bug for View Source, and I think it's a completely separate issue. If I want to see the current source, I would use "View Selection Source" or Firebug, DOMi, etc. -[Unknown]
Firebug doesn't display the actual source, it displays the "DOM source". There can be a significant difference between the two, as I have discovered more than a few times, to much pain and teeth-gnashing. I noticed that some of the pre-1.0 bugs were mentioned recently. It's a sad state of affairs, this *was* fixed in the days before 1.0, but was broken again shortly after 1.0 was released. It's incredibly frustrating and painful that 6 years later, this bug still exists, and that people have refused to pay attention to RFC's, security warnings and the like, and have supported the broken functionality we continue to see today. This is a feature that worked in the earliest days of browser development, was (and is) a continued critical piece of functionality in some circles. What will it take to get it fixed?
(In reply to comment #116) > What will it take to get it fixed? Making this widely known. If a popular publication shows the world that somebody using Firefox has paid twice for a product or something along these lines, Mozilla developers will jump on this. Their ego won't let them allow their product to be worse than all other available browsers in an essential area. They criticize MS for having IE violate some CSS standards, but they let Firefox have bugs that can cost people money.
Peter, That is my point. What Firebug shows is what you want to print, and what you want to save, but not what you want to see as View Source. In other words, the same solution cannot be used on all three of these together. (In reply to comment #117) > Making this widely known. If a popular publication shows the world that > somebody using Firefox has paid twice for a product or something along these > lines, Mozilla developers will jump on this. Their ego won't let them allow > their product to be worse than all other available browsers in an essential > area. They criticize MS for having IE violate some CSS standards, but they let > Firefox have bugs that can cost people money. I am sorry. I am a developer and I've made a few ecommerce type sites myself. If any site ever charges someone twice due to them posting twice, that site is violating many public and well established standards, and is not designed properly. Even if this bug were fixed, and no browser had this flaw, it would be a huge and CRITICAL flaw in the site for a long list of reasons. Should such a situation exist, I would firstly and rightly blame the site author, not Firefox. Even so, Firefox should be fixed, but only because this affects other forms. -[Unknown]
OK. This bug as originally filed has been fixed forever, ever since the UI actually started passing LOAD_FROM_CACHE into the core code when saving. Comment 109 confirms this. If you are having an issue with saving things not hitting the cache, please take the following steps: 1) Make sure that your issue is reproducible with a current trunk nightly from 2009-02-20 or later. That means that you have the fix for bug 84106 in your build, as well as the new image cache that no longer stomps on the necko memory cache. 2) File a bug with a description of your problem, including step by step directions to reproduce. Don't just say "save the image", for example; tell me which exact menu items you're using in which exact menu, or which keyboard shortcuts you're using. 3) cc me on the bug you file; it seems that some people have been marking bugs about save issues with cache duplicates of unrelated bugs, so if you don't do this there's no guarantee that anyone competent will actually see your bug. This code has been more or less abandoned for the last 6 years or so, but that doesn't mean it can't get improved. It does mean that there are several bugs on file which are so confused (due to covering a host of unrelated problems, some of which have been fixed for years) as to be completely unusable. Let's do a good job of tracking the problems that remain, with one bug per problem, so that we can get them fixed, get testcases added for them, and make sure the bugs don't reappear in the future. Thanks, all.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → WORKSFORME
(In reply to comment #114) I wrote: > In fact, that is also how the W3C or the IETF understand it--one of the > W3C or IETF specifications (probably the HTTP RFC) says that the Back > button is expected to show what was shown before, and _not_ re-retrieve > the page to show its _current_ content. Someone found what I was remembering partially: RFC 2068 (HTTP 1.1), section 13.13 (History Lists).
(In reply to Boris Zbarsky [:bz] from comment #119) > OK. This bug as originally filed has been fixed forever, ever since the UI > actually started passing LOAD_FROM_CACHE into the core code when saving. > Comment 109 confirms this. > > If you are having an issue with saving things not hitting the cache, please > take the following steps: > > 1) Make sure that your issue is reproducible with a current trunk nightly > from > 2009-02-20 or later. That means that you have the fix for bug 84106 in > your > build, as well as the new image cache that no longer stomps on the necko > memory cache. > 2) File a bug with a description of your problem, including step by step > directions to reproduce. Don't just say "save the image", for example; > tell > me which exact menu items you're using in which exact menu, or which > keyboard shortcuts you're using. > 3) cc me on the bug you file; it seems that some people have been marking > bugs > about save issues with cache duplicates of unrelated bugs, so if you don't > do this there's no guarantee that anyone competent will actually see your > bug. > > This code has been more or less abandoned for the last 6 years or so, but > that doesn't mean it can't get improved. It does mean that there are > several bugs on file which are so confused (due to covering a host of > unrelated problems, some of which have been fixed for years) as to be > completely unusable. Let's do a good job of tracking the problems that > remain, with one bug per problem, so that we can get them fixed, get > testcases added for them, and make sure the bugs don't reappear in the > future. > > Thanks, all. Is there a list of these follow-up bugs somewhere?
I'm not aware of anyone filing any.
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: