Last Comment Bug 293834 - Save Page As should save form inputs' state
: Save Page As should save form inputs' state
Status: RESOLVED FIXED
:
Product: Core Graveyard
Classification: Graveyard
Component: File Handling (show other bugs)
: Trunk
: All All
: -- normal with 2 votes (vote)
: mozilla1.9.1a2
Assigned To: Ben Karel [eschew]
: Hixie (not reading bugmail)
Mentors:
Depends on: 448445 449692
Blocks: 115107
  Show dependency treegraph
 
Reported: 2005-05-11 16:23 PDT by Dan Harkless
Modified: 2016-06-22 12:16 PDT (History)
16 users (show)
jonas: wanted‑next+
bzbarsky: in‑testsuite+
See Also:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
approach 1: update attrs before serialization (5.91 KB, patch)
2007-08-20 17:08 PDT, Ben Karel [eschew]
no flags Details | Diff | Review
As much as possible in nsWebBrowserPersist (6.29 KB, patch)
2007-08-22 03:08 PDT, Ben Karel [eschew]
no flags Details | Diff | Review
Like so? (15.77 KB, patch)
2007-08-22 22:59 PDT, Ben Karel [eschew]
bzbarsky: review-
Details | Diff | Review
update to review comments 1 (14.08 KB, patch)
2007-08-24 08:34 PDT, Ben Karel [eschew]
bzbarsky: review+
Details | Diff | Review
Goes back to SetAttribute instead of SetDefaultValue (14.13 KB, patch)
2007-08-25 01:03 PDT, Ben Karel [eschew]
eschew: review+
jonas: superreview+
Details | Diff | Review
Same as last patch, updated to mozilla-central (13.29 KB, patch)
2008-07-26 08:53 PDT, Ben Karel [eschew]
eschew: review+
eschew: superreview+
Details | Diff | Review
Semi-manual testcase for form state (435 bytes, text/html)
2008-07-29 09:10 PDT, Ben Karel [eschew]
no flags Details
Mochitest version of above testcase (8.88 KB, patch)
2008-08-11 23:12 PDT, Ben Karel [eschew]
no flags Details | Diff | Review
Updated mochitest patch (8.88 KB, patch)
2008-08-11 23:19 PDT, Ben Karel [eschew]
bzbarsky: review+
Details | Diff | Review

Description Dan Harkless 2005-05-11 16:23:00 PDT
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.7) Gecko/20050414
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.7) Gecko/20050414

When saving a page as an HTML file, it should be the page as it currently
appears in the browser.  Specifically, any text that's been entered in form
fields should be preserved, radio button and checkbox state be retained, etc.

One place this would be very desirable is in saving pages that you're filling
out as part of an online transaction.  Right now you only have the option to
print the pages to have a record of your inputs, but that approach has many
limitations, including not being able to copy & paste data from the saved page
later, having text that appears in TEXTAREAs with scrollbars be truncated, etc.

This came up for me today when I was trying to enter a bunch of text in an
online evaluation form and the site was giving me a generic insert error.  When
I reported the problem to the webmaster, he requested I send him a screenshot of
the data I was trying to submit, but I had to reply that this would be
insufficient because the form in question was many screenfulls long and I had
typed much text in TEXTAREAs that would be truncated in screenshots.  Mozilla
not giving me any way to save the form with my inputs, I was forced to save the
basic HTML and then manually edit it to add SELECTED attributes to OPTION tags,
CHECKED to checkboxes and radio buttons, VALUE attributes to TEXT inputs, and
paste typed text into the TEXTAREAs.

This feature would also be very useful to proactively protect oneself from data
loss on those pages that give you an error (e.g. because you've been typing so
long that your session expired) and then when you hit Back all the form fields
have been cleared to their default state.  If you had typed a large amount of
text that you wanted to make sure you didn't lose, you could simply save the
page, and then if you got an error you could easily copy and paste your text
into a new instance of the page when retrying.

Reproducible: Always

Steps to Reproduce:
1. Change some inputs on a form page.
2. Do Save Page As.
3. Open the file.

Actual Results:  
The file does not include any changes you made to the form inputs.

Expected Results:  
It should preserve your entered text and other form input changes.
Comment 1 Boris Zbarsky [:bz] (Out June 25-July 6) 2005-06-16 21:19:41 PDT
jst, biesi, darin, thoughts?  This seems like a reasonable idea to me...
Comment 2 Christian :Biesinger (don't email me, ping me on IRC) 2005-06-17 05:39:25 PDT
hm, yeah, I think this is a good idea.
Comment 3 Gervase Markham [:gerv] 2005-09-27 02:07:09 PDT
This is an automated message, with ID "auto-resolve01".

This bug has had no comments for a long time. Statistically, we have found that
bug reports that have not been confirmed by a second user after three months are
highly unlikely to be the source of a fix to the code.

While your input is very important to us, our resources are limited and so we
are asking for your help in focussing our efforts. If you can still reproduce
this problem in the latest version of the product (see below for how to obtain a
copy) or, for feature requests, if it's not present in the latest version and
you still believe we should implement it, please visit the URL of this bug
(given at the top of this mail) and add a comment to that effect, giving more
reproduction information if you have it.

If it is not a problem any longer, you need take no action. If this bug is not
changed in any way in the next two weeks, it will be automatically resolved.
Thank you for your help in this matter.

The latest beta releases can be obtained from:
Firefox:     http://www.mozilla.org/projects/firefox/
Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html
Seamonkey:   http://www.mozilla.org/projects/seamonkey/
Comment 4 Boris Zbarsky [:bz] (Out June 25-July 6) 2005-09-27 14:17:58 PDT
This needs to be done over in webbrowserpersist...
Comment 5 Boris Zbarsky [:bz] (Out June 25-July 6) 2007-02-26 21:55:49 PST
We could perhaps do this by comparing value to defaultvalue for textarea/input, and so forth for the other controls.

Ryan, do you want to take a stab at this?
Comment 6 Ryan Jones 2007-02-27 10:21:12 PST
(In reply to comment #5)
> We could perhaps do this by comparing value to defaultvalue for textarea/input,
> and so forth for the other controls.
> 
> Ryan, do you want to take a stab at this?
> 

Sure, I'll take a shot!

I've got a few questions though, should we deal with select's for example  (as in make sure the item that was selected will be selected when the saved file is loaded)? Also, should we save the file path for input type="file" for example?
Comment 7 Boris Zbarsky [:bz] (Out June 25-July 6) 2007-02-27 10:27:01 PST
Dealing with <select>s could be nice, but might be a bit of trouble; could do that in a followup if desired.

I don't think we should save file paths.  It wouldn't do any good anyway, since there's no way to prefill a file input by parsing the HTML, so...
Comment 8 Ryan Jones 2007-02-27 10:30:19 PST
OK, I'll leave the handling of selects too a follow up bug. I'll try to look at this sometime this week or weekend.
Comment 9 Ben Karel [eschew] 2007-08-20 17:08:54 PDT
Created attachment 277463 [details] [diff] [review]
approach 1: update attrs before serialization

Taking this bug with permission from Ryan.

The content serialization code works by recursively serializing nodes with stored attributes into a string of tags with key="value" pairs. The problem is that attributes are used to store default values, while DOM state, such as user input to form elements, is stored in the elements themselves. Thus, this patch puts some code in right before attributes are serialized and updates the attributes to reflect the current state of the given element, including any user input. The existing framework takes care of the rest.

Attached patch handles textarea, textbox, checkbox, radio, and option/select.
Comment 10 Boris Zbarsky [:bz] (Out June 25-July 6) 2007-08-20 19:44:36 PDT
That doesn't look like the right approach.  It changes the DOM you're saving, which is a problem, imo.  In particular, saving a website could break its functioning.

What you want to do instead is to munge things on the cloned "fixup" nodes that web browser persist creates.
Comment 11 Ben Karel [eschew] 2007-08-22 03:08:41 PDT
Created attachment 277698 [details] [diff] [review]
As much as possible in nsWebBrowserPersist

Okay, so doing stuff in webbrowserpersist works for checkboxes, radio buttons, textboxes, and selects... but not textareas.

The problem is that there's an assumption in the design of the current system, that attributes are the only thing that could need fixing. But textareas violate that rule: textareas may have child nodes in the original DOM that shouldn't be serialized as-is! Combined with the desire to leave the original DOM intact, that leaves us between a rock and a hard place.

Possible solutions:

1) Go ahead and modify the DOM when serializing textareas. The only thing that would be modified would be either adding or changing the content of a single #text node under each textarea. Advantages: simple to write and understand. Disadvantages: modifies uncloned DOM.

2) Add a special case to nsDocumentEncoder::SerializeToStringRecursive's handling of child node serialization. Advantages: simple to write and understand, does not modify uncloned DOM. Disadvantages: violates the document-language-agnosticism of nsDocumentEncoder, adds a QI before looping over child nodes (perf issue?).

3) Play fast and loose with the content serializer interface, add a method like EncoderShouldSkipChildSerialization (or some such). This would, at the end of the day, be rather like option number 2. It would be more flexible, but also harder to understand, code-wise, because it would split the work done in one place in option 2 into at least three separate places in at least two files. More, if you could the unused stubs in the non-HTML content serializer implementations. Advantages: arguably less hackish than option 2. Disadvantages: necessitates adding stubs to at least four content serializer implementations. Added complexity may not outweigh less hackish nature.

Other approaches I'm missing?

Since I already attached a patch that does option 1, this patch tries option 2 on for size. I actually started with approach 3 at the very beginning, but discarded it when I found the joys of nsContentUtils::SetNodeTextContent in patch #1.

Also, since it's not just URI-related attributes that are being fixed up now, perhaps the name of nsWebBrowserPersist::CloneNodeWithFixedUpURIAttributes should have the "URI" removed?
Comment 12 Boris Zbarsky [:bz] (Out June 25-July 6) 2007-08-22 08:11:32 PDT
Peter, Jonas, what do you think is a good approach for the textarea issue?

My instinct is that we should expand nsIDocumentEncoderNodeFixup to allow fixing up the kids too (that is, move the fixup up a level from SerializeNodeStart and allow fixupNode() to return a boolean or something indicating whether the clone's kids should be used instead of the original node's kids.
Comment 13 Jonas Sicking (:sicking) PTO Until July 5th 2007-08-22 16:51:15 PDT
Yeah, that sounds like a good idea.
Comment 14 Ben Karel [eschew] 2007-08-22 22:59:48 PDT
Created attachment 277856 [details] [diff] [review]
Like so?

* Extends the fixupNode method with an [out] boolean param
* Removes "URI" from "CloneNodeWithFixedUpURIAttributes", since we're now fixing up more than just URI attributes. Technically, with textareas, we're fixing up more than just attributes, too, but I think it's clearer with just URI removed.
* Moves a call to FixupNode out of SerializeNodeStart, along with a new default parameter so that all the other callers of SerializeNodeStart (who don't directly iterate over child nodes) can continue to assume that SerializeNodeStart will do fixup for them.
Comment 15 Boris Zbarsky [:bz] (Out June 25-July 6) 2007-08-23 14:49:33 PDT
Comment on attachment 277856 [details] [diff] [review]
Like so?

>Index: content/base/src/nsDocumentEncoder.cpp
> nsDocumentEncoder::SerializeToStringRecursive
>+  nsCOMPtr<nsIDOMNode> node, maybeFixedNode;

|node| can just be an nsIDOMNOde* here, no?

>Index: content/base/public/nsIDocumentEncoder.idl
>+   * apply recursive serialization to the children of the cloned node, instead

s/cloned/fixed up/

>Index: embedding/components/webbrowserpersist/src/nsWebBrowserPersist.cpp
>+nsWebBrowserPersist::CloneNodeWithFixedUpAttributes(
>+                    rv = nodeAsInput->GetValue(valueStr);
>+                    if (NS_SUCCEEDED(rv) && !valueStr.IsEmpty())

So... if NS_FAILED(rv) should we really go on?  Not that that will happen here.  On the other hand, why not do this if valueStr.IsEmpty()?

>+                      outElt->SetAttribute(NS_LITERAL_STRING("value"), valueStr);

Would it be clearer to SetDefaultValue(valueStr)?

>+                    if (checked)
>+                      outElt->SetAttribute(checkedAttr, NS_LITERAL_STRING("true"));
>+                    else
>+                      outElt->RemoveAttribute(checkedAttr);

Again, would |outElt->SetDefaultChecked(checked)| make more sense?  Especially since checked="true" is not actually valid HTML...

>+                    break;

Toss in a "default: break" to prevent build warnings?

>+    nsCOMPtr<nsIDOMHTMLTextAreaElement> nodeAsTextArea = do_QueryInterface(aNodeIn);
....
>+            nsCOMPtr<nsIDOMNode> out = do_QueryInterface(*aNodeOut);

You just need that for convenience, right?  Why not:

  nsIDOMNode* out = *aNodeOut;

then?

>+            // Remove the textarea's child text nodes (should be at most one)

Shouldn't be any, in fact...

>+            rv = doc->CreateTextNode(valueStr, getter_AddRefs(childText));

etc.  How about replacing all that with a QI to nsIDOM3Node followed by a call to SetTextContent(valueStr) on he result.  That will handle removing existing kids (none), creating the text node, setting it to the right string, appending it, etc, etc.

>+    nsCOMPtr<nsIDOMHTMLOptionElement> nodeAsOption = do_QueryInterface(aNodeIn);
....
>+            if (selected)
>+              outElt->SetAttribute(selectedStr, NS_LITERAL_STRING("true"));
>+            else
>+              outElt->RemoveAttribute(selectedStr);

  SetDefaultSelected(selected);

Rest looks good.
Comment 16 Ben Karel [eschew] 2007-08-24 08:34:43 PDT
Created attachment 278065 [details] [diff] [review]
update to review comments 1

Updated to review comments.

I believe it is actually possible for the fixed up text area element we get to have child nodes, if mPersistFlags has PERSIST_FLAGS_FIXUP_ORIGINAL_DOM. It usually won't, but look at
http://mxr.mozilla.org/seamonkey/source/embedding/components/webbrowserpersist/src/nsWebBrowserPersist.cpp#2932

>+           // Avoid superfluous value="" serialization
>+           if (valueStr.IsEmpty())
>+             outElt->RemoveAttribute(NS_LITERAL_STRING("value"));
>+           else
>+             outElt->SetDefaultValue(valueStr);

This bit isn't strictly necessary, it just makes the serialization a little bit cleaner. You're right, though, about the last patch's logic being a little dodgy. In the case of an un-cloned textbox with a default value which the user then erased, the last patch would have serialized the default value even though the user had explicitly changed it.
Comment 17 Boris Zbarsky [:bz] (Out June 25-July 6) 2007-08-24 15:32:25 PDT
Comment on attachment 278065 [details] [diff] [review]
update to review comments 1

>+                    if (valueStr.IsEmpty())
>+                      outElt->RemoveAttribute(NS_LITERAL_STRING("value"));
>+                    else
>+                      outElt->SetDefaultValue(valueStr);


I guess SetDefaultValue() doesn't remove the attr, yeah.  In that case, RemoveAttribute/SetAttribute is clearer, I think.

With that changed back, r=bzbarsky
Comment 18 Ben Karel [eschew] 2007-08-25 01:03:54 PDT
Created attachment 278185 [details] [diff] [review]
Goes back to SetAttribute instead of SetDefaultValue

Carrying over r+, requesting sr.
Comment 19 Ben Karel [eschew] 2007-08-28 15:36:30 PDT
Just a note, I was looking at bug 115107 and realized that script elements follow the textarea pattern. That is, we want serialize (in nsDocumentEncoder) a #text child for script elements generated from information stored in the script element, in nsWebBrowserPersist.
Comment 20 Jonas Sicking (:sicking) PTO Until July 5th 2007-08-28 16:27:04 PDT
You mean <style> elements, right?
Comment 21 Ben Karel [eschew] 2007-08-28 17:34:45 PDT
Err, yeah, those things. ;-)
Comment 22 Boris Zbarsky [:bz] (Out June 25-July 6) 2007-08-28 18:32:21 PDT
Or <link>.  Yes, that could be done.  We'd probably want to fix some CSS serializer bugs first...
Comment 23 Jonas Sicking (:sicking) PTO Until July 5th 2008-07-24 01:32:10 PDT
Comment on attachment 278185 [details] [diff] [review]
Goes back to SetAttribute instead of SetDefaultValue

Looks great, sorry for the extremely slow review on this one :(
Comment 24 Ben Karel [eschew] 2008-07-26 08:53:14 PDT
Created attachment 331425 [details] [diff] [review]
Same as last patch, updated to mozilla-central

This patch is the same content diff as patch 278185, but done with hg in a mozilla-central tree instead of Firefox 3; thus, a few of the line numbers are slightly different.

I don't have commit access, so I'm carrying over r+sr and marking this as checkin-needed.
Comment 25 Boris Zbarsky [:bz] (Out June 25-July 6) 2008-07-28 23:45:52 PDT
Checked in.  It'd be good to have some tests here too.....
Comment 26 Ben Karel [eschew] 2008-07-29 09:10:40 PDT
Created attachment 331631 [details]
Semi-manual testcase for form state

I don't know how to go about doing automated testing of webbrowserpersist, but here's the testcase I used when writing the patch, cleaned up and possibly suitable for adding to Litmus.

Steps to test:
1) Open testcase in browser
2) Click "Fill Form" button
3) File > Save Page As
4) Open saved page and verify that the form is still filled.
Comment 27 Boris Zbarsky [:bz] (Out June 25-July 6) 2008-07-30 01:05:52 PDT
Could write a test that saves a webpage to a file and make sure it saves the right thing...
Comment 28 Ben Karel [eschew] 2008-08-11 23:12:54 PDT
Created attachment 333367 [details] [diff] [review]
Mochitest version of above testcase

This patch adds a test (and the test directory) to /embedding/test/. The directory /embedding/tests/ already exists, but it contains compiled-code and xpcshell tests, and convention seems to be to put mochitests in a test/ dir.

The test itself uses wbp in a manner similar to File > Save Page As. It loads a form in an iframe, fills it out, and saves the form to a file in the temp dir. Because mochitests can't read file:/// URLs, the test then reads back the persisted file contents and loads the contents into a second iframe. Finally, the two iframes are confirmed to have the same form state.

As a last note, the "loads the contents into an iframe" bit seems to trigger a known (but still unfortunate) assertion:

###!!! ASSERTION: Adding child where we already have a child?  This will likely misbehave: 'Error', file c:/moz/mozilla-central/docshell/shistory/src/nsSHEntry.cpp, line 599
Comment 29 Ben Karel [eschew] 2008-08-11 23:19:08 PDT
Created attachment 333368 [details] [diff] [review]
Updated mochitest patch

Sorry for the bugspam, but the patch above just needed one minor fix:

changed
@@ -0,0 +1,141 @@
to
@@ -0,0 +1,142 @@
Comment 30 Boris Zbarsky [:bz] (Out June 25-July 6) 2008-08-20 19:46:09 PDT
Comment on attachment 333368 [details] [diff] [review]
Updated mochitest patch

This looks great!  Thanks!
Comment 31 Boris Zbarsky [:bz] (Out June 25-July 6) 2008-08-25 10:27:40 PDT
Test checked in.

Note You need to log in before you can comment on or make changes to this bug.