Closed Bug 293834 Opened 20 years ago Closed 16 years ago

Save Page As should save form inputs' state

Categories

(Core Graveyard :: File Handling, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.9.1a2

People

(Reporter: mozilla-bugzilla, Assigned: eschew)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 6 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.7) Gecko/20050414
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.7) Gecko/20050414

When saving a page as an HTML file, it should be the page as it currently
appears in the browser.  Specifically, any text that's been entered in form
fields should be preserved, radio button and checkbox state be retained, etc.

One place this would be very desirable is in saving pages that you're filling
out as part of an online transaction.  Right now you only have the option to
print the pages to have a record of your inputs, but that approach has many
limitations, including not being able to copy & paste data from the saved page
later, having text that appears in TEXTAREAs with scrollbars be truncated, etc.

This came up for me today when I was trying to enter a bunch of text in an
online evaluation form and the site was giving me a generic insert error.  When
I reported the problem to the webmaster, he requested I send him a screenshot of
the data I was trying to submit, but I had to reply that this would be
insufficient because the form in question was many screenfulls long and I had
typed much text in TEXTAREAs that would be truncated in screenshots.  Mozilla
not giving me any way to save the form with my inputs, I was forced to save the
basic HTML and then manually edit it to add SELECTED attributes to OPTION tags,
CHECKED to checkboxes and radio buttons, VALUE attributes to TEXT inputs, and
paste typed text into the TEXTAREAs.

This feature would also be very useful to proactively protect oneself from data
loss on those pages that give you an error (e.g. because you've been typing so
long that your session expired) and then when you hit Back all the form fields
have been cleared to their default state.  If you had typed a large amount of
text that you wanted to make sure you didn't lose, you could simply save the
page, and then if you got an error you could easily copy and paste your text
into a new instance of the page when retrying.

Reproducible: Always

Steps to Reproduce:
1. Change some inputs on a form page.
2. Do Save Page As.
3. Open the file.

Actual Results:  
The file does not include any changes you made to the form inputs.

Expected Results:  
It should preserve your entered text and other form input changes.
jst, biesi, darin, thoughts?  This seems like a reasonable idea to me...
hm, yeah, I think this is a good idea.
This is an automated message, with ID "auto-resolve01".

This bug has had no comments for a long time. Statistically, we have found that
bug reports that have not been confirmed by a second user after three months are
highly unlikely to be the source of a fix to the code.

While your input is very important to us, our resources are limited and so we
are asking for your help in focussing our efforts. If you can still reproduce
this problem in the latest version of the product (see below for how to obtain a
copy) or, for feature requests, if it's not present in the latest version and
you still believe we should implement it, please visit the URL of this bug
(given at the top of this mail) and add a comment to that effect, giving more
reproduction information if you have it.

If it is not a problem any longer, you need take no action. If this bug is not
changed in any way in the next two weeks, it will be automatically resolved.
Thank you for your help in this matter.

The latest beta releases can be obtained from:
Firefox:     http://www.mozilla.org/projects/firefox/
Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html
Seamonkey:   http://www.mozilla.org/projects/seamonkey/
This needs to be done over in webbrowserpersist...
Assignee: dom-to-text → file-handling
Status: UNCONFIRMED → NEW
Component: DOM to Text Conversion → File Handling
Ever confirmed: true
Keywords: helpwanted
QA Contact: ian
Whiteboard: [good first bug]
We could perhaps do this by comparing value to defaultvalue for textarea/input, and so forth for the other controls.

Ryan, do you want to take a stab at this?
(In reply to comment #5)
> We could perhaps do this by comparing value to defaultvalue for textarea/input,
> and so forth for the other controls.
> 
> Ryan, do you want to take a stab at this?
> 

Sure, I'll take a shot!

I've got a few questions though, should we deal with select's for example  (as in make sure the item that was selected will be selected when the saved file is loaded)? Also, should we save the file path for input type="file" for example?
Assignee: file-handling → sciguyryan
Dealing with <select>s could be nice, but might be a bit of trouble; could do that in a followup if desired.

I don't think we should save file paths.  It wouldn't do any good anyway, since there's no way to prefill a file input by parsing the HTML, so...
OK, I'll leave the handling of selects too a follow up bug. I'll try to look at this sometime this week or weekend.
Taking this bug with permission from Ryan.

The content serialization code works by recursively serializing nodes with stored attributes into a string of tags with key="value" pairs. The problem is that attributes are used to store default values, while DOM state, such as user input to form elements, is stored in the elements themselves. Thus, this patch puts some code in right before attributes are serialized and updates the attributes to reflect the current state of the given element, including any user input. The existing framework takes care of the rest.

Attached patch handles textarea, textbox, checkbox, radio, and option/select.
Assignee: sciguyryan → web+moz
Status: NEW → ASSIGNED
That doesn't look like the right approach.  It changes the DOM you're saving, which is a problem, imo.  In particular, saving a website could break its functioning.

What you want to do instead is to munge things on the cloned "fixup" nodes that web browser persist creates.
Okay, so doing stuff in webbrowserpersist works for checkboxes, radio buttons, textboxes, and selects... but not textareas.

The problem is that there's an assumption in the design of the current system, that attributes are the only thing that could need fixing. But textareas violate that rule: textareas may have child nodes in the original DOM that shouldn't be serialized as-is! Combined with the desire to leave the original DOM intact, that leaves us between a rock and a hard place.

Possible solutions:

1) Go ahead and modify the DOM when serializing textareas. The only thing that would be modified would be either adding or changing the content of a single #text node under each textarea. Advantages: simple to write and understand. Disadvantages: modifies uncloned DOM.

2) Add a special case to nsDocumentEncoder::SerializeToStringRecursive's handling of child node serialization. Advantages: simple to write and understand, does not modify uncloned DOM. Disadvantages: violates the document-language-agnosticism of nsDocumentEncoder, adds a QI before looping over child nodes (perf issue?).

3) Play fast and loose with the content serializer interface, add a method like EncoderShouldSkipChildSerialization (or some such). This would, at the end of the day, be rather like option number 2. It would be more flexible, but also harder to understand, code-wise, because it would split the work done in one place in option 2 into at least three separate places in at least two files. More, if you could the unused stubs in the non-HTML content serializer implementations. Advantages: arguably less hackish than option 2. Disadvantages: necessitates adding stubs to at least four content serializer implementations. Added complexity may not outweigh less hackish nature.

Other approaches I'm missing?

Since I already attached a patch that does option 1, this patch tries option 2 on for size. I actually started with approach 3 at the very beginning, but discarded it when I found the joys of nsContentUtils::SetNodeTextContent in patch #1.

Also, since it's not just URI-related attributes that are being fixed up now, perhaps the name of nsWebBrowserPersist::CloneNodeWithFixedUpURIAttributes should have the "URI" removed?
Attachment #277463 - Attachment is obsolete: true
Attachment #277698 - Flags: review?(bzbarsky)
Peter, Jonas, what do you think is a good approach for the textarea issue?

My instinct is that we should expand nsIDocumentEncoderNodeFixup to allow fixing up the kids too (that is, move the fixup up a level from SerializeNodeStart and allow fixupNode() to return a boolean or something indicating whether the clone's kids should be used instead of the original node's kids.
Yeah, that sounds like a good idea.
Attached patch Like so? (obsolete) — Splinter Review
* Extends the fixupNode method with an [out] boolean param
* Removes "URI" from "CloneNodeWithFixedUpURIAttributes", since we're now fixing up more than just URI attributes. Technically, with textareas, we're fixing up more than just attributes, too, but I think it's clearer with just URI removed.
* Moves a call to FixupNode out of SerializeNodeStart, along with a new default parameter so that all the other callers of SerializeNodeStart (who don't directly iterate over child nodes) can continue to assume that SerializeNodeStart will do fixup for them.
Attachment #277698 - Attachment is obsolete: true
Attachment #277856 - Flags: review?(bzbarsky)
Attachment #277698 - Flags: review?(bzbarsky)
Comment on attachment 277856 [details] [diff] [review]
Like so?

>Index: content/base/src/nsDocumentEncoder.cpp
> nsDocumentEncoder::SerializeToStringRecursive
>+  nsCOMPtr<nsIDOMNode> node, maybeFixedNode;

|node| can just be an nsIDOMNOde* here, no?

>Index: content/base/public/nsIDocumentEncoder.idl
>+   * apply recursive serialization to the children of the cloned node, instead

s/cloned/fixed up/

>Index: embedding/components/webbrowserpersist/src/nsWebBrowserPersist.cpp
>+nsWebBrowserPersist::CloneNodeWithFixedUpAttributes(
>+                    rv = nodeAsInput->GetValue(valueStr);
>+                    if (NS_SUCCEEDED(rv) && !valueStr.IsEmpty())

So... if NS_FAILED(rv) should we really go on?  Not that that will happen here.  On the other hand, why not do this if valueStr.IsEmpty()?

>+                      outElt->SetAttribute(NS_LITERAL_STRING("value"), valueStr);

Would it be clearer to SetDefaultValue(valueStr)?

>+                    if (checked)
>+                      outElt->SetAttribute(checkedAttr, NS_LITERAL_STRING("true"));
>+                    else
>+                      outElt->RemoveAttribute(checkedAttr);

Again, would |outElt->SetDefaultChecked(checked)| make more sense?  Especially since checked="true" is not actually valid HTML...

>+                    break;

Toss in a "default: break" to prevent build warnings?

>+    nsCOMPtr<nsIDOMHTMLTextAreaElement> nodeAsTextArea = do_QueryInterface(aNodeIn);
....
>+            nsCOMPtr<nsIDOMNode> out = do_QueryInterface(*aNodeOut);

You just need that for convenience, right?  Why not:

  nsIDOMNode* out = *aNodeOut;

then?

>+            // Remove the textarea's child text nodes (should be at most one)

Shouldn't be any, in fact...

>+            rv = doc->CreateTextNode(valueStr, getter_AddRefs(childText));

etc.  How about replacing all that with a QI to nsIDOM3Node followed by a call to SetTextContent(valueStr) on he result.  That will handle removing existing kids (none), creating the text node, setting it to the right string, appending it, etc, etc.

>+    nsCOMPtr<nsIDOMHTMLOptionElement> nodeAsOption = do_QueryInterface(aNodeIn);
....
>+            if (selected)
>+              outElt->SetAttribute(selectedStr, NS_LITERAL_STRING("true"));
>+            else
>+              outElt->RemoveAttribute(selectedStr);

  SetDefaultSelected(selected);

Rest looks good.
Attachment #277856 - Flags: review?(bzbarsky) → review-
Attached patch update to review comments 1 (obsolete) — Splinter Review
Updated to review comments.

I believe it is actually possible for the fixed up text area element we get to have child nodes, if mPersistFlags has PERSIST_FLAGS_FIXUP_ORIGINAL_DOM. It usually won't, but look at
http://mxr.mozilla.org/seamonkey/source/embedding/components/webbrowserpersist/src/nsWebBrowserPersist.cpp#2932

>+           // Avoid superfluous value="" serialization
>+           if (valueStr.IsEmpty())
>+             outElt->RemoveAttribute(NS_LITERAL_STRING("value"));
>+           else
>+             outElt->SetDefaultValue(valueStr);

This bit isn't strictly necessary, it just makes the serialization a little bit cleaner. You're right, though, about the last patch's logic being a little dodgy. In the case of an un-cloned textbox with a default value which the user then erased, the last patch would have serialized the default value even though the user had explicitly changed it.
Attachment #277856 - Attachment is obsolete: true
Attachment #278065 - Flags: review?(bzbarsky)
Comment on attachment 278065 [details] [diff] [review]
update to review comments 1

>+                    if (valueStr.IsEmpty())
>+                      outElt->RemoveAttribute(NS_LITERAL_STRING("value"));
>+                    else
>+                      outElt->SetDefaultValue(valueStr);


I guess SetDefaultValue() doesn't remove the attr, yeah.  In that case, RemoveAttribute/SetAttribute is clearer, I think.

With that changed back, r=bzbarsky
Attachment #278065 - Flags: review?(bzbarsky) → review+
Carrying over r+, requesting sr.
Attachment #278065 - Attachment is obsolete: true
Attachment #278185 - Flags: superreview?(jonas)
Attachment #278185 - Flags: review+
Just a note, I was looking at bug 115107 and realized that script elements follow the textarea pattern. That is, we want serialize (in nsDocumentEncoder) a #text child for script elements generated from information stored in the script element, in nsWebBrowserPersist.
You mean <style> elements, right?
Err, yeah, those things. ;-)
Or <link>.  Yes, that could be done.  We'd probably want to fix some CSS serializer bugs first...
Blocks: 115107
Comment on attachment 278185 [details] [diff] [review]
Goes back to SetAttribute instead of SetDefaultValue

Looks great, sorry for the extremely slow review on this one :(
Attachment #278185 - Flags: superreview?(jonas) → superreview+
This patch is the same content diff as patch 278185, but done with hg in a mozilla-central tree instead of Firefox 3; thus, a few of the line numbers are slightly different.

I don't have commit access, so I'm carrying over r+sr and marking this as checkin-needed.
Attachment #278185 - Attachment is obsolete: true
Attachment #331425 - Flags: superreview+
Attachment #331425 - Flags: review+
Checked in.  It'd be good to have some tests here too.....
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Flags: in-testsuite?
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: [good first bug]
Target Milestone: --- → mozilla1.9.1a2
I don't know how to go about doing automated testing of webbrowserpersist, but here's the testcase I used when writing the patch, cleaned up and possibly suitable for adding to Litmus.

Steps to test:
1) Open testcase in browser
2) Click "Fill Form" button
3) File > Save Page As
4) Open saved page and verify that the form is still filled.
Could write a test that saves a webpage to a file and make sure it saves the right thing...
Depends on: 448445
Depends on: 449692
This patch adds a test (and the test directory) to /embedding/test/. The directory /embedding/tests/ already exists, but it contains compiled-code and xpcshell tests, and convention seems to be to put mochitests in a test/ dir.

The test itself uses wbp in a manner similar to File > Save Page As. It loads a form in an iframe, fills it out, and saves the form to a file in the temp dir. Because mochitests can't read file:/// URLs, the test then reads back the persisted file contents and loads the contents into a second iframe. Finally, the two iframes are confirmed to have the same form state.

As a last note, the "loads the contents into an iframe" bit seems to trigger a known (but still unfortunate) assertion:

###!!! ASSERTION: Adding child where we already have a child?  This will likely misbehave: 'Error', file c:/moz/mozilla-central/docshell/shistory/src/nsSHEntry.cpp, line 599
Attachment #333367 - Flags: review?(bzbarsky)
Sorry for the bugspam, but the patch above just needed one minor fix:

changed
@@ -0,0 +1,141 @@
to
@@ -0,0 +1,142 @@
Attachment #333367 - Attachment is obsolete: true
Attachment #333368 - Flags: review?(bzbarsky)
Attachment #333367 - Flags: review?(bzbarsky)
Comment on attachment 333368 [details] [diff] [review]
Updated mochitest patch

This looks great!  Thanks!
Attachment #333368 - Flags: review?(bzbarsky) → review+
Test checked in.
Flags: in-testsuite? → in-testsuite+
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: