Closed Bug 947212 Opened 7 years ago Closed 7 years ago

Broadcast form data and move it out of tabData.entries[]

Categories

(Firefox :: Session Restore, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 29

People

(Reporter: ttaubert, Assigned: ttaubert)

References

Details

Attachments

(2 files, 1 obsolete file)

Same as bug 921942, just for form data.
Depends on: 947216
Depends on: 947262
This patch does:

1) Merge TextAndScrollData.jsm and DocumentUtils.jsm into FormData.jsm.
2) Moves form data out of tabData.entries[] as advertised.
3) Introduces a PrivacyLevelFilter that is used to filter collected data according to the privacy level as set by the user.
4) Keeps support for the old form data format.
Attachment #8357316 - Flags: review?(dteller)
Lots of test changes...
Attachment #8357317 - Flags: review?(dteller)
Comment on attachment 8357316 [details] [diff] [review]
0001-Bug-947212-Broadcast-form-data-and-move-it-out-of-ta.patch

Review of attachment 8357316 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. That should contribute to cleaning up sessionstore.js, too.

::: browser/components/sessionstore/content/content-sessionStore.js
@@ +235,5 @@
> +    addEventListener("change", this, true);
> +    gFrameTree.addObserver(this);
> +  },
> +
> +  handleEvent: function (event) {

Shouldn't we check event.persisted here?

::: browser/components/sessionstore/src/FormData.jsm
@@ +16,5 @@
> + * Returns whether the given URL very likely has input
> + * fields that contain serialized session store data.
> + */
> +function isRestorationPage(url) {
> +  return /^about:(sessionrestore|welcomeback)$/.test(url);

Why not simply
 url == "about:sessionrestore" || url == "about:welcomeback"
?

@@ +32,5 @@
> +  return false;
> +}
> +
> +/**
> + * Returns the given document's current URI.

Could you mention that we're stripping anchors here?

@@ +61,5 @@
> + * This module's internal API.
> + */
> +let FormDataInternal = {
> +  /**
> +   * Collect form data for a given |frame|.

Please mention that we're not collecting for subframes.

@@ +71,5 @@
> +   * The "innerHTML" key is used for editable documents (designMode=on).
> +   *
> +   * Example:
> +   *   {
> +   *     id: {input1: "value1"},

Could you make it clearer in your example that we can have several fields in |id| and in |xpath|.
Also, what happens if the id is |prototype| or any reserved name? That sounds like food for a followup bug.

@@ +132,5 @@
> +        value = { selectedIndex: node.selectedIndex, value: node.value };
> +      } else {
> +        // <select>s with the multiple attribute are easier to determine the
> +        // default value since each <option> has a defaultSelected
> +        let options = Array.map(node.options, function(aOpt, aIx) {

Names aOpt, aIx don't really match the style of the rest of the file.

@@ +151,5 @@
> +          generatedCount++;
> +          ret.xpath = ret.xpath || {};
> +          ret.xpath[XPathGenerator.generate(node)] = value;
> +        }
> +      }

Maybe
 if (hasDefaultValue) {
   continue;
 }

@@ +172,5 @@
> +    // We want to avoid saving data for about:sessionrestore as a string.
> +    // Since it's stored in the form as stringified JSON, stringifying further
> +    // causes an explosion of escape characters. cf. bug 467409
> +    if (isRestorationPage(ret.url)) {
> +      ret.id.sessionData = JSON.parse(ret.id.sessionData);

Note for followup bug: that looks slow.

@@ +198,5 @@
> +    // For about:{sessionrestore,welcomeback} we saved the field as JSON to
> +    // avoid nested instances causing humongous sessionstore.js files.
> +    // cf. bug 467409
> +    if (hasRestorationData(data)) {
> +      data.id.sessionData = JSON.stringify(data.id.sessionData);

Note for followup bug: that looks slow.

@@ +217,5 @@
> +      // may navigate away before the setTimeout handler runs. We do
> +      // a simple comparison against savedURL to check for that.
> +      let savedURL = doc.documentURI;
> +
> +      setTimeout(() => {

What's that setTimeout for?

@@ +236,5 @@
> +   * @param retrieve (function)
> +   *        The function used to retrieve the input field belonging to a key
> +   *        in the given |data| object.
> +   */
> +  restoreInputValues: function (data, retrieve) {

Nit: Could you rename this |restoreManyInputValues| to make it harder to confuse it with |restoreInputValue|? Alternatively, you could rename |restoreInputValue| to |restoreSingleInputValue|.

@@ +291,5 @@
> +    } else if (aValue && aValue.fileList && aValue.type == "file" &&
> +      aNode.type == "file") {
> +      aNode.mozSetFileNameArray(aValue.fileList, aValue.fileList.length);
> +      eventType = "input";
> +    } else if (aValue && typeof aValue.indexOf == "function" && aNode.options) {

What's that case? Shouldn't the test be |Array.isArray(aValue)|?

@@ +303,5 @@
> +        }
> +      });
> +    }
> +
> +    // Fire events for this node if applicable

So, why exactly do we need to fire an event?

::: browser/components/sessionstore/src/PrivacyLevelFilter.jsm
@@ +73,5 @@
> +
> +    for (let key of Object.keys(data)) {
> +      if (key === "children") {
> +        let recurse = child => this.filterFormData(child, isPinned);
> +        let children = data.children.map(recurse).filter(child => child);

That filter doesn't look useful, does it?

::: browser/components/sessionstore/src/SessionStore.jsm
@@ +2983,5 @@
>        PageStyle.restoreTree(aBrowser.docShell, pageStyle);
>      }
>  
> +    // We need to support the old form and scroll data for a while at least.
> +    for (let [frame, data] of frameList) {

I thought that style was deprecated?
Attachment #8357316 - Flags: review?(dteller) → review+
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?") from comment #3)
> ::: browser/components/sessionstore/content/content-sessionStore.js
> @@ +235,5 @@
> > +    addEventListener("change", this, true);
> > +    gFrameTree.addObserver(this);
> > +  },
> > +
> > +  handleEvent: function (event) {
> 
> Shouldn't we check event.persisted here?

No, event.persisted is only defined for the "pageshow" event. That is correctly checked above.

> ::: browser/components/sessionstore/src/FormData.jsm
> @@ +16,5 @@
> > + * Returns whether the given URL very likely has input
> > + * fields that contain serialized session store data.
> > + */
> > +function isRestorationPage(url) {
> > +  return /^about:(sessionrestore|welcomeback)$/.test(url);
> 
> Why not simply
>  url == "about:sessionrestore" || url == "about:welcomeback"
> ?

I felt like it was more elegant to do that with a regex but maybe it's not. Changed it back to your version.

> @@ +32,5 @@
> > +  return false;
> > +}
> > +
> > +/**
> > + * Returns the given document's current URI.
> 
> Could you mention that we're stripping anchors here?

Yes, good idea.

> @@ +61,5 @@
> > + * This module's internal API.
> > + */
> > +let FormDataInternal = {
> > +  /**
> > +   * Collect form data for a given |frame|.
> 
> Please mention that we're not collecting for subframes.

Done.

> @@ +71,5 @@
> > +   * The "innerHTML" key is used for editable documents (designMode=on).
> > +   *
> > +   * Example:
> > +   *   {
> > +   *     id: {input1: "value1"},
> 
> Could you make it clearer in your example that we can have several fields in
> |id| and in |xpath|.

Done.

> Also, what happens if the id is |prototype| or any reserved name? That
> sounds like food for a followup bug.

Irgs, yes. Filed bug 958499.

> @@ +132,5 @@
> > +        value = { selectedIndex: node.selectedIndex, value: node.value };
> > +      } else {
> > +        // <select>s with the multiple attribute are easier to determine the
> > +        // default value since each <option> has a defaultSelected
> > +        let options = Array.map(node.options, function(aOpt, aIx) {
> 
> Names aOpt, aIx don't really match the style of the rest of the file.

Yeah, I really wanted to change that but didn't want to make the review harder. I'll clean that up.

> @@ +151,5 @@
> > +          generatedCount++;
> > +          ret.xpath = ret.xpath || {};
> > +          ret.xpath[XPathGenerator.generate(node)] = value;
> > +        }
> > +      }
> 
> Maybe
>  if (hasDefaultValue) {
>    continue;
>  }

Done.

> @@ +172,5 @@
> > +    // We want to avoid saving data for about:sessionrestore as a string.
> > +    // Since it's stored in the form as stringified JSON, stringifying further
> > +    // causes an explosion of escape characters. cf. bug 467409
> > +    if (isRestorationPage(ret.url)) {
> > +      ret.id.sessionData = JSON.parse(ret.id.sessionData);
> 
> Note for followup bug: that looks slow.

Filed bug 958501. I agree that this looks rather suboptimal but seems better than blowing up sessionstore.js. about:sessionrestore tabs should be rather rare.

> @@ +217,5 @@
> > +      // may navigate away before the setTimeout handler runs. We do
> > +      // a simple comparison against savedURL to check for that.
> > +      let savedURL = doc.documentURI;
> > +
> > +      setTimeout(() => {
> 
> What's that setTimeout for?

I don't know for sure. It was in the old code and I assume it was there so that JavaScript running onload has a chance to set document.designMode=on before we check whether that is set and restore form data.

> @@ +236,5 @@
> > +   * @param retrieve (function)
> > +   *        The function used to retrieve the input field belonging to a key
> > +   *        in the given |data| object.
> > +   */
> > +  restoreInputValues: function (data, retrieve) {
> 
> Nit: Could you rename this |restoreManyInputValues| to make it harder to
> confuse it with |restoreInputValue|? Alternatively, you could rename
> |restoreInputValue| to |restoreSingleInputValue|.

Yes, I like the idea as I got a little confused when writing the patch, too. I renamed both functions adding "Many" and "Single".

> @@ +291,5 @@
> > +    } else if (aValue && aValue.fileList && aValue.type == "file" &&
> > +      aNode.type == "file") {
> > +      aNode.mozSetFileNameArray(aValue.fileList, aValue.fileList.length);
> > +      eventType = "input";
> > +    } else if (aValue && typeof aValue.indexOf == "function" && aNode.options) {
> 
> What's that case? Shouldn't the test be |Array.isArray(aValue)|?

Yeah, you're right. I adopted that old code and didn't think too much about it. Thanks for catching that.

> @@ +303,5 @@
> > +        }
> > +      });
> > +    }
> > +
> > +    // Fire events for this node if applicable
> 
> So, why exactly do we need to fire an event?

So that code on the page and sessionstore itself knows that these fields have been changed as if the user did that.

> ::: browser/components/sessionstore/src/PrivacyLevelFilter.jsm
> @@ +73,5 @@
> > +
> > +    for (let key of Object.keys(data)) {
> > +      if (key === "children") {
> > +        let recurse = child => this.filterFormData(child, isPinned);
> > +        let children = data.children.map(recurse).filter(child => child);
> 
> That filter doesn't look useful, does it?

I could turn that into |.filter(child => !!child)| if you think that's more expressive. It is used to filter "null" values from the array, i.e. subframes that don't have any form data to store.

> ::: browser/components/sessionstore/src/SessionStore.jsm
> @@ +2983,5 @@
> >        PageStyle.restoreTree(aBrowser.docShell, pageStyle);
> >      }
> >  
> > +    // We need to support the old form and scroll data for a while at least.
> > +    for (let [frame, data] of frameList) {
> 
> I thought that style was deprecated?

Iterator() is deprecated. frameList is actually a two-dimensional array.
Attachment #8357316 - Attachment is obsolete: true
Attachment #8358371 - Flags: review+
Depends on: 958809
Comment on attachment 8357317 [details] [diff] [review]
0002-Bug-947212-Tests-for-broadcasting-form-data.patch

Review of attachment 8357317 [details] [diff] [review]:
-----------------------------------------------------------------

Also, since you're changing the format of sessionstore.js, albeit slightly, blogging-needed!

::: browser/components/sessionstore/test/browser_346337.js
@@ +1,3 @@
> +/* Any copyright is dedicated to the Public Domain.
> + * http://creativecommons.org/publicdomain/zero/1.0/ */
> +

Much nicer, thanks. Could you take the opportunity to add "use strict" to all these tests?

@@ +3,5 @@
> +
> +const URL = ROOT + "browser_346337_sample.html";
> +
> +/**
> + * Bug 346337 - Generic form data restoration tests.

Maybe it's a good time to rename this test?

::: browser/components/sessionstore/test/browser_454908.js
@@ -12,5 @@
> -    passwd:   "pwd" + Date.now()
> -  };
> -
> -  // make sure we do save form data
> -  gPrefService.setIntPref("browser.sessionstore.privacy_level", 0);

Just in case, I would set that pref at the start of the test.

@@ +27,5 @@
> +  let username = yield getInputValue(browser, {id: "username"});
> +  is(username, usernameValue, "username was saved/restored");
> +  let passwd = yield getInputValue(browser, {id: "passwd"});
> +  is(passwd, "", "password wasn't saved/restored");
> +

It would be nice to also check whether the password appears in the sessionstore.js. Maybe in a followup mentored bug if you prefer.

::: browser/components/sessionstore/test/browser_463205.js
@@ -14,5 @@
> -
> -  let mainURL = testURL;
> -  let frame1URL = "data:text/html;charset=utf-8,<input%20id='original'>";
> -  let frame2URL = rootDir + "browser_463205_helper.html";
> -  let frame3URL = "data:text/html;charset=utf-8,mark2";

So why did you remove all of these?

@@ +22,5 @@
> +  is(value, "foobar", "value was restored");
> +
> +  // Restore form data with an invalid URL.
> +  ss.setTabState(tab, getState("http://example.com/"));
> +  yield promiseTabRestored(tab);

While we're here, could we test also with example.com/1 and example.com/2 ? Also, can we have subdomains with example.com?

Possibly for a followup bug if you prefer.

::: browser/components/sessionstore/test/browser_485482.js
@@ +6,5 @@
> +/**
> + * Bug 485482 - Make sure that we produce valid XPath expressions even for very
> + * weird HTML documents.
> + */
> +add_task(function todo_name_me() {

Interesting name :)

::: browser/components/sessionstore/test/browser_662743.js
@@ -18,5 @@
> -    { "#select_id" : 2 },
> -    // invalid index
> -    { "#select_id" : 8 },
> -    { "/xhtml:html/xhtml:body/xhtml:select" : 5},
> -    { "/xhtml:html/xhtml:body/xhtml:select[@name='select_name']" : 6},

So, remind me, why can we remove this?

@@ -40,5 @@
> -  // combinations
> -    { "#select_id" : 3, id:{ "select_id": {"selectedIndex":1,"value":"val1"} } },
> -    { "#select_id" : 5, xpath: { "/xhtml:html/xhtml:body/xhtml:select[@name='select_name']" : {"selectedIndex":4,"value":"val4"} } },
> -    { "/xhtml:html/xhtml:body/xhtml:select" : 5, id:{ "select_id": {"selectedIndex":1,"value":"val1"} }},
> -    { "/xhtml:html/xhtml:body/xhtml:select[@name='select_name']" : 2, xpath: { "/xhtml:html/xhtml:body/xhtml:select[@name='select_name']" : {"selectedIndex":7,"value":"val7"} } }

And this?

::: browser/components/sessionstore/test/browser_916390_form_data_loss.js
@@ -1,2 @@
> -/* Any copyright is dedicated to the Public Domain.
> - * http://creativecommons.org/publicdomain/zero/1.0/ */

Is that replaced by browser_form_restore_events.js?

::: browser/components/sessionstore/test/browser_formdata.js
@@ +11,5 @@
> +  const URL = "http://mochi.test:8888/browser/browser/components/" +
> +              "sessionstore/test/browser_formdata_sample.html";
> +
> +  const OUTER_VALUE = Math.random();
> +  const INNER_VALUE = Math.random();

Nit: Could you add the name of the test to these values? This makes for easier debugging whenever there's a session restore leak between tests.

@@ +110,5 @@
> + * This test ensures that a malicious website can't trick us into restoring
> + * form data into a wrong website and that we always check the stored URL
> + * before doing so.
> + */
> +add_task(function test_url_check() {

Nice.
I suspect that it would be interesting to test this also with an example.org url.

::: browser/components/sessionstore/test/browser_formdata_format.js
@@ +105,4 @@
>  
> +    // clean up
> +    gBrowser.removeTab(tab);
> +  }).then(aCallback);

Mmmh...
Ok, if the task throws, this will timeout hence fail. Could you add a comment on that point?

::: browser/components/sessionstore/test/head.js
@@ +524,5 @@
>  function sendMessage(browser, name, data = {}) {
>    browser.messageManager.sendAsyncMessage(name, data);
>    return promiseContentMessage(browser, name);
>  }
> +

Nit: A little doc here?
Attachment #8357317 - Flags: review?(dteller) → review+
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?") from comment #6)
> Also, since you're changing the format of sessionstore.js, albeit slightly,
> blogging-needed!

Yes, blog post is in the works. I hope to get that finished soon.

> Much nicer, thanks. Could you take the opportunity to add "use strict" to
> all these tests?

Sure.

> > +const URL = ROOT + "browser_346337_sample.html";
> > +
> > +/**
> > + * Bug 346337 - Generic form data restoration tests.
> 
> Maybe it's a good time to rename this test?

Yo.

> ::: browser/components/sessionstore/test/browser_454908.js
> > -  // make sure we do save form data
> > -  gPrefService.setIntPref("browser.sessionstore.privacy_level", 0);
> 
> Just in case, I would set that pref at the start of the test.

Done.

> @@ +27,5 @@
> > +  let username = yield getInputValue(browser, {id: "username"});
> > +  is(username, usernameValue, "username was saved/restored");
> > +  let passwd = yield getInputValue(browser, {id: "passwd"});
> > +  is(passwd, "", "password wasn't saved/restored");
> > +
> 
> It would be nice to also check whether the password appears in the
> sessionstore.js. Maybe in a followup mentored bug if you prefer.

Good idea, done.

> ::: browser/components/sessionstore/test/browser_463205.js
> @@ -14,5 @@
> > -
> > -  let mainURL = testURL;
> > -  let frame1URL = "data:text/html;charset=utf-8,<input%20id='original'>";
> > -  let frame2URL = rootDir + "browser_463205_helper.html";
> > -  let frame3URL = "data:text/html;charset=utf-8,mark2";
> 
> So why did you remove all of these?

The test was overly complicated and I just simplified it.

> @@ +22,5 @@
> > +  // Restore form data with an invalid URL.
> > +  ss.setTabState(tab, getState("http://example.com/"));
> > +  yield promiseTabRestored(tab);
> 
> While we're here, could we test also with example.com/1 and example.com/2 ?
> Also, can we have subdomains with example.com?

Um.. why? There's not much more coverage to gain from testing /1 and /2 or any subdomains. The domain check is just a simple string comparison.

> ::: browser/components/sessionstore/test/browser_485482.js
> > +add_task(function todo_name_me() {
> 
> Interesting name :)

Fixed.

> ::: browser/components/sessionstore/test/browser_662743.js
> @@ -18,5 @@
> > -    { "#select_id" : 2 },
> > -    // invalid index
> > -    { "#select_id" : 8 },
> > -    { "/xhtml:html/xhtml:body/xhtml:select" : 5},
> > -    { "/xhtml:html/xhtml:body/xhtml:select[@name='select_name']" : 6},
> 
> So, remind me, why can we remove this?

The form data format has changed 1.75yrs ago (Fx 15, bug 697903) and I thought that it's about time to get rid of the backwards compatibility now that we need to change it once again. This won't be the only breaking change in Fx 29 and I figured it would be better to batch that.

> @@ -40,5 @@
> > -  // combinations
> > -    { "#select_id" : 3, id:{ "select_id": {"selectedIndex":1,"value":"val1"} } },
> > -    { "#select_id" : 5, xpath: { "/xhtml:html/xhtml:body/xhtml:select[@name='select_name']" : {"selectedIndex":4,"value":"val4"} } },
> > -    { "/xhtml:html/xhtml:body/xhtml:select" : 5, id:{ "select_id": {"selectedIndex":1,"value":"val1"} }},
> > -    { "/xhtml:html/xhtml:body/xhtml:select[@name='select_name']" : 2, xpath: { "/xhtml:html/xhtml:body/xhtml:select[@name='select_name']" : {"selectedIndex":7,"value":"val7"} } }
> 
> And this?

Same.

> ::: browser/components/sessionstore/test/browser_916390_form_data_loss.js
> @@ -1,2 @@
> > -/* Any copyright is dedicated to the Public Domain.
> > - * http://creativecommons.org/publicdomain/zero/1.0/ */
> 
> Is that replaced by browser_form_restore_events.js?

No, I removed that test because the code that caused the failure doesn't exist anymore. We don't use __SS_data as a cache anymore in case only the hash of a URL changes.

> ::: browser/components/sessionstore/test/browser_formdata.js
> > +  const OUTER_VALUE = Math.random();
> > +  const INNER_VALUE = Math.random();
> 
> Nit: Could you add the name of the test to these values? This makes for
> easier debugging whenever there's a session restore leak between tests.

Done.

> @@ +110,5 @@
> > + * This test ensures that a malicious website can't trick us into restoring
> > + * form data into a wrong website and that we always check the stored URL
> > + * before doing so.
> > + */
> > +add_task(function test_url_check() {
> 
> Nice.
> I suspect that it would be interesting to test this also with an example.org
> url.

I don't think that's necessary. We check the URL and if that doesn't match we don't restore, no matter what the scheme or protocol that is used.

> ::: browser/components/sessionstore/test/browser_formdata_format.js
> @@ +105,4 @@
> >  
> > +    // clean up
> > +    gBrowser.removeTab(tab);
> > +  }).then(aCallback);
> 
> Mmmh...
> Ok, if the task throws, this will timeout hence fail. Could you add a
> comment on that point?

Done.

> ::: browser/components/sessionstore/test/head.js
> @@ +524,5 @@
> >  function sendMessage(browser, name, data = {}) {
> >    browser.messageManager.sendAsyncMessage(name, data);
> >    return promiseContentMessage(browser, name);
> >  }
> > +
> 
> Nit: A little doc here?

Done.
Blocks: 960903
https://hg.mozilla.org/mozilla-central/rev/14a4f50a4681
https://hg.mozilla.org/mozilla-central/rev/35b6230f10c1
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → Firefox 29
Depends on: 864731
You need to log in before you can comment on or make changes to this bug.