Closed Bug 347165 Opened 18 years ago Closed 18 years ago

Provide a faster internal method for accessing form control elements

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: jlurz24, Assigned: jlurz24)

References

Details

Attachments

(5 files, 4 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1b1) Gecko/20060710 Firefox/2.0b1
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1b1) Gecko/20060710 Firefox/2.0b1

Follow up to bug 302186

bz's comment:

> >+nsHTMLFormElement::ResetDefaultSubmitElement(PRBool aNotify)
> >+  nsCOMPtr<nsISimpleEnumerator> formControls;
> 
> I really wish we had a faster "internal" way of doing it.  This is ok for now,
> but want to file a followup bug on improving both this and the code in
> nsHTMLInputElement, perhaps?

Reproducible: Always
Not a style system thing at all.  ;)
Assignee: dbaron → general
Status: UNCONFIRMED → NEW
Component: Style System (CSS) → DOM: HTML
Ever confirmed: true
Depends on: 302186
I shouldn't pick components before drinking coffee in the morning. I'll take care of this one after bug 302186 lands.
Status: NEW → ASSIGNED
Hmmm, I thought that would assign it to me. I guess I don't have the privileges to do that.
You have to change the assignee field to affect who the bug's assigned to.  I just gave you the bits to do that, for what it's worth.
Assignee: general → jlurz24
Status: ASSIGNED → NEW
So what the nsFormControlIterator does is to perform an insertion sort on the list of elements not-in-elements list, and then lazily merges the list with the existing sorted list of elements.

Unfortunately, because image input elements are in the non-in-elements list we can't just iterate over the elements list. However, we should be able to do slightly better than the nsFormControlIterator in this case because we only care about a subset of the possible elements. Hmmm.
Even having the same algorithm, but returning nsIContent* (no addref) instead of the COM enumerator stuff would be great.
There are currently three users of GetControlEnumerator

1) nsHTMLFormElement uses it to find the new default submit button.
2) nsHTMLInputElement uses it to find the number of text input elements.
3) Accessable code uses it to get the default submit button.

Interestingly, these uses are all fairly different, but none actually needs the fully sorted list of elements. I was thinking instead of adding a public function to return a sorted element array, I would add a public function to count the number of text input elements. This would satisfy usage 2. For 1, I would change the internal code to directly search through the two arrays.

What are the rules with regard to linking of accessable code with layout code? Should I leave the public COM enumerator function, or directly use the GetDefaultSubmitElement? If possible it'd be a nice code size win to remove all the enumerator code.

Or, as option 2 I could write a generic function which returns a sorted list of  form controls. I'd prefer option 1 however, as it should be faster and less code overall. Any thoughts?
Hmm.. getting rid of all this code is _really_ tempting, but right now it has a certain flexibility...  I'd be maybe kinda in favor of leaving _something_ that lets the caller pass in an nsCOMArray<nsIFormControl> and fills it in with the controls.  jst, sicking, peterv, what do you think?

Having a function that outputs the number of text controls, or even just whether there is exactly one, might be a good idea no matter what.

As for accessibility, it should just use GetDefaultSubmitElement().  That's a virtual method, so no linkage issues.
It seems like all those three uses could simply use the elements array. I.e. they never care about the nodes that end up in nsFormControlList::mNotInElements. If so I don't think we should worry about getting the sorted list of the merge of mNotInElements and mElements.
Use #1 and use #3 want nodes that are in mNotInElements -- image inputs.
Attached file testcase
Single input test cases for me not to break.
The WF2 patch in bug 345822 is also using the enumerator internally.
Yeah.  That use is not order-dependent, so it could just walk elements and non-elements.
Once this is fixed, bug 349355 needs revisiting.  Can't add a dep because it causes a dep loop.  :(
I'd really like to remove the enumerator as it appears to be a good way to iterate over form elements, but it usually does way more than you want.

Would an acceptable path forward be to remove the enumerator and fix current users(I have a patch for this), and then file a follow up bug to discuss if and what to replace it with? If we decide to replace it with a new method I'll commit to implementing it.
Yeah, that seems OK to me.
CCing Alex because his patch is using GetControlEnumerator
Attached patch remove_enumerator_v1 (obsolete) — Splinter Review
Attachment #236896 - Flags: review?(bzbarsky)
I realized that there was one last use of the enumerator, for form control submissions, that I thought should stay sorted. The HTML4 spec didn't specify this case, but maintaining existing behavior seemed best. I rewrote the sorting code to sort directly into an array instead of creating the enumerator. This function will also make it easy to add a public method that returns a sorted list of form controls.

This still won't be that fast because CompareFormControlPosition QIs its arguments and addrefs them.
Test case for form control ordering on submission. Requires looking at the query string, which is rather lame but I couldn't think of a better way.
The WF2 stuff is going to need access to a sorted array of controls too. So it'd be good to have some function that can be reused for different callsites.
Oh, and form submission absolutely has to keep the same order as it does now. Chaning the order in any way is known to break lots of sites.
Comment on attachment 236896 [details] [diff] [review]
remove_enumerator_v1

>Index: content/html/content/src/nsHTMLFormElement.cpp
>+nsHTMLFormElement::FindDefaultSubmit() const
>+  // Find the first submit control in the elements list. This list is
>+  // in document order.

So I was wondering... The only caller is ResetDefaultSubmitElemen(), and the only caller of that is RemoveElement().  And RemoveElement() knows whether it removed the element from mElements or mNotInElements and where it used to be.  So we know it's pointless to search earlier than that in the relevant array.  Could it be worth it to propagate that information into here and use it to optimize?  Or too much complexity for little win?

>+  // The not-in-elements list is not in document order.

I have to ask... why not?  It seems like it would be pretty easy to do in nsHTMLFormElement::AddElement something like:

nsVoidArray & controlList = ShouldBeInElements(aChild) ?
   mControls->mElements : mControls->mNotInElements;

and then just go through the existing "ShouldBeInElements(aChild)" case, but use |controlList| throughout.  You might need to make mControls->mNotInElements an nsVoidArray, since nsSmallVoidArray has private inheritance from nsVoidArray.  Or you could make both arrays nsTArray<nsIFormControl*> and either just preallocate some space or not.  That does need an extra allocation, probably, but I think that's ok in this case.

sicking, do you have any array wisdom here?

In any case, if this list were sorted it seems like you could do better here and in nsFormControlList::GetSortedFormControls.

I haven't reviewed the sorting code details pending this stuff, but the rest looks good.  ;)
(In reply to comment #21)
> The WF2 stuff is going to need access to a sorted array of controls too. So
> it'd be good to have some function that can be reused for different callsites.
> 
Sure, I'll add an internal function to get at this list. It'll just be a wrapper around getSortedFormControls.
 
> Could it be worth it to propagate that information into here and use it to
> optimize?  Or too much complexity for little win?

I'll try this out and see if it can be done in relatively little code. This could be important for loops removing elements one by one.

> 
> >+  // The not-in-elements list is not in document order.
> 
> I have to ask... why not? 

I just looked and couldn't see any reason why, other than to avoid sorting on insertion. I like this idea, this should be less code and faster. I'm a big fan of nsTArray, so I'm hoping the extra allocation overhead isn't too bad.

> I haven't reviewed the sorting code details pending this stuff, but the rest
> looks good.  ;)
> 

A wise plan, a lot of the sorting code should go away in the next version.
I don't think it's worth it to use nsSmallVoidArray for either of the two arrays, since we can expect that it's resonably uncommon for either to contain exactly 1 element (which is the only case where nsSmallVoidArray saves memory).

Using nsVoidArray or nsTArray<> should not matter size-wise, so it seems like a good idea to use the latter since it's more typesafe.
Attached patch remove_enumerator_v2(-w) (obsolete) — Splinter Review
Attachment #236896 - Attachment is obsolete: true
Attachment #238328 - Flags: review?(bzbarsky)
Attachment #236896 - Flags: review?(bzbarsky)
Attached patch remove_enumerator_v2 (obsolete) — Splinter Review
This addresses all the review comments so far.

I also realized that the last-element optimization in addElement wasn't right in the case where the new child and the old default submit control were not in the same lists(in-elements vs. not-in-elements). I'll attach a testcase that shows this. Since I was working on that function anyway I went ahead and fixed that in this patch too.
Attached file Add element testcase
Or I could split that into another bug, not sure which makes the most sense.
Comment on attachment 238328 [details] [diff] [review]
remove_enumerator_v2(-w)

A couple of nits, but in general this looks great!

>Index: content/html/content/src/nsHTMLFormElement.cpp
>+   * @param aPrevDefaultInElements Whether the previous default submit control
>+   *                               was in the elements list.

So if this is false, it was in the non-elements list, right?  That should be clearly documented here, I think.

>+   * @param aPrevDefaultIndex The index of the previous default submit control.

And here we should say that this index is within the list determined by aPrevDefaultInElements.

This applies to the comments for both ResetDefaultSubmitElement and FindDefaultSubmit.

>@@ -1287,19 +1295,22 @@ nsHTMLFormElement::AddElement(nsIFormCon
>     // default submit element. To speed up parsing, the special case
>     // of the last element in a form that already has a default submit
>-    // element is ignored as it cannot be the default submit element.
>+    // element where the previous default submit element resides in the
>+    // same list as the new child is ignored.

Maybe more like:

  "To speed up parsing, the special case of a form that already
   has a default submit that's in the same list as the new child
   is ignored, since the new child then cannot be the default
   submit element."

That said, maybe we should actually keep track of the first submits in both mInElements and mNotInElements?  With the current setup I worry that a single submit followed by a bunch of image inputs will make life unhappy.  For example, we could store the indices in both, rather than the element pointers.  We'd need to maintain the indices a little on insertion and removal, and change GetDefaultSubmitElement to compare the two indices and return the right thing... but then in this code we could simply do an index compare and be done with it.  A separate bug on this, though.  What you have here is good enough for this one.

>+    if (!mDefaultSubmitElement ||
>+        ((!lastElement ||
>+         ShouldBeInElements(mDefaultSubmitElement) != childInElements) &&

The "Should..." part should be indented one more space, no?

>         CompareFormControlPosition(aChild, mDefaultSubmitElement) < 0)) {

As should this line.

>+nsHTMLFormElement::HasSingleTextControl() const
>+    nsIFormControl* currentControl = mControls->mElements[i];

You don't really need this temporary.  Or does it help you avoid the 80-col thing?  If so, keep it.

> nsFormControlList::nsFormControlList(nsHTMLFormElement* aForm) :

I think you should init the size of mElements here, as we used to do.  The default size of an nsAutoVoidArray is 8, so just do:

   nsFormControlList::nsFormControlList(nsHTMLFormElement* aForm) :
     mForm(aForm),
     mElements(8)   
   {

> nsFormControlList::Clear()

>+    nsIFormControl* f = mElements.ElementAt(i);
>+    f->SetForm(nsnull, PR_FALSE, PR_TRUE);

Again, no need for the temporary, esp. if you use [i] instead of ElementAt(i)

>+    nsIFormControl* f = mNotInElements.ElementAt(i);

And here.

> nsFormControlList::Item(PRUint32 aIndex, nsIDOMNode** aReturn)
>+    nsIFormControl *control = mElements.ElementAt(aIndex);
>     return CallQueryInterface(control, aReturn);

And here, too.

>+nsFormControlList::GetSortedControls(nsTArray<nsIFormControl*>& aControls) const
>+  while (elementsIdx < elementsLen || notInElementsIdx < notInElementsLen) {

I think we can do a better job of optimizing the (common) case of different lengths here.  Something like this:

  while (elementsIdx < elementsLen ||
         notInElementsIdx < notInElementsLen) {
    // Check whether we're done with mElements
    if (elementsIdx == elementsLen) {
      NS_ASSERTION(notInElementsIdx < notInElementsLen,
                  "How'd that happen?");
      // Append the remaining mNotInElements elements
      if (!aControls.AppendElements(mNotInElements.Elements() +
                                      notInElementsIdx,
                                    notInElementsLen -
                                      notInElementsIdx)) {
        return NS_ERROR_OUT_OF_MEMORY;
      }
      break;
    }

Then similarly for the case of notInElementsIdx == notInElementsLen, and then fall through to the case when we still have both arrays to deal with.  That way once you get to the end of one of the arrays you'll just do a single append and be done with it.  If you think that this is sufficiently more complex than what you have, let's not worry about it, though; I'm not sure how much it really optimizes.  ;)  Let me know.
        
>+  for (PRUint32 i = 0; i < aControls.Length() - 1; ++i) {

Have to be a little careful here.  For example, if aControls.Length() == 0 this will go from 0 to PR_UINT32_MAX.  You probably want to put this whole for loop inside a |if (aControls.Length() > 0)|.
Comment on attachment 238328 [details] [diff] [review]
remove_enumerator_v2(-w)

r- just to get it out of my list, pending the nits being picked.
Attachment #238328 - Flags: review?(bzbarsky) → review-
Blocks: 352980
(In reply to comment #31)

> That said, maybe we should actually keep track of the first submits in both
> mInElements and mNotInElements?

I agree, parsing performance could be pretty bad for certain forms. bug 353980 filed for this issue.

> I think we can do a better job of optimizing the (common) case of different
> lengths here. 

Interesting. I don't think this will be a huge amount faster than the code I had as it doesn't avoid any control comparisons or allocations, but it isn't any more complicated than what I had, and certainly is slightly faster, so I went with it.

Everything else fixed per comments. Patch in a moment.
Attached patch remove_enumerator_v3 (obsolete) — Splinter Review
Attachment #238329 - Attachment is obsolete: true
Attachment #238832 - Flags: review?(bzbarsky)
Attachment #238328 - Attachment is obsolete: true
Attachment #238833 - Attachment is patch: true
Attachment #238833 - Attachment mime type: application/octet-stream → text/plain
Comment on attachment 238832 [details] [diff] [review]
remove_enumerator_v3

>Index: content/html/content/src/nsHTMLFormElement.cpp

>+nsHTMLFormElement::FindDefaultSubmit(PRBool aPrevDefaultInElements,

>+    if (currControl->IsSubmitControl() && (!defaultSubmit ||
>+        CompareFormControlPosition(currControl, defaultSubmit) < 0)) {
>+      defaultSubmit = currControl;
>+      break;

So... now that mNotInElements is sorted, we can break out as soon as we hit an submit control, no?  Something like:

  if (currControl->IsSubmitControl()) {
    if (!defaultSubmit ||
        CompareFormControlPosition(currControl, defaultSubmit) < 0) {
      defaultSubmit = currControl;
    }
    break;
  }

Should save us a bit of walking.  If you'd rather defer this to just fixing bug 352980 I'm OK with that too, but I think this is a simple change with a good win in edge cases of a long mNotInElements.

>+nsFormControlList::GetSortedControls(nsTArray<nsIFormControl*>& 
>+    else if (notInElementsIdx == notInElementsLen) {

No need for the |else|.  The previous |if| has an unconditional break.

>+    // Both lists have elements left.
>+    else {

And this |else| could be removed, and its body outdented.

r+sr=bzbarsky with those nits picked.  Thanks again for being patient, and I'm sorry I didn't catch the FindDefaultSubmit thing last time around.  :(
Attachment #238832 - Flags: superreview+
Attachment #238832 - Flags: review?(bzbarsky)
Attachment #238832 - Flags: review+
> r+sr=bzbarsky with those nits picked.  Thanks again for being patient, and I'm
> sorry I didn't catch the FindDefaultSubmit thing last time around.  :(
> 

Not a problem at all, I really appreciate the thorough reviews. Nits fixed, patch coming momentarily.
Attachment #238832 - Attachment is obsolete: true
Attachment #238965 - Flags: superreview+
Attachment #238965 - Flags: review+
Fix checked in on trunk for 1.9 alpha 1.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Depends on: 353415
Comment on attachment 238965 [details] [diff] [review]
remove_enumerator_v4

>+#ifdef DEBUG
>+  if (!aControls.IsEmpty()) {
>+    for (PRUint32 i = 0; i < aControls.Length() - 1; ++i) {
>+      NS_ASSERTION(CompareFormControlPosition(aControls[i],
>+                     aControls[i + 1]) < 0,
>+                   "Form controls sorted incorrectly");
>+    }
>+  }
>+#endif
What does it mean if I hit this assertion entering a new bug into Bugzilla?
> What does it mean if I hit this assertion entering a new bug into Bugzilla?

It means Bugzilla's HTML sucks.  See bug 353415 (which is on the regression list for this bug).
Flags: in-testsuite?
Component: DOM: HTML → DOM: Core & HTML
QA Contact: ian → general
Depends on: CVE-2012-1976
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: