Last Comment Bug 324918 - This testcase triggers "index out of range" through nsHTMLSelectElement::RemoveOptionsFromListRecurse
: This testcase triggers "index out of range" through nsHTMLSelectElement::Remo...
Status: RESOLVED FIXED
[sg:investigate]
: assertion, fixed1.8.1, testcase, verified1.8.0.4
Product: Core
Classification: Components
Component: Layout: Form Controls (show other bugs)
: Trunk
: PowerPC Mac OS X
: -- critical (vote)
: ---
Assigned To: Jonas Sicking (:sicking) No longer reading bugmail consistently
:
Mentors:
Depends on: 330917
Blocks: stirdom
  Show dependency treegraph
 
Reported: 2006-01-27 03:08 PST by Jesse Ruderman
Modified: 2007-12-16 16:57 PST (History)
4 users (show)
dveditz: blocking1.8.0.4+
jruderman: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
testcase (478 bytes, application/xhtml+xml)
2006-01-27 03:10 PST, Jesse Ruderman
no flags Details
stack trace for the assertion (9.13 KB, text/plain)
2006-01-27 03:11 PST, Jesse Ruderman
no flags Details
Patch to fix (18.60 KB, patch)
2006-02-16 23:31 PST, Jonas Sicking (:sicking) No longer reading bugmail consistently
bzbarsky: review+
jst: superreview+
jst: approval‑branch‑1.8.1+
dveditz: approval1.8.0.4+
Details | Diff | Splinter Review
missing bits (2.55 KB, patch)
2006-03-01 18:48 PST, Jonas Sicking (:sicking) No longer reading bugmail consistently
bzbarsky: review+
bzbarsky: superreview+
jst: approval‑branch‑1.8.1+
dveditz: approval1.8.0.4+
Details | Diff | Splinter Review
Patch that landed on 1.8.1 branch (17.20 KB, patch)
2006-04-20 00:42 PDT, Jonas Sicking (:sicking) No longer reading bugmail consistently
no flags Details | Diff | Splinter Review
Patch that really landed on 1.8.1 branch (17.14 KB, patch)
2006-04-21 16:05 PDT, Jonas Sicking (:sicking) No longer reading bugmail consistently
no flags Details | Diff | Splinter Review

Description Jesse Ruderman 2006-01-27 03:08:50 PST
Steps to reproduce:
1. Load the testcase.

Result:

###!!! ASSERTION: index out of range: '0 <= aIndex && aIndex < Count()', file /Users/admin/trunk/mozilla/xpcom/glue/nsVoidArray.h, line 71
Break: at file /Users/admin/trunk/mozilla/xpcom/glue/nsVoidArray.h, line 71

Sicking noticed this assertion failure; and I figured out how to reproduce it and reduced it.

Security-sensitive because nsVoidArray::FastElementAt is not run-time checked.
Comment 1 Jesse Ruderman 2006-01-27 03:10:07 PST
Created attachment 209820 [details]
testcase
Comment 2 Jesse Ruderman 2006-01-27 03:11:31 PST
Created attachment 209821 [details]
stack trace for the assertion
Comment 3 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-02-16 23:31:30 PST
Created attachment 212198 [details] [diff] [review]
Patch to fix

So I fixed this on a few levels, since there were a few different bugs.

First off we tried to support <option>s that were children of non-optgroup elements. I.e. something like <select><p><option/></p><option/></select> would produce a select with two options. However, if options were added or removed from 'foregin' children using DOM methods the <select> would never be notified and so the options array would get out of sync.

I fixed this by simply removing support for foregin children. In the example above we would simply ignore the <p> and all its children.


Second, whenever the option array would get out of sync with reality we would deal with this very poorly. Ideally this should never happen, but the code is complex enough that I think we should at least try to deal. So I added some guards for that and some code that would rebuild the options array in extream cases, like when InsertChildAt fails.

Third, nsCOMArray::RemoveObjectAt does not deal well with out-of-bounds removes. It wouldn't release an object at the out-of-bounds index, but it would access it. I fixed this by adding bounds checks. RemoveObjectAt in general has a pretty bad signature, i filed bug 327598 on that.
Comment 4 Boris Zbarsky [:bz] (TPAC) 2006-02-20 15:58:53 PST
Comment on attachment 212198 [details] [diff] [review]
Patch to fix

>Index: content/html/content/src/nsHTMLSelectElement.cpp
>+  /**
>+   * Rebuilds the options array from scratch as a fallback in error cases.
>+   */
>+  void RebuildOptionsArray();

What happens if the rebuild fails, if I might ask?  Is it ok to have an out-of-sync array (having too little stuff in it)?  The problem with this bug was too much stuff left in the array, right?

>@@ -458,27 +492,23 @@ nsHTMLSelectElement::nsHTMLSelectElement

I kinda wonder... Can't we give this sucker an Init() method that ensures a non-null mOptions?  And then nuke all those null-checks?  Followup bug, perhaps?

Also, no need to init mRestoreState to null here -- the nsRefPtr will do that on its own.

>+nsHTMLSelectElement::InsertChildAt(nsIContent* aKid,
>+  rv = nsGenericHTMLFormElement::InsertChildAt(aKid, aIndex, aNotify);

So say this threw NS_ERROR_OUT_OF_MEMORY (or say WillAddOptions did so).  Is it safe to call RebuildOptionsArray at that point?

>+  if (mOptGroupCount && !prevOptGroups) {
>+    DispatchDOMEvent(NS_LITERAL_STRING("selectHasGroups"));
>+  }

Could we have a followup bug to remove that, as well as all similar event firing in this code?  It was done for XBL form controls originally, and we don't do those.  And I really don't like firing sync events from this code.  Not safe at all.  If we want to keep the events, we should do them only when the XBL form controls pref is set, and fire async.  At least imo.  We may be able to remove mOptGroupCount too, if we do this.

>@@ -675,46 +739,39 @@ nsHTMLSelectElement::InsertOptionsIntoLi
>   if (optElement) {
>+    NS_ASSERTION(optElement, "not a real option");

That doesn't seem to be a useful assert, given that |if|.

>@@ -725,49 +782,43 @@ nsHTMLSelectElement::RemoveOptionsFromLi
>   if (optElement) {
>+    NS_ASSERTION(optElement, "not a real option");

Same here.

>@@ -1683,17 +1734,16 @@ nsHTMLSelectElement::DoneAddingChildren(
>   if (mRestoreState) {
>     RestoreStateTo(mRestoreState);
>-    NS_RELEASE(mRestoreState);

I'd still set mRestoreState = nsnull here.  No reason to keep it around.

>@@ -1805,17 +1855,16 @@ nsHTMLSelectElement::HandleDOMEvent(nsPr
>   nsSelectState* state = new nsSelectState();
>   if (!state) {
>     return NS_ERROR_OUT_OF_MEMORY;
>   }
>-  NS_ADDREF(state);

That looks wrong to me.  What am I missing?

r=bzbarsky with the issues fixed.
Comment 5 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-02-20 16:10:33 PST
(In reply to comment #4)
> (From update of attachment 212198 [details] [diff] [review] [edit])
> >Index: content/html/content/src/nsHTMLSelectElement.cpp
> >+  /**
> >+   * Rebuilds the options array from scratch as a fallback in error cases.
> >+   */
> >+  void RebuildOptionsArray();
> 
> What happens if the rebuild fails, if I might ask?  Is it ok to have an
> out-of-sync array (having too little stuff in it)?  The problem with this bug
> was too much stuff left in the array, right?

See second last pagagraph of comment 3. I hope that we deal more or less well with this now. At least well enough that there are no security issues or crashes.

> >@@ -458,27 +492,23 @@ nsHTMLSelectElement::nsHTMLSelectElement
> 
> I kinda wonder... Can't we give this sucker an Init() method that ensures a
> non-null mOptions?  And then nuke all those null-checks?  Followup bug,
> perhaps?

Yeah. Will file.

> Also, no need to init mRestoreState to null here -- the nsRefPtr will do that
> on its own.

Yup

> >+nsHTMLSelectElement::InsertChildAt(nsIContent* aKid,
> >+  rv = nsGenericHTMLFormElement::InsertChildAt(aKid, aIndex, aNotify);
> 
> So say this threw NS_ERROR_OUT_OF_MEMORY (or say WillAddOptions did so).  Is 
> it safe to call RebuildOptionsArray at that point?

Well.. it should be. That method should be compleatly OOM safe in the sense that it shouldn't crash. I bet rendering and such look wrong, but there's no way to behave corrently when OOM.

> >+  if (mOptGroupCount && !prevOptGroups) {
> >+    DispatchDOMEvent(NS_LITERAL_STRING("selectHasGroups"));
> >+  }
> 
> Could we have a followup bug to remove that, as well as all similar event
> firing in this code?  It was done for XBL form controls originally, and we
> don't do those.  And I really don't like firing sync events from this code. 
> Not safe at all.  If we want to keep the events, we should do them only when
> the XBL form controls pref is set, and fire async.  At least imo.  We may be
> able to remove mOptGroupCount too, if we do this.

Agreed. Will file.

> >@@ -675,46 +739,39 @@ nsHTMLSelectElement::InsertOptionsIntoLi
> >   if (optElement) {
> >+    NS_ASSERTION(optElement, "not a real option");
> 
> That doesn't seem to be a useful assert, given that |if|.

Doh! I had this code written another way, but I reverted back. Forgot to remove the assertion though. Will do.

> >@@ -1683,17 +1734,16 @@ nsHTMLSelectElement::DoneAddingChildren(
> >   if (mRestoreState) {
> >     RestoreStateTo(mRestoreState);
> >-    NS_RELEASE(mRestoreState);
> 
> I'd still set mRestoreState = nsnull here.  No reason to keep it around.

Ah, good catch.

> >@@ -1805,17 +1855,16 @@ nsHTMLSelectElement::HandleDOMEvent(nsPr
> >   nsSelectState* state = new nsSelectState();
> >   if (!state) {
> >     return NS_ERROR_OUT_OF_MEMORY;
> >   }
> >-  NS_ADDREF(state);
> 
> That looks wrong to me.  What am I missing?

Nothing, i goofed :)
Comment 6 Johnny Stenback (:jst, jst@mozilla.com) 2006-02-21 16:25:45 PST
Comment on attachment 212198 [details] [diff] [review]
Patch to fix

- In nsCOMArray_base::RemoveObjectAt(PRInt32 aIndex):

+    if (PRUint32(aIndex) < PRUint32(Count())) {
+        nsISupports* element = ObjectAt(aIndex);

This isn't safe if aIndex is a huge negative number since ObjectAt() calls into FastElementAt() on the internal array w/o any bounds checking. Not a big deal, nor likely to happen, but probably worth adding a < 0 check here too to be safe against that as well.

sr=jst with that and what bz pointed out fixed.
Comment 7 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-02-22 00:47:03 PST
Err.. that's what the boundscheck on the line above is for. Check the implementation of nsVoidArray::SafeElementAt
Comment 8 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-02-28 17:44:04 PST
We should let this bake on trunk for a while and then consider for 1.8.0.3
Comment 9 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-02-28 17:44:34 PST
Oh, and the patch has been checked in. Thanks for the reviews.
Comment 10 Robert O'Callahan (:roc) (email my personal email if necessary) 2006-03-01 16:39:39 PST
I backed out the changes to nsHTMLSelectElement.cpp ... they was responsible for comboboxes simply not working at all. (The COMarray changes are still there.)

Maybe a simpler yet still performant approach would be to just blow away the option list whenever an option is inserted or removed, and recreate it the next time we need it.
Comment 11 Boris Zbarsky [:bz] (TPAC) 2006-03-01 16:44:34 PST
Reopening, since this got backed out.
Comment 12 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-03-01 18:48:34 PST
Created attachment 213703 [details] [diff] [review]
missing bits

I forgot that the callpaths hasn't merged at this point. Sort of ugly to call InsertChildAt, but it's what nsGenericElement is doing anyway.

I also added some extra safety since we want to be able to deal with the array being wrong. This was the only unsafe place i could find that was using ObjectAt.
Comment 13 Boris Zbarsky [:bz] (TPAC) 2006-03-01 19:11:22 PST
Comment on attachment 213703 [details] [diff] [review]
missing bits

r+sr=bzbasky.  Should we just inline AppendChildTo on nsINode?  Or even non-inline, but still implement in terms of InsertChildAt?  I'm not sure how these GetChildCount() calls play with the EnsureContentsGenerated crap that nsXULElement does, btw... I'd hope we could rip out some of that...
Comment 14 neil@parkwaycc.co.uk 2006-03-16 04:02:09 PST
I'm not sure if this bug is causing my trunk to crash visiting http://www.ryanair.com/ but at line 2294 of nsHTMLSelectElement.cpp it calls into ReplaceChild, which by the time that reaches the end of nsGenericElement::doReplaceOrInsertBefore has managed to drop the last reference so that the NS_ADDREF is on a stale pointer. Switching line 2299 to an nsCOMPtr seems to resolve the problem for me.
Comment 15 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-03-16 10:46:45 PST
That's bug 330084
Comment 16 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-04-20 00:42:24 PDT
Created attachment 219119 [details] [diff] [review]
Patch that landed on 1.8.1 branch
Comment 17 Daniel Veditz [:dveditz] 2006-04-21 13:17:40 PDT
Comment on attachment 212198 [details] [diff] [review]
Patch to fix

approved for 1.8.0 branch, a=dveditz for drivers
Comment 18 Daniel Veditz [:dveditz] 2006-04-21 13:17:55 PDT
Comment on attachment 213703 [details] [diff] [review]
missing bits

approved for 1.8.0 branch, a=dveditz for drivers
Comment 19 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-04-21 16:05:15 PDT
Created attachment 219375 [details] [diff] [review]
Patch that really landed on 1.8.1 branch

The last one wasn't quite right, i had merged the wrong version of the patch.
Comment 20 Jonas Sicking (:sicking) No longer reading bugmail consistently 2006-04-21 16:44:05 PDT
Marking fixed since this relanded on trunk.
Comment 21 Tracy Walker [:tracy] 2006-05-11 09:15:42 PDT
verified with Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4
Comment 22 Alexander Sack 2006-09-15 16:09:00 PDT
Jesse,

on 1.0.x branch I only get the runtime-checked assertion when running the attached testcase

###!!! ASSERTION: nsVoidArray::ElementAt(index past end array) - note on bug 96108: 'aIndex < Count()', file ../ds/nsVoidArray.h, line 72

... so no problem I guess?
Comment 23 Jesse Ruderman 2007-12-16 16:57:52 PST
Testcase checked in as a crashtest.

Note You need to log in before you can comment on or make changes to this bug.