Closed Bug 720589 Opened 12 years ago Closed 9 years ago

mMatchCounts may be accessed with a nonexisting index (Assertion failure: i < Length() (invalid array index))

Categories

(Toolkit :: Autocomplete, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla45
Tracking Status
firefox43 --- wontfix
firefox44 --- fixed
firefox45 --- fixed
b2g-v2.5 --- fixed

People

(Reporter: mak, Assigned: mak)

References

Details

Attachments

(4 files, 6 obsolete files)

Attached file stack
While working on bug 720110 I kept hitting nsTArray assertions like 

###!!! ASSERTION: invalid array index: 'i < Length()', file c:\mozilla\mozilla-inbound\obj-i686-pc-mingw32\browser\dist\include\nsTArray.h, line 560

I tried to check all of our searches, the only particular thing is that our new autocomplete, with autoFill, has multiple searches, and one of them is async and invokes processResult multiple times for the same result (uses _ONGOING), while the other one is sync.

After lots of tests and debugging I think I figured out that mMatchCounts is just crazy. It's a cache of mMatchCount values for each search, we SetLength() it to the number of searches, though later we forget about that and call Clear() on it. Then we try to append to it using a resultIndex to figure out if we should append or not, that nothing has to do with the scope of the cache.

We should just be consistent with it.
Now, I have a patch, though I'm not sure if I can make a test that hits that, since out autocomplete is quite complex to reproduce. I'll try and see.
Attached patch patch v1.0 (obsolete) — Splinter Review
Attached patch patch v1.1 (obsolete) — Splinter Review
Attachment #590962 - Attachment is obsolete: true
I was actually hitting the assertion due to an uncatched exception in the delayed startSearch, so I don't need this bug to fix bug 720110.

Though, it would still be nice to fix it!
No longer blocks: 720110
Attached patch patch v1.2 (obsolete) — Splinter Review
I went back to the original bug (fixed in 1.9a8) to ensure exactly the scope for all of this.
I think this patch makes lot of sense, and it's what was intended originally. Though, I have no idea how to make a test for it, even because the assertion would not activate any test failure.
Attachment #590968 - Attachment is obsolete: true
Attachment #591497 - Flags: review?(gavin.sharp)
Summary: mMatchCounts doesn't know what's its scope → mMatchCounts may be accessed with a nonexisting index
Comment on attachment 591497 [details] [diff] [review]
patch v1.2

>diff --git a/toolkit/components/autocomplete/nsAutoCompleteController.cpp b/toolkit/components/autocomplete/nsAutoCompleteController.cpp

>+  // The following code supports incremental updating results.
>+  // The search may reuse the same result adding entries to it, or send a new
>+  // result each time.  In the former case, we have to adjust mRowCount,
>+  // subtracting the number of matches already added by the previous result.
>   PRInt32 resultIndex = mResults.IndexOf(aResult);
>   if (resultIndex == -1) {
>     // cache the result
>     mResults.AppendObject(aResult);
>     resultIndex = mResults.Count() - 1;
>+    // Store the matchCount in case the result is going to be reused.
>+    mMatchCounts[aSearchIndex] = matchCount;
>   }
>   else {
>+    // The result is being updated incrementally, save the current value to use
>+    // it later for correcting mRowCount.
>     oldMatchCount = mMatchCounts[aSearchIndex];
>+    mMatchCounts[aSearchIndex] = matchCount;
>   }

This code seems like it would break if a search actually does send a completely new result each time. You'd just always hit the first case, and never calculate the actual difference between the matchCount of the previous result and the current matchCount, so the RowCountChanged calls later on would be incorrect (doesn't matter in the location bar case where there's no tree, but it could impact e.g. form fill).

Kind of unrelated to this patch, but it also seems like http://hg.mozilla.org/mozilla-central/rev/f23efef7f616 also broke the "different results each time" case by removing the mResults.ReplaceObjectAt() call that updated mResults to point to the new result object. Before that patch, it seemed like both mMatchCounts and mResults were appended to as searches returned results (in the order they returned their first results), and so it made sense that you could index mMatchCounts by resultIndex.
Attachment #591497 - Flags: review?(gavin.sharp) → review-
(In reply to Marco Bonardo [:mak] from comment #4)
> Though, I have no idea how to make a test for it, even because
> the assertion would not activate any test failure.

Add a mochitest and vote for bug 404077? Hopefully the test will last longer than mochitest's lack of reporting assertion failures ;)
Blocks: 756861
Blocks: 760975
I crashed due to this fatal assertion today; this isn't the first time I've crashed in this way while typing quickly in the URL bar and using arrows to select autocompletion results.
Any chance of getting this fixed?  I've been hitting this multiple times a week for the last few months; it probably accounts for 90% of my crashes, despite having learned to be really careful every time I touch the awesomebar.
Flags: needinfo?(mak77)
Do you usually use a debug browser? doesn't look like we have top crash-stats for this in the wild. I don't have an estimate as of now, so it's hard for me to give you an answer, maybe you could help finding good steps to reproduce, that would help debugging and testing the fix.
My patch is likely still valid, even if I must go through comment 5 and solve those points.
Flags: needinfo?(mak77)
Yes, David Baron usually uses a debug browser.
Depends on: 989598
Attached file gdb debugging, again
I crashed again due to this yesterday; here's my debugging.

(It's hard to debug because it crashes while the autocomplete popup is grabbing all keyboard and mouse events, so I can't use my X session at all until I've killed the process.)
Summary: mMatchCounts may be accessed with a nonexisting index → mMatchCounts may be accessed with a nonexisting index (Assertion failure: i < Length() (invalid array index))
(In reply to :Gavin Sharp [email: gavin@gavinsharp.com] from comment #5)
> Kind of unrelated to this patch, but it also seems like
> http://hg.mozilla.org/mozilla-central/rev/f23efef7f616 also broke the
> "different results each time" case by removing the
> mResults.ReplaceObjectAt() call that updated mResults to point to the new
> result object. Before that patch, it seemed like both mMatchCounts and
> mResults were appended to as searches returned results (in the order they
> returned their first results), and so it made sense that you could index
> mMatchCounts by resultIndex.

This doesn't make sense to me, if the result is different, the indexOf would return -1 and we'd always hit the first case. The pointed at patch removed ReplaceObjectAt because it was replacing a result with itself, it was pointless.
Attached patch wip v2 (obsolete) — Splinter Review
this tries to "properly" handle mResults as a sparse array, that was the original intent.
I also tried to simplify the handling by supporting a single case, 1 search -> 1 result (the 1 search multiple results case is still supported, through a results merge).
finally, I removed mMatchCounts that is pretty much pointless since we have the results already.

This is not ready yet, I still have some failures to look into.
Attachment #591497 - Attachment is obsolete: true
Comment on attachment 8684882 [details] [diff] [review]
patch v2.2

Neil, do you have any time to look at this? It looks scary, but it shouldn't be that much.

The merge approach is not optimal, but I can't make it any better cause nsAutoCompleteSimpleResult is just one of the few possible nsIAutoCompleteResult implementations around. I don't know which implementation it will have to work on.

On the other side no internal consumer is using that path, and thus I'm also deprecating it through an assertion, it will work, but not so much if the nsIAutoCompleteResult uses fancy stuff like wrappedJSObject. I'd be happy to completely remove that and return an error in a couple versions from now.
Surely the patch would be simpler, if we would just throw that path away (I could avoid all the simpleresult changes), but I can't tell how many bogus consumers is has among add-ons.
Attachment #8684882 - Flags: review?(neil)
(In reply to Marco Bonardo from comment #16)
> On the other side no internal consumer is using that path,
That explains why nothing went wrong when I ported bug bug 389593 to XPFE - I also misread the bogus code and assumed that each result should replace the previous result sent for that search whether or not it was the same result object.

> Surely the patch would be simpler, if we would just throw that path away (I
> could avoid all the simpleresult changes), but I can't tell how many bogus
> consumers is has among add-ons.
I guess finding out which addons use RESULT_*_ONGOING would be a start.
(In reply to neil@parkwaycc.co.uk from comment #17)
> I guess finding out which addons use RESULT_*_ONGOING would be a start.

I fear that's not enough, a consumer reusing the same result can use RESULT_*_ONGOING and just change the searchResult value later (Places does this for example). Handling these consumers is not a big deal.

The problematic consumers are the ones that keep creating a new result everytime they have new matches for the same search.
note I'm not excluding all of this is just wrong from the beginning, in the end nobody ever really understood the scope of that. That's why I think it would be wise to throw away that code path completely.
Comment on attachment 8684882 [details] [diff] [review]
patch v2.2

Yeah, it's scary all right, but I guess it's the best you can do. Some nits:

>+      // Grow the array if needed. We can't just append cause we can't predict
>+      // the order searches will return results, and we can't insert an object
>+      // to an arbitrary index if we don't grow the array first.
>+      if (static_cast<int32_t>(mResults.Length()) < aSearchIndex + 1) {
>+        mResults.SetCount(aSearchIndex + 1);
>+      }
>+      mResults.InsertObjectAt(aResult, aSearchIndex);
This is an nsCOMArray, so ReplaceObjectAt does all this for you.

>+#ifdef DEBUG
>   else {
>-    oldMatchCount = mMatchCounts[aSearchIndex];
>-    mMatchCounts[resultIndex] = matchCount;
>+    MOZ_ASSERT(mResults.IndexOf(aResult) == aSearchIndex,
>+               "The result should have the same index as the search");
>   }
>+#endif
Isn't MOZ_ASSERT debug-only, in which case the #ifdef is unnecessary?

>-    nsIAutoCompleteResult *result = mResults[i];
>+    nsIAutoCompleteResult *result = mResults.SafeObjectAt(i);
[I know why you're doing it, but using SafeObjectAt for a known loop still looks silly...]

>+  if (_retval.Length() == 0) {
IsEmpty() [more than once]

>   /**
>    * Sets a listener for changes in the result.
>    */
>   void setListener(in nsIAutoCompleteSimpleResultListener aListener);
>+
>+  /**
>+   * Gets the listener for changes in the result.
>+   */
>+  nsIAutoCompleteSimpleResultListener getListener();
Please put getListener before setListener as this is how it would work as an attribute.

>-                datalistLabels = new Array(filteredEntries.length).fill("").concat(datalistLabels);
>+                // History entries have the label equal to the value.
>+                datalistLabels = filteredEntries.concat(datalistLabels);
[Won't the simple result automatically return the value if the label is empty?]

>     mObserver->OnUpdateSearchResult(mSearch, mResult);
[Ugh this sucks, although not as much as the way the form fill code unifies the satchel and datalist entries.]
Attachment #8684882 - Flags: review?(neil) → review+
(In reply to neil@parkwaycc.co.uk from comment #20)
> >+#ifdef DEBUG
> >   else {
> >-    oldMatchCount = mMatchCounts[aSearchIndex];
> >-    mMatchCounts[resultIndex] = matchCount;
> >+    MOZ_ASSERT(mResults.IndexOf(aResult) == aSearchIndex,
> >+               "The result should have the same index as the search");
> >   }
> >+#endif
> Isn't MOZ_ASSERT debug-only, in which case the #ifdef is unnecessary?

yes, but then I'd have an empty else branch and I fear some compiler may complain...

> >-    nsIAutoCompleteResult *result = mResults[i];
> >+    nsIAutoCompleteResult *result = mResults.SafeObjectAt(i);
> [I know why you're doing it, but using SafeObjectAt for a known loop still
> looks silly...]

yeah, it was mostly for consistency and avoid future people adding unsafe usage, if everything is consistent it's less likely to happen.

> >-                datalistLabels = new Array(filteredEntries.length).fill("").concat(datalistLabels);
> >+                // History entries have the label equal to the value.
> >+                datalistLabels = filteredEntries.concat(datalistLabels);
> [Won't the simple result automatically return the value if the label is
> empty?]

Simple result does, but form history has 2 (yes two!) different custom implementations of nsIAutoCompleteResult that do fancy stuff with wrappedJSObject. That's horrible.
This is the one in question:
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/satchel/nsFormAutoCompleteResult.jsm#113
I can also do that here instead of changing the datalistLabels above, it would be more consistent with simple result.
(In reply to Marco Bonardo from comment #21)
> (In reply to comment #20)
> > Isn't MOZ_ASSERT debug-only, in which case the #ifdef is unnecessary?
> 
> yes, but then I'd have an empty else branch and I fear some compiler may
> complain...

With the {}s it would have always been safe, but just to be sure, I checked the definition, and what happens in release builds is that it gets replaced with do {} while (0) which is always valid to use in the else part of an if.

> Simple result does, but form history has 2 (yes two!) different custom
> implementations of nsIAutoCompleteResult that do fancy stuff with
> wrappedJSObject. That's horrible.

[As per the end of comment 20] I'll take your word for it.
(In reply to neil@parkwaycc.co.uk from comment #24)
> With the {}s it would have always been safe, but just to be sure, I checked
> the definition, and what happens in release builds is that it gets replaced
> with do {} while (0) which is always valid to use in the else part of an if.

right, but in the end I replaced it with a MOZ_ASSERT_IF
https://hg.mozilla.org/mozilla-central/rev/432bea2c0c68
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
Blocks: 1222015
(Sorry didn't spot this comment before)
(In reply to Marco Bonardo from comment #18)
> (In reply to comment #17)
> > I guess finding out which addons use RESULT_*_ONGOING would be a start.
> 
> I fear that's not enough, a consumer reusing the same result can use
> RESULT_*_ONGOING and just change the searchResult value later (Places does
> this for example). Handling these consumers is not a big deal.

Places uses RESULT_*_ONGOING? Where? (It didn't show up when I tried to search for it.)

> The problematic consumers are the ones that keep creating a new result
> everytime they have new matches for the same search.

Sure, but if nobody uses RESULT_*_ONGOING then there are no consumers.
Comment on attachment 8685474 [details] [diff] [review]
patch v2.3

NOTE: This request if for uplifting early in BETA 44 cycle, not for Beta 43.

Approval Request Comment
[Feature/regressing bug #]: long standing crash
[User impact if declined]: crashes, datalists are broken (bug 1230872) and bug 1222015
[Describe test coverage new/current, TreeHerder]: existing automated tests cover the functionality, the crash fix has no test (hard to build a reliable one)
[Risks and why]: the patch went through a full Nightly cycle and the code path is hit by every autocomplete interaction. No regressions reported so far.
[String/UUID change made/needed]: backwards compatible changes to nsIAutoCompleteSimpleResult.idl
Attachment #8685474 - Flags: approval-mozilla-beta?
Blocks: 1230872
Marking 43 wontfix for clarity based on comment 31.
Jorge, can we take the IDL changes mentioned in comment 31? Do they affect break addons compatibility? If not, I would like to uplift soon (today/tomorrow).
Flags: needinfo?(jorge)
They look okay since they are adding a function and an optional argument. I think it's okay for uplift.
Flags: needinfo?(jorge)
Comment on attachment 8685474 [details] [diff] [review]
patch v2.3

This has been on Nightly for ~6 weeks and fixes a crash, taking it. Beta44+
Attachment #8685474 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Marco, could you please nominate this for uplift to esr38 as well? I noticed bug 1222015 has "status-firefox-esr38" set to "affected".
Flags: needinfo?(mak77)
(In reply to Ritu Kothari (:ritu) from comment #37)
> Marco, could you please nominate this for uplift to esr38 as well? I noticed
> bug 1222015 has "status-firefox-esr38" set to "affected".

I think the patch doesn't apply cleanly on 38, it is based on top of changes we did more recently... It's true the crash exists on 38 as well, but it exists from a long time (years) to be honest and I think it's only the recent changes who made it more likely to be hit.
If you think we should uplift it I can try to rebase the patch.
Flags: needinfo?(mak77) → needinfo?(rkothari)
Marco, I'll respond in bug 1222015.
Flags: needinfo?(rkothari)
You need to log in before you can comment on or make changes to this bug.