Closed Bug 675515 Opened 13 years ago Closed 13 years ago

Crash [@ TextUpdater::DoUpdate] with long text node

Categories

(Core :: Disability Access APIs, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla8
Tracking Status
firefox8 + fixed
firefox9 + fixed
status1.9.2 --- unaffected

People

(Reporter: jruderman, Assigned: surkov)

References

Details

(Keywords: crash, testcase, Whiteboard: [sg:critical?][qa!])

Crash Data

Attachments

(3 files, 2 obsolete files)

1. Start a debug build of Firefox.

2. Enable accessibility, e.g. by pasting the following into the js console:

Components.classes["@mozilla.org/accessibilityService;1"]
      .getService(Components.interfaces.nsIAccessibleRetrieval);

3. Load the testcase.

Result: Crash [@ TextUpdater::DoUpdate] trying to access e.g. 0x24db0000.
Attached file stack trace β€”
Attached patch patch (obsolete) β€” β€” Splinter Review
Assignee: nobody → surkov.alexander
Status: NEW → ASSIGNED
Attachment #549727 - Flags: superreview?(neil)
Attachment #549727 - Flags: review?(trev.saunders)
Comment on attachment 549727 [details] [diff] [review]
patch

I'm not sure this does the right thing when one of the lengths is zero. The old code would have ended up firing one of the two events, although somewhat inefficiently. However this code looks as if it always fires both events. But the good news is that having fixed that you can then special-case on one of the lengths being zero and take this "fast" path.

>   // Increase offset of the text leaf on skipped characters amount.
>   mTextOffset += aSkipStart;
> 
>+  if (strLen1 > kMaxStrLen || strLen2 > kMaxStrLen) {
>+    // Fire text change event for insertion and removal.
At first I wondered why you did that at this particular point in the code but then I realised that you don't fire the insertion or removal for the coinciding initial and final characters :-)

>+  const static PRUint32 kMaxStrLen = 2 << 6;
Did you mean 2^6? That's actually written 1 << 6. (I'm assuming you have a good feel as to how big a string you normally encounter in practice.)
(In reply to comment #3)rz;

> >+  const static PRUint32 kMaxStrLen = 2 << 6;
> Did you mean 2^6? That's actually written 1 << 6. (I'm assuming you have a
> good feel as to how big a string you normally encounter in practice.)

Yes, I meant 2^6. It doesn't sound like the user can change the text twice in one processing, i.e. in case where AT would want to announce user actions. For other changes I think it's ok if AT just updates the text on its side. The used number is rough approximation where it can be too expensive to start the string difference algorithm. I don't have enough data for fair judgement, I think we can tweak it in the future.
Attached patch patch2 (obsolete) β€” β€” Splinter Review
Attachment #549727 - Attachment is obsolete: true
Attachment #549727 - Flags: superreview?(neil)
Attachment #549727 - Flags: review?(trev.saunders)
Attachment #549758 - Flags: superreview?(neil)
Attachment #549758 - Flags: review?(trev.saunders)
Can you explain what causes the crash and why adding this seemingly-arbitrary limit fixes it?
(In reply to comment #6)
> Can you explain what causes the crash and why adding this
> seemingly-arbitrary limit fixes it?

The problem is it's supposed that the array should be larger than address space (while it's not). So while the array is iterated then you get the case when *(pointer+=increment) doesn't point to any array item.
What array? How could it possibly be larger than my 2^64 address space?
I don't know much about internals, so it's likely guesses reasonable at some precision.

I started from the point that entries array
PRUint32 len1 = 2^16, len2 = 2^16 + delta;
PRUint32* entries = new PRUint32[len1 * len2];
can't be larger than 2^32. I guessed that result len1 * len2 is casted to PRUint32 (the same time not sure though whether arrays larger than 2^32 are allowed in 32 builds).

And then when you iterate it
PRUint32* iter = entries;
for (PRUint32 i = 0; i < len2; i++) {
  iter += len1;
  for (PRUint32 j = 0; j < len1; j ++) {
    *(iter + j) should through us out of array
  }
}

sine it should be equivalent to PRUint32* iter = entries + (2^32 + something) what is overflow.
Ok, that makes sense.

What is the impact on accessibility to only compute the edit distance for tiny strings? Is there a bug on switching to a more efficient Levenshtein distance algorithm?
(In reply to comment #10)
> What is the impact on accessibility to only compute the edit distance for
> tiny strings?

I think it shouldn't end up with broken AT user experience (I put some thoughts in comment #4).

> Is there a bug on switching to a more efficient Levenshtein
> distance algorithm?

there's no bug, I appreciate if you could file it and share ideas.
Filed bug 675685.
Even in 64-bit builds the calculation is done in 32 bits. So for instance using the test case you get 0x10001 * 0x10003 = 0x40003, not 0x100040003 (which would at least not crash, although I guess it would be somewhat slow...)
Comment on attachment 549758 [details] [diff] [review]
patch2

>+  if (strLen1 > kMaxStrLen || strLen2 > kMaxStrLen) {
Nit: could check for either length being zero here as well, since in that case the edit is the obvious edit anyway.

>+  const static PRUint32 kMaxStrLen = 2 << 6;
I thought we had agreed on 2^6...
(In reply to comment #14)
> >+  const static PRUint32 kMaxStrLen = 2 << 6;
> I thought we had agreed on 2^6...

personally I think I'd like something a little  larger than 64 characters in the longer string, probably in the range 2^7 - 2^10, but I agree 2 << 6 is a odd way to write it, I'd probably use 0x80,  just 128, or as Neil sad 1 << 7.
(In reply to comment #14)
> Comment on attachment 549758 [details] [diff] [review] [diff] [details] [review]
> patch2
> 
> >+  if (strLen1 > kMaxStrLen || strLen2 > kMaxStrLen) {
> Nit: could check for either length being zero here as well, since in that
> case the edit is the obvious edit anyway.

uh, I don't think it should ever be 0 therebecause of the checks we do earlier see line 92 or so.
Comment on attachment 549758 [details] [diff] [review]
patch2


>+  if (strLen1 > kMaxStrLen || strLen2 > kMaxStrLen) {
>+    if (strLen1 > 0) {

isn't this always true because of the checks we do for a only appen case around line 91?

>+      // Fire text change event for removal.
>+      nsRefPtr<AccEvent> textRemoveEvent =
>+        new AccTextChangeEvent(mHyperText, mTextOffset, str1, PR_FALSE);
>+      mDocument->FireDelayedAccessibleEvent(textRemoveEvent);

why not use the UpdateTextnFireEvent method?

>+    }
>+
>+    if (strLen2 > 0) {
>+      // Fire text change event for insertion.
>+      nsRefPtr<AccEvent> textInsertEvent =
>+        new AccTextChangeEvent(mHyperText, mTextOffset, str2, PR_TRUE);
>+      mDocument->FireDelayedAccessibleEvent(textInsertEvent);

same, I gues it would mean to value change events, and a second update to our internal string, I'm not usre if that's a bad thing though since with the 2 text change events we're sort of claiming what happened was a remove and insert, so maybe we should infact fire 2 value change events.

>+  /**
>+   * The constant used to skip string difference calculation in case of long
>+   * strings.
>+   */
>+  const static PRUint32 kMaxStrLen = 2 << 6;

in addition to the existing discussion of this why not make it private?

>+        var eventItem = aEventList[i];

I'd probably name it just event, but its not particularly clear  to me that its nicer than just using aEventList[i].
(In reply to comment #16)
>(In reply to comment #14)
>>(From update of attachment 549758 [details] [diff] [review])
>>>+  if (strLen1 > kMaxStrLen || strLen2 > kMaxStrLen) {
>>Nit: could check for either length being zero here as well, since in that
>>case the edit is the obvious edit anyway.
>uh, I don't think it should ever be 0 therebecause of the checks we do
>earlier see line 92 or so.
I don't think that applies in the case of "XZ" -> "XYYY...YYYZ" does it?
(In reply to comment #18)
> (In reply to comment #16)
> >(In reply to comment #14)
> >>(From update of attachment 549758 [details] [diff] [review] [diff] [details] [review])
> >>>+  if (strLen1 > kMaxStrLen || strLen2 > kMaxStrLen) {
> >>Nit: could check for either length being zero here as well, since in that
> >>case the edit is the obvious edit anyway.
> >uh, I don't think it should ever be 0 therebecause of the checks we do
> >earlier see line 92 or so.
> I don't think that applies in the case of "XZ" -> "XYYY...YYYZ" does it?

if you mean the case of "xy" -> "xay" then yes, I think the current code issies the fast path possible, but I think the check at line 113 can be  changed to take aSkipStart into account to fix that.
(In reply to comment #19)
> (In reply to comment #18)
> > I don't think that applies in the case of "XZ" -> "XYYY...YYYZ" does it?
> 
> if you mean the case of "xy" -> "xay" then yes, I think the current code
> issies the fast path possible, but I think the check at line 113 can be 
> changed to take aSkipStart into account to fix that.

In the case of "xy" -> "xay" then aSkipStart is 1 and skipEnd is 1 but minLen is 2 so neither branch is taken.
(In reply to comment #20)
> (In reply to comment #19)
> > (In reply to comment #18)
> > > I don't think that applies in the case of "XZ" -> "XYYY...YYYZ" does it?
> > 
> > if you mean the case of "xy" -> "xay" then yes, I think the current code
> > issies the fast path possible, but I think the check at line 113 can be 
> > changed to take aSkipStart into account to fix that.
> 
> In the case of "xy" -> "xay" then aSkipStart is 1 and skipEnd is 1 but
> minLen is 2 so neither branch is taken.

yes, that's why I said could be changed, not  currently does :-)
(In reply to comment #14)
> Comment on attachment 549758 [details] [diff] [review] [diff] [details] [review]
> patch2
> 
> >+  if (strLen1 > kMaxStrLen || strLen2 > kMaxStrLen) {
> Nit: could check for either length being zero here as well, since in that
> case the edit is the obvious edit anyway.

Either length can be zero so that insert or delete or both events can be fired. I didn't catch your idea, could you give details please?

> >+  const static PRUint32 kMaxStrLen = 2 << 6;
> I thought we had agreed on 2^6...

Sorry, I missed correction when read it.
(In reply to comment #21)

> > In the case of "xy" -> "xay" then aSkipStart is 1 and skipEnd is 1 but
> > minLen is 2 so neither branch is taken.
> 
> yes, that's why I said could be changed, not  currently does :-)

this should require us to change the whole logic, currently we just fall out into levenshtein algorithm what's not expensive in case of single insertion or deletion.
(In reply to comment #13)
> Even in 64-bit builds the calculation is done in 32 bits. So for instance
> using the test case you get 0x10001 * 0x10003 = 0x40003, not 0x100040003
> (which would at least not crash, although I guess it would be somewhat
> slow...)

Thank you, that's why we get out of range of the array.
(In reply to comment #22)
>(In reply to comment #14)
>>(From update of attachment 549758 [details] [diff] [review])
>>>+  if (strLen1 > kMaxStrLen || strLen2 > kMaxStrLen) {
>>Nit: could check for either length being zero here as well, since in that
>>case the edit is the obvious edit anyway.
>Either length can be zero so that insert or delete or both events can be
>fired. I didn't catch your idea, could you give details please?
Well, you're adding some code which handles the case "xaaa...aaay" -> "xbbb...bbby". You're doing this when a and b are large because the distance is too inefficient to compute. However you of course have to be careful to handle the two cases "xaaa...aaay" -> "xy" and "xy" -> "xbbb...bbby", which are a single delete and single insert respectively, but a and b are still large. I'm just pointing out that you can also use this block for the "xay" -> "xy" and "xy" -> "xby" cases even when a and b are not large, but the other is zero.

[In fact, having done this, you can even get rid of the "xa" <-> "x" and the "x" <-> "bx" special cases, but that's beyond the scope of this bug.]
New code works for:

"xaaa...aaay" -> "xbbb...bbby" (removal for bbb...bbb, insertion for aaa...aaa)
"xaaa...aaay" -> "xy" (removal for aaa...aaa)
"xy" -> "xbbb...bbby" (insertion for bbb...bbb)

also for cases like
"xaaa...coincidence...aaay" -> "xbbb...coincidence...bbby"
(removal for aaa...coincidence...aaa, insertion for bbb...coincidence...bbb

to make this code working for "xay" -> "xy" and "xy" -> "xby" (when a and b are small) I should check if a and b doesn't have common parts (for example, if these aren't two insertions). Maybe it's ok to continue Levenstain algorithm usage to keep these cases.

I'd be nice to get rid "xa" <-> "x" and the "x" <-> "bx" special cases, but currently they serve to protect from case when endSkipOffset < skipStartOffset.

I'm still not sure what's the best way to implement your suggestion.
(In reply to comment #26)
> to make this code working for "xay" -> "xy" and "xy" -> "xby" (when a and b
> are small) I should check if a and b doesn't have common parts (for example,
> if these aren't two insertions).
I think you might have misunderstood me; x and y denote the common parts; a and b have nothing in common with either x or y (or each other, but then again these are separate cases, not to be confused with "xay" -> "xby").
Attached patch patch3 β€” β€” Splinter Review
Attachment #549758 - Attachment is obsolete: true
Attachment #549758 - Flags: superreview?(neil)
Attachment #549758 - Flags: review?(trev.saunders)
Attachment #550335 - Flags: superreview?(neil)
Attachment #550335 - Flags: review?(trev.saunders)
Comment on attachment 550335 [details] [diff] [review]
patch3

I didn't realise you could eliminate UpdateTextNFireEvent too :-)
Attachment #550335 - Flags: superreview?(neil) → superreview+
(In reply to comment #29)
> Comment on attachment 550335 [details] [diff] [review] [diff] [details] [review]
> patch3
> 
> I didn't realise you could eliminate UpdateTextNFireEvent too :-)

I didn't too until I understood what you talk about :)
> >+  /**
> >+   * The constant used to skip string difference calculation in case of long
> >+   * strings.
> >+   */
> >+  const static PRUint32 kMaxStrLen = 2 << 6;
> 
> in addition to the existing discussion of this why not make it private?
> 
> >+        var eventItem = aEventList[i];
> 
> I'd probably name it just event, but its not particularly clear  to me that
> its nicer than just using aEventList[i].

Alex, what about these two comments?
Attachment #550335 - Flags: review?(trev.saunders) → review+
(In reply to comment #31)
> > >+  /**
> > >+   * The constant used to skip string difference calculation in case of long
> > >+   * strings.
> > >+   */
> > >+  const static PRUint32 kMaxStrLen = 2 << 6;
> > 
> > in addition to the existing discussion of this why not make it private?

because it is private :)

> > >+        var eventItem = aEventList[i];
> > 
> > I'd probably name it just event, but its not particularly clear  to me that
> > its nicer than just using aEventList[i].

fine with me

> Alex, what about these two comments?

thank you for pointing it out, sorry I didn't addressed them early
(In reply to comment #15)
> (In reply to comment #14)
> > >+  const static PRUint32 kMaxStrLen = 2 << 6;
> > I thought we had agreed on 2^6...
> 
> personally I think I'd like something a little  larger than 64 characters in
> the longer string, probably in the range 2^7 - 2^10, but I agree 2 << 6 is a
> odd way to write it, I'd probably use 0x80,  just 128, or as Neil sad 1 << 7.

let's keep it small for now, we could tweak it based on telemetry results (if it makes sense)
(In reply to comment #17)

> same, I gues it would mean to value change events, and a second update to
> our internal string, I'm not usre if that's a bad thing though since with
> the 2 text change events we're sort of claiming what happened was a remove
> and insert, so maybe we should infact fire 2 value change events.

no, it doesn't matter how many times it was changed, all AT needs to know that value is different now (btw, they will be coalesced)
(In reply to comment #32)
> (In reply to comment #31)
> > > >+  /**
> > > >+   * The constant used to skip string difference calculation in case of long
> > > >+   * strings.
> > > >+   */
> > > >+  const static PRUint32 kMaxStrLen = 2 << 6;
> > > 
> > > in addition to the existing discussion of this why not make it private?
> 
> because it is private :)

Oh, I see, we have a redundant private: right after it.
inbound land http://hg.mozilla.org/integration/mozilla-inbound/rev/75c8a2eb9f87
Flags: in-testsuite+
OS: Mac OS X → All
Hardware: x86_64 → All
Whiteboard: [sg:critical?] → [sg:critical?] [inbound]
landed http://hg.mozilla.org/mozilla-central/rev/75c8a2eb9f87
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Whiteboard: [sg:critical?] [inbound] → [sg:critical?]
Target Milestone: --- → mozilla8
blocking1.9.2: --- → ?
Firefox 3.6 is not affected because the patched code path was introduced as part of Firefox 4 big a11y rework: technically Firefox 4 started to cache the rendered text and operate on it. This problem appears when this text is long enough and thus this bug can't exists in Firefox 3.6.
blocking1.9.2: ? → ---
Whiteboard: [sg:critical?] → [sg:critical?][qa+]
I'm trying to verify this fix but when I execute the code from comment 0 in the JS console I only get an error that it is undefined. Can't I use a debug nightly build from the FTP server for testing?
(In reply to Henrik Skupin (:whimboo) from comment #39)
> I'm trying to verify this fix but when I execute the code from comment 0 in
> the JS console I only get an error that it is undefined. Can't I use a debug
> nightly build from the FTP server for testing?

If you do that on mac then yes since accessibility is not a part of default build.
How do I enable accessibility in debug builds? Which flag do I have to set?
1. In mozconfig:

  ac_add_options --enable-accessibility

(It's disabled by default on Mac, but Tinderbox has a special mozconfig that turns it on for Mac debug builds. So you could download a Tinderbox debug build.)

2. At runtime, paste into the Error Console:

  Components.classes["@mozilla.org/accessibilityService;1"]
            .getService(Components.interfaces.nsIAccessibleRetrieval);

If you get an [xpconnect wrapped nsIAccessibleRetrieval], it worked.
Thanks for the Tinderbox hint, Jesse! I was optimistic that it [1] will work but sadly it doesn't. I still get the undefined error. I will now rebuilt my local debug build with the make option set.

[1] http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx-debug/1323288955/
Try a 64-bit Mac Tinderbox build.
http://mxr.mozilla.org/mozilla-central/search?string=enable-accessibility
Verified fixed with:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:9.0) Gecko/20111206 Firefox/9.0

Mozilla/5.0 (Windows NT 5.1; rv:8.0.1) Gecko/20100101 Firefox/8.0.1

Mozilla/5.0 (X11; Linux i686; rv:8.0.1) Gecko/20100101 Firefox/8.0.1

Can't test with the Firefox 8.0 release debug build on OS X because I still get the undefined error. But the 9.0 beta works pretty well and doesn't crash.
Status: RESOLVED → VERIFIED
Whiteboard: [sg:critical?][qa+] → [sg:critical?][qa!]
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: