Closed Bug 1885702 Opened 11 months ago Closed 5 months ago

Crash in [@ mozilla::detail::InvalidArrayIndex_CRASH | nsTArray_Impl<T>::ElementAt | nsTArray_Impl<T>::operator[] | ClusterIterator::ClusterIterator]

Categories

(Core :: Layout, defect)

Other
Windows
defect

Tracking

()

VERIFIED FIXED
132 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox-esr128 --- verified
firefox126 --- wontfix
firefox130 --- wontfix
firefox131 --- wontfix
firefox132 --- verified

People

(Reporter: release-mgmt-account-bot, Assigned: jfkthame)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: crash, regression, reproducible)

Crash Data

Attachments

(4 files)

Crash report: https://crash-stats.mozilla.org/report/index/9688803f-4512-44e3-b212-581a80240301

Reason: EXCEPTION_BREAKPOINT

Top 10 frames of crashing thread:

0  mozglue.dll  MOZ_Crash  mfbt/Assertions.h:301
0  mozglue.dll  mozilla::detail::InvalidArrayIndex_CRASH  mfbt/Assertions.cpp:50
1  xul.dll  nsTArray_Impl<RefPtr<nsTransformedCharStyle>, nsTArrayInfallibleAllocator>::ElementAt const  xpcom/ds/nsTArray.h:1219
1  xul.dll  nsTArray_Impl<RefPtr<nsTransformedCharStyle>, nsTArrayInfallibleAllocator>::operator[] const  xpcom/ds/nsTArray.h:1250
1  xul.dll  ClusterIterator::ClusterIterator  layout/generic/nsTextFrame.cpp:8095
2  xul.dll  nsTextFrame::PeekOffsetWord  layout/generic/nsTextFrame.cpp:8170
3  xul.dll  nsIFrame::PeekOffsetForWord  layout/generic/nsIFrame.cpp:9118
4  xul.dll  nsIFrame::PeekOffset  layout/generic/nsIFrame.cpp:9430
5  xul.dll  nsIFrame::PeekBackwardAndForward  layout/generic/nsIFrame.cpp:5069
6  xul.dll  nsIFrame::SelectByTypeAtPoint  layout/generic/nsIFrame.cpp:4988

By querying Nightly crashes reported within the last 2 months, here are some insights about the signature:

  • First crash report: 2024-02-09
  • Process type: Content
  • Is startup crash: No
  • Has user comments: No
  • Is null crash: No

The Bugbug bot thinks this bug should belong to the 'Core::Layout' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: General → Layout
Keywords: pernosco-wanted

The severity field is not set for this bug.
:jfkthame, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(jfkthame)

Not a new crash, but it does seem to have become more frequent in the last couple months. From the reports I looked at, it seems like it may be related to editing a field with password masking, though I don't know of any recent changes around that code. @masayuki, would you be able to take a look at what might be happening here?

Flags: needinfo?(jfkthame) → needinfo?(masayuki)

Well, I touched PeekOffsetForWord and its caller for solving some Selection::Modify bugs like in bug 1872302, but I don't touch directly around the crash point. The main change which I did is, making some code ignore native anonymous subtrees to get new caret position to avoid moving caret into <input> etc from its parent. Therefore, I guess that the behavior change made a hidden bug appear...

Flags: needinfo?(masayuki)

According to the type of the array which we access outbound, the crashed line in ClusterIterator::ClusterIterator() is here:

if (transformedTextRun->mStyles[skippedOffset]->mMaskPassword) {

The code loops are written as:

for (uint32_t i = 0; i < mFrag->GetLength(); ++i) {
  mIterator.SetOriginalOffset(i);
  uint32_t skippedOffset = mIterator.GetSkippedOffset();
  if (mFrag->IsHighSurrogateFollowedByLowSurrogateAt(i)) {
    if (transformedTextRun->mStyles[skippedOffset]->mMaskPassword) {
      maskedText.Append(kPasswordMask);
      maskedText.Append(kPasswordMask);
    } else {
      maskedText.Append(mFrag->CharAt(i));
      maskedText.Append(mFrag->CharAt(i + 1));
    }
    ++i;
  } else {

and transformedTextRun is initialized as:

const nsTransformedTextRun* transformedTextRun =
    static_cast<const nsTransformedTextRun*>(textRun);

textRun is initialized as:

gfxTextRun* textRun = aTextFrame->GetTextRun(nsTextFrame::eInflated);

However, mIterator is initialized with aTextFrame->EnsureTextRun(nsTextFrame::eInflated); which may be different from textRun. So, it seems that in the strictly speaking, the loop is wrong because used different offset for different TextRun. However, this code is almost 5-year-old. So, I guess that it was true before, but something was changed around the first crash report. On the other hand, I don't see any changes around nsTransformedTextRun, GetTextRun() and EnsureTextRun() recently.

Do you have any ideas about the loop or nsTransformedTextRun?

Flags: needinfo?(jfkthame)

(In reply to Masayuki Nakano [:masayuki] (he/him)(JST, +0900) from comment #5)

However, mIterator is initialized with aTextFrame->EnsureTextRun(nsTextFrame::eInflated); which may be different from textRun. So, it seems that in the strictly speaking, the loop is wrong because used different offset for different TextRun.

I don't think this can be a problem; textRun comes from aTextFrame->GetTextRun(nsTextFrame::eInflated) here, which simply returns the textframe's mTextRun. This is the same textrun was used to initialize mIterator here, because EnsureTextRun either uses an already-existing mTextRun or creates a new textrun and stores it there, so in either case, the subsequent call to GetTextRun will return the same textrun.

So I think there must be something broken in nsTransformedTextRun, such that its mStyles array is incorrect.

Oh..... re-reading code there, I suspect there may have been an error in the patch in bug 1852811, and it's a rare enough case that we didn't catch it in automation. I'll double-check that and see if we can verify with a crashtest, if my suspicions are right.

Assignee: nobody → jfkthame
Flags: needinfo?(jfkthame)

(In reply to Jonathan Kew [:jfkthame] from comment #6)

Oh..... re-reading code there, I suspect there may have been an error in the patch in bug 1852811, and it's a rare enough case that we didn't catch it in automation. I'll double-check that and see if we can verify with a crashtest, if my suspicions are right.

No, after refreshing my understanding of the code there, it wasn't broken (at least not in the way I was thinking). So I don't know what's going wrong here. Unassigning myself for now, as I don't have a way forward at this point and I'm not sure where the root of the issue lies.

Assignee: jfkthame → nobody

Marking this as S3 given the still-pretty-low crash rate, and that it doesn't appear to be a recent regression. (Bugbot's comment 0 saying "First crash report: 2024-02-09" is a bit misleading, as the 6-month graph clearly shows it was happening before that; at least back to 2023-10-23.)

Maybe if the fuzzers manage to find a way to hit this sometime, we can get a testcase to examine.

Severity: -- → S3
Attached video crash

Hello,
I was able to reproduce this crash on Firefox Nightly 132.0a1 (2024-09-19), using Windows 11 with the following steps:

  1. Open Firefox and go to facebook.com
  2. Enter some characters into the password field and then press the "Space" key / Select some random text along with the empty space following it (see the attached video for a better understanding)
  3. Double click inside the password field

AR: The tab is crashing

Crash report: https://crash-stats.mozilla.org/report/index/c0c1d47d-3d19-422e-8a74-c95810240919

Please let me know if I can provide more details!

The crash rate has gone up a lot (though it is still low in absolute values) in July, and there are some specific steps now. Maybe somebody could take a look at this again? Thanks.

Flags: needinfo?(jfkthame)
Keywords: regression
Regressed by: 1824671

Thanks for getting the regression range.

Set release status flags based on info from the regressing bug 1824671

(In reply to Bianca Hidecuti, Desktop Test Engineering [:bhidecuti] from comment #9)

Created attachment 9425920 [details]
crash

Hello,
I was able to reproduce this crash on Firefox Nightly 132.0a1 (2024-09-19), using Windows 11 with the following steps:

  1. Open Firefox and go to facebook.com
  2. Enter some characters into the password field and then press the "Space" key / Select some random text along with the empty space following it (see the attached video for a better understanding)
  3. Double click inside the password field

AR: The tab is crashing

Crash report: https://crash-stats.mozilla.org/report/index/c0c1d47d-3d19-422e-8a74-c95810240919

Please let me know if I can provide more details!

Thanks for the STR here, that's really helpful. I can reproduce the crash on all desktop platforms with these steps.

Curiously, the crash happens even if the trailing <space> character is deleted from the password field before double-clicking. But it doesn't happen if there is a space and then a following non-space character present.

So it seems to be dependent not only on the current content of the password field, but also on the edit history: there must have been a trailing space present to trigger the crash, but if the trailing space has just been removed by backspacing (and no new text added), the crash still occurs.

Flags: needinfo?(jfkthame)

I captured a pernosco trace of this crash, using the STR from comment 9: https://pernos.co/debug/4--xIZSqb51Rrm1sMneNAw/index.html

The password field contains the text "current " (including a trailing space) pasted in from the clipboard, and then I double-clicked in the field, resulting in the crash.

The frame dump for the password editing control looks like this:

ScrollContainer(div)(-1)@7fe69c869bd0 parent=7fe69c869620 (x=0, y=0, w=300, h=22) [content=7fe69f315a60] [cs=7fe6cb470708:-moz-text-control-editing-root] <
  Block(div)(-1)@7fe69c869e68 parent=7fe69c869bd0 (x=0, y=0, w=300, h=22) [content=7fe69f315a60] [cs=7fe6a5805808:-moz-scrolled-content] <
    line@7fe69c869fc0 count=2 state=inline,clean,prevmarginclean,not-impacted,not-wrapped,no-break,clear-before:none,clear-after:none(x=0, y=0, w=47.6, h=22) <
      Text(0)"current"@7fe69c869f20 parent=7fe69c869e68 next=7fe6a3b31110 next-continuation=7fe6a3b31110 (x=0, y=2.25, w=41.65, h=17.5) bidi(0,2,0) [content=7fe69f317800] [cs=7fe6a5805108:-moz-text] [run=7fe6a3bf3c80][0,7,F] 
      Text(0)" "@7fe6a3b31110 parent=7fe69c869e68 prev-continuation=7fe69c869f20 (x=41.65, y=2.25, w=5.95, h=17.5) bidi(0,0,255) [content=7fe69f317800] [cs=7fe6a5805108:-moz-text] [run=7fe6986e1c40][7,1,T] 
    >
  >
>

One interesting thing is that the trailing space is split off into a continuation with different bidi attributes than the rest of the text. I wonder if this is a result of some custom handling facebook is doing?

(It's worth noting that I have not been able to reproduce the crash using a simple <input type=password> field outside the context of the facebook login page.)

The root of the problem is that the masking code here in ClusterIterator does not allow for the possibility that the textframe may not cover the whole of the text fragment; this will happen in cases involving bidi continuations, where a single text fragment may map to multiple frames with differing bidi levels.

When this happens, the frame's textrun will only cover a part of the text, and so when we try to create the masked text fragment, we'll attempt to access style data that is outside the bounds of the transformed textrun's style array.

Assignee: nobody → jfkthame
Status: NEW → ASSIGNED

I've reduced this to a simpler testcase: just load

data:text/html,<input type=password value="foo&%23x202e;bar&%23x202c;">

in a new tab, then double-click in the input field. Result: crash.

Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/fa7ab4257a6f Fix initialization of masked text fragment in ClusterIterator. r=layout-reviewers,emilio https://hg.mozilla.org/integration/autoland/rev/59ea08f1b99d Add a WPT testcase that triggers the crash here. r=dholbert
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/48340 for changes under testing/web-platform/tests
Keywords: pernosco-wanted
Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 132 Branch
Upstream PR merged by moz-wptsync-bot

The patch landed in nightly and beta is affected.
:jfkthame, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox131 to wontfix.

For more information, please visit BugBot documentation.

Flags: needinfo?(jfkthame)

Not sure it's justified to uplift this so late in the cycle, given the relatively low crash rate.

Flags: needinfo?(jfkthame)

Verified as fixed with Firefox 132.0b2 on macOS 14.6, Windows 11 and Ubuntu 22.04. The crash is no longer happening after following the steps from Comment 9 or from Comment 18.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
Attachment #9430078 - Flags: approval-mozilla-esr128?

esr128 Uplift Approval Request

  • User impact if declined: possible crash (out-of-bounds access) interacting with password fields
  • Code covered by automated testing: yes
  • Fix verified in Nightly: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: see comment 9, comment 18
  • Risk associated with taking this patch: low
  • Explanation of risk level: change contained to password-masking code, simple corrections to text fragment indexing
  • String changes made/needed: none
  • Is Android affected?: yes
Attachment #9430078 - Flags: approval-mozilla-esr128? → approval-mozilla-esr128+

Verified as fixed with Firefox 128.4.0esr on macOS 14.7, Windows 11 and Ubuntu 22.04.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: