crash in mozilla::dom::TabParent::HandleQueryContentEvent(mozilla::WidgetQueryContentEvent&)

VERIFIED FIXED in Firefox 41

Status

()

defect
--
critical
VERIFIED FIXED
4 years ago
4 years ago

People

(Reporter: dmajor, Assigned: masayuki)

Tracking

({crash})

Trunk
mozilla41
x86_64
Windows
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox40 unaffected, firefox41+ verified)

Details

(crash signature)

Attachments

(19 attachments, 2 obsolete attachments)

9.20 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
9.59 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
12.31 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
13.60 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
10.78 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
3.69 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
10.35 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
3.91 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
35.84 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
15.00 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
13.37 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
4.50 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
25.47 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
1.54 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
10.36 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
2.61 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
2.44 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
41.53 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
11.26 KB, patch
m_kato
: review+
Details | Diff | Splinter Review
Reporter

Description

4 years ago
This bug was filed from the Socorro interface and is 
report bp-7cbb9804-ad13-47ed-ad8a-d409a2150516.
=============================================================

New top crash on nightly starting in build 20150515084717.

This happens on Windows 64-bit builds only. It looks like mIMECompositionRects is a bad pointer. 

User comments:
"I turned on IME and input Japanese charancters, then Firefox crashed."
"site: Google+"
"When I tried to input message into text box, then crashed."

Regression range: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=127a78bac3f1&tochange=1a8343f8ed83

I assume bug 1130935.
Flags: needinfo?(masayuki)
Flags: needinfo?(VYV03354)
Reporter

Comment 1

4 years ago
[Tracking Requested - why for this release]: New topcrash on 41
Reporter

Comment 2

4 years ago
> It looks like mIMECompositionRects is a bad pointer. 
Or maybe "i" is bad?
Reporter

Comment 3

4 years ago
      if (aEvent.mInput.mOffset < mIMECompositionRectOffset ||
          (aEvent.mInput.mOffset + aEvent.mInput.mLength >
            mIMECompositionRectOffset + mIMECompositionRects.Length())) {
        // XXX
        // we doesn't have cache for this request.
        break;
      }

Should the ">" be ">=" ?
Reporter

Comment 4

4 years ago
In the minidumps, "i" is something like 0xfffffff7. Since this happens only on Win64, maybe there is an overflow/underflow somewhere?
(In reply to David Major [:dmajor] from comment #3)
>       if (aEvent.mInput.mOffset < mIMECompositionRectOffset ||
>           (aEvent.mInput.mOffset + aEvent.mInput.mLength >
>             mIMECompositionRectOffset + mIMECompositionRects.Length())) {
>         // XXX
>         // we doesn't have cache for this request.
>         break;
>       }
> 
> Should the ">" be ">=" ?

I don't think so... It compares end offsets.
Flags: needinfo?(masayuki)
OS: Windows NT → Windows
Hardware: Unspecified → x86_64
Version: unspecified → Trunk
Hmm, I built x64 debug build locally, but I cannot reproduce this... We need STR for this.
I reproduced this problem last night.

Environment:
 Mac Book Air(OS X 10.10.3 Yosemite)
 Firefox Nightly 41.0a1(2015-05-20)

Reproduce step:
 1) open MDN edit page.[1]
 2) input Japanese charancters.

[1] https://developer.mozilla.org/ja/docs/JavaScript/Reference/Operators/yield
Tracking for FF41 as it's a top crash (#4) at the moment.
Reporter

Comment 9

4 years ago
Does STR from comment 7 work for you?
Flags: needinfo?(masayuki)
I could work on this bug soon (I'm still working on another crash bug).
Got it. aEvent.mInput.mOffset is sometimes -1.
Assignee: nobody → masayuki
Status: NEW → ASSIGNED
Flags: needinfo?(masayuki)
Hmm, I still don't get a good STR. Looks like that this is a bug of the interface between PuppetWidget and TabParent. PuppetWidget updates cache in TabParent with two or more IPC calls. Therefore, TabParent may have incomplete information in its cache. So, we need to send all information once.
So, I meant that it's difficult to test it after creating the patch if my idea is right.
I believe that we need big change for fixing this completely. However, if this should be fixed in the branches, i.e., we need to uplift the fix, I'll create a wallpaper to avoid the crash.
mIMECompositonRects is empty.  We should check whether it is empty.

0:000> dt -r xul!nsTArray<mozilla::gfx::IntRectTyped<mozilla::LayoutDevicePixel>
 > 00000028`9f33fc00+0x1a8
   +0x000 mHdr             : 0x00007ff9`5d93dcb0 nsTArrayHeader
      =00007ff9`5d93dcb0 sEmptyHdr        : nsTArrayHeader
         =00007ff9`5d93dcb0 sEmptyHdr        : nsTArrayHeader
         +0x000 mLength          : 0
         +0x004 mCapacity        : 0y0000000000000000000000000000000 (0)
         +0x004 mIsAutoArray     : 0y0
      +0x000 mLength          : 0
      +0x004 mCapacity        : 0y0000000000000000000000000000000 (0)
      +0x004 mIsAutoArray     : 0y0
   =00007ff9`5af00000 NoIndex          : 0x00000003`00905a4d
Another issue that it is related to this.

When getting focus on PuppetWidget on content process, we sends UINT32_MAX as mOffset to chrome process.  If you need update selection, we should query it correctly then we have to call NotifyIMEOfSelecitonChange.
Hmm, that looks like different reason when I reproduce this crash. When I reproduce this, the array isn't empty but mInput.mOffset is wrong because selection is collapsed to 0, but start offset of the cached rects are collect offset (larger than 0).
Anyway, both cases can be explained. If IME tries to query content while PuppetWidget is modifying TabParent's content cache. I believe that PuppetWidget should update all content cache with a call of method. I'm trying to separate content cache to a new class. Then, PuppetWidget can cache all information too. Then, PuppetWidget should send its cache to TabParent after modifying a part of the cache.
I think that this is easy reproduce step.

- Env
e10s with IMM32 on Windows 8.1

- Step
1. Open about:home
2. Input "aa" into editbox
3. Press left arrow.  Then, caret position is between a and a.
4. Turn on IME
5. Input "あ” by pressing 'a'

Comment 20

4 years ago
Not only happens with Japanese, on this site http://l10n.mozest.org/zh_CN/firefox-central/translate/ , when I try to input Chinese character, firefox have chance to crash. I think it's an extensive bug with all users with e10s on and IME enabled. Need to be fixed soon.
Flags: needinfo?(masayuki)
What are you asking me? I'm working on this bug now but there are some bugs to fix this. Note that this isn't a recent regression, actually, there are some reports from Mac.
Flags: needinfo?(masayuki)
First of all, we should create a capsule to store content process's information. I named it mozilla::ContentCache. First, it should be used in TabParent instead of it has a lot of members.

Note that WidgetUtils.cpp part is an existing bug. I'm not sure why it doesn't cause bustage, but |using namespace widget;| in ContentCache.cpp makes it bustage.
Attachment #8615308 - Flags: review?(m_kato)
Let's move the complicated "request to commit/cancel" code from TabParent to ContentCache.
Attachment #8615313 - Flags: review?(m_kato)
Let's move caret information and text rects information to ContentCache.
Attachment #8615314 - Flags: review?(m_kato)
Let's move the writing mode information of selected range to ContentCache.
Attachment #8615316 - Flags: review?(m_kato)
Let's move WidgetQueryContentEventHandler. This makes a lot of public methods of ContentCache gone later.
Attachment #8615320 - Flags: review?(m_kato)
Let's make ContentCache IPC-aware. Only the members should be copied from child process.
Attachment #8615322 - Flags: review?(m_kato)
PuppetWidget should cache the content into its mContentCache with Cache*() methods. Then, it should be sent to TabParent at every notification to IME.

With this approach, PuppetWidget can send all content date with a notification. This is safer because if it's not copied once, IME might query content (cache) before all data isn't modified yet.
Attachment #8615326 - Flags: review?(m_kato)
Now, we see some redundant communication at IME notification. PuppetWidget should send notification to TabParent only once for a notification to IME since it can send all content data with it.
Attachment #8615327 - Flags: review?(m_kato)
Now, the design of ContentCache is almost fixed. Let's remove or hide unnecessary public methods.
Attachment #8615329 - Flags: review?(m_kato)
Let's log the behavior! This must help us!!
Attachment #8615330 - Flags: review?(m_kato)
This is an improvement.

First, we currently store caret position as target clause offset if there is composition. However, this is different behavior from non-e10s mode.

IME composition string is selected by a couple of IME selections for separating to multiple clauses and/or changes the visual style. But the caret is the "normal selection". So, caret position should be always anchor or focus of the selection.

And also when caret position is queried, we should guess the caret rect if the offset's text rect is in the cache.
Attachment #8615331 - Flags: review?(m_kato)
Let's cache the text rect at selection anchor and focus too. These points may be queried by IME.

Then, the order of calls of Cache*() becomes more important. So, we should guarantee the order by calling other Cache*() methods from Cache*().

CacheText(public) -> CacheSelection(public) -> CacheCaret(private) -> CacheTextRects(private)

SetSelection(public) -> CacheCaret(private) -> CacheTextRects(private)

CacheEditorRect(public)

CacheAll(public) -> CacheText()... -> CacheEditorRect().
Attachment #8615333 - Flags: review?(m_kato)
I realized this bug.

CacheTextRects() calls TextComposition::String() but it may not be modified yet if it's called from one of editor observers (i.e., IMEContentObserver).

In such case, TextComposition::LastData() is the expected value which is modified when dispatching NS_COMPOSITION_CHANGE event from TextComposition. (TextComposition::String() is modified after all editor observers are performed)
Attachment #8615337 - Flags: review?(m_kato)
ContentCache is now a little bit complicated. Some members are chrome process only, others are content only and the others are available in both processes.

Let's restrict it with mIsChrome flag in each method. Then, we can prevent to land buggy patch.
Attachment #8615340 - Flags: review?(m_kato)
NS_QUERY_TEXT_RECT event's mInput.mLength may be 0. In such case, we should return caret rect instead of empty rect.
Attachment #8615342 - Flags: review?(m_kato)
I see a lot of failure of NS_QUERY_CARET_RECT which is next of the composition string. So, if GetCaretRect() cannot find a text rect at the offset, it should guess the result from its previous character's rect if there is it.
Attachment #8615343 - Flags: review?(m_kato)
That's all. I think that, we can make ContentCache smarter, but for now, we should land this as far as possible for x64 testers.

And the actual cause of the crash is, mInput.mOffset is sometimes UINT32_MAX which is larger than mIMECompositionRectOffset and mInput.mOffset + mInput.mLength (only when > 0) is less than mIMECompositionRectOffset + mIMECompositionRects.Length() due to overflown! Therefore,

> 2268   case NS_QUERY_TEXT_RECT:
> 2269     {
> 2270       if (aEvent.mInput.mOffset < mIMECompositionRectOffset ||
> 2271           (aEvent.mInput.mOffset + aEvent.mInput.mLength >
> 2272             mIMECompositionRectOffset + mIMECompositionRects.Length())) {
> 2273         // XXX
> 2274         // we doesn't have cache for this request.
> 2275         break;
> 2276       }

This check won't work. I'm not sure why this happens only on x64 build, though. And I found this with the log (part.12).
Attachment #8615308 - Flags: review?(m_kato) → review+
Attachment #8615309 - Flags: review?(m_kato) → review+
Attachment #8615313 - Flags: review?(m_kato) → review+
Comment on attachment 8615314 [details] [diff] [review]
part.4 mozilla::ContentCache should store text rects and caret rect and TabParent should use them

Review of attachment 8615314 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/ContentCache.cpp
@@ +97,5 @@
> +bool
> +ContentCache::GetCaretRect(uint32_t aOffset,
> +                           LayoutDeviceIntRect& aCaretRect) const
> +{
> +  if (mCaret.mOffset != aOffset) {

Also, if mCaret.mOffset is UINT32_MAX, should we return false?

::: widget/ContentCache.h
@@ +159,5 @@
> +    }
> +
> +    uint32_t StartOffset() const
> +    {
> +      NS_WARN_IF(mStart == UINT32_MAX);

We shouldn't use StartOffset() with mStart is UINT32_MAX.  So we should use NS_ASSERTION instead of NS_WARN_IF.
Attachment #8615314 - Flags: review?(m_kato) → review+
Comment on attachment 8615309 [details] [diff] [review]
part.2 mozilla::ContentCache should store a selection range and TabParent should use it

Review of attachment 8615309 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/ContentCache.cpp
@@ +24,5 @@
> +void
> +ContentCache::SetSelection(uint32_t aAnchorOffset, uint32_t aFocusOffset)
> +{
> +  mSelection.mAnchor = aAnchorOffset;
> +  mSelection.mFocus = aFocusOffset;

Both values shouldn't set UINT32_MAX.  We should add assertion or warning when aAnchorOffset or aFocusOffset is UINT32_MAX
Attachment #8615316 - Flags: review?(m_kato) → review+
Attachment #8615317 - Flags: review?(m_kato) → review+
I see, but instead of modifying the patch, I'll add a new patch for making invalid state of mSelection due to it's very early patch of the series.
Comment on attachment 8615320 [details] [diff] [review]
part.7 mozilla::ContentCache should handle WidgetQueryContentEvent

Review of attachment 8615320 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/ContentCache.cpp
@@ +41,5 @@
> +  aEvent.mWasAsync = false;
> +  aEvent.mReply.mFocusedWidget = aWidget;
> +
> +  switch (aEvent.message) {
> +    case NS_QUERY_SELECTED_TEXT:

When NS_QUERY_SELECTED_TEXT on non-e10s cause, can it return true if selection is invalid??

If not so, when mSelection.StartOffset() returns UINT32_MAX, we should return false. In the other word, we should add like IsValid() method in Selection struct, then this method should check StartOffset is UINT32_MAX.

::: widget/TextEvents.h
@@ +541,5 @@
>    bool mWithFontRanges;
>    struct
>    {
>      uint32_t EndOffset() const
>      {

Also should we add NS_WARN_IF(mOffset == UINT32_MAX)?  If no situation, please ignore.
Attachment #8615322 - Flags: review?(m_kato) → review+
Attachment #8615326 - Flags: review?(m_kato) → review+
Attachment #8615327 - Flags: review?(m_kato) → review+
Comment on attachment 8615329 [details] [diff] [review]
part.11 Remove unnecessary public methods of mozilla::ContentCache

Review of attachment 8615329 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/ContentCache.cpp
@@ +208,5 @@
>    WidgetQueryContentEvent queryText(true, NS_QUERY_TEXT_CONTENT, aWidget);
>    queryText.InitForQueryTextContent(0, UINT32_MAX);
>    aWidget->DispatchEvent(&queryText, status);
>    if (NS_WARN_IF(!queryText.mSucceeded)) {
> +    mText.Truncate(0);

nit: mText.Trucate();
Attachment #8615329 - Flags: review?(m_kato) → review+
Comment on attachment 8615330 [details] [diff] [review]
part.12 Log the behavior of mozilla::ContentCache

Review of attachment 8615330 [details] [diff] [review]:
-----------------------------------------------------------------

By bug 1165515, logging code is replaced.  This code causes bustage.
Attachment #8615330 - Flags: review?(m_kato)
Attachment #8615331 - Flags: review?(m_kato) → review+
Attachment #8615333 - Flags: review?(m_kato) → review+
Attachment #8615337 - Flags: review?(m_kato) → review+
Attachment #8615340 - Flags: review?(m_kato) → review+
Comment on attachment 8615342 [details] [diff] [review]
part.17 ContentCache::HandleQueryContentEvent() should return caret rect if mInput.mLength of NS_QUERY_TEXT_RECT event is 0

Review of attachment 8615342 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/ContentCache.cpp
@@ +262,5 @@
> +        if (NS_WARN_IF(!GetUnionTextRects(aEvent.mInput.mOffset,
> +                                          aEvent.mInput.mLength,
> +                                          aEvent.mReply.mRect))) {
> +          // XXX We don't have cache for this request.
> +          MOZ_LOG(sContentCacheLog, PR_LOG_ERROR,

Replace 2nd param with LogLevel::Error

@@ +272,5 @@
> +      } else {
> +        // If the length is 0, we should return caret rect instead.
> +        if (NS_WARN_IF(!GetCaretRect(aEvent.mInput.mOffset,
> +                                     aEvent.mReply.mRect))) {
> +          MOZ_LOG(sContentCacheLog, PR_LOG_ERROR,

Replace 2nd param with LogLevel::Error
Attachment #8615342 - Flags: review?(m_kato) → review+
Attachment #8615343 - Flags: review?(m_kato) → review+
Thank you for the quick review!

Here is the new logging code with LogLevel.

# I'll post part.19 which includes the fix of part.2 and part.7.
Attachment #8615330 - Attachment is obsolete: true
Attachment #8615827 - Flags: review?(m_kato)
This fixes your comment for part.2 and part.7.

And I include CheckedInt which I was advised by ehsan.
Attachment #8615834 - Flags: review?(m_kato)
Attachment #8615827 - Flags: review?(m_kato) → review+
Attachment #8615835 - Flags: review?(m_kato) → review+
Could you review part.7 again?
Attachment #8615320 - Flags: review?(m_kato) → review+
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #51)
> Could you review part.7 again?

reviewed.  I confirm Part 19 fix
Thank you very much!! I'll land it after I confirm part.19 doesn't cause new assertion in automated tests.

Updated

4 years ago
Blocks: 1183873

Updated

4 years ago
Blocks: 1183875
You need to log in before you can comment on or make changes to this bug.