Closed Bug 1777277 Opened 3 years ago Closed 3 years ago

Crash with infinite recursion in RemoteContentController::NotifyScaleGestureCompleteInProcess

Categories

(Core :: Graphics: Layers, defect)

x86
Windows 7
defect

Tracking

()

RESOLVED FIXED
104 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox-esr102 --- unaffected
firefox102 --- unaffected
firefox103 --- fixed
firefox104 --- fixed

People

(Reporter: mccr8, Assigned: botond)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

Crash report: https://crash-stats.mozilla.org/report/index/d0b54c6a-625c-4656-81cb-8a7a40220629

Reason: EXCEPTION_STACK_OVERFLOW

Top 10 frames of crashing thread:

0 ntdll.dll RtlAcquireSRWLockExclusive 
1 mozglue.dll mozilla::detail::MutexImpl::lock mozglue/misc/Mutex_windows.cpp:22
2 xul.dll static mozilla::layers::CompositorBridgeParent::GetGeckoContentControllerForRoot gfx/layers/ipc/CompositorBridgeParent.cpp:1878
3 xul.dll mozilla::layers::RemoteContentController::NotifyScaleGestureCompleteInProcess gfx/layers/ipc/RemoteContentController.cpp:423
4 xul.dll mozilla::layers::RemoteContentController::NotifyScaleGestureCompleteInProcess gfx/layers/ipc/RemoteContentController.cpp:426
5 xul.dll mozilla::layers::RemoteContentController::NotifyScaleGestureCompleteInProcess gfx/layers/ipc/RemoteContentController.cpp:426
6 xul.dll mozilla::layers::RemoteContentController::NotifyScaleGestureCompleteInProcess gfx/layers/ipc/RemoteContentController.cpp:426
7 xul.dll mozilla::layers::RemoteContentController::NotifyScaleGestureCompleteInProcess gfx/layers/ipc/RemoteContentController.cpp:426
8 xul.dll mozilla::layers::RemoteContentController::NotifyScaleGestureCompleteInProcess gfx/layers/ipc/RemoteContentController.cpp:426
9 xul.dll mozilla::layers::RemoteContentController::NotifyScaleGestureCompleteInProcess gfx/layers/ipc/RemoteContentController.cpp:426

This crash has showed up in automation, but now it also appears to be happening in the wild.

Crash volume is very low. As I said in the other bug, this looks like it could be a regression from bug 1773865. I don't know if these various incarnations should be duped together or what.

Flags: needinfo?(mconley)
Keywords: regression
Regressed by: 1773865

Here are two more crashes that look the same, but with different and very generic looking signatures:
bp-f9d99971-63d1-4098-a3e6-9c15a0220627
bp-48503b14-36f7-4393-850b-d901a0220625

Summary: Crash in [@ mozilla::detail::MutexImpl::lock | mozilla::layers::CompositorBridgeParent::GetGeckoContentControllerForRoot] → Crash with infinite recursion in RemoteContentController::NotifyScaleGestureCompleteInProcess

Maybe this needs a check that rootController != this?

  RefPtr<GeckoContentController> rootController =
      CompositorBridgeParent::GetGeckoContentControllerForRoot(aGuid.mLayersId);
  if (rootController) {
    rootController->NotifyScaleGestureComplete(aGuid, aScale);
  }

Thanks, mccr8! Hey botond, this is a little outside of my wheelhouse... is mccr8's suggestion the way to go for crashes like this?

Flags: needinfo?(mconley) → needinfo?(botond)

Set release status flags based on info from the regressing bug 1773865

Huh, that's unexpected. If the GPU process is disabled (which is the case in which NotifyScaleGestureCompleteInProcess() is called), I would expect GetGeckoContentControllerForRoot() to return a ChromeProcessController, not a RemoteContentController.

(In reply to Andrew McCreight [:mccr8] from comment #3)

Maybe this needs a check that rootController != this?

We can do that as a quick fix to avoid stack overflows in production, but we should also add a MOZ_ASSERT(rootController != this), and continue investigating why this happens.

I see bug 1776847 has a case of this failing on Linux debug, if we can catch this in Pernosco we may be able to get to the bottom of it.

Flags: needinfo?(botond)

However, also add a MOZ_ASSERT because GetGeckoContentControllerForRoot()
returning a RemoteContentController here is unexpected and should be
investigated further.

Assignee: nobody → botond
Status: NEW → ASSIGNED

Posted the temporary fix for now. I expect that will turn this rare stack overflow into a rare intermittent in debug builds where the MOZ_ASSERT fails, which we can then investigate further.

Pushed by bballo@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7354cfc40054 Guard against infinite recursion in RemoteContentController::NotifyScaleGestureComplete(). r=mccr8
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 104 Branch

The patch landed in nightly and beta is affected.
:botond, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox103 to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(botond)
Crash Signature: [@ mozilla::detail::MutexImpl::lock | mozilla::layers::CompositorBridgeParent::GetGeckoContentControllerForRoot] → [@ mozilla::detail::MutexImpl::lock | mozilla::layers::CompositorBridgeParent::GetGeckoContentControllerForRoot] [@ mozilla::detail::MutexImpl::lock | xul.dll]
Regressions: 1777416

:botond could you submit a beta uplfit request for this?

Comment on attachment 9283486 [details]
Bug 1777277 - Guard against infinite recursion in RemoteContentController::NotifyScaleGestureComplete(). r=mccr8

Beta/Release Uplift Approval Request

  • User impact if declined: Some users may experience a crash in the parent process, which brings down the entire application. It's rare and STR are not known.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch just adds a null check to avoid the crash
  • String changes made/needed:
  • Is Android affected?: No
Flags: needinfo?(botond)
Attachment #9283486 - Flags: approval-mozilla-beta?

Comment on attachment 9283486 [details]
Bug 1777277 - Guard against infinite recursion in RemoteContentController::NotifyScaleGestureComplete(). r=mccr8

Approved for 103.0b7, thanks.

Attachment #9283486 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Crash Signature: [@ mozilla::detail::MutexImpl::lock | mozilla::layers::CompositorBridgeParent::GetGeckoContentControllerForRoot] [@ mozilla::detail::MutexImpl::lock | xul.dll] → [@ mozilla::detail::MutexImpl::lock | mozilla::layers::CompositorBridgeParent::GetGeckoContentControllerForRoot] [@ mozilla::detail::MutexImpl::lock | xul.dll] [@ _dl_deallocate_tls] [@ _dl_update_slotinfo] [@ libc.so@0x6437c | mozilla::layers::Compos…
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: