Closed Bug 1824142 Opened 1 year ago Closed 1 year ago

Crash in [@ mozilla::net::DocumentLoadListener::RedirectToParentProcess::<T>::~] with Microsoft Teams

Categories

(Core :: DOM: Core & HTML, defect)

Unspecified
Windows
defect

Tracking

()

RESOLVED FIXED
114 Branch
Tracking Status
firefox-esr102 --- unaffected
firefox111 --- unaffected
firefox112 --- unaffected
firefox113 --- fixed
firefox114 --- fixed

People

(Reporter: aryx, Assigned: timhuang)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

This is a crash signature on Windows with Microsoft Team urls and which got regular crash reports starting with this week (10 crash reports yesterday). Because it's a diagnostic assert, release is not affected but 111-113 pre-release builds report this behavior.

Please redirect the needinfo request if necessary. The bug has been put here because you might have contacts which can share details what changed.

Crash report: https://crash-stats.mozilla.org/report/index/233064a7-84f4-4dbe-8411-8a8810230321

MOZ_CRASH Reason: MOZ_DIAGNOSTIC_ASSERT(Request::mDisconnected)

Top 10 frames of crashing thread:

0  xul.dll  mozilla::MozPromise<bool, nsresult, 1>::ThenValueBase::AssertIsDead  xpcom/threads/MozPromise.h:526
1  xul.dll  mozilla::MozPromise<nsresult, nsresult, 1>::AssertIsDead  xpcom/threads/MozPromise.h:1119
2  xul.dll  mozilla::MozPromise<nsresult, nsresult, 1>::AssertIsDead  xpcom/threads/MozPromise.h:1122
3  xul.dll  mozilla::MozPromise<nsresult, nsresult, 1>::AssertIsDead  xpcom/threads/MozPromise.h:1119
4  xul.dll  mozilla::MozPromise<nsresult, nsresult, 1>::~MozPromise  xpcom/threads/MozPromise.h:1162
5  xul.dll  mozilla::MozPromise<nsresult, nsresult, 1>::Private::~Private  xpcom/threads/MozPromise.h:256
6  xul.dll  mozilla::MozPromiseRefcountable::Release  xpcom/threads/MozPromise.h:151
6  xul.dll  mozilla::RefPtrTraits<mozilla::MozPromise<nsresult, mozilla::ipc::ResponseRejectReason, 1>::Private>::Release  mfbt/RefPtr.h:50
6  xul.dll  RefPtr<mozilla::MozPromise<nsresult, mozilla::ipc::ResponseRejectReason, 1>::Private>::ConstRemovingRefPtrTraits<mozilla::MozPromise<nsresult, mozilla::ipc::ResponseRejectReason, 1>::Private>::Release  mfbt/RefPtr.h:381
6  xul.dll  RefPtr<mozilla::MozPromise<nsresult, mozilla::ipc::ResponseRejectReason, 1>::Private>::~RefPtr  mfbt/RefPtr.h:81
Flags: needinfo?(na-g)

Jim, could you or Jib (?) reach out to our contacts and inquire about this?

Flags: needinfo?(na-g) → needinfo?(jmathies)
Flags: needinfo?(jib)
Depends on: 1635953

The bug is marked as tracked for firefox112 (beta). We have limited time to fix this, the soft freeze is in 13 days. However, the bug still isn't assigned.

:jimm, could you please find an assignee for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit auto_nag documentation.

Flags: needinfo?(jmathies)

... but 111-113 pre-release builds report this behavior.

Shouldn't firefox111 be wontfix based on this?

Assert is here. The crashing stack mentions netwerk/ipc/DocumentLoadListener.cpp:2036, perhaps implicting MozPromise use in netwerk, but no mention of webrtc or webrtc threads, which suggests this is not happening during a call perhaps?

FWIW I don't see any implication of WebRTC here other than we use MozPromise a lot, but so do other components. Leaving the ni? for Jim to reach out.

Flags: needinfo?(jib) → needinfo?(rjesup)

The bug has been put here because you might have contacts which can share details what changed.

Do we suspect a change in Teams here or does the crash pattern suggest a regression in Firefox 111?

https://crash-stats.mozilla.org/signature/?signature=mozilla%3A%3Anet%3A%3ADocumentLoadListener%3A%3ARedirectToParentProcess%3A%3A%3CT%3E%3A%3A~&date=%3E%3D2023-02-27T20%3A08%3A00.000Z&date=%3C2023-03-27T20%3A08%3A00.000Z

Clearing up the tracking flags - this is a diagnostic assert so it'll never impact release.

We think this might be Networking or DOM related? Also probably not worth reaching out to Microsoft to try and address a diagnostic assert in Firefox.

Nika, any suggestions on where a DocumentLoadListener bug should go? We know it's associated with something the Microsoft Teams site is doing, but it's probably not specific to web conferencing.

Component: WebRTC → DOM: Content Processes
Flags: needinfo?(rjesup)
Flags: needinfo?(nika)
Flags: needinfo?(jmathies)
Component: DOM: Content Processes → DOM: Core & HTML

First hit for this signature was build 20230219095859.

Thanks to Jim for spotting the parent bug here. This is likely related to that, based on bug 1635953 comment 14.

The DocumentLoadListener frame on the stack seems a bit surprising to me. Looking further up the stack it seems like the std::function is being destroyed from StorageAccessPermissionRequest::~StorageAccessPermissionRequest() , which has no reason to be holding a reference to that function. ChildProcessChannelListener only destroys the corresponding objects in the case when it is destroyed, which should not happen outside of shutdown, making it seem unlikely to me that this exact closure & promise are the ones at fault here. The callback named (https://searchfox.org/mozilla-central/rev/736eb58cd30da3afc0310b58232a1e4d0dcc86a4/netwerk/ipc/DocumentLoadListener.cpp#2037) is also quite a small function which just forwards a nsresult to a promise resolver, so there's a chance the linker deduplicated the symbol leading to this confusion.

Looking at https://bugzilla.mozilla.org/show_bug.cgi?id=1635953#c14, it appears there is another crash which happens with ~StorageAccessPermissionRequest() on the stack, but on a different platform: https://crash-stats.mozilla.org/report/index/4feaab0c-a606-42b6-b475-1a3320230327. I wonder whether this might be the same issue, but on macOS.

I am inclined to think that the issue here is that the StorageAccessPermissionRequest type does not make sure to invoke the cancel or allow callbacks before it is destroyed if it happens to be destroyed by the cycle collector (as it is in this case). I think this can happen if we decide to show the prompt (which ends up passing the request to system JS), and then the system JS doesn't end up ever calling any of the lifecycle methods. I think it'd probably be OK to reject in that case.

Given that, it might be as simple as replacing the defaulted destructor (https://searchfox.org/mozilla-central/rev/736eb58cd30da3afc0310b58232a1e4d0dcc86a4/dom/base/StorageAccessPermissionRequest.h#62) with a call to Cancel(), ensuring mCancelCallback is called if nothing else can be fired.

Flags: needinfo?(nika)

bvandersloot, it appears that you've modified this code recently. Could you, please, try the approach suggested by Nika above?

Flags: needinfo?(bvandersloot)
Assignee: nobody → tihuang
Status: NEW → ASSIGNED
Flags: needinfo?(bvandersloot)

We invoke Cancel() when destructing StorageAccessPermissionRequest to
ensure a callback function is called before the request destroys. The
request could be destroyed because of the cycle collector.

Pushed by tihuang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e0b6b8f492eb
Invoke Cancel() when StorageAccessPermissionRequest is destructed. r=anti-tracking-reviewers,pbz
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 114 Branch

The patch landed in nightly and beta is affected.
:timhuang, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox113 to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(tihuang)

Comment on attachment 9329696 [details]
Bug 1824142 - Invoke Cancel() when StorageAccessPermissionRequest is destructed. r?#anti-tracking-reviewers

Beta/Release Uplift Approval Request

  • User impact if declined: Beta users will encounter crashes from time to time when visiting Microsoft Team.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The change is minimal and doesn't change any user behavior.
  • String changes made/needed: None
  • Is Android affected?: No
Flags: needinfo?(tihuang)
Attachment #9329696 - Flags: approval-mozilla-beta?

Comment on attachment 9329696 [details]
Bug 1824142 - Invoke Cancel() when StorageAccessPermissionRequest is destructed. r?#anti-tracking-reviewers

Approved for 113.0b9.

Attachment #9329696 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: