Closed Bug 1706975 Opened 3 years ago Closed 3 years ago

Crash in [@ mozilla::net::TRRService::Notify]

Categories

(Core :: Networking: DNS, defect, P2)

defect

Tracking

()

RESOLVED FIXED
90 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox88 --- wontfix
firefox89 --- wontfix
firefox90 --- fixed

People

(Reporter: valentin, Assigned: valentin)

References

(Blocks 1 open bug)

Details

(Keywords: crash, Whiteboard: [necko-triaged])

Crash Data

Attachments

(1 file)

Crash report: https://crash-stats.mozilla.org/report/index/44259f55-131d-4454-a643-848470210418

MOZ_CRASH Reason: MOZ_CRASH(Unknown timer)

Top 10 frames of crashing thread:

0 libxul.so mozilla::net::TRRService::Notify /build/firefox-mMnNhP/firefox-87.0+build3/netwerk/dns/TRRService.cpp:917
1 libxul.so {virtual override thunk} 
2 libxul.so nsTimerImpl::Fire /build/firefox-mMnNhP/firefox-87.0+build3/xpcom/threads/nsTimerImpl.cpp:565
3 libxul.so nsTimerEvent::Run /build/firefox-mMnNhP/firefox-87.0+build3/xpcom/threads/TimerThread.cpp:252
4 libxul.so nsThread::ProcessNextEvent /build/firefox-mMnNhP/firefox-87.0+build3/xpcom/threads/nsThread.cpp:1148
5 libxul.so NS_ProcessNextEvent /build/firefox-mMnNhP/firefox-87.0+build3/xpcom/threads/nsThreadUtils.cpp:548
6 libxul.so mozilla::ipc::MessagePumpForNonMainThreads::Run /build/firefox-mMnNhP/firefox-87.0+build3/ipc/glue/MessagePump.cpp:332
7 libxul.so MessageLoop::Run /build/firefox-mMnNhP/firefox-87.0+build3/ipc/chromium/src/base/message_loop.cc:310
8 libxul.so nsThread::ThreadFunc /build/firefox-mMnNhP/firefox-87.0+build3/xpcom/threads/nsThread.cpp:391
9 libnspr4.so _pt_root /build/firefox-mMnNhP/firefox-87.0+build3/nsprpub/pr/src/pthreads/ptthread.c:201

Low volume, but it seems we sometimes race between Notify and changing/cancelling the timer, so it leads to this crash.
We should be able to handle it gracefully.
Theoretically it just ignoring the notification should be enough.

We should be holding the lock when comparing the timer pointers otherwise
we have a race where the timer is replaced and cancelled under the lock
but we either still call the handler or simply crash.

Depends on D113310

Pushed by valentin.gosu@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/24fc2394dce7
TRRService::Notify should hold lock when accessing timer r=necko-reviewers,dragana
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 90 Branch

The patch landed in nightly and beta is affected.
:valentin, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(valentin.gosu)

Valentin is out, maybe Dragana can respond.

Flags: needinfo?(dd.mozilla)

Thanks, I will request an uplift.

Flags: needinfo?(valentin.gosu)
Flags: needinfo?(dd.mozilla)

Comment on attachment 9218287 [details]
Bug 1706975 - TRRService::Notify should hold lock when accessing timer r=#necko

Beta/Release Uplift Approval Request

  • User impact if declined: The bug fixes a crash. It its a low volume crash and it is only happening on release, therefore the fix cannot be verified on Nightly.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch only makes sure that a mTimer is only under a lock.
  • String changes made/needed:
Attachment #9218287 - Flags: approval-mozilla-beta?

Comment on attachment 9218287 [details]
Bug 1706975 - TRRService::Notify should hold lock when accessing timer r=#necko

This fixes a rare crash on release and we have only one beta left so it's probably better to =just let it ride the 90 train, thanks.

Attachment #9218287 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: