Closed Bug 1740551 Opened 1 year ago Closed 1 year ago

Assertion failure: info, at /dom/serviceworkers/ServiceWorkerManager.cpp:2311

Categories

(Core :: DOM: Service Workers, defect, P3)

x86_64
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 1747445
97 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox94 --- wontfix
firefox95 --- wontfix
firefox96 --- wontfix
firefox97 --- fixed

People

(Reporter: jkratzer, Assigned: edenchuang)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression, testcase, Whiteboard: [bugmon:bisected,confirmed])

Attachments

(2 files)

Testcase found while fuzzing mozilla-central rev 333f08065c8c (built with: --enable-debug --enable-fuzzing).

Testcase can be reproduced using the following commands:

$ pip install fuzzfetch grizzly-framework
$ python -m fuzzfetch --build 333f08065c8c --debug --fuzzing -n firefox
$ python -m grizzly.replay ./firefox/firefox testcase.zip --repeat 10 --relaunch 1
Assertion failure: info, at /dom/serviceworkers/ServiceWorkerManager.cpp:2311

    ==817619==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f522d5c948d bp 0x7ffe8b3be330 sp 0x7ffe8b3be290 T817619)
    ==817619==The signal is caused by a WRITE memory access.
    ==817619==Hint: address points to the zero page.
        #0 0x7f522d5c948d in mozilla::dom::ServiceWorkerManager::CheckPrincipalQuotaUsage(nsIPrincipal*, nsTSubstring<char> const&) /dom/serviceworkers/ServiceWorkerManager.cpp:2311:3
        #1 0x7f522d620745 in mozilla::dom::ServiceWorkerRegistrationInfo::CheckQuotaUsage() /dom/serviceworkers/ServiceWorkerRegistrationInfo.cpp:905:8
        #2 0x7f522d6205ae in mozilla::dom::ServiceWorkerRegistrationInfo::MaybeScheduleUpdate() /dom/serviceworkers/ServiceWorkerRegistrationInfo.cpp:520:7
        #3 0x7f522c2af65b in operator() /dom/clients/manager/ClientSourceParent.cpp:167:39
        #4 0x7f522c2af65b in mozilla::detail::RunnableFunction<mozilla::dom::ClientSourceParent::RecvNoteDOMContentLoaded()::$_10>::Run() /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:531:5
        #5 0x7f5228dedd7e in mozilla::RunnableTask::Run() /xpcom/threads/TaskController.cpp:468:16
        #6 0x7f5228dc7696 in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /xpcom/threads/TaskController.cpp:771:26
        #7 0x7f5228dc6358 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /xpcom/threads/TaskController.cpp:607:15
        #8 0x7f5228dc65d3 in mozilla::TaskController::ProcessPendingMTTask(bool) /xpcom/threads/TaskController.cpp:391:36
        #9 0x7f5228df1376 in operator() /xpcom/threads/TaskController.cpp:124:37
        #10 0x7f5228df1376 in mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_0>::Run() /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:531:5
        #11 0x7f5228ddc083 in nsThread::ProcessNextEvent(bool, bool*) /xpcom/threads/nsThread.cpp:1175:16
        #12 0x7f5228de326a in NS_ProcessNextEvent(nsIThread*, bool) /xpcom/threads/nsThreadUtils.cpp:467:10
        #13 0x7f522986f3a6 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) /ipc/glue/MessagePump.cpp:85:21
        #14 0x7f522978e9f7 in MessageLoop::RunInternal() /ipc/chromium/src/base/message_loop.cc:331:10
        #15 0x7f522978e902 in RunHandler /ipc/chromium/src/base/message_loop.cc:324:3
        #16 0x7f522978e902 in MessageLoop::Run() /ipc/chromium/src/base/message_loop.cc:306:3
        #17 0x7f522d70a158 in nsBaseAppShell::Run() /widget/nsBaseAppShell.cpp:137:27
        #18 0x7f522f591616 in nsAppStartup::Run() /toolkit/components/startup/nsAppStartup.cpp:295:30
        #19 0x7f522f6bcb62 in XREMain::XRE_mainRun() /toolkit/xre/nsAppRunner.cpp:5294:22
        #20 0x7f522f6be421 in XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) /toolkit/xre/nsAppRunner.cpp:5479:8
        #21 0x7f522f6bec99 in XRE_main(int, char**, mozilla::BootstrapConfig const&) /toolkit/xre/nsAppRunner.cpp:5538:21
        #22 0x55613c027c95 in do_main /browser/app/nsBrowserApp.cpp:225:22
        #23 0x55613c027c95 in main /browser/app/nsBrowserApp.cpp:395:16
        #24 0x7f523e7f50b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:308:16
        #25 0x55613c00350c in _start (/home/jkratzer/builds/mc-debug/firefox-bin+0x1550c)
    
    UndefinedBehaviorSanitizer can not provide additional info.
    SUMMARY: UndefinedBehaviorSanitizer: SEGV /dom/serviceworkers/ServiceWorkerManager.cpp:2311:3 in mozilla::dom::ServiceWorkerManager::CheckPrincipalQuotaUsage(nsIPrincipal*, nsTSubstring<char> const&)
    ==817619==ABORTING
Attached file Testcase
Regressed by: 1722502
Has Regression Range: --- → yes

Set release status flags based on info from the regressing bug 1722502

Flags: needinfo?(echuang)

(In reply to Bugmon [:jkratzer for issues] from comment #2)

Bugmon Analysis
Unable to reproduce bug 1740551 using build mozilla-central 20211110092453-333f08065c8c. Without a baseline, bugmon is unable to analyze this bug.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Hi Jason, I tried to reproduce it, but it fails for me as for the bot. If you can reproduce it, a pernosco session would be great.

Flags: needinfo?(echuang) → needinfo?(jkratzer)

So ServiceWorkerManager::RemoveScopeAndRegistration does first remove the data->mInfos.Remove(aRegistration->Scope(), getter_AddRefs(info)); and then calls swm->MaybeRemoveRegistrationInfo(scopeKey);.

But swm->MaybeRemoveRegistrationInfo(scopeKey); removes the entry only under certain conditions.

ServiceWorkerManager::CheckPrincipalQuotaUsage instead seems to assume that if the RegistrationDataPerPrincipal exists we will also find an entry for ServiceWorkerRegistrationInfo. It is not clear to me if this is a strong assumption we should enforce or if we can simply return instead of asserting.

Flags: needinfo?(echuang)
Severity: -- → S3
Priority: -- → P3

:jstutte, it looks like I messed up the build type in comment 0. This should be a build with --enable-debug rather than --enable-address-sanitizer. I will upload a pernosco session for this shortly.

Flags: needinfo?(jkratzer)
Keywords: bugmon

Bugmon Analysis
Verified bug as reproducible on mozilla-central 20211116093425-3890e2f0b025.
The bug appears to have been introduced in the following build range:

Start: 23f3a0afb545acb23a5927696df94085ea74fcfe (20210822084712)
End: 43eb6ce90ab079e96f70d77747c8c211c08c0cec (20210822212235)
Pushlog: https://hg.mozilla.org/mozilla-unified/pushloghtml?fromchange=23f3a0afb545acb23a5927696df94085ea74fcfe&tochange=43eb6ce90ab079e96f70d77747c8c211c08c0cec

Whiteboard: [bugmon:confirm] → [bugmon:bisected,confirmed]
Flags: needinfo?(jstutte)
Flags: needinfo?(jstutte)

My high-level understanding is that we get somehow in a situation where:

Context

  • we have a valid ServiceWorkerManager::RegistrationDataPerPrincipal entry RDpP in ServiceWorkerManager::mRegistrationInfos for a given scope
    • but RDpP contains an empty list mInfos and it is not (yet) clear to me if this is an expected state and how we got here

Crash path

  • we enter ServiceWorkerRegistrationInfo::MaybeCheckNavigationUpdate and find a living ControlledClientData that contains a living strong reference mRegistrationInfo referring to the same scope as RDpP above and we call MaybeScheduleUpdate on it
    • we reach the navigation fault threshold inside ServiceWorkerRegistrationInfo::MaybeScheduleUpdate
    • this triggers a CheckQuotaUsage on the same registration info
      • there we take the mPrincipal and Scope() of this registration info to call swm->CheckPrincipalQuotaUsage
      • Inside ServiceWorkerManager::CheckPrincipalQuotaUsage we use this scope to first lookup the (existing) RDpP from above and then the registration info that we expect to find in mInfo of RDpP. But that is empty and therefore we have a nullptr

Questions

  • Should we ever be in the configuration to have a living but empty RDpP for a given scope?
    • if yes we must deal with this possibility in CheckPrincipalQuotaUsage (and maybe other places)
      • there is a comment here that talks about re-use of RDpP instances, so it seems likely?
    • if no we need to continue investigations how we got there.
      • there are places in the code where it seems we explicitly do not want this to happen, instead
  • Is it expected that we still see a living ControlledClientData in ServiceWorkerManager::mControlledClients for a given client id after we apparently already unregistered the service worker?
  • Can't we directly use the already present registration info from ControlledClientData reducing the number of indirect (and error prone) lookups here?
    • I also assume the code could benefit from some wrapper lookup functions that handle well all possible nullptr constellations rather than having all these manual double lookups?

(In reply to Bugmon [:jkratzer for issues] from comment #7)

Bugmon Analysis
Verified bug as reproducible on mozilla-central 20211116093425-3890e2f0b025.
The bug appears to have been introduced in the following build range:

Start: 23f3a0afb545acb23a5927696df94085ea74fcfe (20210822084712)
End: 43eb6ce90ab079e96f70d77747c8c211c08c0cec (20210822212235)
Pushlog: https://hg.mozilla.org/mozilla-unified/pushloghtml?fromchange=23f3a0afb545acb23a5927696df94085ea74fcfe&tochange=43eb6ce90ab079e96f70d77747c8c211c08c0cec

The pushlog here suggests bug 1693074 to have introduced this, which sounds weird. I suspect the potential for this issue was always there but some timing was altered?

Navigation fault mitigation should not be performed when the ServiceWorkerRegistrationInfo::mUnregistered is true.

(In reply to Jens Stutte [:jstutte] from comment #10)

Questions

  • Should we ever be in the configuration to have a living but empty RDpP for a given scope?
    • if yes we must deal with this possibility in CheckPrincipalQuotaUsage (and maybe other places)
      • there is a comment here that talks about re-use of RDpP instances, so it seems likely?
    • if no we need to continue investigations how we got there.
      • there are places in the code where it seems we explicitly do not want this to happen, instead
  • Is it expected that we still see a living ControlledClientData in ServiceWorkerManager::mControlledClients for a given client id after we apparently already unregistered the service worker?

I think this is the expected behavior.
https://searchfox.org/mozilla-central/rev/b605f01915c5704a55e9f485101b7be7d20a55df/dom/serviceworkers/ServiceWorkerRegistrationInfo.h#62-68

  • Can't we directly use the already present registration info from ControlledClientData reducing the number of indirect (and error prone) lookups here?

mControlledClients and ScopeToRegistration maps have different meanings.
Scope, Clients and RegistrationInfo are not 1 to 1 to 1 relationships that is the reason why we perform things in such a indirect way.

  • I also assume the code could benefit from some wrapper lookup functions that handle well all possible nullptr constellations rather than having all these manual double lookups?
Assignee: nobody → echuang
Status: NEW → ASSIGNED

(In reply to Eden Chuang[:edenchuang] from comment #12)

Navigation fault mitigation should not be performed when the ServiceWorkerRegistrationInfo::mUnregistered is true.

I assume this would avoid the entire code path of the crash?

mControlledClients and ScopeToRegistration maps have different meanings.
Scope, Clients and RegistrationInfo are not 1 to 1 to 1 relationships that is the reason why we perform things in such a indirect way.

Does this mean we still should expect the situation to have a living but empty RDpP during lookups for a given scope (see first question)?

Pushed by echuang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/aa7acb67ca33
Do not perform navigation fault mitigation when the ServiceWorkerRegistrationInfo::mUnregistered is true. r=dom-worker-reviewers,jstutte
Flags: needinfo?(echuang)
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 97 Branch

Bugmon Analysis
Bug marked as FIXED but still reproduces on mozilla-central 20211221214151-f16661bf49d3.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Hi Jason, can we have the latest reproducing stack-trace to see if it is still the same? Thanks!

Flags: needinfo?(jkratzer)

Jen's, the attached testcase now produces the following assertion. If you'd like me to file a separate bug for this stack, please let me know.

Assertion failure: false, at /builds/worker/checkouts/gecko/netwerk/protocol/http/InterceptedHttpChannel.cpp:1388

    #0 0x7f51bc88244a in mozilla::net::InterceptedHttpChannel::InterceptionTimeStamps::RecordTime(mozilla::net::InterceptedHttpChannel::InterceptionTimeStamps::Status&&, mozilla::TimeStamp&&) /builds/worker/checkouts/gecko/netwerk/protocol/http/InterceptedHttpChannel.cpp:1388:9
    #1 0x7f51bc883c49 in mozilla::net::InterceptedHttpChannel::Cancel(nsresult) /builds/worker/checkouts/gecko/netwerk/protocol/http/InterceptedHttpChannel.cpp:504:15
    #2 0x7f51bca33230 in mozilla::net::DocumentLoadListener::Cancel(nsresult const&) /builds/worker/checkouts/gecko/netwerk/ipc/DocumentLoadListener.cpp:1118:15
    #3 0x7f51bcea2bce in RecvCancel /builds/worker/workspace/obj-build/dist/include/mozilla/net/DocumentChannelParent.h:37:30
    #4 0x7f51bcea2bce in mozilla::net::PDocumentChannelParent::OnMessageReceived(IPC::Message const&) /builds/worker/workspace/obj-build/ipc/ipdl/PDocumentChannelParent.cpp:271:65
    #5 0x7f51bcdb308b in mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&) /builds/worker/workspace/obj-build/ipc/ipdl/PContentParent.cpp:6505:32
    #6 0x7f51bcbf6cdf in mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&) /builds/worker/checkouts/gecko/ipc/glue/MessageChannel.cpp:2043:25
    #7 0x7f51bcbf3611 in mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&) /builds/worker/checkouts/gecko/ipc/glue/MessageChannel.cpp:1968:9
    #8 0x7f51bcbf4a95 in mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask&) /builds/worker/checkouts/gecko/ipc/glue/MessageChannel.cpp:1827:3
    #9 0x7f51bcbf56cd in mozilla::ipc::MessageChannel::MessageTask::Run() /builds/worker/checkouts/gecko/ipc/glue/MessageChannel.cpp:1855:14
    #10 0x7f51bc165a2e in mozilla::RunnableTask::Run() /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:468:16
    #11 0x7f51bc13f6c6 in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:771:26
    #12 0x7f51bc13e388 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:607:15
    #13 0x7f51bc13e603 in mozilla::TaskController::ProcessPendingMTTask(bool) /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:391:36
    #14 0x7f51bc169096 in operator() /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:124:37
    #15 0x7f51bc169096 in mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_0>::Run() /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:531:5
    #16 0x7f51bc154003 in nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1183:16
    #17 0x7f51bc15b26a in NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:467:10
    #18 0x7f51bcbfcaf6 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) /builds/worker/checkouts/gecko/ipc/glue/MessagePump.cpp:85:21
    #19 0x7f51bcb1bfd7 in MessageLoop::RunInternal() /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:331:10
    #20 0x7f51bcb1bee2 in RunHandler /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:324:3
    #21 0x7f51bcb1bee2 in MessageLoop::Run() /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:306:3
    #22 0x7f51c0d92b78 in nsBaseAppShell::Run() /builds/worker/checkouts/gecko/widget/nsBaseAppShell.cpp:137:27
    #23 0x7f51c2c8c0b6 in nsAppStartup::Run() /builds/worker/checkouts/gecko/toolkit/components/startup/nsAppStartup.cpp:295:30
    #24 0x7f51c2db4106 in XREMain::XRE_mainRun() /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:5362:22
    #25 0x7f51c2db59b0 in XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:5547:8
    #26 0x7f51c2db6229 in XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:5606:21
    #27 0x562f0a026f45 in do_main /builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp:225:22
    #28 0x562f0a026f45 in main /builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp:395:16
    #29 0x7f51d0d7d0b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:308:16
    #30 0x562f0a0027bc in _start (/home/jkratzer/builds/mc-debug/firefox-bin+0x157bc)

UndefinedBehaviorSanitizer can not provide additional info.
SUMMARY: UndefinedBehaviorSanitizer: SEGV /builds/worker/checkouts/gecko/netwerk/protocol/http/InterceptedHttpChannel.cpp:1388:9 in mozilla::net::InterceptedHttpChannel::InterceptionTimeStamps::RecordTime(mozilla::net::InterceptedHttpChannel::InterceptionTimeStamps::Status&&, mozilla::TimeStamp&&)
Flags: needinfo?(jkratzer)

I logged it the other day :)

See Also: → 1747445
Status: REOPENED → RESOLVED
Closed: 1 year ago1 year ago
Resolution: --- → DUPLICATE
Regressions: 1747445

No valid actions for resolution (DUPLICATE).
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Keywords: bugmon
You need to log in before you can comment on or make changes to this bug.