Open Bug 1707477 Opened 3 years ago Updated 2 years ago

Crash in [@ mozilla::dom::cache::Manager::SetCacheIdOrphanedIfRefed]

Categories

(Core :: Storage: Cache API, defect, P3)

Unspecified
All
defect

Tracking

()

Tracking Status
firefox100 --- affected
firefox101 --- affected

People

(Reporter: gsvelto, Unassigned)

References

Details

(Keywords: crash)

Crash Data

Crash report: https://crash-stats.mozilla.org/report/index/e580bbc7-087a-41a0-b2d9-063fc0210424

MOZ_CRASH Reason: MOZ_DIAGNOSTIC_ASSERT(!foundIt->mOrphaned)

Top 10 frames of crashing thread:

0 libxul.so mozilla::dom::cache::Manager::SetCacheIdOrphanedIfRefed dom/cache/Manager.cpp:2019
1 libxul.so mozilla::dom::cache::Manager::StorageDeleteAction::Complete dom/cache/Manager.cpp:1412
2 libxul.so mozilla::dom::cache::Manager::BaseAction::CompleteOnInitiatingThread dom/cache/Manager.cpp:512
3 libxul.so mozilla::dom::cache::Context::ActionRunnable::Run dom/cache/Context.cpp:682
4 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1153
5 libxul.so mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:330
6 libxul.so MessageLoop::Run ipc/chromium/src/base/message_loop.cc:310
7 libxul.so nsThread::ThreadFunc xpcom/threads/nsThread.cpp:395
8 libnspr4.so _pt_root nsprpub/pr/src/pthreads/ptthread.c:201
9 libpthread.so.0 start_thread nptl/nptl/pthread_create.c:477

Low-volume, but that's most likely because it's a diagnostic assertion. First crash is from buildid 20210309094921, seems to happen on all platforms.

Component: DOM: Core & HTML → Storage: Cache API

The diagnostic assert has been introduced here in bug 1682542, two months earlier than build 20210309094921. So I'd tend to assume, that if there is a regression it happened elsewhere.

:asuth, do you know the rationale for that diagnostic assert?

Flags: needinfo?(bugmail)
See Also: → 1682542

The code is part of the original landing in bug 940273 with this being the original state in tree.

I understand the general situation here to be:

  • A Cache should be deleted at most once (from CacheStorage). When it's deleted, there are either outstanding references to it or not.
    • If there are references, it's still usable, but said to be "orphaned", and we need to do some cleanup when the references are finally dropped.
    • If there weren't still references, then it can be cleaned up immediately.
  • The diagnostic assertions here are asserting:
    • This one: Since there's at most one deletion, we should transition to orphaned at most once, so we can't possibly be deleting something twice. (The same string name can be used multiple times, but each successive Cache with the same name gets a different UUID, which is what we're dealing with here.)
      • This is very surprising since the caller (Manager::StorageDeleteAction) is consulting the database and handles the case the row no longer exists, and there's a transaction involved and everything.
    • The one that's not this: There should be refcounts if it's in the data-structure that contains the list of refcounts.
Flags: needinfo?(bugmail)

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #2)

I understand the general situation here to be:

  • A Cache should be deleted at most once (from CacheStorage). When it's deleted, there are either outstanding references to it or not.
    • If there are references, it's still usable, but said to be "orphaned", and we need to do some cleanup when the references are finally dropped.
    • If there weren't still references, then it can be cleaned up immediately.
  • The diagnostic assertions here are asserting:
    • This one: Since there's at most one deletion, we should transition to orphaned at most once, so we can't possibly be deleting something twice. (The same string name can be used multiple times, but each successive Cache with the same name gets a different UUID, which is what we're dealing with here.)
      • This is very surprising since the caller (Manager::StorageDeleteAction) is consulting the database and handles the case the row no longer exists, and there's a transaction involved and everything.

What harm can happen, if we delete it more than once? It looks like everybody can deal with orphaned entries, anyway. BTW, I read right above Manager::SetBodyIdOrphanedIfRefed:

// TODO: provide way to set body non-orphaned if its added back to a cache (bug
// 1110479)

Is "adding it back to the cache" something that can already happen (and lead to this situation) ?

In any case: Andrew, can you please triage this?

Flags: needinfo?(bugmail)

(In reply to Jens Stutte [:jstutte] from comment #3)

What harm can happen, if we delete it more than once? It looks like everybody can deal with orphaned entries, anyway.

We shouldn't actually be able to delete it more than once, though. Although CacheStorage.delete operates in terms of string names, under the hood we're operating in terms of UUIDs, and the assertion is basically asserting that a single UUID and related object transition to deleted at most once.

BTW, I read right above Manager::SetBodyIdOrphanedIfRefed:

// TODO: provide way to set body non-orphaned if its added back to a cache (bug
// 1110479)

Is "adding it back to the cache" something that can already happen (and lead to this situation) ?

AIUI, this would be something that happens (after implementing enhancement bug 1110479) at the Request/Response level of the [CacheStorage, Cache, Request/Response] hierarchy. We have marked all of those Request/Responses in the deleted (but still accessible via instances exposed to JS) Cache as orphaned too. They can still be retrieved from the orphaned Cache and then put into a live Cache. If bug 1110479 was implemented, this could/would reuse the same underlying object, in which case the body could be said to no longer be orphaned. However, we don't implement this de-duplication yet, so it's not actually an invariant problem yet (although there will be some wasteful disk I/O).

In any case: Andrew, can you please triage this?

Done.

Severity: -- → S4
Flags: needinfo?(bugmail)
Priority: -- → P3

I just experienced a crash (https://crash-stats.mozilla.org/report/index/83aa89aa-5d2f-4baf-93a8-6ea800211029#tab-details) and it links to this bug. I had just restarted after an update to 2021-10-27 (8 pinned tabs and presumably the other non-pinned tabs weren't reloaded) and went to https://thestar.com in a new tab. I doubt that will help track this down but you never know.

I just hit this again and this time I wasn't even at my computer :) I had a bunch of Google Docs tabs open + Gmail, Slack, Matrix, and a smattering of other pages.

You need to log in before you can comment on or make changes to this bug.