Closed Bug 1653276 Opened 4 years ago Closed 4 years ago

Unexpected error when accessing an object store "not a known object store name"

Categories

(Core :: Storage: IndexedDB, defect, P3)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
83 Branch
Tracking Status
firefox-esr78 --- fixed
firefox81 --- wontfix
firefox82 --- wontfix
firefox83 --- fixed
firefox89 --- affected

People

(Reporter: gregtatum, Assigned: sg, NeedInfo)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 1 obsolete file)

We've been encountering this error with various users on the Firefox Profiler front-end, and from what I can tell, we are accessing IndexedDB in the correct way, at least according to how MDN outlines it. I suspect there could be a browser bug here.

We had Bug 1650627 on file, which we moved to our front-end repo: https://github.com/firefox-devtools/profiler/issues/2637

I outlined the problem in firefox-devtools/profiler/#2637. I will copy it here. The expected results are that the onupgradedneeded will always ensure that the symbol store exists, and that onsuccess will let us use it without errors.


This is the error:

Encountered error while cleaning out database: DOMException: IDBDatabase.transaction: 'symbol-tables' is not a known object store name symbol-store-db.js:151:18
    onsuccess symbol-store-db.js:151
    (Async: promise callback)
    onsuccess symbol-store-db.js:150
    (Async: EventHandlerNonNull)
    _setupDB symbol-store-db.js:141
    _setupDB symbol-store-db.js:97
    SymbolStoreDB symbol-store-db.js:80
    SymbolStore symbol-store.js:172
    getSymbolStore receive-profile.js:823
    finalizeProfileView receive-profile.js:224
    Redux 2
    loadProfile receive-profile.js:141
    Redux 2
    getProfileFromAddon receive-profile.js:807
    retrieveProfileFromAddon receive-profile.js:916
    Redux 2
    getProfilesFromRawUrl receive-profile.js:1409
    Redux 2

Here's what is going on in the code:

_setupDB() calls const openReq = indexedDB.open(dbName, 2);.

openReq.onupgradeneeded is defined, and when run, creates the object store:

const store: SymbolStore = db.createObjectStore('symbol-tables', {
  keyPath: ['debugName', 'breakpadId'],
});

openReq.onsuccess is then definitely run, which calls _deleteAllBeforeDate

This error is then generated on:

const transaction = db.transaction('symbol-tables', 'readwrite');

According to MDN:

[The createObjectStore] method can be called only within a versionchange transaction.

We are doing this, then immediately, and synchronously with the onsuccess handler accessing the store.

Here's a permalink with valid line numbers:

https://github.com/firefox-devtools/profiler/blob/715ad6cf908abcb11b068ebe6e01b2f5d238b000/src/profile-logic/symbol-store-db.js#L97-L154

Plus here is the transaction that is generating the error:

https://github.com/firefox-devtools/profiler/blob/715ad6cf908abcb11b068ebe6e01b2f5d238b000/src/profile-logic/symbol-store-db.js#L253

OS: Unspecified → Linux
Hardware: Unspecified → x86_64
Version: unspecified → Trunk

Greg, thanks for reporting this!

Is this a regression, i.e. did it work reliably before? Or has the version 2 of the database only been introduced recently, so this isn't known.

Bug 1650627 indicated there are STR, but AFAIU, this isn't reproduced deterministically on every upgrade of the DB version?

I can imagine there might be some race when the onupgradeneeded handler deletes an object store, and then recreates an object store with the same name:

   openReq.onupgradeneeded = ({ oldVersion }) => { 
     const db = openReq.result; 
     db.onerror = reject; 
  
     if (oldVersion === 1) { 
       db.deleteObjectStore('symbol-tables'); 
     } 
     const store: SymbolStore = db.createObjectStore('symbol-tables', { 
       keyPath: ['debugName', 'breakpadId'], 
     }); 
     store.createIndex('lastUsedDate', 'lastUsedDate'); 
   }; 

I am not completely sure if the specification allows for this to fail by saying:

In some implementations it is possible for the implementation to run into problems after queuing a task to create the object store after the createObjectStore() method has returned. 

But in that case, the transaction would need to be aborted:

Such implementations must still create and return an IDBObjectStore object, and once the implementation determines that creating the object store has failed, it must abort the transaction using the steps to abort a transaction using the appropriate error.

But probably this is intended to work. I will try to reproduce this with a test case that deletes and re-creates an object store with the same name in onupgradeneeded.

The current WPT tests do not seem to have such a test case according to this quick check:

[simon@sigibln mozilla-unified]$ grep deleteObjectStore testing/web-platform/tests/IndexedDB/* -A8 | grep createObjectStore
testing/web-platform/tests/IndexedDB/idbdatabase_deleteObjectStore4-not_reused.htm-    var objStore2 = db.createObjectStore("resurrected", { autoIncrement: true });

idbdatabase_deleteObjectStore4-not_reused.htm does a similar thing: In its onupgradeneeded handler, the resurrected object store is created, deleted and then created again. Besides that, it does not create another transaction that tries to use it.

Assignee: nobody → sgiesecke

It's really hard to say if this is a regression. I don't THINK it's a recent regression at the very least. I've gotten maybe 5 or 6 reports of this issue over the course of a year, but only these two bugs filed form it. My memory is a little fuzzy as it usually only comes up in chat without anyway to reproduce, and often without a stack trace on our end.

To me this does sound likely to be something like a race condition because it only happens rarely. I think version 2 has been around for quite some time now. This db is used frequently with users across multiple platforms, so if there is a race somewhere, we have Mozilla users that will likely hit it. I think I may even have hit it at one point, and delete the database through DevTools. Again, my memory is a bit fuzzy.

Do you think the test in https://bugzilla.mozilla.org/attachment.cgi?id=9164092 reflects the situation? At least with a few runs I wasn't able to reproduce the issue, unfortunately.

Flags: needinfo?(gtatum)

When this happened to me today, I am fairly sure that there was no existing database. (I didn't look, but before capturing the profile, I had just changed the profiler frontend URL to http://localhost:4242, since the first try landed on the default URL of https://profiler.firefox.com). Seems like that would yield an oldVersion of 0, and so I don't think a deletion of symbol-tables was even attempted.
Unfortunately, I have been unsuccessful in reproducing this issue.

It might be worth noting that this happened in a local non-artifact build of Nightly, where I have --enable-debug and --enable-debug-symbols enabled. This seems to slow down the browser quite a bit, and what I experience now is a lot of crashes in the browser. Content tabs and other processes are very unstable and asks me to attach gdb all the time. I haven't looked closely at the stacktraces (yet) since they seem gnarly, but there are IPC things in there and I suspect I am running into race conditions. So perhaps all bets are off in my particular case.

Michael, how do you know there was no database? Were you using a brand new Firefox profile? Otherwise it's totally possible it was created when you worked on something else.

One possibility I was thinking of is that data eviction had a problem, that Firefox would evict the stores but not the DB. Also this could be an old Firefox bug that's already fixed, but that existed in the past.

I noticed that mostly people working with us for some time had the problem, but people that don't use the profiler that often. That's why I thought of data eviction.

Is that useful to have database without any object store at all? Should Firefox delete them?

(In reply to Julien Wajsberg [:julienw] from comment #6)

Michael, how do you know there was no database? Were you using a brand new Firefox profile? Otherwise it's totally possible it was created when you worked on something else.

In short, I don't. :-) But I did in fact create a new Firefox profile for that local build instance of Nightly, and I think that was just a few days ago. That's why I think there was no database there. But it is certainly possible that I've misunderstood something, misremembered, or have even loaded a profile on http://localhost:4242 which I then have forgotten. So you can take my assertion with a grain of salt.

But I am curious, if it indeed was the case that I had a database of version 1 stored, when would that one have been created? When did we switch from version 1 to version 2? A cursory look at the frontend code tells me that that should've happened years ago, no?

Update: And just to be clear, by creating a new Firefox profile in this context, I mean running ./mach run -P and create and use a profile in that dialog box that appears.

I think Michael was hitting this problem on a database that was never even on version 1 of the database, and was only for version 2.

I ran git blame on our version change, and this happened back in 2017.

Wed Apr 19 00:27:53 2017

So I guess the situation was that the database was always running on version 2... It would be interesting if we had a failure case we could inspect more closely. I don't feel like my investigations are making things any clearer. I don't have a lot of experience with the IndexedDB API.

Flags: needinfo?(gtatum)

(In reply to Greg Tatum [:gregtatum] from comment #8)

So I guess the situation was that the database was always running on version 2...

Yes exactly, the possibility is that the db was already at v2, and the data was badly evicted because of a bug in Firefox.
If the profile was made recently, this would mean that Firefox still has this problem. But this is just a guess that may be false, I don't know really how to test this possbility to rule it out.

(In reply to Julien Wajsberg [:julienw] from comment #9)

(In reply to Greg Tatum [:gregtatum] from comment #8)

So I guess the situation was that the database was always running on version 2...

Yes exactly, the possibility is that the db was already at v2, and the data was badly evicted because of a bug in Firefox.

Jan, do you think this is what might have happened? (i.e. IIUC the eviction would have removed the object store, but not the database?)

If the profile was made recently, this would mean that Firefox still has this problem. But this is just a guess that may be false, I don't know really how to test this possbility to rule it out.

Flags: needinfo?(jvarga)
See Also: → 1650627
See Also: 1650627

Another instance of this problem was reported in bug 1652540 comment 15.

I quickly went through some old bugs and found a similar bug 1478602. Can it happen that indexeddb.open() is called multiple times in parallel ?

Flags: needinfo?(jvarga)

I looked at our code paths and unless I missed something I don't think this can happen in our case.

Severity: -- → S3
Priority: -- → P3
Status: NEW → ASSIGNED
Status: ASSIGNED → NEW

Hello from YouTube,

we enabled an IDB feature on firefox today and we are seeing 10000+ of such exceptions per hour. The exact reason why it happens in our code base is still under investigation. Although, we have a minimal working example that can reproduce this error with some extreme setup.

https://gist.github.com/zeroliu/76cc3077f4eea678f4955e7f748484c2

This code tries to interrupt db upgrade by reloading the page. In the background, the page keeps creating and deleting the db. Eventually, reload kicks in between open request starts and upgradeneeded event is fired. We end up having a database with no object stores. You don't have to call window.location.reload inside of the upgradeneeded callback. Moving it to the for-loop can also reproduce the error but it happens less frequently. Based on this behavior, I guess in our case a small set of users close the tab or navigate away from our site in the middle of idb creation.

In Chrome, the database will not be created if the upgradeneeded callback does not finish.

Andrew, can you help with comment 14? Thanks.

Flags: needinfo?(bugmail)

I am going to give this a look. I think I can reproduce the issue with Nightly using the provided test case! Thank you very much for providing that!

Status: NEW → ASSIGNED
Flags: needinfo?(bugmail) → needinfo?(sgiesecke)

I have a Pernosco session now reproducing this with a slightly adjusted version of the test case at https://pernos.co/debug/oVO1nbmBGSf8zgJG_6GC8A/index.html

I have a patch that fixes this behaviour now, with a mochitest closely based on the provided reproduction test case.

Flags: needinfo?(sgiesecke)
Blocks: 1669253
Pushed by sgiesecke@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0f0d637acf76
Ensure that IndexedDB operations are cancelled when nsGlobalWindowInner::FreeInnerObjects is called. r=dom-workers-and-storage-reviewers,janv
Backout by abutkovits@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/98c09e82b3e9
Backed out changeset 0f0d637acf76 for failures complaining about ActorsChild.cpp. CLOSED TREE

(In reply to Atila Butkovits from comment #22)

Backed out for high frequency bc failures complaining about indexedDB.

These failures only occur on asan builds when running netwerk/cookie/test/browser/browser_indexedDB.js

Pushed by sgiesecke@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8d5f1790048c
Ensure that IndexedDB operations are cancelled when nsGlobalWindowInner::FreeInnerObjects is called. r=dom-workers-and-storage-reviewers,janv
Attachment #9164092 - Attachment is obsolete: true
Blocks: 1669671

Sorry this caused another backout. I adapted the patch and did a try run: https://treeherder.mozilla.org/#/jobs?repo=try&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel%2Crunning%2Cpending%2Crunnable&revision=ee3f0f16f453623e6ca535a955a1f586d5a5ddae&selectedTaskRun=SIu3dhgDRKeJIk7tlKaDzA.1

I triggered all Linux browser-chrome tests a second time, but I am not sure which one contains the failing test, it seems to be in a different slice than on autoland. I can't reproduce the problem locally either. Since the tests look good here, I will try another landing. If that should cause failures again, which I do not hope for, we need to find a better way to reproduce it.

Flags: needinfo?(sgiesecke)
Pushed by sgiesecke@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0b329aa2c4e3
Ensure that IndexedDB operations are cancelled when nsGlobalWindowInner::FreeInnerObjects is called. r=dom-workers-and-storage-reviewers,janv

Backed out for assertion failures on ActorsChild.cpp

backout: https://hg.mozilla.org/integration/autoland/rev/053c19428efaa36a8e0279261813a6d143112740

push: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=0b329aa2c4e3bf684924e8cd87ccc9d8cffef1ab&group_state=expanded

push where fails appeared: https://treeherder.mozilla.org/#/jobs?repo=autoland&selectedTaskRun=TGrFdIEER3agIuyApij3qA.0&revision=2f38e26a143b8204abe0839a2818e1b71cb9688e&searchStr=windows%2C10%2Cx64%2Cdebug%2Cmochitests%2Ctest-windows10-64%2Fdebug-mochitest-browser-chrome-e10s%2Cbc3

failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=317944072&repo=autoland&lineNumber=15534

[task 2020-10-07T20:01:57.407Z] 20:01:57 INFO - GECKO(4664) | Assertion failure: !mOpenRequestActor, at /builds/worker/checkouts/gecko/dom/indexedDB/ActorsChild.cpp:1589
[task 2020-10-07T20:01:57.409Z] 20:01:57 INFO - GECKO(4664) | [Parent 3648, GMPThread] WARNING: Failed to delete GMP storage directory: file /builds/worker/checkouts/gecko/dom/media/gmp/GMPServiceParent.cpp:1553
[task 2020-10-07T20:01:57.469Z] 20:01:57 INFO - GECKO(4664) | [Parent 3648, QuotaManager IO] WARNING: QuotaManager failure: 'OkIf(outputStream)', file ActorsParent.cpp:2522
[task 2020-10-07T20:01:57.469Z] 20:01:57 INFO - GECKO(4664) | [Parent 3648, QuotaManager IO] WARNING: QuotaManager failure: 'GetBinaryOutputStream(file, kUpdateFileFlag)', file ActorsParent.cpp:8353
[task 2020-10-07T20:01:57.471Z] 20:01:57 INFO - GECKO(4664) | [Parent 3648, QuotaManager IO] WARNING: 'NS_FAILED(rv)', file /builds/worker/checkouts/gecko/dom/quota/ActorsParent.cpp:8217
[task 2020-10-07T20:01:57.471Z] 20:01:57 INFO - GECKO(4664) | [Parent 3648, QuotaManager IO] WARNING: 'NS_FAILED(rv)', file /builds/worker/checkouts/gecko/dom/quota/ActorsParent.cpp:8127
[task 2020-10-07T20:01:57.471Z] 20:01:57 INFO - GECKO(4664) | [Parent 3648, Jump List] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80520012 (NS_ERROR_FILE_NOT_FOUND): file /builds/worker/checkouts/gecko/widget/windows/WinUtils.cpp:1394
[task 2020-10-07T20:01:57.489Z] 20:01:57 INFO - GECKO(4664) | [Parent 3648, Main Thread] WARNING: Suboptimal indexes for the SQL statement 0x234b7d55380 (http://mzl.la/1FuID0j).: file /builds/worker/checkouts/gecko/storage/mozStoragePrivateHelpers.cpp:113
[task 2020-10-07T20:01:57.508Z] 20:01:57 INFO - Initializing stack-fixing for the first stack frame, this may take a while...
[task 2020-10-07T20:02:09.666Z] 20:02:09 INFO - GECKO(4664) | #01: mozilla::dom::indexedDB::BackgroundFactoryRequestChild::Recv__delete__(mozilla::dom::indexedDB::FactoryRequestResponse const&) [dom/indexedDB/ActorsChild.cpp:1433]
[task 2020-10-07T20:02:09.666Z] 20:02:09 INFO - GECKO(4664) | #02: mozilla::dom::indexedDB::PBackgroundIDBFactoryRequestChild::OnMessageReceived(IPC::Message const&) [s3:gecko-generated-sources:32d2c26adbc21345dcc4049445c3c40c29d2b6b83e49ac66df2a7044cc3e0030cfa2409f27247ff6ab417a2e07685f6ac2f8665a4b26e615e165b570b6952cde/ipc/ipdl/PBackgroundIDBFactoryRequestChild.cpp::124]
[task 2020-10-07T20:02:09.667Z] 20:02:09 INFO - GECKO(4664) | #03: mozilla::ipc::PBackgroundChild::OnMessageReceived(IPC::Message const&) [s3:gecko-generated-sources:00136e3caf26c4b7c1e02c7d7b5c085f78c5c635ada1f4b4ddbb1f6a4ef65fe4154430e47012562c0e0ce77c38fc9103b2b15d160520930d739c0ef793d6930f/ipc/ipdl/PBackgroundChild.cpp::6080]
[task 2020-10-07T20:02:09.667Z] 20:02:09 INFO - GECKO(4664) | #04: mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy
, IPC::Message const&) [ipc/glue/MessageChannel.cpp:2151]
[task 2020-10-07T20:02:09.668Z] 20:02:09 INFO - GECKO(4664) | #05: mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&) [ipc/glue/MessageChannel.cpp:2074]
[task 2020-10-07T20:02:09.668Z] 20:02:09 INFO - GECKO(4664) | #06: mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask&) [ipc/glue/MessageChannel.cpp:1923]
[task 2020-10-07T20:02:09.668Z] 20:02:09 INFO - GECKO(4664) | #07: mozilla::ipc::MessageChannel::MessageTask::Run() [ipc/glue/MessageChannel.cpp:1955]
[task 2020-10-07T20:02:09.668Z] 20:02:09 INFO - GECKO(4664) | #08: mozilla::SchedulerGroup::Runnable::Run() [xpcom/threads/SchedulerGroup.cpp:146]
[task 2020-10-07T20:02:09.668Z] 20:02:09 INFO - GECKO(4664) | #09: mozilla::RunnableTask::Run() [xpcom/threads/TaskController.cpp:246]
[task 2020-10-07T20:02:09.668Z] 20:02:09 INFO - GECKO(4664) | #10: mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex &> const&) [xpcom/threads/TaskController.cpp:515]
[task 2020-10-07T20:02:09.669Z] 20:02:09 INFO - GECKO(4664) | #11: mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex &> const&) [xpcom/threads/TaskController.cpp:374]
[task 2020-10-07T20:02:09.669Z] 20:02:09 INFO - GECKO(4664) | #12: mozilla::TaskController::ProcessPendingMTTask(bool) [xpcom/threads/TaskController.cpp:171]
[task 2020-10-07T20:02:09.670Z] 20:02:09 INFO - GECKO(4664) | #13: mozilla::detail::RunnableFunction<`lambda at /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:85:7'>::Run() [xpcom/threads/nsThreadUtils.h:578]
[task 2020-10-07T20:02:09.670Z] 20:02:09 INFO - GECKO(4664) | #14: nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:1146]
[task 2020-10-07T20:02:09.670Z] 20:02:09 INFO - GECKO(4664) | #15: NS_ProcessNextEvent(nsIThread*, bool) [xpcom/threads/nsThreadUtils.cpp:513]
[task 2020-10-07T20:02:09.670Z] 20:02:09 INFO - GECKO(4664) | #16: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) [ipc/glue/MessagePump.cpp:87]
[task 2020-10-07T20:02:09.671Z] 20:02:09 INFO - GECKO(4664) | #17: MessageLoop::RunHandler() [ipc/chromium/src/base/message_loop.cc:328]
[task 2020-10-07T20:02:09.671Z] 20:02:09 INFO - GECKO(4664) | #18: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:310]
[task 2020-10-07T20:02:09.671Z] 20:02:09 INFO - GECKO(4664) | #19: nsBaseAppShell::Run() [widget/nsBaseAppShell.cpp:139]
[task 2020-10-07T20:02:09.671Z] 20:02:09 INFO - GECKO(4664) | #20: nsAppShell::Run() [widget/windows/nsAppShell.cpp:602]
[task 2020-10-07T20:02:09.672Z] 20:02:09 INFO - GECKO(4664) | #21: XRE_RunAppShell() [toolkit/xre/nsEmbedFunctions.cpp:913]
[task 2020-10-07T20:02:09.672Z] 20:02:09 INFO - GECKO(4664) | #22: mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) [ipc/glue/MessagePump.cpp:237]
[task 2020-10-07T20:02:09.672Z] 20:02:09 INFO - GECKO(4664) | #23: MessageLoop::RunHandler() [ipc/chromium/src/base/message_loop.cc:328]
[task 2020-10-07T20:02:09.672Z] 20:02:09 INFO - GECKO(4664) | #24: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:310]
[task 2020-10-07T20:02:09.672Z] 20:02:09 INFO - GECKO(4664) | #25: XRE_InitChildProcess(int, char**, XREChildData const*) [toolkit/xre/nsEmbedFunctions.cpp:748]
[task 2020-10-07T20:02:09.752Z] 20:02:09 INFO - GECKO(4664) | #26: NS_internal_main(int, char**, char**) [browser/app/nsBrowserApp.cpp:304]
[task 2020-10-07T20:02:09.752Z] 20:02:09 INFO - GECKO(4664) | #27: wmain(int, wchar_t**) [toolkit/xre/nsWindowsWMain.cpp:131]
[task 2020-10-07T20:02:09.752Z] 20:02:09 INFO - GECKO(4664) | #28: __scrt_common_main_seh() [/builds/worker/workspace/obj-build/browser/app/f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:288]

Flags: needinfo?(sgiesecke)
Flags: needinfo?(sgiesecke)
Pushed by sgiesecke@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/77e5047d6eb2
Ensure that IndexedDB operations are cancelled when nsGlobalWindowInner::FreeInnerObjects is called. r=dom-workers-and-storage-reviewers,janv
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 83 Branch

Based on the minimal test case you provided (thanks a lot again for that!), we fixed the issue now. Could you try with a current Nightly if it also fixes the original issue you experienced?

Flags: needinfo?(zeroliu)

Thanks very much for the quick fix!

Unfortunately, I don't think there is an easy way for us to verify it within YouTube before the fix rolls out to production. We don't have a reliable way to reproduce the error in the application. Do you know which FF version will start to include this fix? I will keep monitoring the exceptions and keep you updated whether it still appears in the latest FF.

Flags: needinfo?(zeroliu)
Flags: needinfo?(sgiesecke)

Comment on attachment 9179225 [details]
Bug 1653276 - Ensure that IndexedDB operations are cancelled when nsGlobalWindowInner::FreeInnerObjects is called. r=#dom-workers-and-storage

Beta/Release Uplift Approval Request

  • User impact if declined: Users may be left in a situation with corrupted object stores when a reload occurred during a version change transaction. Depending on the site, this may or may not be an obvious problem, and resolving the situation requires manually deleting the database, which requires advanced user knowledge.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Medium
  • Why is the change risky/not risky? (and alternatives if risky): Theoretically, this might lead to follow-up issues due to the changed control flow. However, the changes only affect error situations, so the risk for normal operations is low.
  • String changes made/needed:

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: We want to avoid accumulating more broken profiles which require manual repair. See user impact.
  • User impact if declined: Users may be left in a situation with corrupted object stores when a reload occurred during a version change transaction. Depending on the site, this may or may not be an obvious problem, and resolving the situation requires manually deleting the database, which requires advanced user knowledge.
  • Fix Landed on Version: 83 (or 82 if uplift granted)
  • Risk to taking this patch: Medium
  • Why is the change risky/not risky? (and alternatives if risky): Theoretically, this might lead to follow-up issues due to the changed control flow. However, the changes only affect error situations, so the risk for normal operations is low.
  • String or UUID changes made by this patch:
Flags: needinfo?(sgiesecke)
Attachment #9179225 - Flags: approval-mozilla-esr78?
Attachment #9179225 - Flags: approval-mozilla-beta?

(In reply to Xiyuan Liu from comment #33)

Thanks very much for the quick fix!

Unfortunately, I don't think there is an easy way for us to verify it within YouTube before the fix rolls out to production. We don't have a reliable way to reproduce the error in the application. Do you know which FF version will start to include this fix? I will keep monitoring the exceptions and keep you updated whether it still appears in the latest FF.

At the latest, the fix will be contained in Firefox 83. However, I requested beta uplift now. If that is granted, it will be contained in Firefox 82.

Comment on attachment 9179225 [details]
Bug 1653276 - Ensure that IndexedDB operations are cancelled when nsGlobalWindowInner::FreeInnerObjects is called. r=#dom-workers-and-storage

we already built the 82 release candidate, this doesn't sound appropriate this late

Attachment #9179225 - Flags: approval-mozilla-beta? → approval-mozilla-beta-

Comment on attachment 9179225 [details]
Bug 1653276 - Ensure that IndexedDB operations are cancelled when nsGlobalWindowInner::FreeInnerObjects is called. r=#dom-workers-and-storage

approved for 78.5

Attachment #9179225 - Flags: approval-mozilla-esr78? → approval-mozilla-esr78+

This needs a bit of rebasing for ESR78.

Flags: needinfo?(sgiesecke)

(In reply to Ryan VanderMeulen [:RyanVM] from comment #38)

This needs a bit of rebasing for ESR78.

I rebased the patch for ESR78.

Flags: needinfo?(sgiesecke)

I believe I just hit this problem. It persistently prevented me from using the profiler until I used the devtools to remove the db.

I'm on Windows 10, using Nightly 84.0a1 (2020-10-26) (64-bit)

Just checked the repo snippet with the FF 89 today. The error can still be reproduced. The problem can also be discovered in our error logs in prod.

(In reply to Xiyuan Liu from comment #43)

Just checked the repo snippet with the FF 89 today. The error can still be reproduced. The problem can also be discovered in our error logs in prod.

Can you clarify if if the implication is that these are a) newly broken databases or b) broken databases which could be old or new and it's not clear. If it's the latter, it might make sense to prioritize bug 1669253 first.

Flags: needinfo?(zeroliu)

For context, we have an internal recovery system that re-creates user's database if a claimed object store does not exist in the objectStoreNames list. The check is run when we open a database connection, regardless of whether the database is new or old. We recently add a new log to verify whether this code path is still called and FF is still hitting it.

It's hard to track whether this happens to existing databases or new ones. I think it's more likely to be new ones since the reported repo steps interrupts upgrade methods, which only runs for new databases.

Flags: needinfo?(zeroliu)

Are you still seeing this happen in your logs?

Flags: needinfo?(zeroliu)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: