Crash in [@ mozilla::RemoteLazyInputStreamThread::Dispatch]
Categories
(Core :: DOM: File, defect, P2)
Tracking
()
People
(Reporter: sefeng, Assigned: jstutte)
References
Details
(Keywords: crash)
Crash Data
Attachments
(2 files)
Crash report: https://crash-stats.mozilla.org/report/index/a3338930-499c-4c52-97a3-7bc6e0220622
Reason: SIGSEGV / SEGV_MAPERR
Top 10 frames of crashing thread:
0 libxul.so mozilla::RemoteLazyInputStreamThread::Dispatch dom/file/ipc/RemoteLazyInputStreamThread.cpp:149
1 libxul.so mozilla::dom::LSObject::DoRequestSynchronously dom/localstorage/LSObject.cpp:867
2 libxul.so mozilla::dom::LSObject::EnsureDatabase dom/localstorage/LSObject.cpp:920
3 libxul.so mozilla::dom::LSObject::SetItem dom/localstorage/LSObject.cpp:614
4 libxul.so mozilla::dom::Storage_Binding::setItem dom/bindings/StorageBinding.cpp:233
5 libxul.so bool mozilla::dom::binding_detail::GenericMethod<mozilla::dom::binding_detail::NormalThisPolicy, mozilla::dom::binding_detail::ThrowExceptions> dom/bindings/BindingUtils.cpp:3272
6 libxul.so Interpret js/src/vm/Interpreter.cpp:3325
7 libxul.so js::Call js/src/vm/Interpreter.cpp:606
8 libxul.so js::fun_call js/src/vm/JSFunction.cpp:971
9 libxul.so Interpret js/src/vm/Interpreter.cpp:3325
Firefox 100 seems to be the starting point. All the crashes has 0x0000000000000000
as the address.
Comment 1•2 years ago
|
||
Note that the vast majority of crashes seem to come from a single user.
Comment 2•2 years ago
|
||
Lots of different users in more recent crashes (the spike was one user I think). Looks like mThread is null for some reason.
mThread should only be null if a) we haven't called ::Initialize() yet, or b) if we observe thread-shutdown. We take a mutex lock when doing this, and on shutdown set gShutdownHasStarted to true (this is a true global).
In Dispatch(), which is where the crashes are, we take the mutex and check if shutdown has started (and if so return an error). That's not happening, so shutdown must not have happened, so we must not have called Initialize().
However, the LS code gets the RemoteLazyInputStreamThread using ::GetOrCreate(), which always calls Initialize() before returning RemoteLazyInputStreamThread.
So, how can it be null? Note also that there are hits from other places, though most are from LSObject. (There are a few from Blob)
https://crash-stats.mozilla.org/report/index/764ca895-66ef-4082-8450-b680b0220719
https://crash-stats.mozilla.org/report/index/6373bcc3-174a-4e28-aa43-6e13f0220701
Ideas? Frequency has gone up since spring, even discounting the spike
Assignee | ||
Comment 3•2 years ago
•
|
||
As a side note: RemoteLazyInputStreamThread::GetOrCreate()
can fail and return nullptr
itself.
I see at least one place where we do not check the returned thread object before dispatch which might be a way to get to this situation?
I see the following places with this pattern:
- https://searchfox.org/mozilla-central/rev/2bbb0c0a90df20702df8c8011a8996536a83cb75/dom/file/ipc/RemoteLazyInputStream.cpp#162-163
- https://searchfox.org/mozilla-central/rev/2bbb0c0a90df20702df8c8011a8996536a83cb75/dom/file/ipc/RemoteLazyInputStream.cpp#767-768
- https://searchfox.org/mozilla-central/rev/2bbb0c0a90df20702df8c8011a8996536a83cb75/dom/file/ipc/RemoteLazyInputStream.cpp#1181-1182
- https://searchfox.org/mozilla-central/rev/2bbb0c0a90df20702df8c8011a8996536a83cb75/dom/file/ipc/RemoteLazyInputStream.cpp#1286-1287
Comment 4•2 years ago
|
||
If GetOrCreate returns nullptr, we won't be calling domFileThread->Dispatch() (and if we did it would crash there).
We do have a few cases which crash in BindChildActor (called from the Blob code IIRC); so the missing check may well be the cause of those crashes.
https://crash-stats.mozilla.org/signature/?proto_signature=~BindChild&signature=mozilla%3A%3ARemoteLazyInputStreamThread%3A%3ADispatch&date=%3E%3D2022-01-28T14%3A07%3A00.000Z&date=%3C2022-07-28T14%3A07%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_columns=startup_crash&_sort=-date&page=1
Comment 5•2 years ago
|
||
There are no crashes for the other GetOrCreate() missing result checks, though we should fix all of them
Comment 6•2 years ago
|
||
As jens noted, it is theoretically possible for us to fail to create a thread with NS_NewNamedThread
which would lead to this crash as GetOrCreate
would fail. It doesn't appear that we're during shutdown here, so I'm guessing it's not caused by gShutdownHasStarted
being set meaning that seems like the most likely answer.
In the case of https://crash-stats.mozilla.org/report/index/764ca895-66ef-4082-8450-b680b0220719 it appears that this could be a null thread which wasn't checked, and it's one of the cases :jstutte mentioned.
In the case of https://crash-stats.mozilla.org/report/index/6373bcc3-174a-4e28-aa43-6e13f0220701 it interestingly looks a bit different. thread
is clearly non-nullptr here because it's checked (https://searchfox.org/mozilla-central/rev/2bbb0c0a90df20702df8c8011a8996536a83cb75/dom/file/ipc/RemoteLazyInputStream.cpp#1082-1084,1086), but we're still crashing somewhere.
A bit of me wonders if this might be connected to the slightly janky implementation of GetOrCreate
which will if the thread doesn't exist, attempt to create & initialize it, but won't clear it if initialization fails meaning it will return the created but uninitilized thread next time, potentially crashing elsewhere (e.g. when actually accessing the mThread
member).
I would generally just have changed RemoteLazyInputStreamThread
to using a TaskQueue
rather than a thread if it wasn't for the fact that it's also used outside of RemoteLazyINputStream
due to its history as the "DOM File thread" (https://searchfox.org/mozilla-central/rev/2bbb0c0a90df20702df8c8011a8996536a83cb75/dom/localstorage/LSObject.cpp#1068, https://searchfox.org/mozilla-central/rev/2bbb0c0a90df20702df8c8011a8996536a83cb75/dom/localstorage/LocalStorageManager2.cpp#477, https://searchfox.org/mozilla-central/source/dom/file/TemporaryFileBlobImpl.cpp#71). Some of those look like they'd be simple enough to move to a normal background task (e.g. https://searchfox.org/mozilla-central/rev/2bbb0c0a90df20702df8c8011a8996536a83cb75/dom/file/TemporaryFileBlobImpl.cpp#69-79 could probably just be dispatched as a background task, though I'm less certain about the LocalStorageManager2
and LSObject
types as I'm not as familiar with them, but they seem to start PBackground
so probably can't use a task queue yet.
Updated•2 years ago
|
Comment 7•2 years ago
|
||
A bit of me wonders if this might be connected to the slightly janky implementation of GetOrCreate which will if the thread doesn't exist, attempt to create & initialize it, but won't clear it if initialization fails meaning it will return the created but uninitilized thread next time, potentially crashing elsewhere (e.g. when actually accessing the mThread member).
Aha... That would fit - but there are still issues with this theory. On Initialize() failure it returns nullptr but leaves gRemoteLazyThread set, which sets us up for this crash on any future call to GetOrCreate(). Good so far... but what causes Initialize() to fail? The only thing is if NS_NewNamedThread() fails. That will only happen if we can't create the thread, or if attempting to Dispatch() an initial event fails -- and Initialize() doesn't supply an initial event. We can fail to create if we're late in shutdown (mInitialized is false), or if PR_CreateThread() fails.
It seems unlikely that we could be that far in shutdown when this happens from the reports, and the reports don't imply any OOM condition that would cause CreateThread() to fail. So we're back to "how could Initialize() fail here?" I don't see that it would.
That said, GetOrCreate absolutely should clear gRemoteLazyThread on Initialize() failure.
Assignee | ||
Comment 8•2 years ago
|
||
I think this is a duplicate of bug 1788368.
Assignee | ||
Comment 9•2 years ago
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #7)
A bit of me wonders if this might be connected to the slightly janky implementation of GetOrCreate which will if the thread doesn't exist, attempt to create & initialize it, but won't clear it if initialization fails meaning it will return the created but uninitilized thread next time, potentially crashing elsewhere (e.g. when actually accessing the mThread member).
That said, GetOrCreate absolutely should clear gRemoteLazyThread on Initialize() failure.
(In reply to Jens Stutte [:jstutte] from comment #8)
I think this is a duplicate of bug 1788368.
The fix there did not cover this specific aspect, we need an additional patch here.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 10•2 years ago
|
||
Updated•2 years ago
|
Assignee | ||
Comment 11•2 years ago
|
||
Depends on D156395
Comment 12•2 years ago
|
||
Assignee | ||
Updated•2 years ago
|
Comment 13•2 years ago
|
||
bugherder |
Comment 14•2 years ago
|
||
Assignee | ||
Updated•2 years ago
|
Comment 15•2 years ago
|
||
bugherder |
Updated•2 years ago
|
Description
•