Thunderbird Crash in [@ mozilla::ipc::PrincipalInfoToPrincipal]
Categories
(Core :: DOM: Service Workers, defect, P2)
Tracking
()
People
(Reporter: wsmwk, Unassigned)
References
Details
(Keywords: crash, topcrash-thunderbird, Whiteboard: [tbird topcrash])
Crash Data
#5 ranked crash for beta.
Might have been a big upswing in the beta 65 era, but hard to say, because we don't have crash data from 2018 https://crash-stats.mozilla.com/signature/?product=Thunderbird&signature=mozilla%3A%3Aipc%3A%3APrincipalInfoToPrincipal&date=%3E%3D2019-01-08T13%3A23%3A00.000Z&date=%3C2019-07-08T13%3A23%3A00.000Z#summary example: bp-d426a576-f8f3-43a2-a98b-5ec220190208 build ID 20190123180341
This bug is for crash report bp-e4dd235a-703d-4109-bd80-02d3a0190708.
Top 10 frames of crashing thread:
0 xul.dll mozilla::ipc::PrincipalInfoToPrincipal ipc/glue/BackgroundUtils.cpp:98
1 xul.dll mozilla::dom::ClientInfo::GetPrincipal dom/clients/manager/ClientInfo.cpp:96
2 xul.dll nsGlobalWindowInner::EnsureClientSource dom/base/nsGlobalWindowInner.cpp:1721
3 xul.dll nsGlobalWindowInner::ExecutionReady dom/base/nsGlobalWindowInner.cpp:1802
4 xul.dll nsGlobalWindowOuter::SetNewDocument dom/base/nsGlobalWindowOuter.cpp:2224
5 xul.dll nsDocumentViewer::InitInternal layout/base/nsDocumentViewer.cpp:966
6 xul.dll nsDocumentViewer::Init layout/base/nsDocumentViewer.cpp:717
7 xul.dll nsDocShell::SetupNewViewer docshell/base/nsDocShell.cpp:8446
8 xul.dll nsDocShell::Embed docshell/base/nsDocShell.cpp:6355
9 xul.dll nsDocShell::CreateContentViewer docshell/base/nsDocShell.cpp:8261
bp-1239c33e-94c5-414b-967a-a2f390190707 another beta
Reporter | ||
Updated•5 years ago
|
Comment 1•5 years ago
|
||
Heaps of Firefox crashes in the reports. Why don't we ship this off to Firefox and ask people who work on DOM IPC like Nika?
Reporter | ||
Comment 2•5 years ago
|
||
Hopefully that will help. But note, the Firefox crashes don't an increase at the same time as the Thunderbird crashes. And the Firefox crash rate on a per user basis is nowhere near as high as Thunderbird.
Comment 3•5 years ago
|
||
The specific case here is a deserialization error when decoding a codebase principal. In effect what this code is doing is something like this (pseudocode):
// sending side
// true == aPrincipal->GetIsCodebasePrincipal();
nsCString spec = aPrincipal->GetURI()->GetSpec();
nsCString originNoSuffix = aPrincipal->GetOriginNoSuffix();
OriginAttributes attrs = aPrincipal->GetOriginAttributes();
// receiving side
nsCOMPtr<nsIPrincipal> principal =
BasePrincipal::CreateCodebasePrincipal(NS_NewURI(spec), attrs);
MOZ_ASSERT(principal->GetOriginNoSuffix() == originNoSuffix);
This assertion can fail if a couple of invariants aren't held, such as:
- Round-tripping a URI to its spec is lossy - if this is the case, a different URI may arrive at the other side, and thus the wrong origin is selected
- Getting the
originNoSuffix
value for a URI can vary over time.
In this code's case, the time when the PrincipalInfo is deserialized can be far in time from when it was serialized, meaning that it's possible that dependencies on ambient state could change things. So anything which depends on ambient mutable state could change here: https://searchfox.org/mozilla-central/rev/da3f3eaaacb6fb344fd21ac29ace2da0e33f12d3/caps/ContentPrincipal.cpp#101
I see a few places where we depend on local state there (beyond the more direct issues like protocols changing their ORIGIN_IS_FULL_SPEC
flag bit):
- If a protocol dynamically changes information (such as the
ORIGIN_IS_FULL_SPEC
flag), the origin can change. - If the
security.fileuri.strict_origin_policy
pref changes, it will change the principals of file URIs. - Loading blob principal information. This requires reading from the ambient blob URL state. If the blob has been revoked this method may change the principal which is returned from a content principal to a null principal, causing an assertion. https://searchfox.org/mozilla-central/rev/da3f3eaaacb6fb344fd21ac29ace2da0e33f12d3/caps/ContentPrincipal.cpp#182-186
Not sure which issue thunderbird is running into in particular, but ni? :baku to take a look at the blob URI situation.
Comment 4•5 years ago
|
||
Thanks, Nika. I didn't see that we MOZ_CRASH() on "Origin must be available when deserialized" here:
https://searchfox.org/mozilla-central/rev/da3f3eaaacb6fb344fd21ac29ace2da0e33f12d3/ipc/glue/BackgroundUtils.cpp#98
I guess this needs closer looking at by a backend guy.
Reporter | ||
Comment 5•5 years ago
|
||
Still #5 crash for Thunderbird
For Thunderbird this goes back to at least version 66 (but I say this without having compared stacks across versions)
bp-f26af196-4f89-4afd-8e23-de0dd0190318 win10 Crash Address 0x10376263
bp-fedbe70c-6d58-4187-974b-68d510190313 mac Crash Address 0x0
bp-463ca1a3-2d5a-4251-a893-155700190225 win10 Crash Address 0x52e85d93
More recent - version 69
bp-5af27bfd-58e3-4564-8b07-0616d0190822 win7 Crash Address 0x7fece1b810e
bp-43575dee-a6c7-4d46-9618-3a5030190807 Mac
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Comment 6•5 years ago
|
||
Do we really expect bug 1578458 to help in the Thunderbird case?
Reporter | ||
Comment 7•4 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #6)
Do we really expect bug 1578458 to help in the Thunderbird case?
bug 1578458 is set at P5 (not very encouraging), plus bug 1578458 comment 13 "Looking at crash-stats, the crash is hit in the field from a reasonable variety of callers - indicating the calling code here is buggy in quite a few places."
Given that comment, are we not at fault?
Comment 8•4 years ago
|
||
Looking at e.g. bp-e5752ece-4367-438e-9b45-a88bf0200107 probably not c-c at fault. We have an m_channelListener and go from there. If something's wrong in c-c, I don't know what it is.
Reporter | ||
Updated•4 years ago
|
Comment 9•4 years ago
|
||
In the Firefox crashes, such as https://crash-stats.mozilla.org/report/index/2cb1e2b9-ce2a-41f9-8953-fbf030200309 , I see a HTTP cookie related signature, where the crash comes from
https://searchfox.org/mozilla-central/source/netwerk/cookie/CookieJarSettings.cpp#281
That doesn't seem like a blob URL concern, which is what the present needinfo is about. Maybe it's a different bug.
Either way, it's unclear to me why the bug is in this component. The Thunderbird crashes implicate code that was added for Service Workers and the Firefox crashes implicate cookie code. Sending over to Service Workers for now.
Comment 10•4 years ago
|
||
(In reply to Henri Sivonen (:hsivonen) from comment #9)
The Thunderbird crashes implicate code that was added for Service Workers.
I cannot see this, at least in the 4-5 crashes I clicked on.
And we have several bugs talking about PrincipalInfoToPrincipal crashes, all assigned to different components. Probably someone should just dig into it regardless of component ownership.
Comment 11•4 years ago
|
||
Maybe mozilla::ipc::PrincipalInfoToPrincipal should be added to the Socorro prefix list? However, I have no idea at all how the crash reporting works for Thunderbird. In the crash stats table above, Thunderbird isn't mentioned at all.
Comment 12•4 years ago
|
||
Following the link from "Signature" in the bug metadata, the resulting list is mostly Thunderbird.
Comment 13•4 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #10)
(In reply to Henri Sivonen (:hsivonen) from comment #9)
The Thunderbird crashes implicate code that was added for Service Workers.
I cannot see this, at least in the 4-5 crashes I clicked on.
I was referring to mozilla::dom::ClientInfo
as code added for Service Workers. E.g.:
https://crash-stats.mozilla.org/report/index/9b7eb840-e904-45ab-9758-d76450200309
Reporter | ||
Comment 14•4 years ago
|
||
(In reply to Simon Giesecke [:sg] [he/him] from comment #11)
Maybe mozilla::ipc::PrincipalInfoToPrincipal should be added to the Socorro prefix list?
If doing so presents a more accurate crash signature I would support doing this.
However, I have no idea at all how the crash reporting works for Thunderbird.
Works the same as Firefox
Comment 15•4 years ago
|
||
(In reply to Henri Sivonen (:hsivonen) from comment #13)
I was referring to
mozilla::dom::ClientInfo
as code added for Service Workers. E.g.:
https://crash-stats.mozilla.org/report/index/9b7eb840-e904-45ab-9758-d76450200309
I see. Digging a bit further I stumbled over:
#ifdef FUZZING
return nullptr;
#else
MOZ_CRASH("Origin must be available when deserialized");
#endif /* FUZZING */
introduced by bug 1598472. So if we can safely return nullptr in case of fuzzing, probably we should do so also normally? Or remove this and try to reduce a testcase with the fuzzer if we want to enforce this invariant ?
Updated•4 years ago
|
Comment 16•4 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #15)
introduced by bug 1598472. So if we can safely return nullptr in case of fuzzing, probably we should do so also normally?
I think it doesn't mean that returning nullptr
is safe--just a level of unsafety that fuzzing is willing to overlook in order to find out things. It could be bad to return nullptr
in the wild.
Comment 17•4 years ago
|
||
(In reply to Henri Sivonen (:hsivonen) from comment #16)
(In reply to Jens Stutte [:jstutte] from comment #15)
introduced by bug 1598472. So if we can safely return nullptr in case of fuzzing, probably we should do so also normally?
I think it doesn't mean that returning
nullptr
is safe--just a level of unsafety that fuzzing is willing to overlook in order to find out things. It could be bad to returnnullptr
in the wild.
We've hit this before in fuzzing, that is why the code is as it is, but recent discussions have concluded that this code should not MOZ_CRASH
at all (via IPC, because it causes instabilities like this one and essentially undos the process separation). So yes, we probably want to return nullptr
all the time, but we need to check that callers handle this properly.
Comment 18•4 years ago
|
||
(In reply to Henri Sivonen (:hsivonen) from comment #16)
this code should notMOZ_CRASH
at all (via IPC, because it causes instabilities like this one and essentially undos the process separation).
Perry, what do you think? I totally second that code called via IPC should never crash regardless of outer order invariants violated or not. Do you suspect side effects?
Comment 19•4 years ago
|
||
That makes sense to me if it's the case that de-serialization of the PrincipalInfo is failing. Not crashing to maintain process separation is a good point too. I don't suspect side effects as long as the callers check if PrincipalInfoToPrincipal fails, and from a quick spot check it looks like most but not all do check this.
Comment 20•4 years ago
|
||
I think the rationale for crashing was that a mismatch in origins coming from a content process suggests the content process has been compromised (see https://bugzilla.mozilla.org/show_bug.cgi?id=1340163#c37 and https://bugzilla.mozilla.org/show_bug.cgi?id=1340163#c55). In general, the strategy for such cases has been to terminate the potentially rogue content process, not the parent process. However, the code in question didn't/doesn't have easy access to the actor that provided the PrincipalInfo, so it just opted to crash the parent process.
Comment 21•4 years ago
|
||
I should also note that with bug 922464 fixed via bug 1536744, it's not clear we ever need to ship the origin separately in PrincipalInfo anymore since it should be able to be derived on any thread. (A motivation for PBackground APIs previously was that they couldn't parse URIs off the main thread and so needed to have the origin pre-computed, but this also needed to be enforced by fatal assertions for security reasons, as it would be very nice for an attacker to say the URI was evil.com but the origin was good.com.) It's mainly just a question of actually performing the cleanup at this point.
Reporter | ||
Comment 22•4 years ago
|
||
Would it surprise anyone if the signature is gone for Thunderbird after 77beta buildid 20200511021338, i.e. 77.0b2? Half the crashes via nsImapCacheStreamListener and half via nsMailboxProtocol::ReadMessageResponse
No crashes with 77.0b3, built 5/18/2020 - comm-beta range https://hg.mozilla.org/releases/comm-beta/pushloghtml?fromchange=THUNDERBIRD_77_0b2_RELEASE&tochange=THUNDERBIRD_77_0b3_RELEASE&full=1
Comment 23•4 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #22)
Would it surprise anyone if the signature is gone for Thunderbird after 77beta buildid 20200511021338, i.e. 77.0b2? Half the crashes via nsImapCacheStreamListener and half via nsMailboxProtocol::ReadMessageResponse
Not if those builds include the fix from bug 1635399 which removed all calls to MOZ_CRASH from PrincipalInfoToPrincipal in favor of propagating errors to the callers and for them to handle things.
Reporter | ||
Updated•3 years ago
|
Description
•