Crash in [@ mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | mozilla::dom::PContentParent::OnMessageReceived] from deserializing UntrustedModulesData
Categories
(External Software Affecting Firefox :: Telemetry, defect)
Tracking
(firefox-esr68 unaffected, firefox71 unaffected, firefox72+ fixed, firefox73+ fixed)
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox71 | --- | unaffected |
firefox72 | + | fixed |
firefox73 | + | fixed |
People
(Reporter: pascalc, Assigned: aklotz)
References
(Regression)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(5 files, 1 obsolete file)
2.83 KB,
text/plain
|
chutten
:
data-review+
|
Details |
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta+
|
Details | Review |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
lizzard
:
approval-mozilla-beta+
|
Details | Review |
[Tracking Requested - why for this release]:
This bug is for crash report bp-a0d9cbfa-ea1d-463a-86b0-e0dfe0191213.
Top 10 frames of crashing thread:
0 xul.dll mozilla::ipc::FatalError ipc/glue/ProtocolUtils.cpp:165
1 xul.dll mozilla::ipc::IProtocol::HandleFatalError ipc/glue/ProtocolUtils.cpp:404
2 xul.dll mozilla::dom::PContentParent::OnMessageReceived ipc/ipdl/PContentParent.cpp:6312
3 xul.dll void mozilla::ipc::MessageChannel::DispatchMessage ipc/glue/MessageChannel.cpp:2131
4 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1256
5 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486
6 xul.dll void mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:109
7 xul.dll void MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:308
8 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:290
9 xul.dll nsBaseAppShell::Run widget/nsBaseAppShell.cpp:137
This signature is spiking on nightly over the last 3 days.
Comment 1•1 year ago
|
||
I looked at about 10 of these crash reports and they all had an IPC fatal error message of "Error deserializing 'UntrustedModulesData?'".
Updated•1 year ago
|
Updated•1 year ago
|
Comment 2•1 year ago
|
||
I don't know this code at all but the ParamTraits look pretty accurate to me. There is a little mixture of nsAutoString and nsString with ParamTraits but I'm pretty sure that is legit. The only other part that I wouldn't feel 100% on is this:
https://searchfox.org/mozilla-central/rev/c61720a7d0c094d772059f9d6a7844eb7619f107/toolkit/xre/UntrustedModulesData.h#301
I don't get all of the conditions under which NS_NewLocalFile fails.
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Comment 3•1 year ago
|
||
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Comment 4•1 year ago
|
||
Assignee | ||
Comment 5•1 year ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a8332a69b883c9abac021deaf2868610884f278e
Comment 6•1 year ago
|
||
Comment on attachment 9116556 [details] Data Review Request DATA COLLECTION REVIEW RESPONSE: Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate? Yes. This collection is documented in its definitions file [CrashAnnotations.yaml](https://hg.mozilla.org/mozilla-central/file/tip/toolkit/crashreporter/CrashAnnotations.yaml). Is there a control mechanism that allows the user to turn the data collection on and off? Yes. This collection is part of crash reporting which is opt-in on the first and subsequent crashes, as well as being controllable through Firefox's Preferences. If the request is for permanent data collection, is there someone who will monitor the data over time? No. This collection will expire as soon as we get answers. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under? Category 1, Technical. Is the data collection request for default-on or default-off? Default off for everyone. Only ever collected on Nightly, opt-in. Does the instrumentation include the addition of any new identifiers? No. Is the data collection covered by the existing Firefox privacy notice? Yes. Does there need to be a check-in in the future to determine whether to renew the data? Yes. :aklotz is responsible for renewing or removing the collection once we get answers. --- Result: datareview+
Pushed by aklotz@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/44b14e9e661f Add temporary, Nightly-only crash annotations to UntrustedModulesData and dependencies to determine failure location; r=handyman
Comment 8•1 year ago
|
||
bugherder |
Assignee | ||
Comment 9•1 year ago
|
||
I've got some data coming back now. From the annotation that I've seen, it looks like there are events that reference modules that are not in the ModulesMap
hash table. I'm not sure how/why yet.
Assignee | ||
Comment 10•1 year ago
•
|
||
[Tracking Requested - why for this release]:
Potential for buffers to be filled with invalid data, leading to unexpected errors.
Assignee | ||
Comment 11•1 year ago
|
||
Untrusted modules is receiving bad data due to a copy/move of MemorySectionNameBuf
that does not update the UNICODE_STRING
's Buffer
field with a pointer to the receiving buffer's location.
I think this is the only cause, but I'll leave the instrumentation in a bit longer to confirm.
Assignee | ||
Comment 12•1 year ago
|
||
Assignee | ||
Comment 13•1 year ago
|
||
The affected code hasn't made it to beta yet. Revoking tracking request.
Assignee | ||
Comment 14•1 year ago
|
||
Actually, that's incorrect.
[Tracking Requested - why for this release]:
Potential for invalid buffer contents. Those buffers are not really used in 72, but we should fix this just in case. The regressing bug did land in 72, even though the subsequent work that revealed the crashes did not land until 73.
Comment 15•1 year ago
|
||
Pushed by aklotz@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e0f712f3c394 Add explicit copy constructor and copy assignment operators to nt::MemorySectionNameBuf; r=mhowell
Comment 16•1 year ago
|
||
bugherder |
Comment 17•1 year ago
|
||
Still seeing crashes in the 20191220095035 nightly which should have the patch from comment 16, e.g. bp-4db0d02f-7b3b-4197-a9f2-f9a470191220
Assignee | ||
Comment 18•1 year ago
|
||
Assignee | ||
Comment 19•1 year ago
|
||
I still do not yet have a good grasp of the root cause of this data integrity
issue, however I also think that having the browser process crash because of it
is far worse.
This also adds a check in our telemetry processing that the deserialized event
is valid.
We'll probably need to add more diagnostic information in the future, but I'd
like to get rid of the IPC errors in Nightly in time for holidays.
Depends on D58046
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Comment 20•1 year ago
|
||
Comment on attachment 9117116 [details]
Bug 1603714: Add explicit copy constructor and copy assignment operators to nt::MemorySectionNameBuf; r=mhowell
Beta/Release Uplift Approval Request
- User impact if declined: Potential for bad memory to be included in various buffers. AFAICT nothing is actually done with those buffers in 72 now that it is in Beta, but I think we should land this just in case.
(Confirming that just this patch needs to be uplifted, not the others in this bug)
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Very simple, localized patch with good understanding of the issue.
- String changes made/needed: None
Comment 21•1 year ago
|
||
Pushed by aklotz@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5d04eb35d98e Backed out changeset 44b14e9e661f as this information is no longer needed;
Comment 22•1 year ago
|
||
Pushed by aklotz@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c3a21a4b5a45 Do not treat third-party module data integrity error as IPC error; r=mhowell
Assignee | ||
Updated•1 year ago
|
Comment 23•1 year ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/5d04eb35d98e
https://hg.mozilla.org/mozilla-central/rev/c3a21a4b5a45
Comment 24•1 year ago
|
||
Comment on attachment 9117116 [details]
Bug 1603714: Add explicit copy constructor and copy assignment operators to nt::MemorySectionNameBuf; r=mhowell
approved for 72.0b11
Comment 25•1 year ago
|
||
bugherderuplift |
Comment 26•1 year ago
|
||
Backed out changeset 5fb1944b566d (Bug 1603714) for build bustages on 'nt::MemorySectionNameBuf'
https://hg.mozilla.org/releases/mozilla-beta/rev/3d44f6dd362188c92f331a868de6f4ca47784aa3
Failure log:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=282407071&repo=mozilla-beta&lineNumber=77636
Assignee | ||
Comment 27•1 year ago
|
||
Assignee | ||
Updated•1 year ago
|
Updated•1 year ago
|
Comment 28•1 year ago
|
||
Pushed by aklotz@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9017c76f2cb5 Follow-up: also add move constructor and move assignment operator to MemorySectionNameBuf; r=handyman
Comment 29•1 year ago
|
||
bugherder |
Aaron, can you request uplift for the new patch if you think it is safe for beta? We have a beta build tonight, and then the RC build coming up next Monday.
Assignee | ||
Comment 31•1 year ago
|
||
Comment on attachment 9117564 [details]
Bug 1603714: Follow-up: also add move constructor and move assignment operator to MemorySectionNameBuf; r=handyman
Beta/Release Uplift Approval Request
- User impact if declined:
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Simple, understandable patch that is very tightly scoped.
- String changes made/needed: None
Assignee | ||
Comment 32•1 year ago
|
||
Note that D57889 should land on Beta first, followed by D58142.
Comment on attachment 9117564 [details]
Bug 1603714: Follow-up: also add move constructor and move assignment operator to MemorySectionNameBuf; r=handyman
Fix for memory issue and possible crash. Let's take this for the RC build on Monday.
Note that D57889 should land on Beta first, followed by D58142.
Comment 34•1 year ago
|
||
bugherderuplift |
Updated•1 year ago
|
Updated•1 year ago
|
Description
•