Open Bug 1332054 Opened 8 years ago Updated 2 years ago

Don't treat constructor failure as fatal

Categories

(Core :: IPC, defect, P3)

defect

Tracking

()

People

(Reporter: billm, Unassigned)

References

Details

Attachments

(1 file)

This rule never made much sense to me. The rationale was that it's too difficult to deal with a failing constructor, so we should just crash in that case. However, we've worked around the rule so much that I don't think it serves any purpose anymore. We don't crash if we're in the parent process. We don't crash if we're talking to the GPU process. Not bug 1331685 is adding special hacks to avoid crashing for certain messages. Let's just not crash at all.
Attached patch patchSplinter Review
Attachment #8828072 - Flags: review?(dvander)
Comment on attachment 8828072 [details] [diff] [review] patch Review of attachment 8828072 [details] [diff] [review]: ----------------------------------------------------------------- Do we have to do any auditing for missing null checks? Or are we assuming we'd already be getting crashes?
Attachment #8828072 - Flags: review?(dvander) → review+
Also, there may be code that does not expect ActorDestroy() to be called synchronously during the Send*Constructor() call. For example, I am fairly certain this will explode: https://dxr.mozilla.org/mozilla-central/source/dom/cache/CacheStorage.cpp#516 Because the actor destroy calls: https://dxr.mozilla.org/mozilla-central/source/dom/cache/CacheStorage.cpp#554 I have a patch that fixes CacheStorage here: https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=1329693&attachment=8825086 But it does seem like some auditing is in order.
Assignee: nobody → wmccloskey
Status: NEW → ASSIGNED
Flags: needinfo?(wmccloskey)
This patch will only change behavior if we're already crashing. I can go through crash-stats, but I don't think we need to audit our code.
Flags: needinfo?(wmccloskey)
I'm sorry, but I don't think its reasonable to change something from infallible to fallible without investigating the impact on the tree. I did a quick survey here and I think there is enough to warrant investigating further before landing the patch. At a very minimum I think we should look at the potential UAFs that would be produced by this change. There is a lot of code that has the actor do SendP*Constructor(this) and then continues to operate on its own memory. These can get self-destructed during the constructor call. Ignores return value: https://dxr.mozilla.org/mozilla-central/source/accessible/base/DocManager.cpp#541 https://dxr.mozilla.org/mozilla-central/source/accessible/base/NotificationController.cpp#871 https://dxr.mozilla.org/mozilla-central/source/accessible/ipc/win/DocAccessibleChild.h#266 https://dxr.mozilla.org/mozilla-central/source/dom/indexedDB/ActorsChild.cpp#1109 https://dxr.mozilla.org/mozilla-central/source/dom/indexedDB/ActorsChild.cpp#1633 https://dxr.mozilla.org/mozilla-central/source/dom/ipc/ContentChild.cpp#600 https://dxr.mozilla.org/mozilla-central/source/dom/ipc/ContentChild.cpp#1007 https://dxr.mozilla.org/mozilla-central/source/dom/media/gmp/GMPChild.cpp#261 https://dxr.mozilla.org/mozilla-central/source/dom/network/TCPServerSocketChild.cpp#53 https://dxr.mozilla.org/mozilla-central/source/dom/presentation/ipc/PresentationIPCService.cpp#232 https://dxr.mozilla.org/mozilla-central/source/toolkit/components/printingui/ipc/nsPrintingProxy.cpp#66 Assumes success (might leak or leave content stuck): https://dxr.mozilla.org/mozilla-central/source/dom/cache/CacheChild.cpp#72 https://dxr.mozilla.org/mozilla-central/source/dom/filehandle/FileHandleBase.cpp#95 https://dxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBDatabase.cpp#710 https://dxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBDatabase.cpp#773 https://dxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBMutableFile.cpp#191 https://dxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBTransaction.cpp#302 https://dxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBTransaction.cpp#307 https://dxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBTransaction.cpp#328 https://dxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBTransaction.cpp#333 https://dxr.mozilla.org/mozilla-central/source/dom/ipc/TabParent.cpp#646 https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/ftp/FTPChannelChild.cpp#890 https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/HttpChannelChild.cpp#2766 https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/websocket/WebSocketEventService.cpp#377 https://dxr.mozilla.org/mozilla-central/source/security/manager/ssl/PSMContentListener.cpp#415 https://dxr.mozilla.org/mozilla-central/source/toolkit/components/printingui/ipc/nsPrintingProxy.cpp#112 https://dxr.mozilla.org/mozilla-central/source/toolkit/components/printingui/ipc/nsPrintingProxy.cpp#155 https://dxr.mozilla.org/mozilla-central/source/uriloader/exthandler/ContentHandlerService.cpp#28 https://dxr.mozilla.org/mozilla-central/source/widget/nsFilePickerProxy.cpp#42 Possibly unsafe: https://dxr.mozilla.org/mozilla-central/source/dom/base/nsContentPermissionHelper.cpp#397 https://dxr.mozilla.org/mozilla-central/source/dom/cache/AutoUtils.cpp#422 Maybe nullptr unsafe: https://dxr.mozilla.org/mozilla-central/source/dom/broadcastchannel/BroadcastChannel.cpp#463 https://dxr.mozilla.org/mozilla-central/source/dom/cache/CacheStorage.cpp#516 https://dxr.mozilla.org/mozilla-central/source/dom/flyweb/FlyWebPublishedServer.cpp#554 https://dxr.mozilla.org/mozilla-central/source/dom/ipc/ContentChild.cpp#814 https://dxr.mozilla.org/mozilla-central/source/dom/ipc/ContentParent.cpp#1006 https://dxr.mozilla.org/mozilla-central/source/dom/ipc/TabChild.cpp#2611 https://dxr.mozilla.org/mozilla-central/source/dom/media/systemservices/CamerasChild.cpp#97 https://dxr.mozilla.org/mozilla-central/source/dom/media/systemservices/MediaSystemResourceManager.cpp#106 https://dxr.mozilla.org/mozilla-central/source/dom/messagechannel/MessagePort.cpp#913 https://dxr.mozilla.org/mozilla-central/source/dom/plugins/ipc/PluginInstanceChild.cpp#3999 https://dxr.mozilla.org/mozilla-central/source/ipc/glue/IPCStreamUtils.cpp#67 https://dxr.mozilla.org/mozilla-central/source/ipc/glue/IPCStreamUtils.cpp#106 https://dxr.mozilla.org/mozilla-central/source/layout/base/nsRefreshDriver.cpp#868 https://dxr.mozilla.org/mozilla-central/source/netwerk/ipc/NeckoChild.cpp#66 https://dxr.mozilla.org/mozilla-central/source/uriloader/exthandler/nsExternalHelperAppService.cpp#732 Maybe UAF unsafe: https://dxr.mozilla.org/mozilla-central/source/dom/file/ipc/Blob.cpp#3600 https://dxr.mozilla.org/mozilla-central/source/dom/file/ipc/Blob.cpp#3629 https://dxr.mozilla.org/mozilla-central/source/dom/ipc/ContentChild.cpp#804 https://dxr.mozilla.org/mozilla-central/source/dom/media/webspeech/synth/nsSynthVoiceRegistry.cpp#151 https://dxr.mozilla.org/mozilla-central/source/dom/network/TCPSocketChild.cpp#103 https://dxr.mozilla.org/mozilla-central/source/dom/network/TCPSocketChild.cpp#116 https://dxr.mozilla.org/mozilla-central/source/dom/network/UDPSocketChild.cpp#182 https://dxr.mozilla.org/mozilla-central/source/dom/network/UDPSocketChild.cpp#184 https://dxr.mozilla.org/mozilla-central/source/dom/storage/StorageIPC.cpp#85 https://dxr.mozilla.org/mozilla-central/source/extensions/spellcheck/src/mozSpellChecker.cpp#66 https://dxr.mozilla.org/mozilla-central/source/ipc/glue/SendStreamChild.cpp#377 https://dxr.mozilla.org/mozilla-central/source/ipc/glue/SendStreamChild.cpp#414 https://dxr.mozilla.org/mozilla-central/source/netwerk/cookie/CookieServiceChild.cpp#55 https://dxr.mozilla.org/mozilla-central/source/netwerk/dns/DNSRequestChild.cpp#215 https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/ftp/FTPChannelChild.cpp#203 https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/websocket/WebSocketChannelChild.cpp#482 https://dxr.mozilla.org/mozilla-central/source/toolkit/components/printingui/ipc/PrintingParent.cpp#311 https://dxr.mozilla.org/mozilla-central/source/uriloader/prefetch/OfflineCacheUpdateChild.cpp#428 https://dxr.mozilla.org/mozilla-central/source/widget/nsColorPickerProxy.cpp#24 https://dxr.mozilla.org/mozilla-central/source/widget/nsDatePickerProxy.cpp#25 https://dxr.mozilla.org/mozilla-central/source/widget/nsScreenManagerProxy.cpp#32
Flags: needinfo?(wmccloskey)
We had to work around this quite a bit in the GPU process so I think it's the right way to go. It only affects child-side protocols, where if IPC dies the content process is doomed anyway. Interruptible protocols (like PGPU) had to override FatalError in a pretty gross way, and we made a conscious decision to check success everywhere.
That's fine, but we can't create new security vulnerabilities by exposing a lot of UAF paths as a side effect. Those need to be fixed before this can land.
I thought about this over the weekend and now I remember a conversation we had on the subject, David. I thought we decided it would make sense to enforce the following rules: 1. If you send any message after ActorDestroy, we'll assert. 2. If you send a message before ActorDestroy and the Send call fails, then we should return an error. 3. In particular, if you send a constructor message and it fails before ActorDestroy, we would not destroy the actor (that would happen during ActorDestroy). I kind of remember you found a problem in this approach, David. Do you remember?
Flags: needinfo?(wmccloskey) → needinfo?(dvander)
(In reply to Bill McCloskey (:billm) from comment #8) > I thought about this over the weekend and now I remember a conversation we > had on the subject, David. I thought we decided it would make sense to > enforce the following rules: > > 1. If you send any message after ActorDestroy, we'll assert. I think the reason for this rule was to avoid programming error. But for asynchronous stuff in gfx we don't actually know whether the channel has shut down, so we end up with CanSend() or IsOpen() checks everywhere. > 2. If you send a message before ActorDestroy and the Send call fails, then > we should return an error. This makes sense. > 3. In particular, if you send a constructor message and it fails before > ActorDestroy, we would not destroy the actor (that would happen during > ActorDestroy). > > I kind of remember you found a problem in this approach, David. Do you > remember? I can't think of anything directly related. In bug 1319271 we initially fired ActorDestroy slightly differently and it broke gfx code [1]. That code (PLayer and everything associated with it) got removed a few weeks ago. Maybe we were concerned about ActorDestroy firing in places other than the top of the event loop. (I think that can happen anyway, but only in response to Close?) [1] https://dxr.mozilla.org/mozilla-beta/rev/8e692dd4176cba81ce020b29ae8b352dc1db724a/gfx/layers/ipc/ShadowLayerChild.cpp#28
Flags: needinfo?(dvander)
Bug 1367150 made me think of this again today...
AFAICT this is bug is dormant, please update the priority if you disagree.
Flags: needinfo?(wmccloskey)
Priority: -- → P3
Yes, it's stalled. More work is needed.
Flags: needinfo?(wmccloskey)
Assignee: bill.mccloskey → nobody
Status: ASSIGNED → NEW
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: