Closed
Bug 808080
Opened 12 years ago
Closed 4 years ago
IPC parent aborts due to too large allocation for infallible array in PContent::Msg_AsyncMessage
Categories
(Core :: IPC, defect)
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: posidron, Assigned: cyu)
References
(Blocks 1 open bug)
Details
(Keywords: crash)
Attachments
(3 files)
This crash occurred with an optimized/non-debug build of B2G. I was intentionally using a non debug build to pass the DCHECK() functions in pickle.
Additionally we see the following warnings:
[Parent 2173] WARNING: waitpid failed pid:2356 errno:10: file /Users/cdiehl/Code/Mozilla/b2g/gecko/ipc/chromium/src/base/process_util_posix.cc, line 260
[Parent 2173] WARNING: Failed to deliver SIGKILL to 2356!(3).: file /Users/cdiehl/Code/Mozilla/b2g/gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc, line 118
[Parent 2173] WARNING: pipe error (99): Connection reset by peer: file /Users/cdiehl/Code/Mozilla/b2g/gecko/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 425
---
An interception occurred only then if XRE_GetProcessType() equaled GeckoProcessType_Content.
Let me know if you need further information.
Comment 1•11 years ago
|
||
Is there a test case for this?
Assignee | ||
Comment 2•11 years ago
|
||
Looks like this crash is the result of running the Faulty IPC fuzzer. We can take a look at the crash stack and see if we can make a test case for it. I'll take a look at this.
Assignee: nobody → cyu
Assignee | ||
Comment 3•11 years ago
|
||
Reproduced in gdb by manipulating the InfallibleTArray length read in Pickle::ReadUInt32(). This is a release build:
207 UpdateIter(iter, sizeof(*result));
(gdb) p *result
$4 = 0
(gdb) p *result = 4000000000
$5 = -294967296
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x41dec13a in mozalloc_abort (msg=<value optimized out>) at /home/cervantes/hg/mozilla-central/memory/mozalloc/mozalloc_abort.cpp:30
30 MOZ_CRASH();
(gdb) bt
#0 0x41dec13a in mozalloc_abort (msg=<value optimized out>) at /home/cervantes/hg/mozilla-central/memory/mozalloc/mozalloc_abort.cpp:30
#1 0x414c229e in nsTArrayInfallibleAllocator::SizeTooBig (this=0xbe80451c, newLen=3196074784) at ../../dist/include/nsTArray.h:213
#2 nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyElements<mozilla::jsipc::CpowEntry> >::EnsureCapacity (this=0xbe80451c, newLen=3196074784) at ../../dist/include/nsTArray-inl.h:112
#3 nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyElements<mozilla::jsipc::CpowEntry> >::InsertSlotsAt (this=0xbe80451c, newLen=3196074784) at ../../dist/include/nsTArray-inl.h:258
#4 nsTArray_Impl<mozilla::jsipc::CpowEntry, nsTArrayInfallibleAllocator>::InsertElementsAt (this=0xbe80451c, newLen=3196074784) at ../../dist/include/nsTArray.h:1418
#5 nsTArray_Impl<mozilla::jsipc::CpowEntry, nsTArrayInfallibleAllocator>::SetLength (this=0xbe80451c, newLen=3196074784) at ../../dist/include/nsTArray.h:1378
#6 0x414d76ea in mozilla::dom::PContentParent::Read (this=0x46a47c00, __v=0xbe80451c, __msg=0xbe80464c, __iter=0xbe8045c4) at /home/cervantes/git/unagi/B2G/objdir-gecko/ipc/ipdl/PContentParent.cpp:6565
#7 0x414da03c in mozilla::dom::PContentParent::OnMessageReceived (this=0x46a47c00, __msg=...) at /home/cervantes/git/unagi/B2G/objdir-gecko/ipc/ipdl/PContentParent.cpp:3195
#8 0x414a5676 in mozilla::ipc::MessageChannel::DispatchAsyncMessage (this=0x46a47c30, aMsg=...) at /home/cervantes/hg/mozilla-central/ipc/glue/MessageChannel.cpp:970
In the parent we should use fallible allocation for any array that we construct based on an IPC message. If the allocation fails then we should probably kill the child...
Assignee | ||
Comment 5•11 years ago
|
||
Agree. I am trying make the codegen to produce code like this:
auto PContentParent::Read(
InfallibleTArray<InputStreamParams>* __v,
const Message* __msg,
void** __iter) -> bool
{
FallibleTArray<InputStreamParams> a;
uint32_t length;
if ((!(Read((&(length)), __msg, __iter)))) {
FatalError("Error deserializing 'length' (uint32_t) of 'InputStreamParams[]'");
return false;
}
if (!(__v)->SetLength(length)) {
FatalError("Error setting length of 'InputStreamParams[]'");
return false;
}
for (uint32_t i = 0; (i) < (length); (++(i))) {
if ((!(Read((&(a[i])), __msg, __iter)))) {
FatalError("Error deserializing 'InputStreamParams[i]'");
return false;
}
}
(__v)->SwapElements(a);
return true;
}
You should use SetCapacity instead of SetLength!
Assignee | ||
Comment 7•11 years ago
|
||
(In reply to ben turner [:bent] (needinfo? encouraged) from comment #6)
> You should use SetCapacity instead of SetLength!
For not doing unless default initialization and later get assigned? SetCapacity doesn't sound right because it only allocates spaces for holding the elements, but the elements don't exist semantically. Accessing them triggers assertion failure in debug builds.
I have the codegen produce the code. The code should be equivalent except for the extra allocation failure check, but b2g doesn't start correctly. I am trying to figure out what goes wrong.
Yeah, you're right. I thought that our serialization code constructed elements one by one but I'm wrong!
Assignee | ||
Comment 9•11 years ago
|
||
I narrowed down the problem to PLayerTransactionParent.cpp and PLayerTransactionChild.cpp where "(__v)->SwapElements(a);" doesn't work. The main thread of the chrome process is blocked in an infinite loop.
If I change replace it with "(__v)->ReplaceElementsAt(0, (__v)->Length(), (a).Elements(), length)" only in these 2 files the problem disappears.
The pattern of allocating a FallibleTArray, setting its length, and then Read() into each element doesn't look incorrect as in ipc/glue/IPCMessageUtils.h. I am wondering if LayerTransaction{Parent|Child} perform some tricks with the array elements and make this pattern fail.
Assignee | ||
Comment 10•11 years ago
|
||
Attachment #826583 -
Flags: review?(bent.mozilla)
Assignee | ||
Comment 11•11 years ago
|
||
Try submission: https://tbpl.mozilla.org/?tree=Try&rev=edc17c4f836e
Comment on attachment 826583 [details] [diff] [review]
Handle allocation error in reading array parameters in protocol actors.
Review of attachment 826583 [details] [diff] [review]:
-----------------------------------------------------------------
Hm, the EditReply stuff seems unrelated to this fix, right? I wonder if we can't move those smarts into nsTArray actually.
Also, can you attach a copy of the generated code?
Assignee | ||
Comment 13•11 years ago
|
||
(In reply to ben turner [:bent] (needinfo? encouraged) from comment #12)
> Comment on attachment 826583 [details] [diff] [review]
> Handle allocation error in reading array parameters in protocol actors.
>
> Review of attachment 826583 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Hm, the EditReply stuff seems unrelated to this fix, right? I wonder if we
> can't move those smarts into nsTArray actually.
>
Actually it is related. Without the EditReply trait if we perform SwapElements with an AutoInfallibleTArray then we will get an incorrect result because it's unsafe to memmove or memcpy EditReply. We will enter an infinite loop later when we try to access mozilla::layers::OpContentBufferSwap::frontUpdatedRegion_. nsRegion maintains a circular linked list using the address of nsRegion::mRectListHead, and this makes EditReply unsafe to memcpy.
> Also, can you attach a copy of the generated code?
Assignee | ||
Comment 14•11 years ago
|
||
Generated code: PLayerTransactionChild.cpp
Assignee | ||
Comment 15•11 years ago
|
||
The reason I don't place the trait in nsTArray is that EditReply is IPC-specific code. Placing it in nsTArray makes it visible globally. Placing it in IPCMessageUtils.h make specific to IPC.
Updated•11 years ago
|
Attachment #826622 -
Attachment mime type: text/x-c++src → text/plain
Comment on attachment 826583 [details] [diff] [review]
Handle allocation error in reading array parameters in protocol actors.
Review of attachment 826583 [details] [diff] [review]:
-----------------------------------------------------------------
::: ipc/glue/IPCMessageUtils.h
@@ +90,5 @@
>
> +// mozilla::layers::EditReply needs to be copied with constructors because
> +// it may contain nsRegion, which is not memmove-safe.
> +template<>
> +struct nsTArray_CopyChooser<mozilla::layers::EditReply> {
Yeah, I really think this is the wrong place for this... It seems like what we want is for IPDL to generate this specialization automatically for all structs/unions *if* their types are non memmove-safe.
I'm pretty sure that this can be done with a combination of codegen and template magic like so:
template<>
struct nsTArray_CopyChooser<UnionType> :
mozilla::Conditional<mozilla::IsSame<nsTArray_CopyChooser<Type1>::Type,
nsTArray_CopyWithMemutils>::value &&
mozilla::IsSame<nsTArray_CopyChooser<Type2>::Type,
nsTArray_CopyWithMemutils>::value,
nsTArray_CopyWithMemutils,
nsTArray_CopyWithConstructors<UnionType>>
{ };
Can you give that a try? I'm reluctant to start special-casing things like this in IPCMessageUtils.
::: ipc/ipdl/ipdl/lower.py
@@ +4357,4 @@
> read.addstmts([
> + StmtDecl(Decl(Type('FallibleTArray',
> + T=_cxxBareType(eltipdltype, self.side)),
> + avar.name)),
Let's move this after we read the length.
Attachment #826583 -
Flags: review?(bent.mozilla) → review-
Assignee | ||
Comment 17•11 years ago
|
||
(In reply to ben turner [:bent] (needinfo? encouraged) from comment #16)
> Comment on attachment 826583 [details] [diff] [review]
> Handle allocation error in reading array parameters in protocol actors.
>
> Review of attachment 826583 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: ipc/glue/IPCMessageUtils.h
> @@ +90,5 @@
> >
> > +// mozilla::layers::EditReply needs to be copied with constructors because
> > +// it may contain nsRegion, which is not memmove-safe.
> > +template<>
> > +struct nsTArray_CopyChooser<mozilla::layers::EditReply> {
>
> Yeah, I really think this is the wrong place for this... It seems like what
> we want is for IPDL to generate this specialization automatically for all
> structs/unions *if* their types are non memmove-safe.
>
> I'm pretty sure that this can be done with a combination of codegen and
> template magic like so:
>
> template<>
> struct nsTArray_CopyChooser<UnionType> :
> mozilla::Conditional<mozilla::IsSame<nsTArray_CopyChooser<Type1>::Type,
> nsTArray_CopyWithMemutils>::value &&
> mozilla::IsSame<nsTArray_CopyChooser<Type2>::Type,
> nsTArray_CopyWithMemutils>::value,
> nsTArray_CopyWithMemutils,
> nsTArray_CopyWithConstructors<UnionType>>
> { };
>
>
So we are going to generate things like this:
template<>
struct nsTArray_CopyChooser<EditReply> :
mozilla::Conditional<mozilla::IsSame<nsTArray_CopyChooser<OpContentBufferSwap>::Type,
nsTArray_CopyWithMemutils>::value &&
mozilla::IsSame<nsTArray_CopyChooser<OpTextureSwap>::Type,
nsTArray_CopyWithMemutils>::value &&
mozilla::IsSame<nsTArray_CopyChooser<ReplyTextureRemoved>::Type,
nsTArray_CopyWithMemutils>::value,
nsTArray_CopyWithMemutils,
nsTArray_CopyWithConstructors<UnionType>>
{ };
To know whether EditReply is memcpy-able the compiler has to instantiate
template<>
struct nsTArray_CopyChooser<OpContentBufferSwap>
Then we need to either descend down to the data members of OpContentBufferSwap, or stop here by providing a specialization outside of the code-gen'd file.
If we choose to descend to the member types, we still need to provide some specialization for the basic types. And this could slow down the compilation of IPC code.
Assignee | ||
Comment 18•11 years ago
|
||
And with the landing of bug 845874, we no longer need to provide a specialization to copy EditReply with constructors. Do we still need to implement the above in the code gen?
Yes, I was thinking that we should make IPDL generate these structs for all union/struct types. Any members that are not union/struct types will need to rely on hand-written specializations that should hopefully exist in the relevant modules (e.g. gfx for nsRegion). Any members that don't have custom specializations will just use the default memcpy automatically.
The compiler will only need to look at these templates if we end up using them in an array, so it shouldn't slow down most of our compile I don't think.
And if we don't do this now we're just going to have bugs later on. Seems better to fix while we understand the problem.
Assignee | ||
Comment 20•11 years ago
|
||
Generating the specialization is not too difficult. There is one problem with
> mozilla::IsSame<nsTArray_CopyChooser<StructMemberType1>::Type, ...
where StructMemberType must be qualified, but in ipdlh it can appear as not qualified.
For example, in PDNSRequestParams.ipdlh:
> using NetAddrArray from "mozilla/net/PDNSParams.h";
and NetAddrArray is:
> namespace mozilla {
> namespace net {
>
> // Need to define typedef in .h file--can't seem to in ipdl.h file?
> typedef nsTArray<NetAddr> NetAddrArray;
>
> } // namespace net
> } // namespace mozilla
We should use mozilla::net::NetAddrArray as StructMemberType1, where we can't easily know that NetAddrArray belongs in namespace mozilla::net without parsing the .h file referenced in the ipdlh file.
I guess what we can do at best is try to check if the using statement in ipdlh refers to a qualified name (checking the existence of "::" in the name). If not then typedef it again like:
> //-----------------------------------------------------------------------------
> // Declaration of the IPDL type |struct DNSRecord|
> //
> namespace mozilla {
> namespace net {
> class DNSRecord MOZ_FINAL
> {
> public:
> typedef NetAddrArray NetAddrArray_t;
So we can safely use mozilla::IsSame<nsTArray_CopyChooser<mozilla::net::DNSRecord::NetAddrArray_t>::Type.
Comment 21•4 years ago
|
||
Cervantes Yu, is this crash still an issue, or can this report be closed?
Flags: needinfo?(cervantes.yu)
Comment 22•4 years ago
|
||
This might still be an issue, but I can't find anything in lower.py that looks like the code in the patch. If we want to make the parent resilient to the child sending absurdly large data, that'll probably require a whole new effort, so leaving this bug open isn't too useful.
Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(cervantes.yu)
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•