Closed Bug 1373579 Opened 7 years ago Closed 7 years ago

Crash in memcpy | js::SCInput::readArray<T>

Categories

(WebExtensions :: General, defect, P1)

55 Branch
x86
Windows
defect

Tracking

(firefox-esr52 unaffected, firefox54 unaffected, firefox55+ fixed, firefox56 fixed)

RESOLVED FIXED
mozilla56
Tracking Status
firefox-esr52 --- unaffected
firefox54 --- unaffected
firefox55 + fixed
firefox56 --- fixed

People

(Reporter: philipp, Assigned: kmag)

References

Details

(Keywords: crash, regression, Whiteboard: [triaged])

Crash Data

Attachments

(3 files, 1 obsolete file)

This bug was filed from the Socorro interface and is 
report bp-efb9200f-eef8-4f64-940e-3665f0170616.
=============================================================
Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	vcruntime140.dll 	memcpy 	f:\dd\vctools\crt\vcruntime\src\string\i386\memcpy.asm:600
1 	xul.dll 	js::SCInput::readArray<unsigned char>(unsigned char*, unsigned int) 	js/src/vm/StructuredClone.cpp:846
2 	xul.dll 	mozilla::dom::StructuredCloneBlob::ReadStructuredCloneInternal(JSContext*, JSStructuredCloneReader*, mozilla::dom::StructuredCloneHolder*) 	dom/base/StructuredCloneBlob.cpp:137
3 	xul.dll 	mozilla::dom::StructuredCloneBlob::ReadStructuredClone(JSContext*, JSStructuredCloneReader*, mozilla::dom::StructuredCloneHolder*) 	dom/base/StructuredCloneBlob.cpp:109
4 	xul.dll 	mozilla::dom::StructuredCloneHolder::CustomReadHandler(JSContext*, JSStructuredCloneReader*, unsigned int, unsigned int) 	dom/base/StructuredCloneHolder.cpp:991
5 	xul.dll 	mozilla::dom::`anonymous namespace'::StructuredCloneCallbacksRead 	dom/base/StructuredCloneHolder.cpp:64
6 	xul.dll 	JSStructuredCloneReader::startRead(JS::MutableHandle<JS::Value>) 	js/src/vm/StructuredClone.cpp:2234
7 	xul.dll 	JSStructuredCloneReader::read(JS::MutableHandle<JS::Value>) 	js/src/vm/StructuredClone.cpp:2537
8 	xul.dll 	ReadStructuredClone(JSContext*, JSStructuredCloneData&, JS::StructuredCloneScope, JS::MutableHandle<JS::Value>, JSStructuredCloneCallbacks const*, void*) 	js/src/vm/StructuredClone.cpp:626
9 	xul.dll 	JS::GCVector<JSObject*, 8, js::TempAllocPolicy>::trace(JSTracer*) 	obj-firefox/dist/include/js/GCVector.h:134
10 	xul.dll 	mozilla::dom::ipc::StructuredCloneData::Read(JSContext*, JS::MutableHandle<JS::Value>, mozilla::ErrorResult&) 	dom/ipc/StructuredCloneData.cpp:103
11 	xul.dll 	nsFrameMessageManager::ReceiveMessage(nsISupports*, nsIFrameLoader*, bool, nsAString const&, bool, mozilla::dom::ipc::StructuredCloneData*, mozilla::jsipc::CpowHolder*, nsIPrincipal*, nsTArray<mozilla::dom::ipc::StructuredCloneData>*) 	dom/base/nsFrameMessageManager.cpp:986
12 	xul.dll 	nsFrameMessageManager::ReceiveMessage(nsISupports*, nsIFrameLoader*, bool, nsAString const&, bool, mozilla::dom::ipc::StructuredCloneData*, mozilla::jsipc::CpowHolder*, nsIPrincipal*, nsTArray<mozilla::dom::ipc::StructuredCloneData>*) 	dom/base/nsFrameMessageManager.cpp:1121
13 	xul.dll 	nsSameProcessAsyncMessageBase::ReceiveMessage(nsISupports*, nsIFrameLoader*, nsFrameMessageManager*) 	dom/base/nsFrameMessageManager.cpp:2073
14 	xul.dll 	nsAsyncMessageToSameProcessParent::HandleMessage() 	dom/base/nsFrameMessageManager.cpp:1887
15 	xul.dll 	mozilla::dom::SameProcessMessageQueue::Runnable::Run() 	dom/base/SameProcessMessageQueue.cpp:73
16 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:1428
17 	xul.dll 	mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:96
18 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:311
19 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:291
20 	xul.dll 	nsBaseAppShell::Run() 	widget/nsBaseAppShell.cpp:156
21 	xul.dll 	nsAppShell::Run() 	widget/windows/nsAppShell.cpp:271
22 	xul.dll 	nsAppStartup::Run() 	toolkit/components/startup/nsAppStartup.cpp:283
23 	xul.dll 	XREMain::XRE_mainRun() 	toolkit/xre/nsAppRunner.cpp:4573
24 	xul.dll 	XREMain::XRE_main(int, char** const, mozilla::BootstrapConfig const&) 	toolkit/xre/nsAppRunner.cpp:4757
25 	xul.dll 	XRE_main(int, char** const, mozilla::BootstrapConfig const&) 	toolkit/xre/nsAppRunner.cpp:4852
26 	xul.dll 	mozilla::BootstrapImpl::XRE_main(int, char** const, mozilla::BootstrapConfig const&) 	toolkit/xre/Bootstrap.cpp:45
27 	firefox.exe 	do_main 	browser/app/nsBrowserApp.cpp:237
28 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:115
29 	firefox.exe 	__scrt_common_main_seh 	f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:253
30 	kernel32.dll 	BaseThreadInitThunk 	
31 	ntdll.dll 	__RtlUserThreadStart 	
32 	ntdll.dll 	_RtlUserThreadStart

these crashes started showing up on 32bit versions of firefox on windows in a higher frequency since 55.0a1 build 20170608030205. so based on this timing, bug 1356546 might be the regressing patch.
in early stability data from the 55.0b/devedition rollout, with 1.37 % of all browser crashes this signature is the #7 top crash there.
Hi Andy, could you please find someone to take a look at this, and suggest how we'd like to process this for Release 55? Thank you!
Flags: needinfo?(amckay)
Kris, this might be related to your changes?
Flags: needinfo?(amckay) → needinfo?(kmaglione+bmo)
At this point, my best guess is that JSStructuredCloneBuffer is failing to allocate an initial segment with the requested capacity, but ignoring the result:

http://searchfox.org/mozilla-central/rev/20d16dadd336e0c6b97e3e19dc4ff907744b5734/mfbt/BufferList.h#85

The fact that every crash I've looked at is a 32bit system that's low on virtual memory makes that seem all the more likely.

It's a bit unfortunate that that constructor appears to be infallible, and doesn't document the possibility of failure, but is apparently fallible nonetheless...
Assignee: nobody → kmaglione+bmo
Flags: needinfo?(kmaglione+bmo)
Priority: -- → P1
Whiteboard: [triaged]
I think I would prefer a different approach here. Could we instead require that everyone who wants to use aInitialSize != 0 when they're not using an infallible allocator call an Init method that returns a MOZ_MUST_USE bool? Hopefully there aren't too many callers like that.

In the constructor, you could do:
  MOZ_ASSERT(aInitializeSize == 0 || IsSame<AllocPolicy, InfallibleAllocPolicy>::value);
Then you could add an Init method that would assert that mSegments.empty() and then call AllocateSegment.

I'm also a little worried about a two other things:

- I happened to notice this line, which I must have missed before:
http://searchfox.org/mozilla-central/rev/7cc377ce3f0a569c5c9b362d589705cd4fecc0ac/js/public/StructuredClone.h#257
I think that's what allows us to do |JSStructuredCloneData data(length, length, 4096);|. However, that won't initialize callbacks_, closure_, or ownTransferrables_. I wonder if that's a problem.

- We could make OOMs less likely in StructuredCloneBlob.cpp if we didn't allocate the memory contiguously:
http://searchfox.org/mozilla-central/rev/7cc377ce3f0a569c5c9b362d589705cd4fecc0ac/dom/base/StructuredCloneBlob.cpp#137
To do that, instead of using an Init() method as I described above, you could add a more generic method that adds |size| bytes to the buffer without initializing it. Then you could repeatedly call this in 4K segments and JS_ReadBytes into each of those segments.
(In reply to Bill McCloskey (:billm) from comment #7)
> I think I would prefer a different approach here. Could we instead require
> that everyone who wants to use aInitialSize != 0 when they're not using an
> infallible allocator call an Init method that returns a MOZ_MUST_USE bool?
> Hopefully there aren't too many callers like that.

Yeah, I thought about that after I posted the patches, and I think it would be my preferred approach too. Based on the try run I did after I added the assertions, there are only two callers that use a fallible allocator with that constructor.

> I'm also a little worried about a two other things:
> 
> - I happened to notice this line, which I must have missed before:
> http://searchfox.org/mozilla-central/rev/
> 7cc377ce3f0a569c5c9b362d589705cd4fecc0ac/js/public/StructuredClone.h#257
> I think that's what allows us to do |JSStructuredCloneData data(length,
> length, 4096);|. However, that won't initialize callbacks_, closure_, or
> ownTransferrables_. I wonder if that's a problem.

Yeah, I was wondering about that too, but it didn't seem to be causing any problems, so I assumed I was probably missing something. But it seems like it would be safer to always initialize them regardless.

> - We could make OOMs less likely in StructuredCloneBlob.cpp if we didn't
> allocate the memory contiguously:
> http://searchfox.org/mozilla-central/rev/
> 7cc377ce3f0a569c5c9b362d589705cd4fecc0ac/dom/base/StructuredCloneBlob.cpp#137
> To do that, instead of using an Init() method as I described above, you
> could add a more generic method that adds |size| bytes to the buffer without
> initializing it. Then you could repeatedly call this in 4K segments and
> JS_ReadBytes into each of those segments.

Yeah, I did consider that, but I wondered how much difference it would make. If we're already hitting an OOM at this point, I suspect that no matter what we do, we're likely to hit it again before we finish deserializing the structured clone blob, or the next time we try to deserialize a large IPC message (since the IPC code follows the same patter, but with an infallible allocator).

But I guess it couldn't hurt. We'd at least be less likely to fail due to heap fragmentation.
(In reply to Kris Maglione [:kmag] (busy; behind on reviews) from comment #8)
> Yeah, I did consider that, but I wondered how much difference it would make.
> If we're already hitting an OOM at this point, I suspect that no matter what
> we do, we're likely to hit it again before we finish deserializing the
> structured clone blob, or the next time we try to deserialize a large IPC
> message (since the IPC code follows the same patter, but with an infallible
> allocator).
> 
> But I guess it couldn't hurt. We'd at least be less likely to fail due to
> heap fragmentation.

We try pretty hard to allocate to avoid allocating large contiguous blocks, especially in the IPC code. Most OOM failures I've seen are due to the inability to allocate a big contiguous block rather than a failure to allocate a small block. And when we remove the big contiguous allocations, crash rates do seem to go down rather than shift elsewhere.
Attachment #8878713 - Attachment is obsolete: true
Attachment #8878713 - Flags: review?(wmccloskey)
tracking for 55, this is a top crash in early 55 beta
Comment on attachment 8879339 [details]
Bug 1373579: Part 1 - Initialize all members when using an inherited constructor.

https://reviewboard.mozilla.org/r/150632/#review155836
Attachment #8879339 - Flags: review?(wmccloskey) → review+
Comment on attachment 8878712 [details]
Bug 1373579: Part 2 - Deserialize StructuredCloneBlob in segments rather than a single large allocation.

https://reviewboard.mozilla.org/r/150018/#review155838

This looks nice. Thanks.

::: dom/base/StructuredCloneBlob.cpp:137
(Diff revision 2)
>    if (blobCount) {
>      BlobImpls().AppendElements(&aHolder->BlobImpls()[blobOffset], blobCount);
>    }
>  
> -  JSStructuredCloneData data(length, length, 4096);
> -  if (!JS_ReadBytes(aReader, data.Start(), length)) {
> +  JSStructuredCloneData data;
> +  size_t size = 0;

|size| should be scoped inside the loop.

::: mfbt/BufferList.h:267
(Diff revision 2)
>    // Copies aSize bytes from aData into the BufferList. The storage for these
>    // bytes may be split across multiple buffers. Size() is increased by aSize.
>    inline bool WriteBytes(const char* aData, size_t aSize);
>  
> +  // Allocates a buffer of at most |aMaxBytes| bytes and, if successful, returns
> +  // that buffer, and places its size in |aSize|.

Please say, "If unsuccessful, returns null and leaves aSize undefined."

::: mfbt/BufferList.h:351
(Diff revision 2)
>  {
>    MOZ_RELEASE_ASSERT(mOwning);
>    MOZ_RELEASE_ASSERT(mStandardCapacity);
>  
>    size_t copied = 0;
> -  size_t remaining = aSize;
> +  size_t toCopy = 0;

toCopy should be scoped inside the loop.
Attachment #8878712 - Flags: review?(wmccloskey) → review+
Comment on attachment 8878714 [details]
Bug 1373579: Part 3 - Require fallible Init method rather than infallible constructor when using fallible allocator.

https://reviewboard.mozilla.org/r/150022/#review155842

::: mfbt/BufferList.h:70
(Diff revision 2)
>    // aInitialCapacity is allocated automatically. This data will be contiguous
> -  // an can be accessed via |Start()|. Subsequent buffers will be allocated with
> +  // and can be accessed via |Start()| if, and only if, an infallible alloc
> +  // policy is used. If a fallible alloc policy is used, the fallible |Init()|

This comment makes it sound like you can pass aInitialSize != 0 with a fallible allocator, but you might not get a contiguous buffer. I think it would be clearer to say that aInitialSize must be 0 when using a fallible allocator.

::: mfbt/BufferList.h:129
(Diff revision 2)
>  
> +  // Initializes the BufferList with a segment of the given size and capacity.
> +  // May only be called once, before any segments have been allocated.
> +  bool Init(size_t aInitialSize, size_t aInitialCapacity)
> +  {
> +    MOZ_ASSERT(mSegments.empty());

Maybe also assert that aInitialCapacity != 0? Might be nice to have an aCapacity != 0 assertion in AllocateSegment.

::: mfbt/BufferList.h:278
(Diff revision 2)
> -  char* Start() { return mSegments[0].mData; }
> +  char* Start()
> +  {
> +    MOZ_RELEASE_ASSERT(!mSegments.empty());
> +    return mSegments[0].mData;
> +  }
>    const char* Start() const { return mSegments[0].mData; }

Shouldn't we do the same thing for this version?
Attachment #8878714 - Flags: review?(wmccloskey) → review+
Comment on attachment 8878712 [details]
Bug 1373579: Part 2 - Deserialize StructuredCloneBlob in segments rather than a single large allocation.

https://reviewboard.mozilla.org/r/150018/#review155838

> |size| should be scoped inside the loop.

I initially did have it scoped inside the loop, but moved it outside to avoid initializing it for each iteration. But I guess that's not really worth worrying about.
Comment on attachment 8878714 [details]
Bug 1373579: Part 3 - Require fallible Init method rather than infallible constructor when using fallible allocator.

https://reviewboard.mozilla.org/r/150022/#review155842

> Shouldn't we do the same thing for this version?

Maybe. In previous version of the patch, I initially added the `Size()` check assertion to the const version, too, but if someone is calling the const version of `Start()` without having written data to the non-const version first, that seems like a different, and bigger, problem.
Comment on attachment 8878712 [details]
Bug 1373579: Part 2 - Deserialize StructuredCloneBlob in segments rather than a single large allocation.

https://reviewboard.mozilla.org/r/150018/#review155838

> I initially did have it scoped inside the loop, but moved it outside to avoid initializing it for each iteration. But I guess that's not really worth worrying about.

Why does it need to be initialized at all?
Comment on attachment 8878712 [details]
Bug 1373579: Part 2 - Deserialize StructuredCloneBlob in segments rather than a single large allocation.

https://reviewboard.mozilla.org/r/150018/#review155838

> Why does it need to be initialized at all?

We get "possibly uninitialized" warnings if it's not. Or, at least, I assumed we would, but given that the method is inlined, we may not. I'll check.
https://hg.mozilla.org/integration/mozilla-inbound/rev/3b5573bf558d580cc8efa698536974938b86f9c2
Bug 1373579: Part 1 - Initialize all members when using an inherited constructor. r=billm

https://hg.mozilla.org/integration/mozilla-inbound/rev/2b334bcc95a7867dd753ea651db06486fd449a7b
Bug 1373579: Part 2 - Deserialize StructuredCloneBlob in segments rather than a single large allocation. r=billm

https://hg.mozilla.org/integration/mozilla-inbound/rev/97eebd54e4e44ccbd2a8b927d64c5c81cdfbd1a2
Bug 1373579: Part 3 - Require fallible Init method rather than infallible constructor when using fallible allocator. r=billm
Comment on attachment 8878712 [details]
Bug 1373579: Part 2 - Deserialize StructuredCloneBlob in segments rather than a single large allocation.

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1356546
[User impact if declined]: This issue causes crashes and potential memory corruption when memory allocation fails.
[Is this code covered by automated tests?]: Yes, but not this error condition specifically, since it only occurs on OOM.
[Has the fix been verified in Nightly?]: No.
[Needs manual test from QE? If yes, steps to reproduce]: No. This only occurs in OOM, and would be impractical for QA to reproduce.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: Low-risk.
[Why is the change risky/not risky?]: The changes are relatively minimal, and the affected code is exercised by broad and extensive tests.
[String changes made/needed]: None.
Attachment #8878712 - Flags: approval-mozilla-beta?
Comment on attachment 8878712 [details]
Bug 1373579: Part 2 - Deserialize StructuredCloneBlob in segments rather than a single large allocation.

avoid triggering an oom condition, regression in 55, beta55+
Attachment #8878712 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Based on comment 22, this does not manual coverage. 
Updating the qe‑verify flag.
Flags: qe-verify-
Product: Toolkit → WebExtensions
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: