Closed
Bug 1263292
Opened 9 years ago
Closed 9 years ago
Windows e10s jemalloc4 startup permacrash since bug 1235633
Categories
(Core :: Memory Allocator, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla48
People
(Reporter: RyanVM, Assigned: billm)
References
Details
(Keywords: crash)
Attachments
(1 file)
1.20 KB,
patch
|
jld
:
review+
RyanVM
:
feedback+
|
Details | Diff | Splinter Review |
This is with upstream jemalloc tip (includes the fix for bug 1261226). I bisected this down to the patch from bug 1235633 as the cause.
https://hg.mozilla.org/integration/mozilla-inbound/rev/dd3e03fcb06b
I've confirmed that this only reproduces with jemalloc4 enabled and that both win32 and win64 builds are affected. The exact top frame of the stack varies from run to run, but it's always in arena dalloc functions.
Crash stack:
mozglue.dll!arena_run_dalloc(arena_s * arena, arena_run_s * run, bool dirty, bool cleaned, bool decommitted) Line 1907
mozglue.dll!arena_dalloc_large_locked_impl(arena_s * arena, arena_chunk_s * chunk, void * ptr, bool junked) Line 2815
mozglue.dll!je_arena_dalloc_large(tsd_s * tsd, arena_s * arena, arena_chunk_s * chunk, void * ptr) Line 2831
mozglue.dll!je_isqalloc(tsd_s * tsd, void * ptr, unsigned __int64 size, tcache_s * tcache) Line 1100
mozglue.dll!je_arena_ralloc(tsd_s * tsd, arena_s * arena, void * ptr, unsigned __int64 oldsize, unsigned __int64 size, unsigned __int64 alignment, bool zero, tcache_s * tcache) Line 3130
mozglue.dll!je_realloc(void * ptr, unsigned __int64 size) Line 1887
mozglue.dll!realloc_impl(void * ptr, unsigned __int64 size) Line 191
xul.dll!Buffer::try_realloc(unsigned __int64 newlength) Line 54
xul.dll!Buffer::assign(const char * bytes, unsigned __int64 length) Line 87
xul.dll!IPC::Channel::ChannelImpl::ProcessIncomingMessages(base::MessagePumpForIO::IOContext * context, unsigned long bytes_read) Line 430
xul.dll!IPC::Channel::ChannelImpl::OnIOCompleted(base::MessagePumpForIO::IOContext * context, unsigned long bytes_transfered, unsigned long error) Line 515
xul.dll!base::MessagePumpForIO::WaitForIOCompletion(unsigned long timeout, base::MessagePumpForIO::IOHandler * filter) Line 495
xul.dll!base::MessagePumpForIO::DoRunLoop() Line 439
xul.dll!base::MessagePumpWin::RunWithDispatcher(base::MessagePump::Delegate * delegate, base::MessagePumpWin::Dispatcher * dispatcher) Line 56
xul.dll!MessageLoop::RunHandler() Line 224
xul.dll!MessageLoop::Run() Line 204
xul.dll!base::Thread::ThreadMain() Line 177
xul.dll!ThreadEntry(void * arg) Line 256
Reporter | ||
Comment 2•9 years ago
|
||
Add MOZ_JEMALLOC4=1 to your mozconfig and apply https://people.mozilla.org/~rvandermeulen/jemalloc so you don't crash in xpcshell.exe during packaging. Beyond that, just launching via |./mach run| should crash on startup.
Flags: needinfo?(ryanvm)
Comment 3•9 years ago
|
||
Does it happen with mozjemalloc?
Does it happen if you build with a non-updated jemalloc4 with https://hg.mozilla.org/mozilla-central/rev/0a14d675236e reverted?
What about a non-updated jemalloc4 with a cherry-pick of https://github.com/jemalloc/jemalloc/commit/4a8abbb400afe695f145a487380c04a946500bc6 ?
If you still have the instructions to get an allocation log, can you get one?
(In reply to Bill McCloskey (:billm) from comment #1)
> How do I reproduce this?
Set UPSTREAM_COMMIT to dev in memory/jemalloc/upstream.info, run memory/jemalloc/update.sh, then build with MOZ_JEMALLOC4=1 set.
Reporter | ||
Comment 4•9 years ago
|
||
From a debug build, but not entirely sure if it's relevant or not:
Assertion failure: mRawPtr != 0 (You can't dereference a NULL RefPtr with operator->().), at objdir-fx-64-debug\dist\include\mozilla/RefPtr.h:297
#01: D3DVsyncSource::D3DVsyncDisplay::VBlankLoop (gfx\thebes\gfxwindowsplatform.cpp:2783)
#02: RunnableMethod<D3DVsyncSource::D3DVsyncDisplay,void (__cdecl D3DVsyncSource::D3DVsyncDisplay::*)(void) __ptr64,mozilla::Tuple<> >::Run (ipc\chromium\src\base\task.h:290)
#03: MessageLoop::RunTask (ipc\chromium\src\base\message_loop.cc:350)
#04: MessageLoop::DeferOrRunPendingTask (ipc\chromium\src\base\message_loop.cc:360)
#05: MessageLoop::DoWork (ipc\chromium\src\base\message_loop.cc:444)
#06: base::MessagePumpDefault::Run (ipc\chromium\src\base\message_pump_default.cc:35)
#07: MessageLoop::RunHandler (ipc\chromium\src\base\message_loop.cc:224)
#08: MessageLoop::Run (ipc\chromium\src\base\message_loop.cc:204)
#09: base::Thread::ThreadMain (ipc\chromium\src\base\thread.cc:177)
#10: `anonymous namespace'::ThreadFunc (ipc\chromium\src\base\platform_thread_win.cc:27)
#11: BaseThreadInitThunk[C:\Windows\system32\KERNEL32.DLL +0x18102]
#12: RtlUserThreadStart[C:\Windows\SYSTEM32\ntdll.dll +0x5c5b4]
Assignee | ||
Comment 5•9 years ago
|
||
Does the DEBUG crash happen without my patch?
Flags: needinfo?(ryanvm)
Reporter | ||
Comment 6•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #3)
> Does it happen with mozjemalloc?
No. I originally confirmed it was a jemalloc4 issue by removing all jemalloc-related entries from my .mozconfig. I also left just |--enable-jemalloc| set without MOZ_JEMALLOC4=1 and couldn't reproduce.
> Does it happen if you build with a non-updated jemalloc4 with
> https://hg.mozilla.org/mozilla-central/rev/0a14d675236e reverted?
Crash
> What about a non-updated jemalloc4 with a cherry-pick of
> https://github.com/jemalloc/jemalloc/commit/
> 4a8abbb400afe695f145a487380c04a946500bc6 ?
Crash
> If you still have the instructions to get an allocation log, can you get one?
I'm not sure how to get that from the content process. I set the env vars, but the log only covered the parent process.
(In reply to Bill McCloskey (:billm) from comment #5)
> Does the DEBUG crash happen without my patch?
It does not.
Flags: needinfo?(ryanvm)
Reporter | ||
Comment 7•9 years ago
|
||
BTW, these crashes are reproducible on Try as well.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d0996368bb81&group_state=expanded&filter-searchStr=e10s
Fun story, Marionette finished green even though the logs clearly show it also crashing. I've filed a bug for that little doozy too :).
![]() |
||
Updated•9 years ago
|
tracking-e10s:
--- → +
Priority: -- → P3
Assignee | ||
Comment 8•9 years ago
|
||
In try_realloc I forgot to consider the case where newlength is 0. In that case we'll get null back from realloc and our buffer gets freed. We need to make sure that we set mReserved correctly in this case or else we'll crash.
Comment 9•9 years ago
|
||
Comment on attachment 8741196 [details] [diff] [review]
patch
Review of attachment 8741196 [details] [diff] [review]:
-----------------------------------------------------------------
Sorry for missing that the first time.
Attachment #8741196 -
Flags: review?(jld) → review+
Reporter | ||
Comment 10•9 years ago
|
||
Comment on attachment 8741196 [details] [diff] [review]
patch
Works great locally and on Try. Thanks!
Attachment #8741196 -
Flags: feedback+
Comment 11•9 years ago
|
||
Comment 12•9 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
You need to log in
before you can comment on or make changes to this bug.
Description
•