Closed
Bug 1263292
Opened 8 years ago
Closed 8 years ago
Windows e10s jemalloc4 startup permacrash since bug 1235633
Categories
(Core :: Memory Allocator, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla48
People
(Reporter: RyanVM, Assigned: billm)
References
Details
(Keywords: crash)
Attachments
(1 file)
1.20 KB,
patch
|
jld
:
review+
RyanVM
:
feedback+
|
Details | Diff | Splinter Review |
This is with upstream jemalloc tip (includes the fix for bug 1261226). I bisected this down to the patch from bug 1235633 as the cause. https://hg.mozilla.org/integration/mozilla-inbound/rev/dd3e03fcb06b I've confirmed that this only reproduces with jemalloc4 enabled and that both win32 and win64 builds are affected. The exact top frame of the stack varies from run to run, but it's always in arena dalloc functions. Crash stack: mozglue.dll!arena_run_dalloc(arena_s * arena, arena_run_s * run, bool dirty, bool cleaned, bool decommitted) Line 1907 mozglue.dll!arena_dalloc_large_locked_impl(arena_s * arena, arena_chunk_s * chunk, void * ptr, bool junked) Line 2815 mozglue.dll!je_arena_dalloc_large(tsd_s * tsd, arena_s * arena, arena_chunk_s * chunk, void * ptr) Line 2831 mozglue.dll!je_isqalloc(tsd_s * tsd, void * ptr, unsigned __int64 size, tcache_s * tcache) Line 1100 mozglue.dll!je_arena_ralloc(tsd_s * tsd, arena_s * arena, void * ptr, unsigned __int64 oldsize, unsigned __int64 size, unsigned __int64 alignment, bool zero, tcache_s * tcache) Line 3130 mozglue.dll!je_realloc(void * ptr, unsigned __int64 size) Line 1887 mozglue.dll!realloc_impl(void * ptr, unsigned __int64 size) Line 191 xul.dll!Buffer::try_realloc(unsigned __int64 newlength) Line 54 xul.dll!Buffer::assign(const char * bytes, unsigned __int64 length) Line 87 xul.dll!IPC::Channel::ChannelImpl::ProcessIncomingMessages(base::MessagePumpForIO::IOContext * context, unsigned long bytes_read) Line 430 xul.dll!IPC::Channel::ChannelImpl::OnIOCompleted(base::MessagePumpForIO::IOContext * context, unsigned long bytes_transfered, unsigned long error) Line 515 xul.dll!base::MessagePumpForIO::WaitForIOCompletion(unsigned long timeout, base::MessagePumpForIO::IOHandler * filter) Line 495 xul.dll!base::MessagePumpForIO::DoRunLoop() Line 439 xul.dll!base::MessagePumpWin::RunWithDispatcher(base::MessagePump::Delegate * delegate, base::MessagePumpWin::Dispatcher * dispatcher) Line 56 xul.dll!MessageLoop::RunHandler() Line 224 xul.dll!MessageLoop::Run() Line 204 xul.dll!base::Thread::ThreadMain() Line 177 xul.dll!ThreadEntry(void * arg) Line 256
Reporter | ||
Comment 2•8 years ago
|
||
Add MOZ_JEMALLOC4=1 to your mozconfig and apply https://people.mozilla.org/~rvandermeulen/jemalloc so you don't crash in xpcshell.exe during packaging. Beyond that, just launching via |./mach run| should crash on startup.
Flags: needinfo?(ryanvm)
Comment 3•8 years ago
|
||
Does it happen with mozjemalloc? Does it happen if you build with a non-updated jemalloc4 with https://hg.mozilla.org/mozilla-central/rev/0a14d675236e reverted? What about a non-updated jemalloc4 with a cherry-pick of https://github.com/jemalloc/jemalloc/commit/4a8abbb400afe695f145a487380c04a946500bc6 ? If you still have the instructions to get an allocation log, can you get one? (In reply to Bill McCloskey (:billm) from comment #1) > How do I reproduce this? Set UPSTREAM_COMMIT to dev in memory/jemalloc/upstream.info, run memory/jemalloc/update.sh, then build with MOZ_JEMALLOC4=1 set.
Reporter | ||
Comment 4•8 years ago
|
||
From a debug build, but not entirely sure if it's relevant or not: Assertion failure: mRawPtr != 0 (You can't dereference a NULL RefPtr with operator->().), at objdir-fx-64-debug\dist\include\mozilla/RefPtr.h:297 #01: D3DVsyncSource::D3DVsyncDisplay::VBlankLoop (gfx\thebes\gfxwindowsplatform.cpp:2783) #02: RunnableMethod<D3DVsyncSource::D3DVsyncDisplay,void (__cdecl D3DVsyncSource::D3DVsyncDisplay::*)(void) __ptr64,mozilla::Tuple<> >::Run (ipc\chromium\src\base\task.h:290) #03: MessageLoop::RunTask (ipc\chromium\src\base\message_loop.cc:350) #04: MessageLoop::DeferOrRunPendingTask (ipc\chromium\src\base\message_loop.cc:360) #05: MessageLoop::DoWork (ipc\chromium\src\base\message_loop.cc:444) #06: base::MessagePumpDefault::Run (ipc\chromium\src\base\message_pump_default.cc:35) #07: MessageLoop::RunHandler (ipc\chromium\src\base\message_loop.cc:224) #08: MessageLoop::Run (ipc\chromium\src\base\message_loop.cc:204) #09: base::Thread::ThreadMain (ipc\chromium\src\base\thread.cc:177) #10: `anonymous namespace'::ThreadFunc (ipc\chromium\src\base\platform_thread_win.cc:27) #11: BaseThreadInitThunk[C:\Windows\system32\KERNEL32.DLL +0x18102] #12: RtlUserThreadStart[C:\Windows\SYSTEM32\ntdll.dll +0x5c5b4]
Assignee | ||
Comment 5•8 years ago
|
||
Does the DEBUG crash happen without my patch?
Flags: needinfo?(ryanvm)
Reporter | ||
Comment 6•8 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #3) > Does it happen with mozjemalloc? No. I originally confirmed it was a jemalloc4 issue by removing all jemalloc-related entries from my .mozconfig. I also left just |--enable-jemalloc| set without MOZ_JEMALLOC4=1 and couldn't reproduce. > Does it happen if you build with a non-updated jemalloc4 with > https://hg.mozilla.org/mozilla-central/rev/0a14d675236e reverted? Crash > What about a non-updated jemalloc4 with a cherry-pick of > https://github.com/jemalloc/jemalloc/commit/ > 4a8abbb400afe695f145a487380c04a946500bc6 ? Crash > If you still have the instructions to get an allocation log, can you get one? I'm not sure how to get that from the content process. I set the env vars, but the log only covered the parent process. (In reply to Bill McCloskey (:billm) from comment #5) > Does the DEBUG crash happen without my patch? It does not.
Flags: needinfo?(ryanvm)
Reporter | ||
Comment 7•8 years ago
|
||
BTW, these crashes are reproducible on Try as well. https://treeherder.mozilla.org/#/jobs?repo=try&revision=d0996368bb81&group_state=expanded&filter-searchStr=e10s Fun story, Marionette finished green even though the logs clearly show it also crashing. I've filed a bug for that little doozy too :).
Updated•8 years ago
|
tracking-e10s:
--- → +
Priority: -- → P3
Assignee | ||
Comment 8•8 years ago
|
||
In try_realloc I forgot to consider the case where newlength is 0. In that case we'll get null back from realloc and our buffer gets freed. We need to make sure that we set mReserved correctly in this case or else we'll crash.
Comment 9•8 years ago
|
||
Comment on attachment 8741196 [details] [diff] [review] patch Review of attachment 8741196 [details] [diff] [review]: ----------------------------------------------------------------- Sorry for missing that the first time.
Attachment #8741196 -
Flags: review?(jld) → review+
Reporter | ||
Comment 10•8 years ago
|
||
Comment on attachment 8741196 [details] [diff] [review] patch Works great locally and on Try. Thanks!
Attachment #8741196 -
Flags: feedback+
Comment 12•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/d1c487cc4ef2
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
You need to log in
before you can comment on or make changes to this bug.
Description
•