Closed Bug 1592392 Opened 5 years ago Closed 2 years ago

Temporary intermittent browser hang in js::GCParallelTask::join() on Windows 10

Categories

(Core :: JavaScript: GC, defect, P2)

70 Branch
defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: wesley.wiser, Unassigned, NeedInfo)

Details

(Keywords: hang)

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:71.0) Gecko/20100101 Firefox/71.0

Steps to reproduce:

I have two coworkers who experience intermittent 3 - 10 second hangs in Firefox across various sites.

Symptoms include:

  • clicking a link in a web page at which point the browser becomes unresponsive for several seconds and then the page loads
  • typing in a textbox/textarea when a freeze occurs. After several seconds, the browser "catches up" and the proper keystrokes are registered.

Both users are using Firefox 70 on Windows 10 (latest version).

Actual results:

I sat with one of the users and captured a trace using the Firefox Profiler browser extension. The user performed various actions in our internal web app until they clicked a link and the browser hung for about 10 seconds. We captured a trace which I've attached.

The trace shows 8.3 seconds spent in the following (truncated) stack:

NtWaitForAlertByThreadId
RtlSleepConditionVariableSRW
SleepConditionVariableSRW
mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl &,mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const &)
js::GCParallelTask::join()
void js::gc::GCRuntime::incrementalSlice(class js::SliceBudget & const, const class mozilla::Maybe<JSGCInvocationKind> & const, JS::GCReason, class js::gc::AutoGCSession & const)
js::gc::GCRuntime::IncrementalResult js::gc::GCRuntime::gcCycle(bool, class js::SliceBudget, const class mozilla::Maybe<JSGCInvocationKind> & const, JS::GCReason)
js::GCRuntime::collect
js::gc::GCRuntime::collect(bool,js::SliceBudget,mozilla::Maybe<JSGCInvocationKind> const &,JS::GCReason)
js::gc::GCRuntime::gcSlice(JS::GCReason,_int64)
js::gc::GCRuntime::gcIfNeededAtAllocation(JSContext *)
class js::AccessorShape * js::Allocate<js::AccessorShape,js::CanGC>(struct JSContext *)
js::Shape::new
(JSContext *,JS::Handle<js::StackShape>,unsigned int)

Component: Untriaged → JavaScript: GC
Keywords: hang
Product: Firefox → Core

The priority flag is not set for this bug.
:jonco, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jcoppeard)

Thanks for the bug report!

It's hard to investigate this without concrete steps to reproduce. Does this happen for any particular URLs? Is there any way I can trigger it?

I looked at the telemetry we I could find for this version and it didn't indicate that this was general problem.

Flags: needinfo?(jcoppeard) → needinfo?(wesley.wiser)
Priority: -- → P2

It's also possible that this is something that was fixed by a recent change (bug 1592537). Can you try and nightly build and see if the problem goes away?

Unfortunately I'm not hitting this issue myself (even though I'm also running the same version of Firefox on similar hardware on Win 10). As I understand it, this issue happens frequently, 10's of times per hour, but randomly. When I asked a coworker to show me the issue, he was able to reproduce it within 5 minutes. However, even after showing me the hang, if we tried exactly the same steps again, it didn't cause the hang every time.

It also doesn't seem to be limited to any particular web pages. My coworker mentioned that this happens to various pages across many different sites.

I will ask and see if I can get a repro from a publicly available website.

(In reply to Wesley Wiser from comment #4)
Interesting! Unfortunately bugs like this are often hard to reproduce reliably.

Can you ask your coworker if he can capture a profile that includes all threads? It would be great to know what the JS helper threads are doing. He can do this by setting the "Add custom threads by name" field to just a single comma ",".

Yes, I can do that. It sounds like he may have some time later today to try the latest Nightly as well so I'll report back with that info also.

Looking in my email history, I see that I have a separate profile from him from another instance he was was able to capture. It seems to have a slightly different stack ending in

NtSignalAndWaitForSingleObject
SignalObjectAndWait
0x7ffd67da12e3
0x7ffd67d950d9
0x7ffd67d8fa98
VirtualAlloc
chunk_alloc(unsigned __int64,unsigned __int64,bool,bool *)
arena_t::PallocHuge(unsigned __int64,unsigned __int64,bool)
moz_arena_malloc
js::LifoAlloc::getOrCreateChunk(unsigned __int64)
js::TypeHashSet::InsertTry<JS::PropertyKey,js::ObjectGroup::Property,js::ObjectGroup::Property>(js::LifoAlloc &,js::ObjectGroup::Property * * &,unsigned int &,JS::PropertyKey)
js::ObjectGroup::sweep(js::AutoSweepObjectGroup const &)
js::gc::IncrementalProgress js::gc::GCRuntime::sweepTypeInformation(class JSFreeOp *, class js::SliceBudget & const)
js::gc::IncrementalProgress sweepaction::SweepActionSequence::run(struct js::gc::SweepAction::Args & const)
js::gc::IncrementalProgress sweepaction::SweepActionForEach<js::gc::SweepGroupZonesIter,JSRuntime *>::run(struct js::gc::SweepAction::Args & const)
js::gc::IncrementalProgress sweepaction::SweepActionSequence::run(struct js::gc::SweepAction::Args & const)
js::gc::IncrementalProgress sweepaction::SweepActionForEach<js::gc::SweepGroupsIter,JSRuntime *>::run(struct js::gc::SweepAction::Args & const)
js::gc::IncrementalProgress js::gc::GCRuntime::performSweepActions(class js::SliceBudget & const)
void js::gc::GCRuntime::incrementalSlice(class js::SliceBudget & const, const class mozilla::Maybe<JSGCInvocationKind> & const, JS::GCReason, class js::gc::AutoGCSession & const)
js::gc::GCRuntime::IncrementalResult js::gc::GCRuntime::gcCycle(bool, class js::SliceBudget, const class mozilla::Maybe<JSGCInvocationKind> & const, JS::GCReason)
js::GCRuntime::collect
js::gc::GCRuntime::collect(bool,js::SliceBudget,mozilla::Maybe<JSGCInvocationKind> const &,JS::GCReason)
js::gc::GCRuntime::gcSlice(JS::GCReason,__int64)
nsJSContext::GarbageCollectNow(JS::GCReason,nsJSContext::IsIncremental,nsJSContext::IsShrinking,__int64)
nsJSContext::GarbageCollectNow CC_WAITING
static bool InterSliceGCRunnerFired(class mozilla::TimeStamp, void *)
bool std::_Func_impl_no_alloc<`lambda at z:/task_1571244707/build/src/dom/base/nsJSEnvironment.cpp:1776:7',bool,mozilla::TimeStamp>::_Do_call(class mozilla::TimeStamp *)
nsresult mozilla::IdleTaskRunner::Run()
nsThread::ProcessNextEvent(bool,bool *)
NS_ProcessNextEvent(nsIThread *,bool)
mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate *)
MessageLoop::RunHandler()
MessageLoop::Run()
nsBaseAppShell::Run()
nsAppShell::Run()
XRE_RunAppShell()
MessageLoop::RunHandler()
MessageLoop::Run()
XRE_InitChildProcess(int,char * * const,XREChildData const *)
XRE_InitChildProcess
static int content_process_main(class mozilla::Bootstrap *, int, char * *)
static int NS_internal_main(int, char * *, char * *)
wmain
static int __scrt_common_main_seh()
BaseThreadInitThunk
RtlUserThreadStart
(root)

I've attached that file (Firefox 2019-10-30 13.19 profile.json.gz) in case that's helpful.

(In reply to Wesley Wiser from comment #6)
Thanks. That one's hanging inside VirtualAlloc, which is strange...

Status: UNCONFIRMED → RESOLVED
Closed: 2 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: