Closed Bug 1360361 Opened 7 years ago Closed 7 years ago

TelemetrySession.jsm handling "wake_notification" hangs/jangs the browser

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1369734

People

(Reporter: ehsan.akhgari, Assigned: cyu)

References

Details

(Whiteboard: [bhr])

Attachments

(1 file)

See <https://people-mozilla.org/~mlayzell/bhr/20170405/15.html#8> for example.  There are other similar hangs on that page also.  We recently dealt with a similar issue in bug 1357742.
Whiteboard: [bhr]
Assignee: nobody → cyu
Attachment #8880779 - Flags: review?(nfroyd)
Comment on attachment 8880779 [details] [diff] [review]
Don't report pseudo hangs when the system resumes from sleep.

Review of attachment 8880779 [details] [diff] [review]:
-----------------------------------------------------------------

We talked about this in person, and decided that with the current patch, we could run into a situation where we could be hanging across a sleep/wake cycle or we could be hanging after we woke up, but we'd discard that hang as not a "real" hang.  Since we also get notifications for when the computer is going to sleep, we could compensate for that, though, and Cervantes is going to try and rework the patch to take sleep times into account as well.
Attachment #8880779 - Flags: review?(nfroyd) → review-
After taking a closer look at the internals of BackgroundHangManager, I think the reported stacks that contains nsWindow::PostSleepWakeNotification() are real hangs or janks.

BackgroundHangManager accumulates its mIntervalNow only when the BgHangMonitor thread has the chance to run. Lack of activity (like system sleep) will make it stop the updates of mIntervalNow, which effectively stops the internal clock of BackgrounHangMonitor and all monitored threads. Take the following timeline for example:

Task starts --- HangMonitor tick (1) --- System Sleep --- (long sleep) --- System Resume ---  HangMonitor tick (2) --- Task ends

BackgroundHangManager::mIntervalNow doesn't accumulate from tick 1 to tick 2. System sleep or CPU scheduling that results in pseudo hangs are not reported.

I tried to reproduce this reported hang and caught this in a VS debugger. The main thread stack when BackgroundHangManager observed the start of a hang (hangTime = 148 ms when this stack is captured):

 	`anonymous namespace'::internal_GetHistogramMapEntry Line 435	C++
 	`anonymous namespace'::internal_GetHistogramEnumId Line 446	C++
>	TelemetryHistogram::CreateHistogramSnapshots Line 2258	C++
 	`anonymous namespace'::TelemetryImpl::SnapshotSubsessionHistograms Line 851	C++
 	_NS_InvokeByIndex Line 57	Unknown
 	XPCWrappedNative::CallMethod Line 1282	C++
 	XPC_WN_CallMethod Line 966	C++
 	js::InternalCallOrConstruct Line 470	C++
 	InternalCall Line 515	C++
 	Interpret Line 3060	C++
 	js::RunScript Line 410	C++
 	js::InternalCallOrConstruct Line 491	C++
 	InternalCall Line 515	C++
 	Interpret Line 3060	C++
 	js::RunScript Line 410	C++
 	js::InternalCallOrConstruct Line 491	C++
 	InternalCall Line 515	C++
 	Interpret Line 3060	C++
 	js::RunScript Line 410	C++
 	js::InternalCallOrConstruct Line 491	C++
 	InternalCall Line 515	C++
 	Interpret Line 3060	C++
 	js::RunScript Line 410	C++
 	js::InternalCallOrConstruct Line 491	C++
 	InternalCall Line 515	C++
 	Interpret Line 3060	C++
 	js::RunScript Line 410	C++
 	js::InternalCallOrConstruct Line 491	C++
 	InternalCall Line 515	C++
 	Interpret Line 3060	C++
 	js::RunScript Line 410	C++
 	js::InternalCallOrConstruct Line 491	C++
 	InternalCall Line 515	C++
 	Interpret Line 3060	C++
 	js::RunScript Line 410	C++
 	js::InternalCallOrConstruct Line 491	C++
 	InternalCall Line 515	C++
 	Interpret Line 3060	C++
 	js::RunScript Line 410	C++
 	js::InternalCallOrConstruct Line 491	C++
 	InternalCall Line 515	C++
 	js::Call Line 534	C++
 	JS_CallFunctionValue Line 2889	C++
 	nsXPCWrappedJSClass::CallMethod Line 1214	C++
 	nsXPCWrappedJS::CallMethod Line 615	C++
 	PrepareAndDispatch Line 85	C++
 	SharedStub Line 113	C++
 	nsObserverList::NotifyObservers Line 112	C++
 	nsObserverService::NotifyObservers Line 299	C++
 	nsWindow::PostSleepWakeNotification Line 6385	C++
 	nsWindow::ProcessMessage Line 5396	C++
 	nsWindow::WindowProcInternal Line 4930	C++
 	CallWindowProcCrashProtected Line 35	C++
 	nsWindow::WindowProc Line 4882	C++
 	[External Code]	
 	[Frames below may be incorrect and/or missing, no symbols loaded for msctf.dll]	
 	mozilla::widget::WinUtils::PeekMessageW Line 714	C++
 	nsAppShell::ProcessNextNativeEvent Line 361	C++
 	nsBaseAppShell::DoProcessNextNativeEvent Line 139	C++
 	nsBaseAppShell::OnProcessNextEvent Line 289	C++
 	nsThread::ProcessNextEvent Line 1363	C++
 	NS_ProcessNextEvent Line 489	C++
 	mozilla::ipc::MessagePump::Run Line 125	C++
 	MessageLoop::RunHandler Line 314	C++
 	MessageLoop::Run Line 294	C++
 	nsBaseAppShell::Run Line 158	C++
 	nsAppShell::Run Line 273	C++
 	nsAppStartup::Run Line 288	C++
 	XREMain::XRE_mainRun Line 4589	C++
 	XREMain::XRE_main Line 4772	C++
 	XRE_main Line 4867	C++
 	mozilla::BootstrapImpl::XRE_main Line 45	C++
 	do_main Line 238	C++
 	NS_internal_main Line 312	C++
 	wmain Line 118	C++
 	[External Code]	

This shows that handling of "wake_notification" in TelemetrySession.jsm may take too long and gets reported as a hang.

It looks like other "wake_notification" observers either dispatch a task to the main thread or don't take a lot of time to run. It's likely that the reported stacks are mainly from TelemetrySession.jsm.
Summary: BackgroundHangReporter reports long running hangs when the computer resumes from sleep (hangs with nsWindow::PostSleepWakeNotification(bool) on the stack) → TelemetrySession.jsm handling "wake_notification" hangs/jangs the browser
Possibly a dupe of bug 1369734 then?
Yes, what I observed is aborted _saveAbortedSessionPing(). Mark as duplicate.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: