Closed
Bug 1270938
Opened 9 years ago
Closed 9 years ago
Hang while running APZCBasicTester.OverScroll_Bug1152051a (infinite loop in BlockingResourceBase::Release())
Categories
(Core :: Panning and Zooming, defect)
Core
Panning and Zooming
Tracking
()
RESOLVED
FIXED
mozilla49
| Tracking | Status | |
|---|---|---|
| firefox49 | --- | fixed |
People
(Reporter: botond, Assigned: erahm)
Details
(Whiteboard: [gfx-noted])
Attachments
(3 files, 1 obsolete file)
bd339 reported (in bug 1170062 comment 7) an issue where running APZ gtests resulted in a hang while running a particular test, APZCBasicTester.OverScroll_Bug1152051a
I can't reproduce the hang, but bd339 sent me a partial backtrace:
[14:20] <bd339> #0 0x00007f05d0debee2 in mozilla::BlockingResourceBase::Release (this=0x7f05c8ae71e8) at /xpcom/glue/BlockingResourceBase.cpp:339
[14:20] <bd339> #1 0x00007f05d0dec27d in mozilla::ReentrantMonitor::Exit (this=0x7f05c8ae71e8) at /xpcom/glue/BlockingResourceBase.cpp:444
[14:20] <bd339> #2 0x00007f05d0843793 in mozilla::ReentrantMonitorAutoEnter::~ReentrantMonitorAutoEnter (this=0x7ffeed596c00, __in_chrg=<optimized out>)
[14:20] <bd339> at /obj-x86_64-pc-linux-gnu/dist/include/mozilla/ReentrantMonitor.h:185
[14:20] <bd339> #3 0x00007f05d2287e47 in mozilla::layers::StateChangeNotificationBlocker::StateChangeNotificationBlocker (this=0x7ffeed596c70, aApzc=0x7f05c8ae7000)
[14:20] <bd339> at /gfx/layers/apz/src/AsyncPanZoomController.cpp:418
Not sure what's going on here... infinite loop in BlockingResourceBase::Release() maybe?
Attached is a list of the other threads that are running.
| Reporter | ||
Comment 1•9 years ago
|
||
(In reply to Botond Ballo [:botond] from comment #0)
> Not sure what's going on here... infinite loop in
> BlockingResourceBase::Release() maybe?
Yeah - bd339 said one CPU core is maxed out during the hang. So it's not a hang, but an infinite loop, seemingly here:
https://dxr.mozilla.org/mozilla-central/rev/e5a10bc7dac4ee2453d8319165c1f6578203eac7/xpcom/glue/BlockingResourceBase.cpp#364
| Reporter | ||
Comment 2•9 years ago
|
||
Nathan, have you seen this sort of infinite loop in BlockingResourceBase::Release() before? Do you have any suggestions for how to further diagnose / debug one?
Flags: needinfo?(nfroyd)
Summary: Hang while running APZCBasicTester.OverScroll_Bug1152051a → Hang while running APZCBasicTester.OverScroll_Bug1152051a (infinite loop in BlockingResourceBase::Release())
Comment 3•9 years ago
|
||
I haven't seen anything like this, though Eric has done some work with the deadlock detector, maybe he's seen something like this?
The warning for that case is a little worrisome:
NS_WARNING("Resource acquired at calling context\n");
NS_WARNING(" [stack trace unavailable]\n");
NS_WARNING("\nis being released in non-LIFO order; why?");
I'd be curious to see a fuller stack trace, especially if there happen to be traces from other threads involved.
Flags: needinfo?(nfroyd) → needinfo?(erahm)
| Reporter | ||
Comment 4•9 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #3)
> I'd be curious to see a fuller stack trace, especially if there happen to be
> traces from other threads involved.
bd339, could you obtain and post stack traces for all threads? You can get them with a single command, "thread apply all bt".
Flags: needinfo?(bd339)
| Assignee | ||
Comment 5•9 years ago
|
||
In theory to get stacks you have to enable them at build time: https://dxr.mozilla.org/mozilla-central/rev/e5a10bc7dac4ee2453d8319165c1f6578203eac7/xpcom/glue/BlockingResourceBase.h#20-21, but it looks like we missed adding them back to this spot in bug 1050445.
I'm guessing re-entrant + async stuff is wreaking havoc. It predates me, but I'm probably the most up to date on it so I'll take a look at the Release-out-of-order logic.
| Assignee | ||
Comment 7•9 years ago
|
||
Through code inspection I'm not seeing how we could get into an infinite loop
there. This at least re-enables stack printing in the monitor case.
Attachment #8755567 -
Flags: review?(nfroyd)
| Assignee | ||
Updated•9 years ago
|
Assignee: nobody → erahm
Status: NEW → ASSIGNED
| Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(erahm)
Comment 8•9 years ago
|
||
Comment on attachment 8755567 [details] [diff] [review]
Add stack traces for blocking resource base warnings
Review of attachment 8755567 [details] [diff] [review]:
-----------------------------------------------------------------
r=me, but I'd like to double-check the addition of the second Print call below.
::: xpcom/glue/BlockingResourceBase.cpp
@@ +418,5 @@
> NS_WARNING(
> + "Re-entering ReentrantMonitor after acquiring other resources.");
> +
> + nsCString tmp;
> + Print(tmp);
I don't think this one is necessary, as CheckAcquire() below should Print() this via the PrintCycle call?
Attachment #8755567 -
Flags: review?(nfroyd) → review+
| Assignee | ||
Comment 9•9 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #8)
> Comment on attachment 8755567 [details] [diff] [review]
> Add stack traces for blocking resource base warnings
>
> Review of attachment 8755567 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> r=me, but I'd like to double-check the addition of the second Print call
> below.
>
> ::: xpcom/glue/BlockingResourceBase.cpp
> @@ +418,5 @@
> > NS_WARNING(
> > + "Re-entering ReentrantMonitor after acquiring other resources.");
> > +
> > + nsCString tmp;
> > + Print(tmp);
>
> I don't think this one is necessary, as CheckAcquire() below should Print()
> this via the PrintCycle call?
Yeah I'll remove the Print call.
| Assignee | ||
Comment 10•9 years ago
|
||
| Assignee | ||
Updated•9 years ago
|
Attachment #8755567 -
Attachment is obsolete: true
| Assignee | ||
Updated•9 years ago
|
Attachment #8755606 -
Flags: review+
| Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Comment 11•9 years ago
|
||
Keywords: checkin-needed
Comment 12•9 years ago
|
||
| bugherder | ||
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
You need to log in
before you can comment on or make changes to this bug.
Description
•