Closed Bug 1244897 Opened 4 years ago Closed 3 years ago

Intermittent test_restyles.html | Assertion count 1 is greater than expected range 0-0 assertions.

Categories

(Core :: DOM: Animation, defect, P3)

45 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox50 --- wontfix
firefox51 --- wontfix
firefox52 --- wontfix
firefox53 --- fixed

People

(Reporter: KWierso, Assigned: hiro)

References

(Depends on 1 open bug)

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell fixed])

Attachments

(1 file)

https://treeherder.mozilla.org/logviewer.html#?job_id=20821974&repo=mozilla-inbound


 05:26:14     INFO -  Deadlock may happen for some other execution
 05:26:14     INFO -  [3608] ###!!! ASSERTION: Potential deadlock detected:
 05:26:14     INFO -  Cyclical dependency starts at
 05:26:14     INFO -  Mutex : ObservedDocShellMutex
 05:26:14     INFO -  Next dependency:
 05:26:14     INFO -  Mutex : StaticMutex (currently acquired)
 05:26:14     INFO -  Cycle completed at
 05:26:14     INFO -  Mutex : ObservedDocShellMutex
 05:26:14     INFO -  Deadlock may happen for some other execution
 05:26:14     INFO -  : 'Error', file /builds/slave/m-in-lx-d-00000000000000000000/build/src/xpcom/glue/BlockingResourceBase.cpp, line 307
 05:26:48     INFO -  #01: mozilla::BlockingResourceBase::CheckAcquire() [xpcom/glue/BlockingResourceBase.cpp:307]
 05:26:48     INFO -  #02: mozilla::OffTheBooksMutex::Lock() [xpcom/glue/BlockingResourceBase.cpp:382]
 05:26:48     INFO -  #03: mozilla::BaseAutoLock<mozilla::Mutex>::BaseAutoLock(mozilla::Mutex&, mozilla::detail::GuardObjectNotifier&&) [xpcom/glue/Mutex.h:164]
 05:26:48     INFO -  #04: mozilla::ObservedDocShell::ClearMarkers() [docshell/base/timeline/ObservedDocShell.cpp:48]
 05:26:48     INFO -  #05: mozilla::TimelineConsumers::RemoveConsumer(nsDocShell*) [docshell/base/timeline/TimelineConsumers.cpp:152]
 05:26:48     INFO -  #06: nsDocShell::SetRecordProfileTimelineMarkers(bool) [docshell/base/nsDocShell.cpp:2865]
 05:26:48     INFO -  #07: NS_InvokeByIndex
 05:26:48     INFO -  #08: CallMethodHelper::Call() [js/xpconnect/src/XPCWrappedNative.cpp:1416]
 05:26:48     INFO -  #09: XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) [js/xpconnect/src/XPCWrappedNative.cpp:1383]
 05:26:48     INFO -  #10: XPC_WN_GetterSetter(JSContext*, unsigned int, JS::Value*) [js/xpconnect/src/XPCWrappedNativeJSOps.cpp:1144]
 05:26:48     INFO -  #11: js::CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), JS::CallArgs const&) [js/src/jscntxtinlines.h:236]
 05:26:48     INFO -  #12: js::Invoke(JSContext*, JS::CallArgs const&, js::MaybeConstruct) [js/src/vm/Interpreter.cpp:475]
 05:26:48     INFO -  #13: js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value const*, JS::MutableHandle<JS::Value>) [js/src/vm/Interpreter.cpp:527]
 05:26:48     INFO -  #14: js::InvokeSetter(JSContext*, JS::Value const&, JS::Value, JS::Handle<JS::Value>) [js/public/RootingAPI.h:719]
 05:26:48     INFO -  #15: js::NativeSetProperty(JSContext*, JS::Handle<js::NativeObject*>, JS::Handle<jsid>, JS::Handle<JS::Value>, JS::Handle<JS::Value>, js::QualifiedBool, JS::ObjectOpResult&) [js/src/vm/NativeObject.cpp:2289]
 05:26:48     INFO -  #16: js::SetProperty(JSContext*, JS::Handle<JSObject*>, JS::Handle<jsid>, JS::Handle<JS::Value>, JS::Handle<JS::Value>, JS::ObjectOpResult&) [js/src/vm/NativeObject.h:1489]
 05:26:48     INFO -  #17: Interpret [js/src/vm/Interpreter.cpp:287]
 05:26:48     INFO -  #18: js::RunScript(JSContext*, js::RunState&) [js/src/vm/Interpreter.cpp:425]
 05:26:48     INFO -  #19: js::Invoke(JSContext*, JS::CallArgs const&, js::MaybeConstruct) [js/src/vm/Interpreter.cpp:493]
 05:26:48     INFO -  #20: js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value const*, JS::MutableHandle<JS::Value>) [js/src/vm/Interpreter.cpp:527]
 05:26:48     INFO -  #21: JS::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>) [js/src/jsapi.cpp:2883]
 05:26:48     INFO -  #22: mozilla::dom::AnyCallback::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, JS::MutableHandle<JS::Value>, mozilla::ErrorResult&) [obj-firefox/dom/bindings/PromiseBinding.cpp:78]
 05:26:48     INFO -  #23: mozilla::dom::AnyCallback::Call(JS::Handle<JS::Value>, JS::MutableHandle<JS::Value>, mozilla::ErrorResult&, char const*, mozilla::dom::CallbackObject::ExceptionHandling, JSCompartment*) [obj-firefox/dist/include/mozilla/dom/PromiseBinding.h:149]
 05:26:48     INFO -  #24: mozilla::dom::WrapperPromiseCallback::Call(JSContext*, JS::Handle<JS::Value>) [dom/promise/PromiseCallback.cpp:341]
 05:26:48     INFO -  #25: mozilla::dom::PromiseReactionJob::Run() [mfbt/Maybe.h:372]
 05:26:48     INFO -  #26: mozilla::dom::Promise::PerformMicroTaskCheckpoint() [xpcom/glue/nsDebug.h:40]
 05:26:48     INFO -  #27: mozilla::CycleCollectedJSRuntime::AfterProcessTask(unsigned int) [xpcom/base/CycleCollectedJSRuntime.cpp:1248]
 05:26:48     INFO -  #28: XPCJSRuntime::AfterProcessTask(unsigned int) [js/xpconnect/src/XPCJSRuntime.cpp:3661]
 05:26:48     INFO -  #29: nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:1010]
 05:26:48     INFO -  #30: NS_ProcessNextEvent(nsIThread*, bool) [xpcom/glue/nsThreadUtils.cpp:297]
 05:26:48     INFO -  #31: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) [ipc/glue/MessagePump.cpp:127]
 05:26:48     INFO -  #32: MessageLoop::RunInternal() [ipc/chromium/src/base/message_loop.cc:234]
 05:26:48     INFO -  #33: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:520]
 05:26:48     INFO -  #34: nsBaseAppShell::Run() [widget/nsBaseAppShell.cpp:158]
 05:26:48     INFO -  #35: nsAppStartup::Run() [toolkit/components/startup/nsAppStartup.cpp:282]
 05:26:48     INFO -  #36: XREMain::XRE_mainRun() [toolkit/xre/nsAppRunner.cpp:4279]
 05:26:48     INFO -  #37: XREMain::XRE_main(int, char**, nsXREAppData const*) [toolkit/xre/nsAppRunner.cpp:4375]
 05:26:48     INFO -  #38: XRE_main [toolkit/xre/nsAppRunner.cpp:4478]
 05:26:48     INFO -  #39: do_main [browser/app/nsBrowserApp.cpp:212]
 05:26:48     INFO -  #40: main [browser/app/nsBrowserApp.cpp:354]
Hiro, can you look into this?
Flags: needinfo?(hiikezoe)
This is very infrequently, but has surely happened.
https://treeherder.mozilla.org/logviewer.html#?job_id=17150819&repo=try

It seems that the assertion comes from profile timerline markers.  I am leaving ni? to me for now.
I can't find any suspicions to cause the dead lock among ObservedDocShellMutex for 
ObservedDocShell and TimelineConsumers::sMutex.

One thing I am little bit concerned is that nsDocShell::PopProfileTimelineMarkers calls ObservedDocShell::PopMarkers without acquiring TimelineConsumers::sMutex lock.  Should we call ObservedDocShell::PopMarkers through TimelineConsumers class to acquire the lock?

Victor, what do you think?
Flags: needinfo?(hiikezoe) → needinfo?(vporof)
(In reply to Hiroyuki Ikezoe (:hiro) from comment #4)
> 
> Victor, what do you think?

I don't see any particular reason why we could get races in between AddConsumer/RemoveConsumer and PopProfileTimelineMarkers (since that's pretty much the only way these paths could interfere with each other).

That being said, I also don't see any downside to doing the pop through TimelineConsumers and locking the mutex in there first. Let's try it out.
Flags: needinfo?(vporof)
Comment on attachment 8725010 [details]
MozReview Request: Bug 1244897 - Use TimerlineConsumers in PopProfileTimelineMarkers to ensure that all methods acquire TimelineConsumers Mutex. r?vporof

https://reviewboard.mozilla.org/r/37255/#review33899

::: docshell/base/timeline/ObservedDocShell.cpp:48
(Diff revision 1)
> +  MOZ_ASSERT(NS_IsMainThread());

Good catch.
Attachment #8725010 - Flags: review?(vporof) → review+
Comment on attachment 8725010 [details]
MozReview Request: Bug 1244897 - Use TimerlineConsumers in PopProfileTimelineMarkers to ensure that all methods acquire TimelineConsumers Mutex. r?vporof

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/37255/diff/1-2/
Forward declarations were wrong.  That can be compiled on Linux...
Keywords: leave-open
See Also: → 1253516
This assertion has been fixed.

Now we've been seeing real dead lock in bug 1253516.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Reopening.  The lock has been introduced in this bug was wrong.  See bug 1253516.


https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=24027118
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
See Also: → 1278588
Depending on bug 1278588.  As far as I can tell there is nothing we can do in dom/animation code.
Depends on: 1278588
Depending on bug 1324966, it might fix recent this failure.
Depends on: 1324966
I did a try with an idea (getting nsStyleContext for base style without style flush) that I think it makes this failure less frequent.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=805ce2ed004d71e21a8714cba8ef5676be86d71a&selectedJob=66922723

Here is a comparison result.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8f314eae97bd9ae42666fc134e979dd81ee519a3&filter-platform=Linux%20debug&filter-job_type_symbol=c1

I think the idea has some effect to reduce failure ratio.

Here is another try with another idea (getting nsStyleContext for base style from nsIFrame).  This try was just pushed now, so I am not sure what we will see, but this is more preferable idea than the first one to me.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=457b762cb3cb020e139d5fad86fba6ea50304f6b
The latter idea is apparently wrong. There is a case that nsIFrame does not have StyleContext yet at the time.
Now bug 1330190 has been landed, which fixes recursive restyles what I've seen suspecting the cause of recent failure of this orange.  It's suboptimal of bug 1324966, but eliminates the recursive calls I think.
No longer depends on: 1324966
OK, no oranges after landing bug 1330190 so far.  Though I don't know yet the reason before bug 1305325 (this bug introduced the reason of recent failures),  I'd like to close this bug once and open a new bug if new oranges re-appear.
Joel, what do you think?
Flags: needinfo?(jmaher)
thanks :hiro for fixing the bug and following up here!
Status: REOPENED → RESOLVED
Closed: 4 years ago3 years ago
Flags: needinfo?(jmaher)
Resolution: --- → FIXED
thank you!
Assignee: nobody → hikezoe
Target Milestone: --- → mozilla53
Keywords: leave-open
Whiteboard: [stockwell fixed]
You need to log in before you can comment on or make changes to this bug.