Intermittent GECKO(8531) | Assertion failure: mOwningEventTarget->IsOnCurrentThread(), at /builds/worker/workspace/build/src/xpcom/threads/LazyIdleThread.cpp:461
Categories
(Toolkit :: Safe Browsing, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox71 | --- | unaffected |
firefox72 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: dimi)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: assertion, intermittent-failure, regression, Whiteboard: [stockwell disabled])
Crash Data
Attachments
(2 files)
Filed by: malexandru [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=272776557&repo=autoland
Full log: https://queue.taskcluster.net/v1/task/BCQhmO5FQGuHQY2D5U7NdQ/runs/0/artifacts/public/logs/live_backing.log
[task 2019-10-24T15:26:19.758Z] 15:26:19 INFO - GECKO(8531) | [Parent 8531, Main Thread] WARNING: Cannot set transparency mode on non-popup windows.: file /builds/worker/workspace/build/src/widget/gtk/nsWindow.cpp, line 4552
[task 2019-10-24T15:26:19.802Z] 15:26:19 INFO - GECKO(8531) | [Parent 8531, Main Thread] WARNING: Cannot set transparency mode on non-popup windows.: file /builds/worker/workspace/build/src/widget/gtk/nsWindow.cpp, line 4552
[task 2019-10-24T15:26:19.838Z] 15:26:19 INFO - GECKO(8531) | --DOCSHELL 0x7fb635712800 == 4 [pid = 8701] [id = {d296c83b-8be6-482d-815c-b2334b714e96}] [url = http://mochi.test:8888/tests/toolkit/components/url-classifier/tests/mochitest/gethashFrame.html]
[task 2019-10-24T15:26:19.839Z] 15:26:19 INFO - GECKO(8531) | --DOCSHELL 0x7fb635507800 == 3 [pid = 8701] [id = {fdefdb66-4709-4500-b2e4-f80445876441}] [url = http://mochi.test:8888/tests/toolkit/components/url-classifier/tests/mochitest/gethashFrame.html]
[task 2019-10-24T15:26:19.840Z] 15:26:19 INFO - GECKO(8531) | --DOMWINDOW == 11 (0x7fb6364ddc00) [pid = 8701] [serial = 21] [outer = (nil)] [url = http://mochi.test:8888/tests/toolkit/components/url-classifier/tests/mochitest/gethashFrame.html]
[task 2019-10-24T15:26:19.841Z] 15:26:19 INFO - GECKO(8531) | --DOMWINDOW == 10 (0x7fb636305000) [pid = 8701] [serial = 19] [outer = (nil)] [url = http://mochi.test:8888/tests/toolkit/components/url-classifier/tests/mochitest/gethashFrame.html]
[task 2019-10-24T15:26:19.869Z] 15:26:19 INFO - [8510, Main Thread] WARNING: 'result.isErr()', file /builds/worker/workspace/build/src/startupcache/StartupCache.cpp, line 173
[task 2019-10-24T15:26:19.904Z] 15:26:19 INFO - [8510, Main Thread] WARNING: 'result.isErr()', file /builds/worker/workspace/build/src/startupcache/StartupCache.cpp, line 173
[task 2019-10-24T15:26:19.926Z] 15:26:19 INFO - GECKO(8531) | --DOMWINDOW == 9 (0x7fb634fc93e0) [pid = 8701] [serial = 22] [outer = (nil)] [url = http://mochi.test:8888/tests/toolkit/components/url-classifier/tests/mochitest/gethashFrame.html]
[task 2019-10-24T15:26:19.946Z] 15:26:19 INFO - GECKO(8531) | [Parent 8531, Main Thread] WARNING: Cannot set transparency mode on non-popup windows.: file /builds/worker/workspace/build/src/widget/gtk/nsWindow.cpp, line 4552
[task 2019-10-24T15:26:19.994Z] 15:26:19 INFO - GECKO(8531) | [Parent 8531, Main Thread] WARNING: Cannot set transparency mode on non-popup windows.: file /builds/worker/workspace/build/src/widget/gtk/nsWindow.cpp, line 4552
[task 2019-10-24T15:26:20.099Z] 15:26:20 INFO - GECKO(8531) | [Child 8701, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004002: file /builds/worker/workspace/build/src/uriloader/base/nsDocLoader.cpp, line 963
[task 2019-10-24T15:26:20.101Z] 15:26:20 INFO - GECKO(8531) | [Child 8701, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004002: file /builds/worker/workspace/build/src/uriloader/base/nsDocLoader.cpp, line 978
[task 2019-10-24T15:26:20.128Z] 15:26:20 INFO - GECKO(8531) | [Parent 8531, Main Thread] WARNING: Cannot set transparency mode on non-popup windows.: file /builds/worker/workspace/build/src/widget/gtk/nsWindow.cpp, line 4552
[task 2019-10-24T15:26:20.144Z] 15:26:20 INFO - [8510, Main Thread] WARNING: 'result.isErr()', file /builds/worker/workspace/build/src/startupcache/StartupCache.cpp, line 173
[task 2019-10-24T15:26:20.194Z] 15:26:20 INFO - GECKO(8531) | Assertion failure: mOwningEventTarget->IsOnCurrentThread(), at /builds/worker/workspace/build/src/xpcom/threads/LazyIdleThread.cpp:461
[task 2019-10-24T15:26:40.442Z] 15:26:40 INFO - GECKO(8531) | #01: mozilla::safebrowsing::Classifier::~Classifier() [toolkit/components/url-classifier/Classifier.cpp:154]
[task 2019-10-24T15:26:40.442Z] 15:26:40 INFO -
[task 2019-10-24T15:26:40.442Z] 15:26:40 INFO - GECKO(8531) | #02: mozilla::safebrowsing::Classifier::Release() [toolkit/components/url-classifier/Classifier.h:28]
[task 2019-10-24T15:26:40.443Z] 15:26:40 INFO -
[task 2019-10-24T15:26:40.443Z] 15:26:40 INFO - GECKO(8531) | #03: mozilla::detail::RunnableFunction<mozilla::safebrowsing::Classifier::AsyncApplyUpdates(nsTArray<RefPtr<mozilla::safebrowsing::TableUpdate> > const&, std::function<void (nsresult)> const&)::$_2>::~RunnableFunction() [xpcom/threads/nsThreadUtils.h:555]
[task 2019-10-24T15:26:40.443Z] 15:26:40 INFO -
[task 2019-10-24T15:26:40.443Z] 15:26:40 INFO - GECKO(8531) | #04: mozilla::Runnable::Release() [xpcom/threads/nsThreadUtils.cpp:54]
[task 2019-10-24T15:26:40.443Z] 15:26:40 INFO -
[task 2019-10-24T15:26:40.443Z] 15:26:40 INFO - GECKO(8531) | #05: nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:1274]
[task 2019-10-24T15:26:40.443Z] 15:26:40 INFO -
[task 2019-10-24T15:26:40.443Z] 15:26:40 INFO - GECKO(8531) | #06: NS_ProcessNextEvent(nsIThread*, bool) [xpcom/threads/nsThreadUtils.cpp:486]
[task 2019-10-24T15:26:40.443Z] 15:26:40 INFO -
[task 2019-10-24T15:26:40.444Z] 15:26:40 INFO - GECKO(8531) | #07: mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) [ipc/glue/MessagePump.cpp:304]
[task 2019-10-24T15:26:40.444Z] 15:26:40 INFO -
[task 2019-10-24T15:26:40.444Z] 15:26:40 INFO - GECKO(8531) | #08: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:291]
[task 2019-10-24T15:26:40.444Z] 15:26:40 INFO -
[task 2019-10-24T15:26:40.445Z] 15:26:40 INFO - GECKO(8531) | #09: nsThread::ThreadFunc(void*) [xpcom/threads/nsThread.cpp:460]
[task 2019-10-24T15:26:40.445Z] 15:26:40 INFO -
[task 2019-10-24T15:26:40.590Z] 15:26:40 INFO - GECKO(8531) | #10: _pt_root [nsprpub/pr/src/pthreads/ptthread.c:204]
[task 2019-10-24T15:26:40.590Z] 15:26:40 INFO -
[task 2019-10-24T15:26:40.591Z] 15:26:40 INFO - GECKO(8531) | #11: libpthread.so.0 + 0x76ba
[task 2019-10-24T15:26:40.591Z] 15:26:40 INFO -
[task 2019-10-24T15:26:40.591Z] 15:26:40 INFO - GECKO(8531) | #12: libc.so.6 + 0x10741d
[task 2019-10-24T15:26:40.591Z] 15:26:40 INFO -
[task 2019-10-24T15:26:40.592Z] 15:26:40 INFO - GECKO(8531) | #13: ??? (???:???)
[task 2019-10-24T15:26:40.592Z] 15:26:40 INFO - GECKO(8531) | ExceptionHandler::GenerateDump cloned child 8872
[task 2019-10-24T15:26:40.592Z] 15:26:40 INFO - GECKO(8531) | ExceptionHandler::SendContinueSignalToChild sent continue signal to child
[task 2019-10-24T15:26:40.593Z] 15:26:40 INFO - GECKO(8531) | ExceptionHandler::WaitForContinueSignal waiting for continue signal...
[task 2019-10-24T15:26:40.593Z] 15:26:40 INFO - GECKO(8531) | MEMORY STAT | vsize 2516MB | residentFast 140MB | heapAllocated 14MB
[task 2019-10-24T15:26:40.594Z] 15:26:40 INFO - Buffered messages logged at 15:26:15
[task 2019-10-24T15:26:40.594Z] 15:26:40 INFO - TEST-PASS | toolkit/components/url-classifier/tests/mochitest/test_bug1254766.html | Should not load bad javascript
[task 2019-10-24T15:26:40.594Z] 15:26:40 INFO - TEST-PASS | toolkit/components/url-classifier/tests/mochitest/test_bug1254766.html | Should load clean css
[task 2019-10-24T15:26:40.595Z] 15:26:40 INFO - TEST-PASS | toolkit/components/url-classifier/tests/mochitest/test_bug1254766.html | Should not load bad css
[task 2019-10-24T15:26:40.595Z] 15:26:40 INFO - TEST-PASS | toolkit/components/url-classifier/tests/mochitest/test_bug1254766.html | Should not import bad css
[task 2019-10-24T15:26:40.595Z] 15:26:40 INFO - TEST-PASS | toolkit/components/url-classifier/tests/mochitest/test_bug1254766.html | Gethash request is triggered.
[task 2019-10-24T15:26:40.596Z] 15:26:40 INFO - Buffered messages logged at 15:26:16
Comment 1•6 years ago
|
||
Dimi, could this be caused by your changes in Bug 1553855?
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 3•6 years ago
|
||
(In reply to Alexandru Michis [:malexandru] from comment #1)
Dimi, could this be caused by your changes in Bug 1553855?
Yes, I think so
Updated•6 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 7•6 years ago
|
||
FWIW, I'm able to pretty consistently reproduce this locally on linux running with the changeset fd4898326aa235d87567e94a683338bfd5b80675 and running the test in --verify
mode.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 10•6 years ago
|
||
Brendan: Are you actively working or this or should I consider disabling it?
Comment hidden (Intermittent Failures Robot) |
Comment 13•6 years ago
|
||
Comment 14•6 years ago
|
||
Dimi, could you please take a look over the failures here? It's now on the disable recommended list with 179 total failures in the last 2 weeks. https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-10-06&endday=2019-11-05&tree=trunk&bug=1591112
This is asserting frequently on linux debug on gtest UrlClassifierTableUpdateV4.EmptyUpdate and on mochitest test_bug1254766.html and test_cachemiss.html. For the last two I made a disable patch. Edwin, could you please have a go at the gtest disable patch? Thank you.
Assignee | ||
Comment 15•6 years ago
|
||
yes, I should look into this earlier, sorry, will check this now.
Assignee | ||
Updated•6 years ago
|
Comment 16•6 years ago
|
||
Thanks Dimi.
Happy to scrap the disable patch.
Updated•6 years ago
|
Updated•6 years ago
|
Comment hidden (Intermittent Failures Robot) |
Updated•6 years ago
|
Updated•6 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment 20•6 years ago
|
||
Updated•6 years ago
|
Comment 21•6 years ago
|
||
Dimi, I didn't notice comment 16. please enable the test in your fix, thanks.
Comment 22•6 years ago
|
||
bugherder |
Comment 23•6 years ago
|
||
I ran into this on my try push, and had a quick look thinking it was due to my patches.
Is it possible that we get context switched right after dispatching the inner lambda in AsyncApplyUpdates?
I think that could mean the copy of 'self' owned by the outer lambda (on the update thread) might actually get Release() called after the copy owned by the inner (on the caller thread).
If that's the case, and this ended up being the last ref, then we'd try to shut down the Classifier and the update thread while still on the update thread.
You could change the lambda captures to capture as [self = std::move(self)] to pass the single reference around, rather than copying a new ref into each lambda.
Assignee | ||
Comment 24•6 years ago
•
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #23)
I ran into this on my try push, and had a quick look thinking it was due to my patches.
Is it possible that we get context switched right after dispatching the inner lambda in AsyncApplyUpdates?
I think that could mean the copy of 'self' owned by the outer lambda (on the update thread) might actually get Release() called after the copy owned by the inner (on the caller thread).
If that's the case, and this ended up being the last ref, then we'd try to shut down the Classifier and the update thread while still on the update thread.
You could change the lambda captures to capture as [self = std::move(self)] to pass the single reference around, rather than copying a new ref into each lambda.
Hi Matt, thanks for the information.
Yes, that is exactly one of the issues I found. The path I wrote in my local is to set 'self' to null right after creating second lambda. Your approach is definitely better in terms of coding style.
Another issue I found is that in the inner AsyncApplyUpdate (executed in the caller thread), its 'self' is not always released by the caller thread, sometimes it is released by the update thread, so I have to explicitly release 'self' in the inner lambda to ensure 'self' is released in the caller thread.
Comment 26•6 years ago
•
|
||
(In reply to Dimi Lee [:dimi][:dlee] from comment #24)
Hi Matt, thanks for the information.
Yes, that is exactly one of the issues I found. The path I wrote in my local is to set 'self' to null right after creating second lambda. Your approach is definitely better in terms of coding style.
Another issue I found is that in the inner AsyncApplyUpdate (executed in the caller thread), its 'self' is not always released by the caller thread, sometimes it is released by the update thread, so I have to explicitly release 'self' in the inner lambda to ensure 'self' is released in the caller thread.
That might be a similar bug, where fgRunnable has a ref owned by the outer lambda, released on the update thread. I think it's possible the inner lambda to have executed at that point, so it'd be deleting the runnable (and the actual lambda) from the wrong thread.
Comment 27•6 years ago
|
||
You could add another std::move, from the inner lambda's arguments, to a local within the inner. That should guarantee the transfer of ownership happens at the right time.
Comment 28•6 years ago
|
||
I think we should downgrade NS_INLINE_DECL_THREADSAFE_REFCOUNTING(Classifier) to normal non-threadsafe refcounting (since we have to release the last ref on the same thread anyway, and we're not actually threadafe).
That'd make all AddRef/Release calls from the update thread assert (which would catch this sort of bug earlier), but passing the ref around using std::move() should still be fine.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 30•6 years ago
|
||
In AsyncApplyUpdate, the Safe Browsing update thread holds a reference
to 'Classifier'. In some scenarios(See Bug 1591112), the update
thread may be the last one holding the reference; hence the update
thread releases the 'Classifier' when the task is ended.
Classifier has to be created and destroyed in the same thread
because of the constrain of LazyIdelThread, in the current
implementation, it should be released by the worker thread.
This patch transfers the ownership of the reference of 'Classifier 'from the
update thread to the worker thread before its task is finished to make
sure we release 'Classifier' in the right thread.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 32•6 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #28)
I think we should downgrade NS_INLINE_DECL_THREADSAFE_REFCOUNTING(Classifier) to normal non-threadsafe refcounting (since we have to release the last ref on the same thread anyway, and we're not actually threadafe).
That'd make all AddRef/Release calls from the update thread assert (which would catch this sort of bug earlier), but passing the ref around using std::move() should still be fine.
Hi Matt,
Thank you for all the help here :)
I'll file a follow-up bug to remove NS_INLINE_DECL_THREADSAFE_REFCOUNTING from Classifier.
Comment 33•6 years ago
|
||
Comment 34•6 years ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 36•6 years ago
|
||
Looks good
Updated•6 years ago
|
Updated•4 years ago
|
Description
•