Closed Bug 1606157 Opened 3 months ago Closed 3 months ago

Assertion failure: Request::mDisconnected, at src/obj-firefox/dist/include/mozilla/MozPromise.h:439

Categories

(Core :: DOM: Service Workers, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla74
Tracking Status
firefox-esr68 --- unaffected
firefox72 --- wontfix
firefox73 --- wontfix
firefox74 --- fixed

People

(Reporter: tsmith, Assigned: perry)

References

(Blocks 3 open bugs)

Details

(Keywords: assertion, regression, testcase)

Crash Data

Attachments

(2 files, 2 obsolete files)

Log generated with m-c 20191227-2ecfe5397f49

Assertion failure: Request::mDisconnected, at src/obj-firefox/dist/include/mozilla/MozPromise.h:439

0|0|libxul.so|mozilla::MozPromise<bool, nsresult, true>::ThenValueBase::AssertIsDead()|hg:hg.mozilla.org/mozilla-central:xpcom/threads/MozPromise.h:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|439|0x29
0|1|libxul.so|mozilla::MozPromise<bool, nsresult, true>::AssertIsDead()|hg:hg.mozilla.org/mozilla-central:xpcom/threads/MozPromise.h:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|976|0x10
0|2|libxul.so|mozilla::MozPromise<bool, nsresult, true>::~MozPromise()|hg:hg.mozilla.org/mozilla-central:xpcom/threads/MozPromise.h:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|1017|0x9
0|3|libxul.so|mozilla::MozPromise<bool, nsresult, true>::Private::~Private()|hg:hg.mozilla.org/mozilla-central:xpcom/threads/MozPromise.h:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|250|0xe
0|4|libxul.so|mozilla::MozPromiseRefcountable::Release()|hg:hg.mozilla.org/mozilla-central:xpcom/threads/MozPromise.h:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|145|0x3b
0|5|libxul.so|mozilla::dom::UnregisterCallback::Release()|hg:hg.mozilla.org/mozilla-central:dom/serviceworkers/ServiceWorkerUnregisterCallback.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|12|0x50
0|6|libxul.so|mozilla::dom::(anonymous namespace)::UnregisterJobCallback::Release()|hg:hg.mozilla.org/mozilla-central:dom/serviceworkers/ServiceWorkerManager.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|1275|0x4e
0|7|libxul.so|nsTArray_Impl<RefPtr<mozilla::dom::ServiceWorkerJob::Callback>, nsTArrayInfallibleAllocator>::ClearAndRetainStorage()|hg:hg.mozilla.org/mozilla-central:xpcom/ds/nsTArray.h:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|1333|0x21
0|8|libxul.so|nsTArray_Impl<RefPtr<mozilla::dom::ServiceWorkerJob::Callback>, nsTArrayInfallibleAllocator>::~nsTArray_Impl()|hg:hg.mozilla.org/mozilla-central:xpcom/ds/nsTArray.h:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|914|0x8
0|9|libxul.so|mozilla::dom::ServiceWorkerJob::~ServiceWorkerJob()|hg:hg.mozilla.org/mozilla-central:dom/serviceworkers/ServiceWorkerJob.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|119|0x9
0|10|libxul.so|mozilla::dom::ServiceWorkerUnregisterJob::~ServiceWorkerUnregisterJob()|hg:hg.mozilla.org/mozilla-central:dom/serviceworkers/ServiceWorkerUnregisterJob.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|57|0xe
0|11|libxul.so|mozilla::dom::ServiceWorkerJob::Release()|hg:hg.mozilla.org/mozilla-central:dom/serviceworkers/ServiceWorkerJob.h:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|115|0x44
0|12|libxul.so|nsTArray_Impl<RefPtr<mozilla::dom::ServiceWorkerJob>, nsTArrayInfallibleAllocator>::RemoveElementsAtUnsafe(unsigned long, unsigned long)|hg:hg.mozilla.org/mozilla-central:xpcom/ds/nsTArray.h:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|2309|0x2e
0|13|libxul.so|nsTArray_Impl<RefPtr<mozilla::dom::ServiceWorkerJob>, nsTArrayInfallibleAllocator>::RemoveElementsAt(unsigned long, unsigned long)|hg:hg.mozilla.org/mozilla-central:xpcom/ds/nsTArray.h:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|2303|0xe
0|14|libxul.so|mozilla::dom::ServiceWorkerManager::MaybeStartShutdown()|hg:hg.mozilla.org/mozilla-central:dom/serviceworkers/ServiceWorkerManager.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|496|0xd
0|15|libxul.so|mozilla::dom::ServiceWorkerManager::Observe(nsISupports*, char const*, char16_t const*)|hg:hg.mozilla.org/mozilla-central:dom/serviceworkers/ServiceWorkerManager.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|2846|0x8
0|16|libxul.so|non-virtual thunk to mozilla::dom::ServiceWorkerManager::Observe(nsISupports*, char const*, char16_t const*)|hg:hg.mozilla.org/mozilla-central:dom/serviceworkers/ServiceWorkerManager.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|0|0xd
0|17|libxul.so|nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*)|hg:hg.mozilla.org/mozilla-central:xpcom/ds/nsObserverList.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|66|0x1b
0|18|libxul.so|nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*)|hg:hg.mozilla.org/mozilla-central:xpcom/ds/nsObserverService.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|292|0x19
0|19|libxul.so|nsXREDirProvider::DoShutdown()|hg:hg.mozilla.org/mozilla-central:toolkit/xre/nsXREDirProvider.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|1029|0x1d
0|20|libxul.so|ScopedXPCOMStartup::~ScopedXPCOMStartup()|hg:hg.mozilla.org/mozilla-central:toolkit/xre/nsAppRunner.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|1220|0xc
0|21|libxul.so|mozilla::UniquePtr<ScopedXPCOMStartup, mozilla::DefaultDelete<ScopedXPCOMStartup> >::reset(ScopedXPCOMStartup*)|hg:hg.mozilla.org/mozilla-central:mfbt/UniquePtr.h:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|323|0x8
0|22|libxul.so|XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&)|hg:hg.mozilla.org/mozilla-central:toolkit/xre/nsAppRunner.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|4764|0xa
0|23|libxul.so|XRE_main(int, char**, mozilla::BootstrapConfig const&)|hg:hg.mozilla.org/mozilla-central:toolkit/xre/nsAppRunner.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|4818|0x10
0|24|firefox-bin|main|hg:hg.mozilla.org/mozilla-central:browser/app/nsBrowserApp.cpp:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|217|0x1d
0|25|libc-2.27.so||||0x21b97
0|26|firefox-bin|<name omitted>|hg:hg.mozilla.org/mozilla-central:mfbt/UniquePtr.h:2ecfe5397f49abba9992bae43b1fc4fc847fa6bb|274|0x17

We are working on reducing a test case and it will be attached once it is available.

A Pernosco session is available here: https://pernos.co/debug/gUmIKdl0dhwTNGoxada5VQ/index.html

Flags: needinfo?(perry)
Priority: -- → P2
Attached file testcase.zip

To reproduce the issue with the attached testcase:

  • Unpack the zip
  • run a local webserver from where the test was unpacked. python -m SimpleHTTPServer
  • launch browser with prefs.js contained in the zip file
  • visit http://localhost:8000/testcase.html
Assignee: nobody → perry
Flags: needinfo?(perry)

When reproducing do I just let it run or do I need to close the browser after a while? I see shutdown stuff in the backtrace.

Flags: needinfo?(twsmith)

(In reply to Perry Jiang [:perry] from comment #3)

When reproducing do I just let it run or do I need to close the browser after a while? I see shutdown stuff in the backtrace.

The test case should close the browser (make sure you use the provided prefs.js file) or crash. It may take a few attempts but it works. Did the Pernosco session capture what was needed?

Flags: needinfo?(twsmith)

Yeah, I think I have a good idea of what's going on based off pernosco but wanted to verify locally. I'll try running the test case a few more times.

I'd be happy to test a patch if you like.

Awesome, could you check with the patch I just put up?

Flags: needinfo?(twsmith)

I can still reproduce the issue with the patch applied to m-c 5fd4cfacc90d. More bad news I was also unable to reproduce the issue with the attached testcase using that revision without the patch applied, I had to use a much larger unreduced testcase.

Flags: needinfo?(twsmith)

Unassigning myself until reproducible (pernosco trace has expired too).

Assignee: perry → nobody

Hey Perry, the Pernosco session has been restored.

Flags: needinfo?(perry)

Here is another Pernosco session. This is a trace of the failure with the patch applied to m-c 20200103-5fd4cfacc90d: https://pernos.co/debug/sYJeheHndYv8AHZFJTNIdg/index.html

Assignee: nobody → perry
Flags: needinfo?(perry)
Attachment #9118625 - Attachment is obsolete: true

Could you try to test with the new patch I uploaded? (Still can't reproduce)

Flags: needinfo?(twsmith)

Triggered Assertion failure: mPromise, at dom/serviceworkers/ServiceWorkerManager.cpp:563 with latest patch applied to m-c b90fb1397c13.

Here is the Pernosco session: https://pernos.co/debug/S0XerBgJjF1EwRllgQbLJg/index.html

Flags: needinfo?(twsmith)
Attachment #9119208 - Attachment description: Bug 1606157 - ServiceWorkerJob::Callback should handle its job being canceled → Bug 1606157 - ServiceWorkerJob notify callbacks its being discarded

Maybe third time's the charm? Treeherder doesn't show anything going wrong with the latest WIP patch: https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=284073934&revision=3ca1569a9fe43229117c27fb44361843cb79a86d

Forgot to ni?

Flags: needinfo?(twsmith)

With the latest patch I see: Assertion failure: !Failed(), at dist/include/mozilla/ErrorResult.h:484

The Pernosco session is available here: https://pernos.co/debug/YpSrfKksvpohgOrEfybTfQ/index.html

Flags: needinfo?(twsmith)

Unfortunately there is some little misalignment between the stack trace and the source files in the pernosco session, in particular at ServiceWorkerJob::InvokeResultCallbacks () at /ServiceWorkerJob.cpp:157, which contains the most interesting pieces.

The assertion happens during the destruction of an ErrorResult somewhere within ServiceWorkerJob::InvokeResultCallbacks () (and not in any called-from-there function, it seems). But looking at the current code in searchfox the local rv seems to be safe, as rv.SuppressException(); is being called before it leaves the scope boundary. And I see no way how this destructor could have been called earlier on that local object, unless the compiler optimized out rv.SuppressException(), which I cannot believe but did not check. Or maybe yes, I can believe, as this code might be inlined and sets a value that is never used inside the loop except inside the destructor, which might not be relevant for the compiler for this kind of optimization?

Anyway, :tsmith, is there a chance to get an aligned pernosco trace? Thank you!

Flags: needinfo?(twsmith)
See Also: → 1588984

It looks like pernosco pulls the source from Mozilla's mercurial repository, so while the line numbers might be accurate with the patch applied, the source shown by pernosco won't be able to include the patch...leading to some misalignment.

Then :tsmith probably cannot do anything about it, I assume?

Flags: needinfo?(twsmith)

Right, unfortunately not AFAIK.

Attached file Bug 1606157 - modified testcase.html (obsolete) —

I'm able to pretty consistently reproduce using my modified testcase.html. I have to manually close the browser a few seconds after testcase.html loads though because window.close reports "Scripts may not close windows that were not opened by script."

Attachment #9119208 - Attachment description: Bug 1606157 - ServiceWorkerJob notify callbacks its being discarded → Bug 1606157 - ServiceWorkerJob notify callbacks its being discarded r?asuth

(In reply to Perry Jiang [:perry] from comment #20)

It looks like pernosco pulls the source from Mozilla's mercurial repository, so while the line numbers might be accurate with the patch applied, the source shown by pernosco won't be able to include the patch...leading to some misalignment.

pernosco-submit should be uploading the local source as long as the source-tree or its ancestor is included in the list of <source-dir> arguments. I know there is some logic in there that tries to optimize things. So it seems like the situation is one of:

  1. pernosco-submit wasn't run with the source tree listed so things fell back to the hg revision
  2. there's a bug in pernosco-submit and/or pernosco that it didn't include the local modifications.
  3. there's a different bug in pernosco that it got confused despite having the right source

We should probably figure out which it was, because it's a huge problem if there's a line alignment problem and pernosco should probably at least be displaying warnings or errors if it can't line up the source correctly.

:tsmith, needinfo-ing you for when you're back around.

Flags: needinfo?(twsmith)

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #25)

  1. pernosco-submit wasn't run with the source tree listed so things fell back to the hg revision

Right, I forgot to add the src path when submitting, it was my fault. I can resubmit but it looks like Perry has a new patch.

Perry would you like me to test the new patch? Also setting the pref dom.allow_scripts_to_close_windows=true will allow window.close() to work as expected.

Flags: needinfo?(twsmith) → needinfo?(perry)

That would be great, thanks! I'll wait to land then.

Flags: needinfo?(perry)

Alright! With the patch applied to m-c 12bf9dc0978e I no longer see any of the related assertions.

Pushed by pjiang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/32d6e8a34f39
ServiceWorkerJob notify callbacks its being discarded r=asuth
Crash Signature: [@ mozilla::MozPromise<T>::ThenValueBase::AssertIsDead]
Duplicate of this bug: 1588984
Status: NEW → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla74

(In reply to Jens Stutte [:jstutte] from comment #19)

Or maybe yes, I can believe, as this code might be inlined and sets a value that is never used inside the loop except inside the destructor, which might not be relevant for the compiler for this kind of optimization?

Just for the sake of my own piece of mind and completeness: nothing is optimized out as I see from the diassembly (I was able to look at only this morning, as the build took too long yesterday on my home laptop). So I want to believe that the Pernosco session was too confused to understand what was going on (not excluding my own limitations to be the cause).

Is there a user impact here which justifies uplift consideration or can this fix ride Fx74 to release?

Flags: needinfo?(perry)

This bug appears to have fixed https://bugzilla.mozilla.org/show_bug.cgi?id=1605556#c6 - just in case that does justify uplift..

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression

(In reply to Ryan VanderMeulen [:RyanVM] from comment #33)

Is there a user impact here which justifies uplift consideration or can this fix ride Fx74 to release?

I would say https://bugzilla.mozilla.org/show_bug.cgi?id=1605556#c6 is justifiable user impact.

Flags: needinfo?(perry)

Assuming that's a sane-sounding result from this patch, please nominate this for Beta approval when you get a chance. Looks like it grafts cleanly as-landed.

Flags: needinfo?(perry)

Based on the recent comments in bug 1605556, doesn't sound like we need this on Beta after all.

Flags: needinfo?(perry)
Attachment #9120200 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.