Crash in [@ EnqueuePromiseReactionJob] spiking from zh-cn locales since 2019-05-15
Categories
(Core :: JavaScript Engine, defect, P1)
Tracking
()
People
(Reporter: philipp, Assigned: arai)
References
Details
(Keywords: crash)
Crash Data
[Tracking Requested - why for this release]:
This bug is for crash report bp-14f8a344-f0c7-42b9-9a5a-f51fd0190516.
Top 10 frames of crashing thread:
0 xul.dll static bool EnqueuePromiseReactionJob js/src/builtin/Promise.cpp:1069
1 xul.dll static bool ResolvePromise js/src/builtin/Promise.cpp:1125
2 xul.dll static bool FulfillMaybeWrappedPromise js/src/builtin/Promise.cpp:1162
3 xul.dll static bool ResolvePromiseInternal js/src/builtin/Promise.cpp:862
4 xul.dll static bool ResolveOrRejectPromise js/src/jsapi.cpp:4038
5 xul.dll mozilla::dom::Promise::MaybeResolve dom/promise/Promise.cpp:298
6 xul.dll static void mozilla::dom::Promise::MaybeSomething<RefPtr<mozilla::dom::cache::Cache>&> dom/promise/Promise.h:251
7 xul.dll mozilla::dom::cache::CacheOpChild::Recv__delete__ dom/cache/CacheOpChild.cpp:158
8 xul.dll mozilla::dom::cache::PCacheOpChild::OnMessageReceived ipc/ipdl/PCacheOpChild.cpp:106
9 xul.dll void mozilla::ipc::MessageChannel::DispatchMessage ipc/glue/MessageChannel.cpp:2087
this content crash signature is spiking up since 24 hours ago from users of zh-cn builds, so far there are about 700 submitted reports:
urls in reports commonly show sites on the https://blog.csdn.net domain, so probably some change on the site there or from embedded ad scripts is starting to trigger this now
Updated•4 years ago
|
Comment 1•4 years ago
|
||
Jason, any chance you can look at this?
Comment 2•4 years ago
|
||
OK, this is probably bug 1447327. Lots of good info in that bug, including STR. We can fix this.
Updated•4 years ago
|
Comment 3•4 years ago
|
||
Not tracking for 67, we only had a handful of crashes with this signature during the beta cycle and only 4 since we shipped 67 to release.
Updated•4 years ago
|
Hi there,
Some users reported that they encountered a crash when using a script to access bilibili.com. I traced it back to this bug.
A reproducible method is: use tampermonkey to install the script greasyfork.org/scripts/10357, and then visit bilibili.com, there is a high probability that the page will crash, while using other browsers to perform the same operation will not.
This is the crash report related to it:
bp-a5196111-f09d-4548-919d-b54670210320
bp-ee903d31-401d-4e5b-bd5c-a07c80210328
bp-7d449510-2242-4574-8109-b733e0210328
bp-6d38b51e-5478-4646-8c72-f5bb40210328
BTW I don't have access to see comments in crash report. I was wondering if there is any mention of url in the comments.
Comment 5•2 years ago
|
||
[Tracking Requested - why for this release]: There is a sharp increase of these crashes on March 11, almost all crashes are from zh-CN builds.
Comment 6•2 years ago
•
|
||
Probably the same as https://github.com/mozilla-mobile/fenix/issues/18155
EDIT: Or not, this one seems to be showing only Desktop crashes.
Comment 7•2 years ago
|
||
(In reply to yxu from comment #4)
BTW I don't have access to see comments in crash report. I was wondering if there is any mention of url in the comments.
https://www.bilibili.com/ is indeed showing up a lot in the URL field of submitted crash reports. Steven, can you please help us find an owner for this big crash spike on release?
Comment 8•2 years ago
|
||
Tooru, would you be the right person to investigate this issue?
So far, most of the crashes are null pointer reads at this location.
This is a bit strange as this line is supposed to be guarded by the previous condition, unless the location are badly reported, in which case this could be within the nonCCWGlobal
function, which would suggest that a realm got GC-ed?
Updated•2 years ago
|
Assignee | ||
Comment 9•2 years ago
|
||
(In reply to yxu from comment #4)
Hi there,
Some users reported that they encountered a crash when using a script to access bilibili.com. I traced it back to this bug.A reproducible method is: use tampermonkey to install the script greasyfork.org/scripts/10357, and then visit bilibili.com, there is a high probability that the page will crash, while using other browsers to perform the same operation will not.
This is the crash report related to it:
bp-a5196111-f09d-4548-919d-b54670210320
bp-ee903d31-401d-4e5b-bd5c-a07c80210328
bp-7d449510-2242-4574-8109-b733e0210328
bp-6d38b51e-5478-4646-8c72-f5bb40210328BTW I don't have access to see comments in crash report. I was wondering if there is any mention of url in the comments.
Thank you for the info.
Unfortunately I cannot reproduce the crash with the following steps:
- Install Firefox 87.0 zh-CN (https://ftp.mozilla.org/pub/firefox/releases/87.0/win64/zh-CN/Firefox%20Setup%2087.0.exe), on Windows 10 Pro 20H2
- Run Firefox 87.0 zh-CN, with a new clean profile
- Install Tampermonkey https://addons.mozilla.org/en-US/firefox/addon/tampermonkey/
- Install the the script greasyfork.org/scripts/10357
- Open https://www.bilibili.com/ in 10 tabs
- Click the up/down arrows added by scripts 4 times, for each tab
Does the crash happen for you with the above steps?
If it happens, how frequent is it?
If it doesn't, do you have any idea what the difference between your steps and the above?
Comment 10•2 years ago
|
||
Interesting, I found that this crash only occurs after logging in bilibili account. Enabling the script on the bilibili.com homepage before logging in don't cause a crash, but it happens immediately after that. And the frequency is 100%. Tooru, do you have a bilibili account login for testing?
The crashes reported by me and other users is also under Windows, version 87.0. Judging from the increased time of feedback and crashes, it is reasonable to suspect that it happened from 86.0.1.
In addition, I found https://greasyfork.org/zh-CN/scripts/372673 will also cause the same type of crash.
Assignee | ||
Comment 11•2 years ago
|
||
Thank you again!
Indeed, reproduced the crash after logging into a test account I just created.
will investigate the crash.
Assignee | ||
Comment 12•2 years ago
|
||
The crash happens because reaction->getAndClearIncumbentGlobalObject()
returns a CCW with mHasSecurityPolicy == true
and mFlags == 0x01
, that I suppose CrossCompartmentSecurityWrapper
.
(not XrayWrapper
because it doesn't have WrapperFactory::IS_XRAY_WRAPPER_FLAG
),
Rooted<GlobalObject*> global(cx);
if (JSObject* objectFromIncumbentGlobal =
reaction->getAndClearIncumbentGlobalObject()) {
objectFromIncumbentGlobal = CheckedUnwrapStatic(objectFromIncumbentGlobal);
JS_FRIEND_API JSObject* js::CheckedUnwrapStatic(JSObject* obj) {
while (true) {
JSObject* wrapper = obj;
obj = UnwrapOneCheckedStatic(obj);
if (!obj || obj == wrapper) {
return obj;
}
}
}
JS_FRIEND_API JSObject* js::UnwrapOneCheckedStatic(JSObject* obj) {
...
const Wrapper* handler = Wrapper::wrapperHandler(obj);
return handler->hasSecurityPolicy() ? nullptr : Wrapper::wrappedObject(obj);
}
That results in returning nullptr
from js::UnwrapOneCheckedStatic
, and js::CheckedUnwrapStatic
.
Assignee | ||
Comment 13•2 years ago
|
||
there seems to be a kind of race between "associating multiple globals with document.domain
" and "a userscript's execution",
and that behaves badly around setting/getting incumbent global, in promise handling.
the "multiple globals" are top-level page and inline frame, hosted in different subdomain.
the inline frame's script tries to call a function in the top-level page's script, and receives a Promise
object, and then calls then
on it.
js::GetObjectFromIncumbentGlobal
somehow creates a security wrapper in some case, and that results in failure case in CheckedUnwrapStatic
above.
to my understanding, in such case the access to the top-level page's data should fail (and so the promise handling in then
call won't happen), but it somehow succeeds.
in bilibili page, the security wrapper is created almost 100% of the time, but with simplified testcase, it's created only 10% or less of the time.
and in failure case, looks like the userscript's execution takes longer, and also it fails to access top-level page's property.
(the userscript accesses window.top.addEventListener
, and fails)
Assignee | ||
Comment 14•2 years ago
|
||
the issue is either:
- (a) the incumbent global object shouldn't be a security wrapper, but unexpectedly it becomes so
- (b) the incumbent global object can be a security wrapper, and the code around that should handle the case
if (a), we should look into the code that causes the situation and somehow fix that (not yet sure the details).
if (b), we should throw either when getting incumbent global object before storing it into reaction (GetObjectFromIncumbentGlobal
), or when getting incumbent global object from reaction (EnqueuePromiseReactionJob
, or more later)
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Comment 15•2 years ago
|
||
So far, the fix from bug 1702863 looks like it's working well on Nightly (no crashes reported since it landed). We'll keep an eye on this after 88.0rc1 ships and close it out if that trend continues.
Comment 16•2 years ago
|
||
No signs of this crash on 88.0rc1 either. Closing this out.
Description
•