Open Bug 1646258 Opened 5 months ago Updated 4 months ago

Crash in [@ IPCError-browser | ShutDownKill | mozilla::dom::PromiseDebugging::AddConsumedRejection]

Categories

(Core :: DOM: Core & HTML, defect)

Unspecified
Windows 10
defect

Tracking

()

People

(Reporter: RyanVM, Unassigned)

Details

(Keywords: crash)

Crash Data

This bug is for crash report bp-25eeacae-d72f-4bb9-80b5-de51c0200607.

Top 8 frames of crashing thread:

0 xul.dll static mozilla::dom::PromiseDebugging::AddConsumedRejection dom/promise/PromiseDebugging.cpp
1 xul.dll static mozilla::CycleCollectedJSContext::PromiseRejectionTrackerCallback xpcom/base/CycleCollectedJSContext.cpp:332
2 xul.dll JSRuntime::removeUnhandledRejectedPromise js/src/vm/Runtime.cpp:646
3 xul.dll PerformPromiseThenWithReaction js/src/builtin/Promise.cpp:5343
4 xul.dll PerformPromiseThen js/src/builtin/Promise.cpp:5271
5 xul.dll OriginalPromiseThenBuiltin js/src/builtin/Promise.cpp:4396
6 xul.dll Promise_then_noRetVal js/src/builtin/Promise.cpp:5232
7  @0x341f2364040 
Crash Signature: [@ IPCError-browser | ShutDownKill | mozilla::dom::PromiseDebugging::AddConsumedRejection] → [@ IPCError-browser | ShutDownKill | mozilla::dom::PromiseDebugging::AddConsumedRejection] [@IPCError-browser | ShutDownKill | mozilla::CycleCollectedJSContext::PromiseRejectionTrackerCallback]
Component: JavaScript Engine → DOM: Core & HTML

The reports started to appear from nightly 79. You made changes to relevant js Promise code in Bug 1642683. Do you see what may cause the crash, :anba or :arai? Thank you.

Flags: needinfo?(arai.unmht)
Flags: needinfo?(andrebargull)

Bug 1642683 only changed the order in which the "resolve" property is accessed for Promise.{all, allSettled, any, race}(), which shouldn't lead to any observable differences except when user code calls these methods with a custom Promise sub-class. Furthermore the stack trace shows that Promise.prototype.then is called from JS code (via Promise_then_noRetVal) and definitely not from C++ code. This also rules out bug 1642683, because Promise.{all, allSettled, any, race}() are only calling Promise.prototype.then from C++ code (and actually only for the slow-path we're we don't call the inline implementation of Promise.prototype.then).

Flags: needinfo?(andrebargull)

from the crash report, there are too many uncaught rejections to handle.

in macOS reports, uncaughtRejections.length() [1] (%rcx in raw data) is 40948 to 792423.
and I think it takes too long time while shutting down, and parent process killed it.

I wonder if it's not cleared [2] frequently enough, or so much rejection happens in short time frame.

[1] https://searchfox.org/mozilla-central/rev/82c04b9cad5b98bdf682bd477f2b1e3071b004ad/dom/promise/PromiseDebugging.cpp#229-236
[2] https://searchfox.org/mozilla-central/rev/82c04b9cad5b98bdf682bd477f2b1e3071b004ad/dom/promise/PromiseDebugging.cpp#280

if that much rejection really happens and also gets consumed in short time frame (maybe especially when shutting down?)
we could add special path for such case.

currently, we're iterating over registered uncaught rejections for each consumption:

https://searchfox.org/mozilla-central/rev/82c04b9cad5b98bdf682bd477f2b1e3071b004ad/dom/promise/PromiseDebugging.cpp#229-236

/* void */
void PromiseDebugging::AddConsumedRejection(JS::HandleObject aPromise) {
  // If the promise is in our list of uncaught rejections, we haven't yet
  // reported it as unhandled. In that case, just remove it from the list
  // and don't add it to the list of consumed rejections.
  auto& uncaughtRejections =
      CycleCollectedJSContext::Get()->mUncaughtRejections;
  for (size_t i = 0; i < uncaughtRejections.length(); i++) {
    if (uncaughtRejections[i] == aPromise) {
      // To avoid large amounts of memmoves, we don't shrink the vector here.
      // Instead, we filter out nullptrs when iterating over the vector later.
      uncaughtRejections[i].set(nullptr);
      return;
    }
  }

so, if N rejection happens and consumed, we'll do N^2 loop.

a) if too many rejections happen without getting consumed, we could add hash map or something for such case, to speed up the consumption
b) if rejections are consumed immediately and there can be many nullptrs in the uncaught rejections, we could shrink the vector periodically

or, maybe we could immediately flush the rejection if uncaughtRejections vector gets too long?

so far, I can reproduce the ShutDownKill if a webpage keeps rejecting too many promises and handling the rejection afterward, while shutting down.
thus, one possibility is that there's crafted website, or similar DoS attack.

If this is caused by malicious website, or just a buggy website, possible solution for this specific case is
to stop tracking rejections/consumption while shutting down.
but still, website can prevent shutdown and ShutDownKill will happen in somewhere else.

Flags: needinfo?(arai.unmht)

What's the policy, or best practice, for the case that website runs a kind of infinite loop and that causes ShutDownKill ?

Flags: needinfo?(htsai)

(In reply to Tooru Fujisawa [:arai] from comment #7)

What's the policy, or best practice, for the case that website runs a kind of infinite loop and that causes ShutDownKill ?

I am not aware of anything particular. Andrew, do you have ideas?

Flags: needinfo?(htsai) → needinfo?(continuation)

I'm not aware of anything we do to deal with that. It does feel like we should have some intermediate state between letting a website do whatever it wants and hard killing the process, where we stop the website so we can shut down, but I don't know how that would work exactly.

Flags: needinfo?(continuation)
Severity: -- → S3
You need to log in before you can comment on or make changes to this bug.