Closed Bug 1575204 Opened 3 months ago Closed 3 months ago

Crash in [@ mozilla::dom::WorkerRunnable::Run]

Categories

(Core :: DOM: Service Workers, defect, critical)

Unspecified
Windows 10
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla70
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- unaffected
firefox69 --- unaffected
firefox70 --- fixed

People

(Reporter: marcia, Assigned: bzbarsky)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug is for crash report bp-91756ea8-5e49-4fbb-8610-13f570190820.

Seen while looking the nightly crash report: https://bit.ly/2Z4pGmp. Crashes started in 20190815193505 - Possible regression ranged based on build id: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a6ba020c9f7cd1abecd8eb2287020468ec1da6e8&tochange=b283a7ef186c216d765631f6cb1260a3fa2ee42c

bug 1573589? ni on bzbarsky

Top 10 frames of crashing thread:

0 xul.dll mozilla::dom::WorkerRunnable::Run dom/workers/WorkerRunnable.cpp:306
1 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1225
2 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486
3 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:303
4 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:308
5 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:290
6 xul.dll nsThread::ThreadFunc xpcom/threads/nsThread.cpp:458
7 nss3.dll static void _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:397
8 nss3.dll unsigned int pr_root nsprpub/pr/src/md/windows/w95thred.c:137
9 ucrtbase.dll thread_start<unsigned int , 1> 

Flags: needinfo?(bzbarsky)

This is still crashing a lot in the Windows builds of 08-22.

All the crashes I looked at so far seem to be near-null crashes (specifically at address 0x8) at https://hg.mozilla.org/mozilla-central/annotate/fce0b326cd318bf435d4e4c54ae331618059e073/dom/workers/WorkerRunnable.cpp#l306

The code around there looks like this:

  Maybe<mozilla::dom::AutoJSAPI> maybeJSAPI;
  Maybe<mozilla::dom::AutoEntryScript> aes;
  JSContext* cx;
  AutoJSAPI* jsapi;
  if (globalObject) {
    aes.emplace(globalObject, "Worker runnable", isMainThread);
    jsapi = aes.ptr();
    cx = aes->cx();
  } else {
    maybeJSAPI.emplace();
    maybeJSAPI->Init();  <-- CRASH IS CLAIMED TO BE HERE
    jsapi = maybeJSAPI.ptr();
    cx = jsapi->cx();
  }

It's possible that globalObject is null in more cases after bug 1573589, maybe. But that Init() call still shouldn't crash... Looking into what might be going on here.

I wonder whether we used to take this early return in code that I removed:

    JSContext* cx = GetCurrentWorkerThreadJSContext();
    if (NS_WARN_IF(!cx)) {
      return NS_ERROR_FAILURE;
    }

and now we press on, and land in the code in comment 2. And then possibly either CycleCollectedJSContext::Get() returns null or its Context() getter returns null. That seems like the most likely source of the null-deref here....

We're getting crashes because either there's no CycleCollectedJSContext or it
has a null JSContext. Hard to tell which, and whether this is happening
because our runnable comes really early in thread setup or really late in
thread teardown. In either case, this is restoring the null-check that used to
be there in this code.

Pushed by bzbarsky@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ff88b03d2dd9
Fix crash when trying to run worker runnables on a not-ready-for-it worker thread.  r=baku
Flags: needinfo?(bzbarsky)
Status: NEW → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla70
Assignee: nobody → bzbarsky
You need to log in before you can comment on or make changes to this bug.