Bug 1663512 Comment 1 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

We are clearly triggering this diagnostic assert: https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/dom/workers/remoteworkers/RemoteWorkerManager.cpp#157-158
and so [E10SUtils.getRemoteTypeForWorkerPrincipal](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#644-650) did throw an exception.

On non-nightly channel the diagnostic assert wouldn't be triggered and so it wouldn't crash, but the remote worker being launched would be aborted and so it would still be an issue on non nightly channels.

It would be interesting to know which error condition we are triggering in those calls to `E10SUtils.getRemoteTypeForWorkerPrincipal`, e.g. to double-check if any of our assumptions in that method isn't actually met, but unfortunately we are not currently including in the diagnostic assert message enough details to determine that.

The following are some of the scenarios that would make `E10SUtils.getRemoteTypeForWorkerPrincipal` to throw for sure:

- [the worker principal was an expanded principal](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#651-656)
- [a service worker with a non content principal](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#658-664)
- [the call to E10SUtils.getRemoteTypeForURIObject to be raising an exception](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#710-719)
- [a shared worker with a non content principal for which we were not able to select a remote type](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#722-725)

I think that some reasonable next steps to handle this could be:

- adding some additional details to the diagnostic assert to help us to better assess what scenario is making us to trigger this diagnostic assertion (e.g. a subset of the details related to the call parameters, like the principal type and the URI scheme, the worker type and the preferred remote type may be enough).

- bring back the old `RemoteWorkerManager::GetRemoteType` and switch between the old and new logic based on a `mirror: once` static pref, and keep the new logic enabled on Nightly only for now, until we have this figured out and we are confident that we can let the new logic to ride the train (especially given that, based on the telemetry environment related to the crash reports I did look into, the diagnostic assert is being triggered with fission disabled).

I'm needinfo-ing Andrew to let him know about this issue, and to get his opinion about the issue and the proposed next steps described above.
We are clearly triggering this diagnostic assert: https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/dom/workers/remoteworkers/RemoteWorkerManager.cpp#157-158
and so [E10SUtils.getRemoteTypeForWorkerPrincipal](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#644-650) did throw an exception.

On non-nightly channel the diagnostic assert wouldn't be triggered and so it wouldn't crash, but the remote worker being launched would be aborted and so it would still be an issue on non nightly channels.

It would be interesting to know which error condition we are triggering in those calls to `E10SUtils.getRemoteTypeForWorkerPrincipal`, e.g. to double-check if any of our assumptions in that method isn't actually met, but unfortunately we are not currently including in the diagnostic assert message enough details to determine that.

The following are some of the scenarios that would make `E10SUtils.getRemoteTypeForWorkerPrincipal` to throw for sure:

- [the worker principal was an expanded principal](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#651-656)
- [a service worker with a non content principal](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#658-664)
- [the call to E10SUtils.getRemoteTypeForURIObject to be raising an exception](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#710-719), inside this method we do explicitly throw an error if we are selecting a remote type for a worker principal and we met one of this unexpected scenario:
  - [the worker principal URI did have an "ext+" scheme](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#517-525)
  - [the worker principal URI was a nested URI](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#539-547)
  - [the call to validateWebRemoteType to be raising an exception](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#565-572), inside this method we do explicitly throw an error if we are selecting a remote type for a worker principal and we met one of this unexpected scenario:
    - [fission is enabled and hasPotentiallyWebHandledScheme did return true for the worker principal uri](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#188-196) 
- [a shared worker with a non content principal for which we were not able to select a remote type](https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/toolkit/modules/E10SUtils.jsm#722-725)

I think that some reasonable next steps to handle this could be:

- adding some additional details to the diagnostic assert to help us to better assess what scenario is making us to trigger this diagnostic assertion (e.g. a subset of the details related to the call parameters, like the principal type and the URI scheme, the worker type and the preferred remote type may be enough).

- bring back the old `RemoteWorkerManager::GetRemoteType` and switch between the old and new logic based on a `mirror: once` static pref, and keep the new logic enabled on Nightly only for now, until we have this figured out and we are confident that we can let the new logic to ride the train (especially given that, based on the telemetry environment related to the crash reports I did look into, the diagnostic assert is being triggered with fission disabled).

I'm needinfo-ing Andrew to let him know about this issue, and to get his opinion about the issue and the proposed next steps described above.

Back to Bug 1663512 Comment 1