framescripts are not backwards loaded in message order in non-e10s

RESOLVED INCOMPLETE

Status

()

defect
P3
normal
RESOLVED INCOMPLETE
4 years ago
a month ago

People

(Reporter: noitidart, Unassigned)

Tracking

41 Branch
Points:
---

Firefox Tracking Flags

(e10s+)

Details

(Whiteboard: triaged)

(Reporter)

Description

4 years ago
User Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0
Build ID: 20150831172306

Steps to reproduce:


0) in non e10s (im using beta channel 41b6 but this has been an issue for awhile now, now i know its not just me)
1) open 1 tab, about:blank
2) install addon: https://github.com/Noitidart/modtools which loads frame scripts to all tabs, the framescript checks the location and logs it, it also attaches a DOMContentLoaded listener, so if you reload that tab you will see "about:blank" log again to console
3) now install the addon again. the sutdown of the old addon broadcasts async message telling framescripts to unregister. HOWEVER this message does not go out yet, the shutdown completes, and the startup of hte new version of hte addon comes in and  you will see the load frame scripts of the updated version will happen. after the loading of the new version is complete, you will then see the unregister message come in, and all framescripts are unregistered. proof of this, is reload tab, you will no longer see "about:blank" getting logged



Expected results:

The yellow block here https://developer.mozilla.org/en-US/Firefox/Multiprocess_Firefox/Message_Manager/Frame_script_loading_and_lifetime#Unloading_frame_scripts

says:



Note: you might think that there is a race condition here due to the asynchronous nature of the message passing:

    your add-on is disabled for an upgrade
    your add-on broadcasts "disable" to your frame scripts
    your add-on is upgraded, and the new code loads new frame scripts
    the new frame scripts receive the "disable" message, and stop working

In fact, the message manager guarantees that loadFrameScript and broadcastAsyncMessage are guaranteed to affect frame scripts in the order that they are called, so in this case "disable" will be received and consumed before the new frame scripts are loaded.



They are wrong, at least for non-e10s, and i need backwards compat for esr. but it should be right.
Flags: needinfo?(wmccloskey)
Yes, this is an unfinished part of bug 1131375. I tried to fix it in attachment 8563169 [details] [diff] [review] but that didn't work out.

The basic problem is that frame scripts are loaded immediately in non-e10s. They should be loaded off an event like other message manager messages so that they're run in order.

Olli, is there any chance you have time to look at this? I think the main problem was with window.open, where we need to make sure that frame scripts *are* loaded synchronously before we return to content.
Component: Untriaged → DOM
Flags: needinfo?(wmccloskey) → needinfo?(bugs)
Product: Firefox → Core
Summary: framescripts are not backwards compatbile with non-e10s → framescripts are not backwards loaded in message order in non-e10s
(Reporter)

Comment 2

4 years ago
Oh man thanks Bill for the confimration. I was banging my head thinking it was me. I'll make a temporary note on MDN until this bug clears up to save others the headache.
(In reply to Bill McCloskey (:billm) from comment #1)
> Yes, this is an unfinished part of bug 1131375. I tried to fix it in
> attachment 8563169 [details] [diff] [review] but that didn't work out.
> 
> The basic problem is that frame scripts are loaded immediately in non-e10s.
> They should be loaded off an event like other message manager messages so
> that they're run in order.
That would be super regression risky. The idea is that after you've loaded scripts on one side you can assume the other side has them executed. 

> 
> Olli, is there any chance you have time to look at this? 
Well, it is not clear we want to change this.

>I think the main
> problem was with window.open, where we need to make sure that frame scripts
> *are* loaded synchronously before we return to content.
...as this clearly indicates.
Flags: needinfo?(bugs)
(Reporter)

Comment 4

4 years ago
I delayed my startup by 1sec as a quick solution, its working pretty good.

Comment 5

3 years ago
Why status of this bug is UNCONFIRMED?
In my extension, parent process script crashed after update. You think it is NORMAL? FF 48.

Comment 6

3 years ago
There is a workaround?
(Reporter)

Comment 7

3 years ago
CoolCmd, on startup(In reply to CoolCmd from comment #6)
> There is a workaround?

On startup of your addon, delay it by 1second using a timer such as - https://gist.github.com/Noitidart/7943f34cffc602a17e3e

Comment 8

3 years ago
noitidart, your workaround is bad. for example, while delaying, FF fails to open pages with custom protocol (registered by process script).

my workaround is following. it looks crazy, thanks to Mozilla for funny bug.

**** bootstrap.js ****

function startup(data, reason)
{
    Services.ppmm.loadProcessScript('process.js', true);
    Services.ppmm.broadcastAsyncMessage('coolcmd:process-initialize');
}

**** process.js ****

if (Services.appinfo.processType === Services.appinfo.PROCESS_TYPE_DEFAULT)
{
    addMessageListener('coolcmd:process-initialize', Initialize);
}
else
{
    Initialize();
}
function Initialize()
{
    removeMessageListener('coolcmd:process-initialize', Initialize);
    // ...
    addMessageListener('coolcmd:process-uninitialize', Uninitialize);
}

**** FF 48 e10s events on extension update ****

bootstrap.js shutdown
bootstrap.js uninstall
bootstrap.js load
bootstrap.js install
bootstrap.js startup
process.js parent load
process.js parent uninitialize
process.js child uninitialize
process.js parent initialize
process.js child load
process.js child initialize
(Reporter)

Comment 9

3 years ago
(In reply to CoolCmd from comment #8)
> noitidart, your workaround is bad. for example, while delaying, FF fails to
> open pages with custom protocol (registered by process script).
> 
> my workaround is following. it looks crazy, thanks to Mozilla for funny bug.
> 
> **** bootstrap.js ****
> 
> function startup(data, reason)
> {
>     Services.ppmm.loadProcessScript('process.js', true);
>     Services.ppmm.broadcastAsyncMessage('coolcmd:process-initialize');
> }
> 
> **** process.js ****
> 
> if (Services.appinfo.processType === Services.appinfo.PROCESS_TYPE_DEFAULT)
> {
>     addMessageListener('coolcmd:process-initialize', Initialize);
> }
> else
> {
>     Initialize();
> }
> function Initialize()
> {
>     removeMessageListener('coolcmd:process-initialize', Initialize);
>     // ...
>     addMessageListener('coolcmd:process-uninitialize', Uninitialize);
> }
> 
> **** FF 48 e10s events on extension update ****
> 
> bootstrap.js shutdown
> bootstrap.js uninstall
> bootstrap.js load
> bootstrap.js install
> bootstrap.js startup
> process.js parent load
> process.js parent uninitialize
> process.js child uninitialize
> process.js parent initialize
> process.js child load
> process.js child initialize

What I do is, on framescript injection, after registering the custom prototocl, when it finds a badly loaded custom protocol page, it will reload it.

Comment 10

3 years ago
Hi Olli, based on your last comment 3 - is this something that can be worked on.  the work around isn't ideal.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(bugs)
Priority: -- → P2
Whiteboard: triaged
This is super regression risky to change for non-e10s, since
var b = document.createElement("browser");
browser.messageManager.loadFrameScript("data:,addEventListener('foo', function(e) { dump(e); }, true, true);", true);
parent.appendChild(b);
b.contentWindow.dispatchEvent(new b.contentWindow.Event("foo"););

is supposed to work. We would need to block all .contentWindow and such access from chrome side.
billm, has there been any ideas for that? It would make the setup in non-e10s more compatible to e10s, and would ease fixing issues like this one.
Flags: needinfo?(bugs) → needinfo?(wmccloskey)
(In reply to Olli Pettay [:smaug] from comment #11)
> This is super regression risky to change for non-e10s, since
> var b = document.createElement("browser");
> browser.messageManager.loadFrameScript("data:,addEventListener('foo',
> function(e) { dump(e); }, true, true);", true);
> parent.appendChild(b);
> b.contentWindow.dispatchEvent(new b.contentWindow.Event("foo"););
> 
> is supposed to work.

"Supposed to work" seems too strong to me. It happens to work right now, and people might be relying on it. But people rely on all sorts of weird things. However, I don't have a sense of how much would break here. My experience is that loading a frame script into an individual browser element (rather than a window or the global message manager) is almost always going to break anyway due to things like remoteness switching.

Anyway, I feel like if we were going to fix this, we should have done it a long time ago. Most people who want to use frame scripts have probably worked around it somehow. Hopefully everyone else will use WebExtensions or the SDK.

> We would need to block all .contentWindow and such
> access from chrome side.
> billm, has there been any ideas for that? It would make the setup in
> non-e10s more compatible to e10s, and would ease fixing issues like this one.

That seems hard.
Flags: needinfo?(wmccloskey)

Comment 13

3 years ago
:noitidart how about the following workaround ?

Get a unique id during bootstrap (timestamp/random id), broadcast this id after framescript load, and during shutdown, broadcast message with this id. Inside the framescript, only unload if the id matches. 
This is considering that all messages are received in sequence. So when shutdown message reaches the newly loaded script, they're only loaded and not yet received of their unique-id. Only the old scripts with matching ids will be unloaded.

Roughly like below.

Bootstrap
=========

const unique_id = Date.now();

function startup() {
   ..
   Services.mm.loadFrameScript('chrome://myaddon/content/framescript.js', true);
   Services.mm.broadcastAsyncMessage('my-addon:unique-id', {id : unique_id});
   ..
}

function shutdown() {
   ..
   Services.mm.broadcastAsyncMessage('ssleuth@github:shutdown', { id: unique_id });
   ..
}

Frame script
============
    unique_id = null;
..
    onUnload(msg) {
         if (unique_id === msg.data.id) 
                shutdown();
    }
    onUniqueId(msg) {
         unique_id = msg.data.id;
    }

..
    addMessageListener("my-addon:unique-id", onUniqueId);
    addMessageListener("my-addon:unload", onUnload);
..

If there is a way to send data while loading a script we could send the id right away, and avoid adding a listener for id.  Like : chrome://myaddon/content/framescript.js?unique-id (if we could get the framescript URI from within the frame script so as to parse the id ?)
(Reporter)

Comment 14

3 years ago
:sibi.anthony that looks like it would work. It's complicated. I found that working around it with complicated methods like that is not worth it. As we should be designing for an e10s future.

The most fool proof and easiest method I found was on startup of the addon just do a 1 second wait and it resolves all issues. Easy mindless solution. :)

The added side affect of 1sec wait is that it helps with Firefox startup times if your addon does some heavy startup procedures (which it shouldn't, but if it does the 1s wait helps UX).
Flags: needinfo?(sibi.antony)

Comment 15

3 years ago
:noitidart Yes, the wait should work too, although I didn't want to put a delay in the add-on code.

In my case, appending a unique id to the framescript was recommended (to avoid any cache clearing issues), so just wanted to leverage on that.
Flags: needinfo?(sibi.antony)
Priority: P2 → P3
So is the plan to just not fix this and it will just go away when we switch completely to e10s?

(Note we will still have system bootstrap extensions after 57, so they will continue to run into this).
With Firefox 57 only WebExtensions are permitted and are, by default, e10s compatible.
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → INCOMPLETE
Component: DOM → DOM: Core & HTML
Product: Core → Core
You need to log in before you can comment on or make changes to this bug.