Closed Bug 1738713 Opened 1 month ago Closed 22 days ago

browser.contentScripts.register breaks popup blocker on local files

Categories

(WebExtensions :: General, defect, P1)

Firefox 95
defect

Tracking

(firefox-esr91 wontfix, firefox93 wontfix, firefox94 wontfix, firefox95 wontfix, firefox96 verified)

VERIFIED FIXED
96 Branch
Tracking Status
firefox-esr91 --- wontfix
firefox93 --- wontfix
firefox94 --- wontfix
firefox95 --- wontfix
firefox96 --- verified

People

(Reporter: bugzilla, Assigned: rpl)

References

Details

(Whiteboard: [addons-jira])

Attachments

(6 files, 2 obsolete files)

Attached file registercontentscriptbug-1.0.zip (obsolete) —

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:93.0) Gecko/20100101 Firefox/93.0

Steps to reproduce:

  1. Install the minimal reproduction web extension attached
  2. enable the popup blocker
  3. open the attached test.html with the file-protocol
  4. click "OpenSites"
  5. close the opened tab and allow the other two tabs to be opened
  6. close the two tabs
  7. navigate to google.com in the tab with the local file
  8. navigate back
  9. click "OpenSites"

Actual results:

Only one tab is opened

Expected results:

All three tabs should open.

Attached file test.html (obsolete) —

Hello,

I reproduced the issue on the latest Nightly (96.0a1/20211102213508), Beta (95.0b2/20211102190739), Release (94.0/20211028161635) and Release 93.0 (93.0/20210927210923) under Windows 10 x64 and Ubuntu 16.04 LTS.

The issue occurs as mentioned by the reporter in the bug description.

Status: UNCONFIRMED → NEW
Ever confirmed: true

I haven't been able to reproduce the issue locally, I may be missing something in the STR (In particular I'm not sure if I got point 5 and 7 correctly in the STR I tried locally) or maybe some settings in about:config are different and prevents me to trigger the same side effect.

Would you mind to attach to this bug the details got from the "about:support" page?
In there any error logged in the Browser Console once you trigger the issue at step 9 of the comment 0 STR?

In the meantime, I did notice a detail that may be related to the issue (being able to reproduce the issue locally would allow me to confirm if it is related and how the side-effect described in comment 0 is getting triggered):

  • the content script code registered in the main.js script does have a typo (the string isn't closed, and the console.log function call is also missing the closing ) parenthesis), which is also confirmed by the following SyntaxError logged in the Browser Console (unfortunately not in the tab webconsole, see Bug 1410932):
SyntaxError: "" string literal contains an unescaped line break

(the test.html page page does also have a typo, the script tag has a =text="" property instead of text="", but the script is clearly still be part of the page and so I assume it may not make any actual difference for the STR).

Can the issue be triggered if you move the content script into the manifest? (with the same property and same typo in the js content script file set in the manifest content script)

Flags: needinfo?(bugzilla)

Corrected version of the web extension

Attached file test.html

Corrected version of the test file

Attached video 2021-11-05.mp4
Attachment #9248724 - Attachment is obsolete: true
Attachment #9248725 - Attachment is obsolete: true
Attached file about:support content
Sorry for the errors in the file - I corrected them and am still able to reproduce the problem. The real world problem is that: 
I created a video to show my steps - it was a fresh instance that was started with web-ext

The pasting of the about:support content messed up my comment... here it's in full glory.

Sorry for the errors in the file - I corrected them and am still able to reproduce the problem. The real world problem is that: https://github.com/kkapsner/CanvasBlocker/issues/573#issuecomment-884695400
In CanvasBlocker the problem disappears if I remove the dynamic content script: https://github.com/kkapsner/CanvasBlocker/blob/master/lib/main.js#L27
There are still plenty of other content scripts in the manifest with the same properties: https://github.com/kkapsner/CanvasBlocker/blob/master/manifest.json#L30

I created a video to show my steps - it was a fresh instance that was started with web-ext

Flags: needinfo?(bugzilla)

(In reply to kkapsner from comment #8)

The pasting of the about:support content messed up my comment... here it's in full glory.

Sorry for the errors in the file - I corrected them and am still able to reproduce the problem. The real world problem is that: https://github.com/kkapsner/CanvasBlocker/issues/573#issuecomment-884695400
In CanvasBlocker the problem disappears if I remove the dynamic content script: https://github.com/kkapsner/CanvasBlocker/blob/master/lib/main.js#L27
There are still plenty of other content scripts in the manifest with the same properties: https://github.com/kkapsner/CanvasBlocker/blob/master/manifest.json#L30

I created a video to show my steps - it was a fresh instance that was started with web-ext

No worries at all about the errors, I wasn't sure if it was part of the issue or of the reason why I wasn't able to reproduce, but it was worth to mention it in any case.

Thank you a lot for also recording a screencast of the STR, that's helpful and very very much appreciated!
I think that I see now that I wasn't using the fully correct STR.

I'm going to give it a try asap and I'm pretty sure I will be able to reproduce it and dig more into it this time (and so I'm also setting a needinfo assigned to me as a reminder).

Flags: needinfo?(lgreco)

I've been able to reproduce this locally, I need to dig a bit more into it to get a full picture but here is some initial findings:

  • the issue can be triggered using browser.contentScripts.register with a js code snippet (it doesn't matter if the code snippet does fail to parse or execute)
  • the issue is NOT triggered anymore if the exact same script is registered by a browser.contentScripts.register call using a js file instead (same script content, just moved into a file packaged with the extension)
    • and I'm pretty sure this is also the reason why a content script registered from a manifest does also not trigger this issue (as the reported did also mention)
  • the issue gets triggered after the tab where the test.html page was loaded does switch between processes (in the STR from comment 0 is going from a file child process to a isolated child process for google.com, but the same issue seems to be also triggered is the tab is switched back and forth from the file url and an url that is loaded in the parent process, e.g. about:support)
  • if there are two tabs opened with the test.html page and only one is navigated to a url that requires a process switch, then the issue is not triggered, which means that the issue doesn't happen if the file child process was still around

The main difference between the dynamically registered content scripts that are registered using a code snippet vs. a relative url to a file packaged with the extension is that the code snippet is loaded from a blob url and the one using a packaged file is loaded from a moz-extension url.

The fact that all the following conditions are required to triggers the bug:

  • a content script is loaded from a blob url
  • there has been a process switch for the tab that is triggering the popup blocker
  • navigating back to the test.html happens in a new process
    makes me think that this issue may be triggered by a race (related to trying to load from the blob url).

Will dig into it more (leaving my needinfo set on purpose to get back to this).

Flags: needinfo?(lgreco)
Flags: needinfo?(lgreco)

Thanks for the swift investigation.

I just took one more look into this and now I finally see what is going on and why:

  • In ContentParent::LaunchSubprocessResolve, we do:

  • When an extension has registered a content script by using a string of code, internally we use a blob url to load the content script from (which has the same principal of the extension that has registered it).

  • In PermissionManager::GetKeyForOrigin, we do early exit for any origin that doesn't have http / https or ftp as the origin scheme,
    which in turns means that the aKey parameter in the EnsurePermissionsByKey would be also set to "" for any TransmitPermissionForPrincipal calls with a principal that doesn't have one of those schemes.

  • And so when there is a blob url for which the permission key derived for its principal is going to be the empty string "", the call to EnsurePermissionsByKey(""_ns, ""_ns); that follows the TransmitPermissionsForPrincipal call for that blob url is going to early exit and no permissions to be sent to the child process <---- this is basically when the bug gets triggered

  • The blob url doesn't need to be a moz-extension url, any blob url that doesn't have an http or https origin would still trigger the same issue (e.g. I have been able to trigger the same issue without the test extension, by opening the webconsole on an about:support tab and manually by registering a blob url from there, which gets a null origin, but I think the same can also be triggered also from a file url and the tab affected isn't a file child process one anymore).

  • The issue isn't restricted to the popup permission, none of the permissions is being transmitted anymore.


[1] to be precise this has been the behavior starting at least from Bug 1472158 (I haven't checked how that looked before that change and if this was also happening before that change from another code path).

Assignee: nobody → lgreco
Status: NEW → ASSIGNED
Flags: needinfo?(lgreco)
Whiteboard: [addons-jira]

Hi Nika,
I'd like to get also your perspective on the underlying issue (described in detail in comment 12).
Follows some more notes about the test case and proposed fix included in a patch that I'm about to attach to this issue.


I'm about to attach a patch that includes:

  • an xpcshell test that consistently triggers the issue (without using any extension, as described in comment 12 any blob url for a principal origin without an http or https scheme is going to be able to trigger the exact some issue)
  • a small change as the current proposed fix, which at the moment is basically moving the call to EnsurePermissionsByKey(""_ns, ""_ns) to right before the loop over all blob urls

The xpcshell test confirms that transmitting the blob urls permissions after we call EnsurePermissionsByKey(""_ns, ""_ns) does prevent the issue, at a first glance it seems that calling EnsurePermissionsByKey(""_ns, ""_ns) slightly earlier than we do at the moment should trigger new issues / side effects.

I also pushed the patch to try here, I have to look more closely, but at the moment I didn't spotted any unexpected failures related to that change.

what do you think about this approach to fix the issue?

As a side note:
another approach I was briefly looking into is if we could make PermissionManager::GetKeyForOrigin to not return an empty string for the moz-extension principal, but as I also mention in comment 12 (and shown with the test case I'm about to attach) the issue isn't limited to the blob urls created by the extension principals and so it may not be an easy approach (and may still some changes for the bug to be still reproducible if for some blob principals we would still be returning an empty string as the permission key).

Flags: needinfo?(nika)

Clearing needinfo, Nika already replied on the phabricator revision attached, and we will continue to discuss about how the complete fix should look like on phabricator.

Flags: needinfo?(nika)
Severity: -- → S2
Priority: -- → P1
Pushed by luca.greco@alcacoop.it:
https://hg.mozilla.org/integration/autoland/rev/8f381f68600a
Fix permission not sent over ipc while non-http/https blob urls exists. r=nika
Status: ASSIGNED → RESOLVED
Closed: 22 days ago
Resolution: --- → FIXED
Target Milestone: --- → 96 Branch

Verified the fix on the latest Nightly (96.0a1/20211115215316) under Windows 10 x64 and Ubuntu 16.04 LTS.

After proceeding with the STR from the original description of the issue, clicking on “OpenSites” will open all three tabs, confirming the fix. For further details, see the attached video.

Status: RESOLVED → VERIFIED
Attached video 2021-11-16_08h34_27.mp4

Looks good to me in the Nightly build.

The patch landed in nightly and beta is affected.
:rpl, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(lgreco)

Based on what I've seen, I suspect that the issue started with Bug 1472158, which landed in Firefox 74.

The change should be low risk enough for an uplift to beta (and the patch does also cover it with a test case), but unless we get some specific reasons why we may want to get this to release sooner, personally I think we may just leave it to ride the train.

Flags: needinfo?(lgreco)

The scenario is not very common and very specific so I would say version 96 is fine.

You need to log in before you can comment on or make changes to this bug.