Port |Bug 1182569 - Use asyncOpen2() for docshell loads (nsJSProtocolHandler and nsURILoader)| to TB

RESOLVED FIXED in Thunderbird 53.0

Status

Thunderbird
General
RESOLVED FIXED
5 months ago
4 months ago

People

(Reporter: Jorg K (GMT+2), Assigned: Jorg K (GMT+2))

Tracking

52 Branch
Thunderbird 53.0
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(6 attachments, 1 obsolete attachment)

(Assignee)

Description

5 months ago
Mozmill debug runs are busted
https://treeherder.mozilla.org/#/jobs?repo=comm-central&revision=7f8fe6bf3004d4968569227b049ae4a1e31affde

Assertion failure: false (need one securityflag from nsILoadInfo to perform security checks), at c:/builds/moz2_slave/tb-c-cen-w32-d-000000000000000/build/mozilla/dom/security/nsContentSecurityManager.cpp:31
TEST-UNEXPECTED-FAIL | Disconnect Error: Application unexpectedly closed

Looks like this was caused by bug 1182569.

Now a MOZ_ASSERT at nsContentSecurityManager.cpp:31 is triggered:

01:37:32     INFO -  Assertion failure: false (need one securityflag from nsILoadInfo to perform security checks), at c:/builds/moz2_slave/tb-c-cen-w32-d-000000000000000/build/mozilla/dom/security/nsContentSecurityManager.cpp:31

Christoph, can you advise us here.

I'll get an exact stack-dump.
Flags: needinfo?(ckerschb)
(Assignee)

Comment 1

5 months ago
Looks like Xpcshell test are broken for the same reason:

TEST-UNEXPECTED-FAIL | mailnews/compose/test/unit/test_attachment.js | xpcshell return code: 1
PROCESS | 5464 | Assertion failure: false (need one securityflag from nsILoadInfo to perform security checks), at c:/builds/moz2_slave/tb-c-cen-w32-d-000000000000000/build/mozilla/dom/security/nsContentSecurityManager.cpp:31

I'd say a Daily build would be pretty much unusable.
(Assignee)

Comment 2

5 months ago
I think the problem is that we use SEC_NORMAL for pretty much everything
https://dxr.mozilla.org/comm-central/search?q=SEC_NORMAL&redirect=false
apart from chat and calendar which occasionally use SEC_REQUIRE_SAME_ORIGIN_DATA_INHERITS.

So what should we be using? Looks like it needs to be one of these:
  if (securityMode != nsILoadInfo::SEC_REQUIRE_SAME_ORIGIN_DATA_INHERITS &&
      securityMode != nsILoadInfo::SEC_REQUIRE_SAME_ORIGIN_DATA_IS_BLOCKED &&
      securityMode != nsILoadInfo::SEC_ALLOW_CROSS_ORIGIN_DATA_INHERITS &&
      securityMode != nsILoadInfo::SEC_ALLOW_CROSS_ORIGIN_DATA_IS_NULL &&
      securityMode != nsILoadInfo::SEC_REQUIRE_CORS_DATA_INHERITS) {
    MOZ_ASSERT(false, "need one securityflag from nsILoadInfo to perform security checks");
    return NS_ERROR_FAILURE;
  }

I'm not sure how origin checking will work for our Mailnews URIs (like mailbox and IMAP), for example:
mailbox:///C:/Users/jorgk/AppData/Roaming/Thunderbird/Profiles/testing.testing/Mail/Local%20Folders/Trash?number=1
imap://info%40hostalpancheta%2Ees@mail.hostalpancheta.es:993/fetch%3EUID%3E.INBOX%3E10
or more legible (unescaped):
imap://info@hostalpancheta.es@mail.hostalpancheta.es:993/fetch>UID>.INBOX>10
Wow, that's a lot of files you have to update [1]. Indeed you would need to go through that list and use one of the security flags from nsILoadInfo [2]. Are these all top level loads? I mean, what's the contentPolicyType in the loadinfo of those loads? If it's TYPE_DOCMENT and TYPE_SUBDOCUMENT (that's what I assume) then you probably can base your security flags on docshell's security flags [3].

[1] https://dxr.mozilla.org/comm-central/search?q=SEC_NORMAL&redirect=false
[2] https://dxr.mozilla.org/mozilla-central/source/netwerk/base/nsILoadInfo.idl#53
[3] https://dxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp#10866
Flags: needinfo?(ckerschb)
(Assignee)

Comment 4

5 months ago
nsDocShell.cpp#10867 uses nsILoadInfo::SEC_ALLOW_CROSS_ORIGIN_DATA_IS_NULL and that's what I would have used to start with.

I'm just uncertain of which call sites to fix first. Inspecting all the call sites can take a lot of time. Maybe not all of them trigger asyncOpen2(), so it's hard to tell which ones need changing.
(Assignee)

Comment 5

5 months ago
Most are TYPE_OTHER or TYPE_IMAGE:
https://dxr.mozilla.org/comm-central/search?q=nsIContentPolicy.TYPE_&redirect=false
https://dxr.mozilla.org/comm-central/search?q=nsIContentPolicy%3A%3ATYPE_&redirect=false

There's only one TYPE_DOCUMENT and one TYPE_CSP_REPORT, both in nsMsgContentPolicy.cpp which is part of the where some of the failures stem from.

Christoph, apart from "wow" and pointing to nsDocShell.cpp#10866, do you have further advice what to do with TYPE_CSP_REPORT, TYPE_OTHER or TYPE_IMAGE.

I'm going to change the call site that uses TYPE_DOCUMENT to SEC_ALLOW_CROSS_ORIGIN_DATA_IS_NULL.

I really have very little idea, and fixing all this by try and error will take ages. So any help is appreciated.

Can you base the decision on the principal? - all the calls I've inspected use the system principal.
Flags: needinfo?(ckerschb)
(Assignee)

Comment 6

5 months ago
TYPE_OTHER:
https://dxr.mozilla.org/comm-central/search?q=regexp%3AnsIContentPolicy.*TYPE_OTHER&redirect=false

TYPE_IMAGE:
https://dxr.mozilla.org/comm-central/search?q=regexp%3AnsIContentPolicy.*TYPE_IMAGE&redirect=false

Anything else:
https://dxr.mozilla.org/comm-central/search?q=regexp%3AnsIContentPolicy.*TYPE_%5B%5EIO%5D&redirect=false
Only one TYPE_DOCUMENT and one TYPE_CSP_REPORT, both in nsMsgContentPolicy.cpp, as mentioned.
(Assignee)

Comment 7

5 months ago
Sorry, forget TYPE_DOCUMENT and TYPE_CSP_REPORT, they appear in case statements.
(Assignee)

Comment 8

5 months ago
What about when using a null principal? I think a lot of trouble comes from here:
https://dxr.mozilla.org/comm-central/rev/6f666ad0456ce4e56c4ab24de882320fa28059ba/mailnews/compose/src/nsURLFetcher.cpp#328
(In reply to Jorg K (GMT+1) from comment #5)
> I'm going to change the call site that uses TYPE_DOCUMENT to
> SEC_ALLOW_CROSS_ORIGIN_DATA_IS_NULL.

If you have only that one callsite that uses TYPE_DOCUMENT then that should already fix your problem because we have only converted TYPE_DOCUMENT and TYPE_SUBDOCUMENT loads.
 
> I really have very little idea, and fixing all this by try and error will
> take ages. So any help is appreciated.
> 
> Can you base the decision on the principal? - all the calls I've inspected
> use the system principal.

If they really use the systemPrincipal then you can just replace SEC_NORMAL with SEC_ALLOW_CROSS_ORIGIN_DATA_IS_NULL, which is basically the default for systemPrincipal loads.
Flags: needinfo?(ckerschb)
(Assignee)

Comment 10

5 months ago
I will change SEC_NORMAL to SEC_ALLOW_CROSS_ORIGIN_DATA_IS_NULL where the system principal is used.

What about for a null principle, here:
https://dxr.mozilla.org/comm-central/rev/6f666ad0456ce4e56c4ab24de882320fa28059ba/mailnews/compose/src/nsURLFetcher.cpp#328

With these two as a first step, we'll eliminate most SEC_NORMAL and then we can inspect the remaining cases individually.

Sorry about bugging you with questions.
Flags: needinfo?(ckerschb)
(Assignee)

Comment 11

5 months ago
I've inspected all the call sites in C++ code and apart from one that uses system principal, they all use null principal. So what's the correct replacement there?
(Assignee)

Comment 12

5 months ago
Created attachment 8824089 [details] [diff] [review]
Part 1: Replace SEC_NORMAL where system principal is used
Assignee: nobody → jorgk
Status: NEW → ASSIGNED
(Assignee)

Comment 13

5 months ago
Landed Part 1:
https://hg.mozilla.org/comm-central/rev/c9742e200e3f10a385f0fd06ac2acf256543d438
(Assignee)

Comment 14

5 months ago
Nine call sites left with SEC_NORMAL in C++ code using null principle:

.\addrbook\src\nsAbContentHandler.cpp:
.\addrbook\src\nsAddbookProtocolHandler.cpp:
.\addrbook\src\nsAddbookProtocolHandler.cpp:
.\base\src\nsMessenger.cpp:
.\compose\src\nsMsgAttachmentHandler.cpp:
.\compose\src\nsMsgComposeService.cpp:
.\compose\src\nsSmtpService.cpp:
.\compose\src\nsURLFetcher.cpp:
.\imap\src\nsImapProtocol.cpp:

So I'm close ;-)
(Assignee)

Comment 15

5 months ago
This needs fixing in suite/ since SEC_NORMAL is used there, too.
Flags: needinfo?(philip.chee)
(Assignee)

Comment 16

5 months ago
Created attachment 8824098 [details] [diff] [review]
Part 2: Replace SEC_NORMAL where null principal is used

I'm unsure what the replacement should be, for now I've used SEC_ALLOW_CROSS_ORIGIN_DATA_IS_NULL as well. That might be wrong, but this patch identifies the call sites. The patch can easily be edited to match ;-)
Attachment #8824098 - Flags: feedback?(mkmelin+mozilla)
Attachment #8824098 - Flags: feedback?(ckerschb)
(Assignee)

Comment 17

5 months ago
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=c5a1e030e1c472070e5d90e64c21af8111db0a24
(Assignee)

Comment 18

5 months ago
Comment on attachment 8824098 [details] [diff] [review]
Part 2: Replace SEC_NORMAL where null principal is used

I had a very long conversation with Boris (bz) on IRC who tried to explain things to me.

The summary is that the nine call sites which use a null principle will need to be inspected and most likely the system principal will need to be used.

We inspected nsURLFetcher::FireURLRequest and confirmed this for that call site.

So I'll do a try run with all nine changed.
Flags: needinfo?(ckerschb)
Attachment #8824098 - Flags: feedback?(mkmelin+mozilla)
Attachment #8824098 - Flags: feedback?(ckerschb)
(Assignee)

Comment 19

5 months ago
Created attachment 8824213 [details] [diff] [review]
Part 2: Replace SEC_NORMAL where null principal is used

Second part to remove SEC_NORMAL.
Attachment #8824098 - Attachment is obsolete: true
(Assignee)

Comment 20

5 months ago
Created attachment 8824216 [details] [diff] [review]
Part 3: Use system principal in FireURLRequest().

I'm taking the liberty to put r=bz here since I've discussed this with Boris on IRC.

This fixes various (not all) Xpcshell failures. Mozmill will still fail, but we'll see how far we get.
Attachment #8824216 - Flags: review+
(Assignee)

Comment 21

5 months ago
https://hg.mozilla.org/comm-central/rev/c9742e200e3f10a385f0fd06ac2acf256543d438
https://hg.mozilla.org/comm-central/rev/efaa986897703cad3db5558fd266ba769be08541
https://hg.mozilla.org/comm-central/rev/1d934dcb80bcf60a56355e5b981de7bf3b25a902

Sorry, I'm taking a stepwise approach here. Part 1 and 2 remove the deprecated SEC_NORMAL and part 3 fixes FireURLRequest() which will fix all but one Xpcshell test.

There's more work here, so anyone who knows better, please chime in. Perhaps Kent wants to say something.
Flags: needinfo?(rkent)
(Assignee)

Updated

5 months ago
Keywords: leave-open
(Assignee)

Comment 22

5 months ago
Created attachment 8824244 [details] [diff] [review]
NOT LANDED: Part 4: Use system principal everywhere

Security experts, step forward. This is based on Boris' suspicion on IRC:

but yes, you need to examine what the principal should be
And chances are they should all be system, not null

I've run the test which I think still fail after part 3 manually
mozmake SOLO_TEST=content-policy/test-js-content-policy.js mozmill-one
mozilla/mach xpcshell-test mailnews/base/test/unit/test_accountMigration.js
and they still fail.

So this patch might not buy us anything. However, as Boris explained on IRC, only the system principal skips the CheckLoadURI. So we might want to switch to system anyway.

Here's a try run:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=c4b3eb84c87e70f9d3ea63c01655d1fd34d9c96c

I have to call it a day for now.

Oh, Boris suggested to use blame to find out who and why those null principals were added. Bingo!!
https://hg.mozilla.org/comm-central/rev/4823bbd4a723

Christoph added them himself in bug 1071497. That bug makes for interesting reading:

Bug 1071497 comment #6:
Can you tell me what the point of the nsIPrincipal* parameter is?

Bug 1071497 comment #7:
In case where we are not completely sure what the requestingPrincipal is, we currently use the nullPrincipal to be over-conservative and deal with finding the accurate requestingPrincipal at a later time.

Bug 1071497 comment #8:
A lot of this stuff doesn't make sense here because we're not dealing with web pages.

Bug 1071497 comment #11:
Any update on this? We're busted for almost a week now.

And the winner is, bug 1071497 comment #13:
Well, I guess things seem to work with the null principal for now...
(Now being 2014-09-30).

So more than two years onwards, we still don't know.
Attachment #8824244 - Flags: review?(rkent)
Attachment #8824244 - Flags: review?(mkmelin+mozilla)
(Assignee)

Comment 23

5 months ago
Also worth reading:
https://hg.mozilla.org/comm-central/rev/710fbcea1b16 - Bug 1099587.
There nsMsgQuote::QuoteMessage() was changed and the system principle was used.
And it turns out that Kent is the expert ;-)
(Assignee)

Comment 24

5 months ago
So with part 1, 2 and 3 landed, we're down to two test failures:
X: TEST-UNEXPECTED-FAIL | mailnews/base/test/unit/test_accountMigration.js
Z: TEST-UNEXPECTED-FAIL | /builds/slave/test/build/tests/mozmill/content-policy/test-js-content-policy.js
and no crashed, so that's already something.

As can be seen on try
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=c4b3eb84c87e70f9d3ea63c01655d1fd34d9c96c
part 4 makes no difference.

However, Boris was questioning how we access for instance the address book using a null principal. We suspect that that doesn't get tested, so a failure will go unnoticed.

Running TB without part 4 manually, I can still see addresses in the address book.
(Assignee)

Comment 25

5 months ago
Dear Boris and Christoph, with SEC_NORMAL changed to SEC_ALLOW_CROSS_ORIGIN_DATA_IS_NULL everywhere (since we used the system or null principal) and one call site changed from using null to system, we are left with two test failures.

One is in test-js-content-policy.js which is the test that failed due to bug 1308889 and was fixed in bug 1316256 by always running the content policy checks for TYPE_DOCUMENT:
https://hg.mozilla.org/mozilla-central/rev/7ba8dff4a242

The failure we saw is that the content of a <noscript> tag did not show in the message pane although we display mail with JS turned off.

Now we have the exact same problem. Our test failure is:
SUMMARY-UNEXPECTED-FAIL | test-js-content-policy.js | test-js-content-policy.js::test_jsContentPolicy
EXCEPTION: noscript display should be 'inline'; display=none

and in e-mail with
  <noscript>
  hello, this content is noscript!
  </noscript>
this content doesn't show. Sadly, we rely on this working, since otherwise out news reader doesn't work and displays a whole lot of HTML source. So once again, bug 1318859 has also regressed.

Debugging shows this:

=== nsMsgContentPolicy::ShouldLoad called mailbox:///D:/Desktop/Mozilla%20sort/noscript.eml?type=application/x-message-display&number=0
=== nsMsgContentPolicy::SetDisableItemsOnMailNewsUrlDocshells called mailbox:///D:/Desktop/Mozilla%20sort/noscript.eml?type=application/x-message-display&number=0
=== nsMsgContentPolicy::SetDisableItemsOnMailNewsUrlDocshells return 1

Here's the code that produces the debug:
  // If there's no docshell to get to, there's nowhere for the JavaScript to
  // run, so we're already safe and don't need to disable anything.
  if (!aRequestingContext) {
    printf("=== nsMsgContentPolicy::SetDisableItemsOnMailNewsUrlDocshells return 1\n");
    return NS_OK;
  }

So looks like nsMsgContentPolicy::ShouldLoad() and hence nsMsgContentPolicy::SetDisableItemsOnMailNewsUrlDocshells() now gets called with aRequestingContext==nullptr.

Is this expected?

If I skip this test, I'll exit/crash later on
nsCOMPtr<nsIFrameLoaderOwner> flOwner = do_QueryInterface(aRequestingContext, &rv);
rv = flOwner->GetFrameLoaderXPCOM(getter_AddRefs(frameLoader));

Due to the early exit from the function docShell->SetAllowJavascript(false); is never called.

Your help or hints would be much appreciated.
Blocks: 1182569
Flags: needinfo?(ckerschb)
Flags: needinfo?(bzbarsky)
(Assignee)

Comment 26

5 months ago
Oh, I was meant to add: Perhaps time to address bug 1322617 and bug 1322618.
(Assignee)

Comment 27

5 months ago
Created attachment 8824350 [details]
1328847-stack.txt

Here is the stack, perhaps that helps.

The requesting context is null (nullptr) all the way down from DoContentSecurityChecks() where it is determined.

contentPolicyType==nsIContentPolicy::TYPE_DOCUMENT, so that context would have come from requestingContext = aLoadInfo->LoadingNode();
Verified with breakpoint.

So something has changed so that the context comes back as null now.
(Assignee)

Comment 28

5 months ago
Created attachment 8824352 [details]
1328847-stack2.txt

Oops, this gets called multiple times, here's the call that comes out of mailnews/

Updated

5 months ago
Blocks: 1329186

Comment 29

5 months ago
> nsImapProtocol.cpp
> -                         nsILoadInfo::SEC_NORMAL,
> +                         nsILoadInfo::SEC_ALLOW_CROSS_ORIGIN_DATA_IS_NULL,
I think this should be:
SEC_REQUIRE_SAME_ORIGIN_DATA_IS_BLOCKED
Plus add a comment like:
// The following channel is never openend, so it does not matter what
// securityFlags we pass; let's follow the principle of least privilege.

Probably the same for:
mailnews/imap/test/unit/test_imapContentLength.js
mailnews/local/test/unit/test_mailboxContentLength.js
mailnews/news/test/unit/test_nntpContentLength.js
Flags: needinfo?(philip.chee)
(Assignee)

Comment 30

5 months ago
Thanks for your input. Given the comment "it does not matter what securityFlags we pass", and the fact that SEC_ALLOW_CROSS_ORIGIN_DATA_IS_NULL (new) is more or less the equivalent of SEC_NORMAL (old), that's a purely academic discussion, particularly in the light of two perma-red test failures that we need to solve here.

I'm interested in those and the question whether we should switch from using a null principal to using a system principal as suggested in Part 4 (which doesn't solve any test failures).
(Assignee)

Comment 31

5 months ago
Looking at the Xpcshell test failure:
mozilla/mach xpcshell-test mailnews/base/test/unit/test_accountMigration.js

  // Migration should now have added permissions for the address that had them
  // and not for the one that didn't have them.
  do_check_true(Services.prefs.getIntPref("mail.ab_remote_content.migrated") > 0);
  do_check_eq(Services.perms.testPermission(uriAllowed, "image"),
              Services.perms.ALLOW_ACTION);
  do_check_eq(Services.perms.testPermission(uriAllowed2, "image"),
              Services.perms.ALLOW_ACTION);
  do_check_eq(Services.perms.testPermission(uriDisallowed, "image"),
              Services.perms.UNKNOWN_ACTION); <=== this fails.

This test fails as indicated, if I s/UNKNOWN_ACTION/ALLOW_ACTION/ is passes. So something has become more permissive here. Now we have to understand the test to see what's going on :-(
> The requesting context is null (nullptr) all the way down from DoContentSecurityChecks() where it is determined.

Uh.... That's because of bug 1329288, which I just filed.  That needs to get fixed, pronto.
Depends on: 1329288
Flags: needinfo?(bzbarsky)
(Assignee)

Comment 33

5 months ago
Thanks, Boris, for taking this burden off our shoulders. So our Mozmill failure will just silently be fixed by bug 1329288.

We'll use this bug for the Xpcshell failure then, which I still have to investigate, and for any other patches we want to land in mailnews/.

Talking of which, the failing test was written by the three B's Banner, BenB and Bienvenu, so God help us to understand it, or perhaps Magnus, who's also landed some patches on the file.

I've talked to Magnus in the meantime and his stance was not to change the calls from null principal to system principal to stay conservative.
Flags: needinfo?(ckerschb)
> We'll use this bug for the Xpcshell failure then

It's worth logging what calls to migrateAddress over at http://searchfox.org/comm-central/source/mailnews/base/util/mailnewsMigrator.js#133 happen during this test in the passing and failing cases.  At least it's a place to start.

Updated

5 months ago
Attachment #8824089 - Flags: review+

Updated

5 months ago
Attachment #8824213 - Flags: review+

Comment 35

5 months ago
(In reply to Jorg K (GMT+1) from comment #33)
> I've talked to Magnus in the meantime and his stance was not to change the
> calls from null principal to system principal to stay conservative.

What are the security implications of null vs system principal?

Comment 36

5 months ago
Null principal has only minimal privileges, system principal can do everything. So to avoid unexpected privileges for code you didn't intend to have them it's better to use as little privileges as possible, so long as the code can still do what it was designed to do.
(Assignee)

Comment 37

5 months ago
That's a very good question, please refer to my comment #22 where back in 2014 they already didn't know.

As I said, Boris chatted to me for a long time on IRC ...
[2017-01-05 20:33:43] <bz> jorgk: So the "null principal" vs "system principal" is really not that important a distinction
[2017-01-05 20:33:57] <bz> Sure, anything going through FireURLRequest would be affected by the changes
[2017-01-05 20:34:37] <bz> So afaict, the status of this stuff in tbird is that some of it was already broken but no one noticed
[2017-01-05 20:34:44] <bz> because it wasn't tested
[2017-01-05 20:34:57] <bz> "broken" in that it passed SEC_NORMAL through an AsyncOpen2 codepath
[2017-01-05 20:35:13] <jorgk> sure. But some tests exercise FireURLRequest 
[2017-01-05 20:35:14] <bz> Anyway, bkelly is right
[2017-01-05 20:35:40] <bz> someone who knows what these callsites are meant to do should decide what their security policy is
[2017-01-05 20:35:41] <jorgk> bkelly is right in saying that we need to choose?
[2017-01-05 20:35:58] <bz> The nullptr vs system principal distinction is mostly a red herring
[2017-01-05 20:36:14] <jorgk> well, franky, no one knows about these call sites.
[2017-01-05 20:36:22] <bz> I mean, when used with asyncOpen2 it affects what CheckLoadURI will do
[2017-01-05 20:36:32] <bz> And hence whether the load is allowed at all
[2017-01-05 20:36:42] <bz> But it doesn't affect the security _policy_ for the load, necessarily.
[2017-01-05 20:36:50] <bz> jorgk: well, then we have a problem
[2017-01-05 20:37:01] <bz> jorgk: someone needs to sit down and examine them and make some decisions.

Here he explains the idea of asyncOpen2:
[2017-01-05 20:56:56] <bz> The old necko/gecko setup was that necko consumers would create a channel and ask necko to open it
[2017-01-05 20:57:00] <bz> and necko would do that and get data.
[2017-01-05 20:57:30] <bz> If consumers wanted some sort of security policy (e.g. checks for whether that URI was allowed to be loaded by whoever was doing the load), it was on them to make those checks before asking necko to do the load.
[2017-01-05 20:57:48] <bz> So we had CheckLoadURI checks, content policy checks, etc scattered all through our code
[2017-01-05 20:57:53] <bz> At every spot where we called into necko
[2017-01-05 20:58:06] <bz> What Christoph has been doing is moving those security checks into necko
[2017-01-05 20:58:50] <bz> Specifically, if you use asyncOpen2, you get the new behavior, where necko will do security checks for you
[2017-01-05 20:59:08] <bz> It will base these security checks on the nsILoadInfo
[2017-01-05 20:59:20] <bz> Which has information like the "triggering principal" (aka "who is doing the load")
[2017-01-05 20:59:35] <bz> And on the channel URL (aka "who is being loaded")
[2017-01-05 20:59:58] <bz> Now it turns out that different callsites in Gecko want slightly different security policies applied.
[2017-01-05 21:00:04] <bz> That's what the SEC_* flags are about

My feeling is that as more loads are moved onto asyncOpen2() we will get more failures when using a null principal.

Just keep in mind: 37 call sites were already using a system principal, nine were using a null principal, and I had to change one of those in Part 3 to make a bunch of tests (n-1 of the ones that failed) work again.

However, there was some confusion about asyncOpen2 already being used:
[2017-01-05 20:27:22] <bz> A simple example
[2017-01-05 20:27:26] <bz> nsAbContentHandler::HandleContent
[2017-01-05 20:27:31] <bz> This is using NS_NewStreamLoader
[2017-01-05 20:27:47] <bz> which has been using asyncOpen2 all along for a while now
[2017-01-05 20:27:59] <bz> And hence presumably has been calling ValidateSecurityFlags
[2017-01-05 20:28:04] <bz> but was passing SEC_NORMAL
[2017-01-05 20:28:08] <bz> So why was this not being a problem?
[2017-01-05 20:29:11] * bz is quite confused by that, actually
2017-01-05 20:32:33] <bz> Is it possible that we simply don't exercise nsAbContentHandler::HandleContent in a debug build in our tests?
[2017-01-05 20:32:51] <jorgk> bz: yes, that's one on the ones which uses a null principal with SEC_NORMAL, so I'm in search of  a replacement.
[2017-01-05 20:33:07] <jorgk> bz: not everything is covered by tests.
[2017-01-05 20:33:15] <bz> alright
[2017-01-05 20:33:23] <bz> (and no manual testing in debug builds either, eh?)

So maybe I should make sure I exercise nsAbContentHandler::HandleContent in a debug build and see that it works so we exercise at least one call site with a null principle manually.

Comment 38

5 months ago
(In reply to Magnus Melin from comment #36)
> Null principal has only minimal privileges, system principal can do
> everything.

Right, that's http://searchfox.org/mozilla-central/source/caps/nsIPrincipal.idl#84. My concern was how it related to the interpretation of the content policy flags.

(In reply to Jorg K (GMT+1) from comment #37)
> That's a very good question, please refer to my comment #22 where back in
> 2014 they already didn't know.
> 
> As I said, Boris chatted to me for a long time on IRC ...

Thanks, that's very helpful.

> [2017-01-05 20:37:01] <bz> jorgk: someone needs to sit down and examine them
> and make some decisions.

Given comment 22, I suspect this is what has kept being postponed (for TB) for years ;)

> My feeling is that as more loads are moved onto asyncOpen2() we will get
> more failures when using a null principal.

The ones which are caught by tests are the ones where we are lucky... the other ones are the real problem.

> However, there was some confusion about asyncOpen2 already being used: [...]

Possibly more examples of c-c usefully expanding the test coverage of m-c ;)
(Assignee)

Comment 39

5 months ago
(In reply to Jorg K (GMT+1) from comment #37)
> So maybe I should make sure I exercise nsAbContentHandler::HandleContent in
> a debug build and see that it works so we exercise at least one call site
> with a null principle manually.

Before getting side-tracked to another bug, I was looking into this one here.

In the debugger I saw that FireURLRequest() now calls AsyncOpen2() when before it called AsyncOpen(). The change is here:
https://hg.mozilla.org/mozilla-central/rev/a5bac3694b6f#l3.12

So Part 3 was really needed. Part 1 and 2 can be justified by wanting to replace the deprecated SEC_NORMAL that would always trigger a MOZ_ASSERT() here:
https://dxr.mozilla.org/mozilla-central/rev/0d823cf54df53e0cea75a74adebace956bd333d8/dom/security/nsContentSecurityManager.cpp#26
if calls go down this path, and more and more will in the future.

It is not clear whether any of the changes in Part 1 and 2 actually had any immediate benefit, but I think this clean-up is good to have. There were crashes on that MOZ_ASSERT(), but I don't know whether they would have been fixed by Part 3 alone.

Coming back to Boris' doubts about nsAbContentHandler::HandleContent(). Sadly, as much as I tried, I couldn't get that to run. So after playing around with address book cards (and finding another bug), I still don't know how to trigger that function.

Comment 40

5 months ago
Re Part 3: Looks like fireURLRequest is only used to download attachments, http://searchfox.org/comm-central/source/mailnews/compose/src/nsMsgAttachmentHandler.cpp#790
(Assignee)

Updated

5 months ago
Depends on: 1329494
(Assignee)

Comment 41

5 months ago
Boris, thanks for your comment.

(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #34)
> > We'll use this bug for the Xpcshell failure then
> It's worth logging what calls to migrateAddress over at
> http://searchfox.org/comm-central/source/mailnews/base/util/mailnewsMigrator.
> js#133 happen during this test in the passing and failing cases.  At least
> it's a place to start.
I've had a look at the problem. We use the permissions manager to store where remote images should displayed. The user can have a whitelist of URL from which to accept images.

Sadly we also have a feature where remote images can be whitelisted if the e-mail comes from a certain sender. Since e-mail addresses don't have a proper origin, had to change our approach in bug 1193200, we're now misusing the system to store e-mail addresses as chrome URIs. Take a look at the failing test mailnews/base/test/unit/test_accountMigration.js (or take my word for it). You'll see:
  let uriDisallowed = Services.io.newURI(
    "chrome://messenger/content/?email=no@test.invalid", null, null);
BTW, I was in favour of dropping to while-listing support for e-mail addresses (see long discussion in bug 1193200), but Magnus decided to hack the system with those chrome URIs.

This has coming back to bite us, since e-mail whitelisting is broken.

If I list my friend henk@henk.com.au in TB 52, I see chrome://messenger/content/messenger.xul in the "Execeptions - Remote Content" now.

We'll discuss it further in bug 1329494. I'm not sure how the this hangs together with the AsyncOpen2() transition.

Comment 42

4 months ago
(In reply to Jorg K (GMT+1) from comment #15)
> This needs fixing in suite/ since SEC_NORMAL is used there, too
Thanks for the heads up. Working on it.

Comment 43

4 months ago
I don't think we should "Use system principal everywhere" without first understanding where we are relying on necko security to check loads, rather than doing it ourselves which is what system principle implies. I certainly don't have that understanding at the moment.

It would be good to come up with a few cases where we don't want a URL to load, and try to understand how we are accomplishing that. I can think of a few examples:

1) Javascript in emails should not load.

2) Any links clicked from an email (or iframes) should not inherit system principle from that email and thus be able to bypass further security checks.

3) mailnews protocols in html on a content tab should not be accessible for opening by javascript run by that content tab.

I'm sure there are many others we could think of. I do not have a direct understanding of these security issues, so I do not consider myself the expert. Yes I approved the patch on Bug 1099587 but that was another case of, as jcranmer put it, "It would be really nice to not have such massive permabustage" so perhaps I gave into the pressure to stop our bustage.

Along the same lines, I'm not too comfortable with approaches like comment 20 "I'm taking the liberty to put r=bz here since I've discussed this with Boris on IRC." unless Boris actually looked at the patch and approved the change on IRC, which I suspect is not what happened. I realize you are trying to keep things moving, and I am terribly slow and everyone else is really busy, but I am not convinced the cutting corners on the review process where security is concerned is the right approach here.

So Jorg could you please look at the issues I suggested above, and help us understand how we are currently expecting the security to work.

Yes we remain mostly busted. Yet I have stated many times that our current method of holding all landings hostage to the latest m-c breakage is not the right approach. This is going to become particularly acute as we try to land critical patches for TB 52, only to be thwarted by post-52 bustages in m-c. We have this problem each major cycle, and the ad-hoc ways we try to deal with it are quite risky.
For the record, here is my take on the change to nsURLFetcher::FireURLRequest.

Before the asyncOpen2 changes, that codepath did not do any security checks at all, because all of those lived outside the URILoader (content policy checks in docshell, and CheckLoadURI checks in the docshell's callers).  Now with asyncOpen2 both checks are done under the asyncOpen2 call, and hence the OpenURI call that code makes.  With a nullprincipal, the CheckLoadURI check _will_ fail for a bunch of the cases nsURLFetcher::FireURLRequest is used for, as far as I can tell.

Using a system principal there will bypass the CheckLoadURI and content policy checks in asyncOpen2, thus restoring the old behavior of this code in terms of those checks.  Using one of the SEC_ALLOW_CROSS_ORIGIN_* options for the loadinfo flags will also bypass the "enforce same-origin" checks that asyncOpen2 does.  Again, this corresponds to the old behavior.  Whether that behavior was correct in all cases or not, I cannot say given the limited study I have put into this, but it _is_ the old behavior.

Now there is the worry about what principal will get inherited.  In practice this does not matter if the security policy used is SEC_ALLOW_CROSS_ORIGIN_DATA_IS_NULL, because inheritance only happens in the following cases:

1) SEC_FORCE_INHERIT_PRINCIPAL is included in the security policy.
2) SEC_ABOUT_BLANK_INHERITS is included in the security policy and about:blank is being loaded.
3) One of the *_DATA_INHERITS options is used in the security policy
   and data: or javascript: is being loaded.

Since SEC_ALLOW_CROSS_ORIGIN_DATA_IS_NULL doesn't include any of those flags, you won't be getting any inheritance happening anyway.

I'm happy to help out with deciding what to do in other cases if people describe to me what the resulting channels are used for.  It might make sense to spin up multiple bugs for the various cases or groups of cases so that the discussions don't get too confusing.  In particular, I might be willing to just try to analyze some cases myself, but I'm not going to do it for all of the cases in that big patch...
(Assignee)

Comment 45

4 months ago
(In reply to Kent James (:rkent) from comment #43)
> Along the same lines, I'm not too comfortable with approaches like comment
> 20 "I'm taking the liberty to put r=bz here since I've discussed this with
> Boris on IRC." unless Boris actually looked at the patch and approved the
> change on IRC, which I suspect is not what happened.
The change from null principal to system principal was discussed with Boris, and the code to achieve this was trivial, in fact it was copied from nsMsgQuote::QuoteMessage() where the system principal was already used. So I think this patch is well and truly covered by the "four eye principle", which is also confirmed by Boris in the preceding comment.

I'm going to close this bug since I think we're done here.

The remaining failing tests are covered by bug 1329494 (Xpcshell test) and bug 1329288 (Mozmill test).
Status: ASSIGNED → RESOLVED
Last Resolved: 4 months ago
Flags: needinfo?(rkent)
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 53.0
(Assignee)

Comment 46

4 months ago
Comment on attachment 8824244 [details] [diff] [review]
NOT LANDED: Part 4: Use system principal everywhere

The discussion about the switch from null principle to system principle will be continued in bug 1329920.

So Part 4 did not land here. Parts 1, 2 and 3 were landed as follows (see comment #21):
https://hg.mozilla.org/comm-central/rev/c9742e200e3f10a385f0fd06ac2acf256543d438
https://hg.mozilla.org/comm-central/rev/efaa986897703cad3db5558fd266ba769be08541
https://hg.mozilla.org/comm-central/rev/1d934dcb80bcf60a56355e5b981de7bf3b25a902
Attachment #8824244 - Attachment description: Part 4: Use system principal everywhere → NOT LANDED: Part 4: Use system principal everywhere
Attachment #8824244 - Flags: review?(rkent)
Attachment #8824244 - Flags: review?(mkmelin+mozilla)
(Assignee)

Updated

4 months ago
No longer depends on: 1329494
You need to log in before you can comment on or make changes to this bug.