Open Bug 1220687 Opened 6 years ago Updated 4 months ago
Implement HTML5's concept of 'HTTPS state' for Window objects
The Secure Contexts spec depends on being able to get the 'HTTPS state' of 'relevant settings object's. Currently we don't store or otherwise provide a means to obtain that state. I think for now I will just aim to implement this for Window. Support for Workers will need to be added later.
To help reviewers, here's a summary of the relevant sections of the relevant specs with summaries of the behavior they proscribe: Overall summary: the 'fetch' spec defines when request's responses have the HTTPS state set, the browser.html and workers.html pages of the HTML5 spec define when the HTTPS state for Window and Worker is set from such a response, and the other pages of the HTML5 spec linked below define how/when the HTTPS state is set at other times. https://fetch.spec.whatwg.org/ https://fetch.spec.whatwg.org/#basic-fetch defines when HTTPS state is set for requests based on various schemes. For about:blank, blob:, data: and, at our discretion, file:, set HTTPS state to request's client's HTTPS state if request's client is non-null For http: / https: it links to 'HTTP fetch' section which links to the 'HTTP-network-or-cache fetch' section, which links to the 'HTTP-network fetch' section, which sets HTTPS state. "If response was retrieved over HTTPS, set its HTTPS state" Propagate the entry settings object's HTTPS state onto Response objects created by the ctor. https://html.spec.whatwg.org/multipage/browsers.html "When a browsing context is navigated to a new resource...[and] a Document is created...Create a new Window object...Set the Window object's HTTPS state to the HTTPS state of the resource used to generate the document" https://html.spec.whatwg.org/multipage/workers.html Summary: when a worker for a script with URL url is run, set worker global scope's HTTPS state to response's HTTPS state. "When a user agent is to run a worker for a script with URL url, an environment settings object settings object, and a URL referrer it must run the following steps:...9. Set worker global scope's HTTPS state to response's HTTPS state." https://html.spec.whatwg.org/multipage/webappapis.html Summary: Two arg document.open sets HTTPS state to the HTTPS state of the Window object of the responsible document specified by the entry settings object. "An environment settings object specifies algorithms for obtaining the following:...HTTPS state" "Create an environment settings object whose algorithms are defined as follows:...The HTTPS state - Return the HTTPS state of the Window object." "When called with two arguments (or fewer) [ type [, replace ] ], the document.open() method must act as follows:...16. Set the new Window object's HTTPS state to the HTTPS state of the Window object of the responsible document specified by the entry settings object." https://html.spec.whatwg.org/multipage/dom.html Summary: propagate HTTPS state to new Window on document reload when the document's content came purely from document.open/.write "When the user agent is to perform an overridden reload, given a source browsing context, it must act as follows:...3. Let HTTPS state be the HTTPS state of the browsing context's active document's Window. 4. Navigate the browsing context to a new response whose body is source and HTTPS state is HTTPS state, with replacement enabled and exceptions enabled." https://html.spec.whatwg.org/multipage/embedded-content.html Summary: <iframe srcdoc>'s subdoc's Window gets the same HTTPS state as HTMLIFrameElement::GetOwnerDoc's HTTPS state. "When the user agent is to process the iframe attributes, it must run the first appropriate steps from the following list: - If the srcdoc attribute is specified - Navigate the element's child browsing context to a new response whose url list consists of about:srcdoc, header list consists of `Content-Type`/`text/html`, body is the value of the attribute, and HTTPS state is the HTTPS state of the iframe element's node document's Window."
Note that the spec is slightly busted here because the "request client" is not the thing you want to inherit HTTPS state from. See https://github.com/w3c/webappsec-secure-contexts/issues/5#issuecomment-150242175 for one example I know of, but there may be others too; I didn't do a careful audit of all fetch invocations.
Given that, I'm wondering whether it's even possible to sanely propagate HTTPS state for document.open. Say I have three documents A, B and C in three different browsing contexts. Code in A calls a function in B to document.open C and document.write some string. Let's say A is delivered over HTTP and B is delivered over HTTPS. If the string is passed from A to B, the content written into C wasn't secured. If the function in B uses a string that also comes from B then it is. Actually you could argue that B is compromised anyway since it gave access to itself to an HTTP delivered page. Regardless, it seems to me that basing the HTTPS state on any given settings objects is broken. It seems that we'd really want all settings objects all the way back to the entry settings object to have a "modern" HTTPS state. At that point there doesn't seem much point in the HTTPS state concept though, since we're getting closer to the definition of secure context.
> Let's say A is delivered over HTTP and B is delivered over HTTPS. How did A manage to call a function in B? Now of course A could postMessage a string to B to document.write... but we're really not trying to add data tainting to the platform here. > It seems that we'd really want all settings objects all the way back to the entry > settings object to have a "modern" HTTPS state. That would not help with the postMessage case, right? Especially if B stores the string in a variable and then on click writes it out or something. We really don't want to add data tainting. ;)
(In reply to Boris Zbarsky [:bz] from comment #4) > How did A manage to call a function in B? CORS can be used to allow it, no? And CORS doesn't affect HTTPS state as far as I can tell.
How could cors be used to do that?
From previous conversations with Mike about Netflix's subversion of Chrome's Web Crypto implementation I had mistakenly come away with the impression that it was CORS that enabled it. It seems that I was wrong about that and CORS does not provide functionality for one script to directly access another that is in a different, cross-origin window. Looking at: https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS it seems that it only relates to network requests, and only allows requests for: * Invocations of the XMLHttpRequest API in a cross-site manner * Web Fonts * WebGL textures. * Images drawn to a canvas using drawImage. It's kinda hard to figure that out from the specs without specifically examining the CORS behavior they detail at length or having more familiarity with them than I have. I hadn't picked up on that during my recent forays into this world, but I'm glad I have now!
> and only allows requests for: That list is missing various stuff like videos drawn to canvas, stylesheets (for CSSOM access), and scripts (for unmuted exceptions).
nsIChannel already has 'securityInfo' to indicate whether the contents of the channel were delivered via TLS. As it happens, we're already propagating the securityInfo onto the nsWyciwygChannel created when document.open() rewrites a document. (See bug 1221103.) We do not currently propagate the securityInfo object to: * nsDataChannel when a data: URI is used to create a sub-document * nsInputStreamChannel when iframe's 'srcdoc' attribute is used to create a sub-document On the surface of it it seems sane to do so, but it's not clear to me whether that might have any adverse affects. Do you have any insight into that, Boris?
No, because the whole issue is that this needs to propagate independently of the principal/origin. So for example <iframe srcdoc sandbox> doesn't inherit the principal (it gets a nullprincipal) but needs to inherit the HTTPS flag. Otherwise yes, we'd just hang it off the origin and be done with it.
Daniel, note also that this bug is about 'HTTPS state' rather than 'secure context's (the former roughly being "the content of this window was delivered securely", the latter roughly being "the content of this window and all of the windows that can communicate with it were delivered securely").
You're right, I was mixing the two concepts. securityInfo sounds fine for "HTTPS state".
Putting this up to enable discussion regarding the approach. This is really rough. Ignoring document.open for now, this seems to be the place that we create the new channel for all the other cases, and therefore seems like the preferred place to propagate HTTPS state for those cases. As noted in one of the source comments, it looks like propagating securityInfo is actually not going to work since in some cases (for example, a user loading a data: URI directly) we won't have that info. So it looks like we need a separate flag on nsIChannel. Not sure if that needs to be readonly though to protect it from change from some consumers. I also don't know yet how it would be best to get the document that the 'owner' nsIPrincipal came from in order to get the channel we want to propagate HTTPS state from. We don't seem to have access to that document here. Stack when loading URI is these cases: nsDocShell::DoURILoad nsDocShell::InternalLoad nsDocShell::LoadURI Stack when *re*loading URI in these cases: nsDocShell::DoURILoad nsDocShell::InternalLoad nsDocShell::LoadHistoryEntry nsDocShell::Reload
Whether it's security info or new state, it seems like we need to appropriately propagate HTTPS state info onto channels created from non-direct-network requests. Here's an example for data: URIs. I'm working on doing this for other scenarios like blob: etc. but it would be great to get some feedback is case you don't like this, bz.
Comment on attachment 8727901 [details] [diff] [review] patch - propagate security info onto data: URI document's channels Err, that's totally the wrong patch.
Comment on attachment 8727910 [details] [diff] [review] patch - propagate security info onto data: URI document's channels It would be really nice to have this work more automatically with our existing setup.... Would this work? 1) Store the security info (or whatever boolean flag we need from it, more likely) in the principal. This could be set up by GetChannelURIPrincipal for channels that have actual security info. 2) When inheriting a principal, there is nothing else to be done. I believe in all cases when inheriting a principal we want to inherit the secure context state too. Is that correct? If that's not the case, then obviously storing this in the principal is not a good idea. 3) When GetChannelResultPrincipal would create a nullprincipal (for the sandboxing case), we would copy over the secure context flag. Probably in exactly the situations in which we'd inherit if it were not for the fact that we're doing sandboxing... Would that do the right thing? If so, it would have the benefit of not needing to play channel impl whack-a-mole.
(In reply to Boris Zbarsky [:bz] from comment #19) > Would this work? > > 1) Store the security info (or whatever boolean flag we need from it, more > likely) in the principal. This could be set up by GetChannelURIPrincipal > for channels that have actual security info. I hadn't considered this since it seemed like an abuse of nsIPrincipal, and since it really seemed like channels for non-network-request documents should really be getting a security info propagated onto them when appropriate anyway. However, if you're happy enough to do it that way then it seems likely to be easier to implement and more robust. I need to take a closer look at the details of how principals are created and propagated before I can be sure. > 2) When inheriting a principal, there is nothing else to be done. I > believe in all cases when inheriting a principal we want to inherit the > secure context state too. Is that correct? I believe that's correct, yes. > 3) When GetChannelResultPrincipal would create a nullprincipal (for the > sandboxing case), we would copy over the secure context flag. Yup. > Would that do the right thing? If so, it would have the benefit of not > needing to play channel impl whack-a-mole. I'm about two thirds of the way through patching the various scenarios we'd need security-info-on-channel propagation (should have put up this patch for feedback earlier), but your idea would probably involve a lot less code. I'll work on that and see how it compares. Thanks!
I thought the origin attributes might have been the right place too, when I was making comment 19, but I decided it's not and hence didn't say to put it there. The origin attributes are an invariant of the docshell, basically (e.g. they're inherited into sandboxed iframes!), which is not the semantics we want here.
bholley, can I get your input on the following issue with the last patch? As pointed out on IRC by bz, the nsPrincipal::Read/Write changes are problematic because they change the serialization, and that will break session restore across different versions of gecko. Storing httpsState in the attributes would prevent that from being the case, but we only want to inherit https state in nsScriptSecurityManager::GetChannelURIPrincipal, and in nsScriptSecurityManager::GetChannelResultPrincipal in the cases where we would inherit a principal if sandboxing was ignored (see the patch). I.e. whenever we inherit a principal into a sub-document or where we would inherit it if it wasn't for sandboxing. Any thoughts on storing httpsState in the attributes, or on other solutions?
In terms of inheriting the attributes, we have various routines that determine when an OriginAttribute is and isn't inherited: https://hg.mozilla.org/mozilla-central/file/4657041c6f77/caps/BasePrincipal.h#l84 It is true that OriginAttributes handle the serialization case, but they have a conceptual meaning that I don't want to undermine here. In particular, each attribute is intended to be an orthogonal dimension in origin-space. That is to say, all the origin attributes must match in order for two origins to be considered equal. In general, is that the case for https state? Do we expect separate buckets of storage, cookies, permissions, etc between principals with different https state? If we don't, that's a good indicator that this is not the right abstraction. Probably good to get sicking's opinion too.
I agree with Bholley that we don't want to undermine the semantic meaning of OriginAttributes. We shouldn't use that as a storage mechanism for other things. My main concern is OriginAttributes becoming something that it was not designed to do. It's already something that Gecko developers are having a hard time wrapping their heads around since it's an entirely new concept. Currently principals are quite often not just thought of as an URI, but also implemented as a URI (see the large number of places where we're calling createCodebasePrincipal). I think making OriginAttributes not just something that is a orthogonal to the URL makes it harder to understand. And also risks additional members getting piled on. More concretely though, one problem that you'll have is that all existing data stored by a https website will have to be migrated. Data like cookies, localStroage, indexedDB and Cache API all store their data key'ed on both scheme+domain+port *and* on OriginAttributes. Right now the OriginAttributes part is generally empty. If you make all https websites store its data with OriginAttributes.mIsSecureContext=true that means that all existing data has to be migrated, or it will appear deleted to the user. The second problem is one of making sure that the properties are up-to-date. I.e. we'd have to make sure that everywhere where we create a PrincipalOriginAttribute we know what the URL of the page is, and make sure that we use the right OriginAttributes.mIsSecureContext value. This will definitely be a non-trivial task.
Bobby, Jonas, thanks for your input. On digging into the serialization issue further I don't think we would break session restore so long as we *append* to the stream written by nsPrincipal::Write. It seems that the serialization of the principal ends up being stored, by itself, in the variable 'owner_b64' in sessionstore.js. Since only the principal is stored in that variable we shouldn't break anything by tacking on an extra value in nsPrincipal::Write. I'd also note that bug 911547 - which tacked on a CPS serialization to nsPrincipal - didn't seem to result in any bugs being reported against it.
(In reply to Jonathan Watt [:jwatt] from comment #27) > Bobby, Jonas, thanks for your input. > > On digging into the serialization issue further I don't think we would break > session restore so long as we *append* to the stream written by > nsPrincipal::Write. It seems that the serialization of the principal ends up > being stored, by itself, in the variable 'owner_b64' in sessionstore.js. > Since only the principal is stored in that variable we shouldn't break > anything by tacking on an extra value in nsPrincipal::Write. > > I'd also note that bug 911547 - which tacked on a CPS serialization to > nsPrincipal - didn't seem to result in any bugs being reported against it. I am kinda skeptical here - generally you need to rev the CID when you change the serialization format, since otherwise consumers that serialize the principal along with other things in a stream will get confused. Is your argument that nobody else (either in-tree or in addons) serializes nsIPrincipal?
As long as we make the Read function able to read the old serialized format, is there a reason to rev the CID?
(In reply to Jonas Sicking (:sicking) from comment #29) > As long as we make the Read function able to read the old serialized format, How do we do that, exactly? > is there a reason to rev the CID? New code reading old data, and old code reading new data. We discussed the former, but the latter is also a concern.
I don't think we care about old code reading new data. Downgrading breaks in a number of cases and the best we can hold for is that loading the sessionrestore data from disk will fail and we'll not do session restore. Similarly IndexedDB, cookies, localStorage etc will fail to load when downgrading on a pretty regular basis. Regarding how to write the Read function, I don't have concrete ideas. I guess one way would be to actually rev the CID, but also write a deserializer for the old CID.
(In reply to Jonas Sicking (:sicking) from comment #31) > I don't think we care about old code reading new data. Downgrading breaks in > a number of cases and the best we can hold for is that loading the > sessionrestore data from disk will fail and we'll not do session restore. Sure, but it's not clear to me that we're actually guaranteed to fail here. We are if the only callers of nsIPrincipal::Read are top-level callers with a standalone buffer. But if anybody's reading a recursively-serialized data structure out of some larger stream, the extra bits will throw everything off and cause UB. Hence my question in comment 28. > Regarding how to write the Read function, I don't have concrete ideas. If we're only considering the standalone case describe above _and_ the serializer machinery will reliably throw when reading more bits than are in the stream, I think we can make it work. Otherwise probably not. > I guess one way would be to actually rev the CID, but also write a > deserializer for the old CID. The machinery isn't really set up for this, but I guess we could do it with a custom factory function for the old CID which generates a new nsPrincipal with a bit set saying "use the old format"... But do we really care? We've revved the CID of nsPrincipal several times this year. We were doing it for every new OriginAttribute until we decided in bug 1163254#c40 that it wasn't necessary.
I don't know if we care.
(In reply to Bobby Holley (busy) from comment #28) > Is your argument that nobody else (either in-tree or in addons) serializes > nsIPrincipal? There's stuff like IPC that serializes nsIPrincipal, but after looking for in-tree instances of serialization of nsIPrincipal into a larger stream the only instance that I can find is nsXULPrototypeDocument::Read. I'm not sure how significant that is. I also took a look at addons that contain the string "nodePrincipal" (500 or so mxr hits) and the string "read(". Determining the later part was a manual process so I only went through about 20% of the "nodePrincipal" hits, but I didn't find any addons in that set doing serialization of the principal. That's not to say there aren't any, but it seems like if it happens it's probably rare. There being no apparent fallout from when we changed the serialization of nsPrincipal last time in https://hg.mozilla.org/releases/mozilla-aurora/rev/947bebfbe778 would also suggest that's the case.
(In reply to Bobby Holley (busy) from comment #30) > (In reply to Jonas Sicking (:sicking) from comment #29) > > As long as we make the Read function able to read the old serialized format, > > How do we do that, exactly? As long as the principal isn't part of a larger stream it's actually quite easy. See the nsPrincipal::Read code in this patch. That should handle both old and new serializations. The issue of larger streams is still an issue though.
Attachment #8728229 - Attachment is obsolete: true
If we land this, I'd also propose appending another 8-bit value for serialization of a versioning number to allow us to futureproof nsPrincipal against any further changes.
Note that we write principals during structured cloning, but presumably that's not going to persistent storage.
I think this is close to being the final patch _if_ this approach is acceptable. bholley, can you review the principal changes?
Can you answer the question of whether/why we really care now in a way that we didn't before? I'm not really excited about adding all this goop, especially when we've just bumped the CID plenty of times before.
Maybe I misread the above. I thought it sounded like you (and bz on IRC) were concerned that adding to the serialization of principals would break things. I was trying to cater to that concern by making sure that upgrading/downgrading works at least for session restore and other consumers that de-serialize principals that are not part of a greater serialization. If you and bz both agree that we can just append to principle serialization without worrying about that I'd defer to your judgement on that. I'm also also happy to rev the CID, but you mentioned at the end of comment 32 that that wasn't necessary.
(In reply to Jonathan Watt [:jwatt] from comment #40) > Maybe I misread the above. I thought it sounded like you (and bz on IRC) > were concerned that adding to the serialization of principals would break > things. It is a concern, it's just not clear to me that we care enough to try as hard as this patch is trying to avoid it, especially when we haven't in the past. bz? I was trying to cater to that concern by making sure that > upgrading/downgrading works at least for session restore and other consumers > that de-serialize principals that are not part of a greater serialization. > If you and bz both agree that we can just append to principle serialization > without worrying about that I'd defer to your judgement on that. I'm also > also happy to rev the CID That is the standard way to do this. It will cause deserialization to reliably fail. > , but you mentioned at the end of comment 32 that > that wasn't necessary. Not necessary in the case of adding new OriginAttributes (since OriginAttributes handles its own deserialization in a backwards-compatible way), but definitely necessary when writing new items to the object stream.
NI dcamp per discussion in the hallway just now.
(In reply to Bobby Holley (busy) from comment #42) > NI dcamp per discussion in the hallway just now. I talked about this for a while with bz, we may go about this a different way that doesn't require changing the serialization of nsPrincipal. Still interested in hearing how much fx-team cares about this, but dropping the NI for now.
FWIW as far as I can tell it's only top-level documents that are about:blank and data: URIs that go through the nsIPrincipal deserialization code paths. So as far as Session Restore is concerned I think breakage would be minimal if we went the route of changing the CID.
Which hints about the answer to the question on IRC as to why "sessionstore needs to do binary-principal serialization?" If a script opens a popup where the resulting document should inherit the opener's nsIPrincipal (a data: URI, say), the nsIPrincipal that the popup's document gets would be lost if session restore was to restore that popup and its document without regard to the fact it should inherit its principal. Maybe we could provide Session Restore with better API here so that it can somehow say "these documents in the session history of window X have these other documents in window Y as their creator documents", then docshell could do the right things, inheriting principals, setting window.opener (which currently gets broken across session restores), etc. Except that the opener documents may not exist at the time a session is closed and therefore won't be restored.
bholley: regarding your comment "my beef with the patch in the bug is that it's a bunch of hacky code that's specific to making version X upgrade to version X+1, which isn't something I want to see in security-critical code": If it's valid for consumers to persist nsISerializable serialized content then it seems buggy to me if implementations can't reliably be deserialized from a larger stream. I guess for implementations like principals that have no structure in their serialization that would imply adding 8 bits to the serialization to version it, and probably supporting versions back to the last supported ESR, code perhaps looking something like: ...set up variables with default values... ...get stuff from the stream that's been there since before versioning... uint8_t streamVersion; aStream->Read8(&streamVersion); if (streamVersion < 2) goto done_parsing; ...get v2 stuff... if (streamVersion < 3) goto done_parsing; ...etc... done_parsing: ...do stuff with the variables... If you're saying that you don't want to do that and some nsISerializables may only be able to deserialize serializations made by the current implementation then it seems like we should add a mayBePersistedAcrossVersions() method to nsISerializable, and implementations that return false should be required to rev their CID whenever the serialization changes.
> it's just not clear to me that we care enough to try as hard as this patch is trying > to avoid it, especially when we haven't in the past. bz? For the record, I think we should be caring more than we have in the past... But maybe that's just because I find session restore failing extremely annoying. Anyway, it sounds like we're not going to put this in nsIPrincipal after all...
Going back to the approach of passing HTTPS state onto nsIChannel objects and then using that to decide the HTTPS state of data: documents etc. seem to consist of at least four parts. 1. Change various API (both XPIDL and internal C++) to allow for requests' clients to be passed through to the code that creates nsIChannel's for documents that end up in a Window, or are used in a Worker. 2. Identify the callers in the relevant call trees that need to pass through the request's client and make them do so. 3. Provide logic for nsIChannel sub-classes to know whether they should inherit HTTPS state from the client. 4. Provide API to expose HTTPS state from nsIChannel.
For comment 48 step 4 (provide API to expose HTTPS state from nsIChannel) I don't have a complete patch just yet, but I feel like I should attach this code to give an idea of just how much code is involved in this otherwise simple addition.
If we were to propagate the securityInfo object instead of adding a new HTTPS state flag then none of this would be necessary. I'm not sure what people would think of that though, and in some cases we don't have a securityInfo object to propagate (it's an nsISupports though, so we could add a new nsIHttpsState interface and pass and object of that type in those cases).
In spec terms HTTPS state is a property of a channel, and the HTTPS state of a document is the state of the channel used to deliver the document's data. My current intent is therefore to provide API to get the HTTPS state of an nsIChannel, and provide similar API on nsIDocument to get the HTTPS state of the document's channel. I hadn't planed to provide API for nsPIDOMWindowInner. I intend to test all the things you mention indirectly via Window.isSecureContext (bug 1162772), but maybe adding a [ChromeOnly] method to Document would be useful for direct testing too? I think that would probably pretty much just duplicate the isSecureContext tests though which will already require the HTTPS state propagation to be correct.
Attachment #8683108 - Attachment is obsolete: true
Attachment #8727910 - Attachment is obsolete: true
Attachment #8733664 - Flags: feedback?(bzbarsky) → feedback+
This is still incomplete, mainly because it needs CacheEntry and all the code under that (SQL backed storage and all the other types) to be able to store HTTPS state. I started work on that in a separate patch, but that looks like maybe another 70 KB or so of code. Combined with the unwritten code to propagate request "client" to channels we're probably looking more than 200 KB, probably much more. Given that, I've gone with what I hope is an acceptable hack over in the part 2 patch on bug 1162772. Let's see how that goes.
Attachment #8733666 - Attachment is obsolete: true
De-assigning from myself since it seems unlikely I'll be working on this any time soon.
Assignee: jwatt → nobody
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.