Open Bug 1507773 Opened 2 years ago Updated 2 years ago

Downloads db/list should store triggering principals instead of using system principal for everything

Categories

(Toolkit :: Downloads API, enhancement, P3)

enhancement

Tracking

()

Tracking Status
firefox65 --- affected

People

(Reporter: Gijs, Unassigned)

References

Details

Attachments

(1 file)

We have infrastructure for serializing and deserializing principals, and the downloads list should use that (like session restore).
One thing to be aware of is that there is a difference between a serialization format designed to work between processes that run the same code version, and a robust serialization format that takes into account backwards and forwards compatibility. I wasn't aware that session restore used principal serialization, and that leads me to think that we're in the latter case, but we should move with caution and ensure that anything we do here gets a thorough security review.

We need an analysis of which information is carried along with the principal. For example, maybe there is information about the container in which a download was started? As I understand it, container identifiers are integers and their definitions might be reconfigured arbitrarily.

If we serialize principals, we should also have regression tests similar to those we already have for the JSON file format, where we have a serialized principal as an immutable string, and we ensure that restoring and resuming such a download in the current version has the right security properties.
Priority: -- → P3
(In reply to :Paolo Amadini from comment #1)
> One thing to be aware of is that there is a difference between a
> serialization format designed to work between processes that run the same
> code version, and a robust serialization format that takes into account
> backwards and forwards compatibility. I wasn't aware that session restore
> used principal serialization, and that leads me to think that we're in the
> latter case,

Deserialization will fail if the principal cannot be constructed in the current version. This can bite you if you don't have a (good) fallback, cf. bug 1500114. That's fine, we can deal with that.

> but we should move with caution and ensure that anything we do
> here gets a thorough security review.

It doesn't need a security review - it's currently just using system principal, so literally anything we do here is an improvement from a security perspective.

> We need an analysis of which information is carried along with the
> principal.

This is public knowledge, we don't need an 'analysis'. The short answer is "everything that's necessary" is in the principal.

> For example, maybe there is information about the container in
> which a download was started?

yes, the usercontextid is part of the principal.

>  As I understand it, container identifiers are
> integers and their definitions might be reconfigured arbitrarily.

Yes, but I don't see why this would be a problem. You download things in container X, we keep downloading them in container X.

If it *is* a problem, the container implementation provides events that we could use to scrub downloads if a container is removed (which is the only way in which such a "reconfiguration" could happen).
(In reply to :Gijs (he/him) from comment #2)
> Deserialization will fail if the principal cannot be constructed in the
> current version. This can bite you if you don't have a (good) fallback, cf.
> bug 1500114. That's fine, we can deal with that.
> 
> It doesn't need a security review - it's currently just using system
> principal, so literally anything we do here is an improvement from a
> security perspective.

Given the failure mode you described above for which principals from previous versions can't be deserialized at all, then this may be the case, otherwise this isn't necessarily automatic, so we can't just dismiss it as not needing extra scrutiny. For example, if the order of some fields was swapped, and we ended up deserializing information in the wrong field, then after a Firefox update we might, just as a theoretical example, send cookies as if we were a different container or origin.

> >  As I understand it, container identifiers are
> > integers and their definitions might be reconfigured arbitrarily.
> 
> Yes, but I don't see why this would be a problem. You download things in
> container X, we keep downloading them in container X.
> 
> If it *is* a problem, the container implementation provides events that we
> could use to scrub downloads if a container is removed (which is the only
> way in which such a "reconfiguration" could happen).

Yes, this is exactly what I mean by doing an analysis of the principal's content, and not treating is as a black box. This is the kind of question we should ask ourselves. Maybe someone implements an add-on that creates ephemeral container identifiers, and suddenly resuming a download might happen in the wrong container if we don't clean up.

This is also why I'd feel more comfortable if anything we end up doing here gets an additional design review from someone on the platform security team, who may be more familiar with the expectations of principal serialization, and may also need to know for which cases the front-end team is using this feature, so future extensions are designed accordingly.
(In reply to :Paolo Amadini from comment #3)
> (In reply to :Gijs (he/him) from comment #2)
> > Deserialization will fail if the principal cannot be constructed in the
> > current version. This can bite you if you don't have a (good) fallback, cf.
> > bug 1500114. That's fine, we can deal with that.
> > 
> > It doesn't need a security review - it's currently just using system
> > principal, so literally anything we do here is an improvement from a
> > security perspective.
> 
> Given the failure mode you described above for which principals from
> previous versions can't be deserialized at all, then this may be the case,
> otherwise this isn't necessarily automatic, so we can't just dismiss it as
> not needing extra scrutiny. For example, if the order of some fields was
> swapped, and we ended up deserializing information in the wrong field, then
> after a Firefox update we might, just as a theoretical example, send cookies
> as if we were a different container or origin.

A different container, yes, theoretically possible if the serialization changed. A different origin, no - there's only one stored origin in the principal (plus maybe document.domain stuff). And so far serialization changes haven't changed the principal, the issue has been about deserializing URIs, and our internal implementations of URIs changing...

Anyway - this already happens today, because we don't keep the triggering principal. We re-download in the wrong container, we bypass otherwise correct security checks, etc. That's why I filed this bug... It's a bit weird when someone says "we should have a fire extinguisher" to say "what about the fires that the fire extinguisher won't put out".

I'm not saying let's not be careful about how to implement this, I'm saying pretty much anything will be an improvement over the status quo.
See Also: → 1505640
Comment on attachment 9026107 [details]
Bug 1507773 - interim solution for removing system principal from download manager

Hey Paolo, do you have any initial feedback to this solution before I carry on down this path futher?

Thanks!
Attachment #9026107 - Flags: review?(paolo.mozmail)
Comment on attachment 9026107 [details]
Bug 1507773 - interim solution for removing system principal from download manager

I think this can be split in two bugs, one where we add the triggering principal to the "source" object so we can better replay the network request when retrying a download in the same session, and one where we serialize the principal across sessions.

For the latter bug, I still have the objections in comment 1. It seems this same problem could be already hurting Session Restore as well, and after looking at the code doing the serialization, this looks opaque and not designed with backwards compatibility in mind, but rather with efficiency. It seems we even include internal Class IDs, which are XPCOM-specific:

https://searchfox.org/mozilla-central/rev/b03a62c3c82316e733a3b09622c1cb7e59f64cc3/xpcom/io/nsBinaryStream.cpp#327-341

We may also serialize a CSP, which I'm not sure we need at all to replay a single request, and it could be a large amount of data to store in the "downloads.json" file, slowing it down:

https://searchfox.org/mozilla-central/rev/b03a62c3c82316e733a3b09622c1cb7e59f64cc3/caps/ContentPrincipal.cpp#622-652

Moreover, if someone touches the serialization code and wants to remove or move something around, this may break deserialization, and in the future it _might_ introduce a security flaw since the format apparently relies on field order. We don't store the field name, and in a theoretical scenario, if an object had a field for "mOriginWeCareAbout" and "mOriginWeDontCareAboutSetByPage", they could be swapped as part of a refactoring. That would easily be missed in code review, since the code lives in a different part of the tree, and the reviewers don't expect the requirements in comment 1 to apply. Also, do we have regression tests for the deserialization of data from past releases at all, that we commit not to break?

As I mentioned earlier on the bug, I'd like some additional scrutiny on the architecture here not only from us in the front-end team, but also from people who've worked on internals for more time than us. Boris, do you have any thoughts on how to approach this, or can you redirect to someone who could help?

I'm not saying we should rule out this solution completely, but I don't understand the long-term implications enough to be confident about it. It's notorious that "interim" solutions may live for decades unless there's a specific architectural plan to address them. On the final bug, I'd like someone from platform to sign off on the solution.

These should have been valid questions before this bug was filed as well, and I've not looked at past bugs, so this may have been discussed before. I may have missed something in my quick reading of the code, so feel free to correct anything I got factually wrong, and adjust the conclusions accordingly.
Flags: needinfo?(bzbarsky)
Attachment #9026107 - Flags: review?(paolo.mozmail) → review-
> and one where we serialize the principal across sessions.

So if we were to split this into two bugs, when we deserialize from the store we would have to force systemprincipal there right?

> We may also serialize a CSP, which I'm not sure we need at all to replay a single request,

Bug 965637 is actively being worked on to remove the CSP from the principal.

> in the future it _might_ introduce a security flaw since the format apparently relies on field order

I think you raise some valid points around improving serialization however what we have currently is using system for this code which is concerning.

:ckerschb I think we should discuss the individual points that Paolo raises at some point soon.
Flags: needinfo?(ckerschb)
Paolo, I think you understand the serialization situation with principals correctly.  It's horrible as a persistent format.  The only reason sessionstore gets away with it, as I recall, is that principals serialized in sessionstore are not really used in all that many cases; they mostly matter for things like data: and javascript:.  And even then, basic loading will work fine; what will fail if the principals fail to deserialize is some cross-window access.

For what it's worth, reviewers of the principal serialization code _should_ keep in mind that changing it can be problematic.  But in practice they may not have better options.

We should really have a better, more back-and-forth-compatible, story for serializing principals.  I'm not sure whether that should block making changes here, depending on what the behavior would be on serialization failure.

I also agree that you could end up with serialization success that just ends up with the "wrong" principal.  Again, I can't say whether that would be worse than the current "just use system" behavior.
Flags: needinfo?(bzbarsky)
(In reply to Jonathan Kingston [:jkt] from comment #8)
> I think you raise some valid points around improving serialization however
> what we have currently is using system for this code which is concerning.

I agree that we should come up with a plan. Falling back to using the SystemPrincipal as the triggeringPrincipal ultimately translates to insecurity by default which we try to eliminate in our codebase. We have had various efforts on removing that behavior within our codebase and I think we should clean it up for sessions as well.

Now, FWIW, I filed Bug 1508939 and marked this bug to be dependent on it. Even though we should change the behavior we should also not act to fast. Having an end user being stuck because principals can't be deserialized anymore is a problem we don't want to get into, because that will get messy fast.

I added Bug 1508939 as project by itself to our content security roadmap and we will look into it. Once Bug 1508939 is fixed we can then focus on fixing the exact problem within this bug.

Thanks Paolo for summarizing your concerns in such detail!
Flags: needinfo?(ckerschb)
Duplicate of this bug: 1505640
You need to log in before you can comment on or make changes to this bug.