Figure out how to deal with serialized/deserialized principals that have become invalid

RESOLVED FIXED in Firefox 61

Status

()

defect
RESOLVED FIXED
Last year
3 months ago

People

(Reporter: Gijs, Assigned: Gijs)

Tracking

unspecified
mozilla62
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox61 fixed, firefox62 fixed)

Details

Attachments

(1 attachment)

TL;DR:

1. on current release (and up-to-date nightly), about:home is a nested URI, where the outer is about:home and the inner is moz-safe-about:home .
2. bug 1438367 tried to make it so about:home no longer has MAKE_LINKABLE. The side effect of this is that about:home becomes a "normal" URI, ie just "about:home".
3. session store (and maybe other code) stores serialized principals as b64 strings

This is bad when the browser updates, and the canonical URI interpretation of "about:home" changes, because we check originNoSuffix when deserializing principals, and if we've serialized a principal we create the principal from the uri *string*, but the b64 deserialization in the previous build contains all the "old" nested/non-nested information. Result being, one bit of code has a principal with a nested nsIURI with moz-safe-about:home, the other bit has something else, and we MOZ_CRASH. In the parent. Which means startup crashes (or crashes when you click "restore session" page that comes up if the startup crashed) if any tab was a result of a link click off about:home (aka about:newtab aka activity stream). Specifically, this fails in deserialization (PrincipalInfoToPrincipal, in ipc/glue/BackgroundUtils.cpp) because the originNoSuffix doesn't match.

Root cause is bug 1228118 - we should really stop having about: URIs be nested at all. But not sure how easily fixable that is at the moment.

Talking to bkelly, we're looking at making PrincipalToPrincipalInfo (so the *serialization* step) do a sanity check and then fail instead. This would mean that the load fails in the child, because the child will fail to create a serialized copy of the principal to send to the parent. Boris, does that sound plausible? Do you think we need to pursue something else (as well)?

Twist of the knife when I was prototyping a patch is that the nsIURI object for the principal is stored with it in the serialized form in session restore, so we have to reparse the URI. Doing that whenever we serialize a URI to send to the parent feels expensive...
Flags: needinfo?(bzbarsky)
Marking 61 as affected because at least b2-4 went out with the new about:home code already, so people who clicked an about:home link on those beta builds and restore the session will just crash when they get the backout as part of b5 (ships tomorrow) or newer betas.
Does the code in BackgroundUtils.cpp fix this for sessionstore too? I'm currently investigating reports of pages that fail to start loading once they're restored due to (an) invalid principal(s).
Reports of this go as far back as Fx 58.
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Mike de Boer [:mikedeboer] from comment #3)
> Does the code in BackgroundUtils.cpp fix this for sessionstore too?

No, it makes things fail earlier - but they still fail. Hence the fix in ContentRestore.jsm.

> I'm
> currently investigating reports of pages that fail to start loading once
> they're restored due to (an) invalid principal(s).
> Reports of this go as far back as Fx 58.

Do you have links to the other bug(s) about this?
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(mdeboer)
Comment on attachment 8975637 [details]
Bug 1461407 - make about:home unlinkable again and improve behavior of serialized principals across changes to URLs,

https://reviewboard.mozilla.org/r/243886/#review249914

::: browser/components/sessionstore/ContentRestore.jsm:235
(Diff revision 1)
> +        if (triggeringPrincipal && triggeringPrincipal.isCodebasePrincipal &&
> +            triggeringPrincipal.URI && triggeringPrincipal.URI.schemeIs("about")) {

Thinking about this in the cold light of morning, I expect this does the wrong thing for about:blank principals. :-(

::: browser/components/sessionstore/ContentRestore.jsm:235
(Diff revision 1)
> +        if (triggeringPrincipal && triggeringPrincipal.isCodebasePrincipal &&
> +            triggeringPrincipal.URI && triggeringPrincipal.URI.schemeIs("about")) {

Thinking about this in the cold light of morning, I expect this does the wrong thing for about:blank principals. :-(
Bug 1448975 is the one I was investigating yesterday.
Flags: needinfo?(mdeboer)
Comment on attachment 8975637 [details]
Bug 1461407 - make about:home unlinkable again and improve behavior of serialized principals across changes to URLs,

https://reviewboard.mozilla.org/r/243886/#review249918

I'm basically rubberstamping this, because the change is quite harmless for any other flow.

I am wondering though; is there a generic way to detect that a principal we serialized ages ago has become invalid? And if so have a recovery path, instead of breaking all requests?

::: browser/components/about/AboutRedirector.cpp:87
(Diff revision 1)
>      nsIAboutModule::HIDE_FROM_ABOUTABOUT },
>    { "welcomeback", "chrome://browser/content/aboutWelcomeBack.xhtml",
>      nsIAboutModule::ALLOW_SCRIPT |
>      nsIAboutModule::HIDE_FROM_ABOUTABOUT },
>    // Actual activity stream URL for home and newtab are set in channel creation
> -  { "home", "about:blank", ACTIVITY_STREAM_FLAGS | nsIAboutModule::MAKE_LINKABLE }, // Bug 1438367 to try removing MAKE_LINKABLE again
> +  { "home", "about:blank", ACTIVITY_STREAM_FLAGS }, // Bug 1438367 to try removing MAKE_LINKABLE again

Since you're removing MAKE_LINKABLE here, what do you want to do about this comment and bug 1438367?

::: browser/components/sessionstore/ContentRestore.jsm:238
(Diff revision 1)
>  
> +        // Deal with changes in about: URI principals.
> +        if (triggeringPrincipal && triggeringPrincipal.isCodebasePrincipal &&
> +            triggeringPrincipal.URI && triggeringPrincipal.URI.schemeIs("about")) {
> +          let uri = Services.io.newURI(triggeringPrincipal.URI.spec);
> +          let oa = triggeringPrincipal.originAttributes;

nit: I don't think you need a temporary variable for this.
Attachment #8975637 - Flags: review?(mdeboer) → review+
Comment on attachment 8975637 [details]
Bug 1461407 - make about:home unlinkable again and improve behavior of serialized principals across changes to URLs,

https://reviewboard.mozilla.org/r/243886/#review250114

Overall check seems ok, but I think the error handling needs to be fixed in BackgroundUtils.  Also, it might be worth adding a comment pointing to the previous regression we are trying to protect against.

::: browser/components/about/AboutRedirector.cpp:87
(Diff revision 1)
>      nsIAboutModule::HIDE_FROM_ABOUTABOUT },
>    { "welcomeback", "chrome://browser/content/aboutWelcomeBack.xhtml",
>      nsIAboutModule::ALLOW_SCRIPT |
>      nsIAboutModule::HIDE_FROM_ABOUTABOUT },
>    // Actual activity stream URL for home and newtab are set in channel creation
> -  { "home", "about:blank", ACTIVITY_STREAM_FLAGS | nsIAboutModule::MAKE_LINKABLE }, // Bug 1438367 to try removing MAKE_LINKABLE again
> +  { "home", "about:blank", ACTIVITY_STREAM_FLAGS }, // Bug 1438367 to try removing MAKE_LINKABLE again

Bug where this was put back in implied there was a lot more that had to happen to remove this.  Are we sure this is safe?  Did you do this just to test?

::: browser/components/sessionstore/ContentRestore.jsm:234
(Diff revision 1)
>                         Utils.makeInputStream(loadArguments.postData) : null;
>          let triggeringPrincipal = loadArguments.triggeringPrincipal
>                                    ? Utils.deserializePrincipal(loadArguments.triggeringPrincipal)
>                                    : null;
>  
> +        // Deal with changes in about: URI principals.

I don't think I understand the session store issues enough to comment on this code.

::: ipc/glue/BackgroundUtils.cpp:241
(Diff revision 1)
>      return rv;
>    }
>  
>    *aPrincipalInfo = ContentPrincipalInfo(aPrincipal->OriginAttributesRef(),
>                                           originNoSuffix, spec);
> +  if (aPrincipalInfo) {

Uh, I hope this is not nullptr since we just de-referenced it on the line above.

::: ipc/glue/BackgroundUtils.cpp:250
(Diff revision 1)
> +    nsCOMPtr<nsIURI> originalURI;
> +    rv = NS_NewURI(getter_AddRefs(originalURI), spec);
> +    nsCOMPtr<nsIPrincipal> principalFromURI =
> +      BasePrincipal::CreateCodebasePrincipal(originalURI, aPrincipal->OriginAttributesRef());
> +    if (NS_WARN_IF(!principalFromURI)) {
> +      aPrincipalInfo = nullptr;

I don't think `aPrincipalInfo = nullptr` does what you want.  That clears the pointer passed in, but not any of the contents we wrote above with `*aPrincipalInfo = ...`.

It would probably be better to check the generated `originNoSuffix` before assigning anything to `*aPrincipalInfo`.

::: ipc/glue/BackgroundUtils.cpp:258
(Diff revision 1)
> +
> +    // Origin must match what the_new_principal.getOrigin returns.
> +    nsAutoCString originNoSuffix;
> +    rv = principalFromURI->GetOriginNoSuffix(originNoSuffix);
> +    if (NS_WARN_IF(NS_FAILED(rv)) ||
> +        !((ContentPrincipalInfo*)aPrincipalInfo)->originNoSuffix().Equals(originNoSuffix)) {

You don't want to NS_WARN_IF() around the origin equality check?
Attachment #8975637 - Flags: review?(bkelly) → review-
(In reply to Mike de Boer [:mikedeboer] from comment #7)
> Comment on attachment 8975637 [details]
> Bug 1461407 - make about:home unlinkable again and improve behavior of
> serialized principals across changes to URLs,
> 
> https://reviewboard.mozilla.org/r/243886/#review249918
> 
> I'm basically rubberstamping this, because the change is quite harmless for
> any other flow.
> 
> I am wondering though; is there a generic way to detect that a principal we
> serialized ages ago has become invalid?

I don't see an obvious way of doing so beyond re-creating every codebase principal sessionstore sees (using the same uri spec and OA) and comparing its originNoSuffix with the one that got deserialized. `deserializeObject` doesn't seem to care. I guess we could make it return null, though I'm not sure where the principal-specific deserialization code lives. Maybe ContentPrincipal::Read, and the same code for the about URI if that's where we need to check this?

In fact, more specifically, I wonder if we could make the URI ::Read code for both nested and non-nested about: URIs always read in the "currently-valid" version of the URI, even if the serialized version was nested/non-nested. AFAICT that would mean that the URI would always be valid, and as a result the principal would be too (because the linked principal reading code would get the "right" origin).

:bkelly / :bz, how does that sound?

> And if so have a recovery path,
> instead of breaking all requests?

At the moment, we detect this in core code at the point where the principal reaches the parent, and we MOZ_CRASH.

With this patch, we detect it in the child and the loadURI call throws -- but sessionstore / the remote browser code doesn't handle this at all, and you essentially get the symptoms described in bug 1448975. I guess we could try to fix that in a separate bug, given that it's not technically a regression/issue specific to here, and a fix here would at least alleviate the crashing, and users have a trivial workaround to the infinite loading spinner by just hitting 'enter'/'return' in the URL bar.

I guess the ideal case (for other missing/broken principals) might be loading a neterror page, but I dunno how hard it would be to do that here - probably pretty hard.

It *might* also be an option to downgrade to a null principal as the triggering if the load fails or the principal is missing, though right now we also can't assume that the load failing (ie throwing) is due to issues with the principal - there can be other reasons as well (e.g. trying to load a now-invalid URI). So we'd have to detect the broken/missing principal thing ourselves, ahead of time. Also, I don't know that it's always safe to downgrade to a null principal as the triggering principal. I *think* so, but I'm not sure.
Flags: needinfo?(bkelly)
(In reply to :Gijs (he/him) from comment #9)
> In fact, more specifically, I wonder if we could make the URI ::Read code
> for both nested and non-nested about: URIs always read in the
> "currently-valid" version of the URI, even if the serialized version was
> nested/non-nested. AFAICT that would mean that the URI would always be
> valid, and as a result the principal would be too (because the linked
> principal reading code would get the "right" origin).
> 
> :bkelly / :bz, how does that sound?

Oops, omitted the link after editing the comment - https://dxr.mozilla.org/mozilla-central/rev/a382f8feaba41f1cb1cee718f8815cd672c10f3c/caps/ContentPrincipal.cpp#413 is the codebase principal stuff I was looking at.
AIUI the session store corruption is limited to a few days on nightly and beta channels.  Rather than put in a complex recovery mechanism for this specific case I guess I would be inclined to simply ship the error-on-load before.  Perhaps session store should handle failure to load and purge the bad the entry.  That would in theory catch other potential corruption in the future.

If this had made it to release channel I think it would be a different story.

Just my 2 cents.
Flags: needinfo?(bkelly)
(In reply to Ben Kelly [:bkelly] from comment #11)
> AIUI the session store corruption is limited to a few days on nightly and
> beta channels.

Well, we still want to fix bug 1438367, which will re-introduce the issue (and will then eventually cause it to happen on release). So I'm trying to figure out what's the best way of avoiding that happening.

>  Rather than put in a complex recovery mechanism for this
> specific case I guess I would be inclined to simply ship the error-on-load
> before.  Perhaps session store should handle failure to load and purge the
> bad the entry.  That would in theory catch other potential corruption in the
> future.

Just purging the entry would be dataloss, right?
Comment on attachment 8975637 [details]
Bug 1461407 - make about:home unlinkable again and improve behavior of serialized principals across changes to URLs,

https://reviewboard.mozilla.org/r/243886/#review250424

So here's a question.  Would it make more sense to just change ContentPrincipal::Read to do the "if the URI is about: then recreate the URI from the spec" bit?  Basically, move to a "less binary" serialization format for principals, effectively, while retaining backwards compat.

That seems like it would not require the sessionstore code at all.  Would it also allow us to get rid of the PrincipalToPrincipalInfo changes?

::: browser/components/sessionstore/ContentRestore.jsm:234
(Diff revision 1)
>                         Utils.makeInputStream(loadArguments.postData) : null;
>          let triggeringPrincipal = loadArguments.triggeringPrincipal
>                                    ? Utils.deserializePrincipal(loadArguments.triggeringPrincipal)
>                                    : null;
>  
> +        // Deal with changes in about: URI principals.

This needs a much clearer comment about what this code is trying to do and why.

::: ipc/glue/BackgroundUtils.cpp:242
(Diff revision 1)
>    }
>  
>    *aPrincipalInfo = ContentPrincipalInfo(aPrincipal->OriginAttributesRef(),
>                                           originNoSuffix, spec);
> +  if (aPrincipalInfo) {
> +    // Check we're actually returning a sane principal. Need to recreate the URI

Do we need to worry about performance implications here?
Attachment #8975637 - Flags: review?(bzbarsky)
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #13)
> Comment on attachment 8975637 [details]
> Bug 1461407 - make about:home unlinkable again and improve behavior of
> serialized principals across changes to URLs,
> 
> https://reviewboard.mozilla.org/r/243886/#review250424
> 
> So here's a question.  Would it make more sense to just change
> ContentPrincipal::Read to do the "if the URI is about: then recreate the URI
> from the spec" bit?  Basically, move to a "less binary" serialization format
> for principals, effectively, while retaining backwards compat.

I think so. I've attempted to implement that in this new version of the patch.


> That seems like it would not require the sessionstore code at all.

Correct.

>  Would it
> also allow us to get rid of the PrincipalToPrincipalInfo changes?

I think so, yeah. Certainly the attached patch wfm.

(In reply to Ben Kelly [:bkelly] from comment #8)
> ::: browser/components/about/AboutRedirector.cpp:87
> (Diff revision 1)
> >      nsIAboutModule::HIDE_FROM_ABOUTABOUT },
> >    { "welcomeback", "chrome://browser/content/aboutWelcomeBack.xhtml",
> >      nsIAboutModule::ALLOW_SCRIPT |
> >      nsIAboutModule::HIDE_FROM_ABOUTABOUT },
> >    // Actual activity stream URL for home and newtab are set in channel creation
> > -  { "home", "about:blank", ACTIVITY_STREAM_FLAGS | nsIAboutModule::MAKE_LINKABLE }, // Bug 1438367 to try removing MAKE_LINKABLE again
> > +  { "home", "about:blank", ACTIVITY_STREAM_FLAGS }, // Bug 1438367 to try removing MAKE_LINKABLE again
> 
> Bug where this was put back in implied there was a lot more that had to
> happen to remove this.  Are we sure this is safe?  Did you do this just to
> test?

I've asked Mardak to review this now to make sure this is OK, but basically, removing the MAKE_LINKABLE flag fixed a security issue, so when we put it back we also put in another bit of code to avoid re-introducing the security issue. I'm pretty OK leaving said other code in place - belt + suspenders and all that. See also bug 1446065. So the TL;DR is that I think this is fine to land as-is, and if we do want to do further cleanup we can do it in bug 1438367, but I'm less bothered about that. I did remove the comment because it doesn't make much sense to have a comment saying "try removing this thing" when the thing is not, you know, there. :-)
Flags: needinfo?(bzbarsky)
Comment on attachment 8975637 [details]
Bug 1461407 - make about:home unlinkable again and improve behavior of serialized principals across changes to URLs,

https://reviewboard.mozilla.org/r/243886/#review250584

Lovely.  This makes a lot more sense to me and keeps all the ick in one place, which is already kinda icky.
Attachment #8975637 - Flags: review?(bzbarsky) → review+
Comment on attachment 8975637 [details]
Bug 1461407 - make about:home unlinkable again and improve behavior of serialized principals across changes to URLs,

https://reviewboard.mozilla.org/r/243886/#review250586

I added the "try removing again" comment as just plain removing the flag wasn't enough. Looks like this is enough! :)
Attachment #8975637 - Flags: review?(edilee) → review+
Pushed by gijskruitbosch@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/7cef5b202339
make about:home unlinkable again and improve behavior of serialized principals across changes to URLs, r=bz,Mardak,mikedeboer
Ryan, how do I check crash rates on nightly once this change lands, so I can request beta uplift once those verify this works as expected?

Mike: let me know if you need help (review, ideas, whatever) with the other triggering principal woes in bug 1448975 and/or others. The fix here will only help for about: pages and my impression was that in bug 1448975 you're also running into other issues.
Flags: needinfo?(ryanvm)
Well, it also looks like Ryan found bug 1345433 to regress sessionstore in some way, so I'm looking at activity there for the time being. Thanks!
(In reply to :Gijs (he/him) from comment #19)
> Ryan, how do I check crash rates on nightly once this change lands, so I can
> request beta uplift once those verify this works as expected?

I've actually been holding off on verifying bug 1459655 to make sure the rate drops off as expected (and so far, all indications are that it has). Assuming the signature stays the same, we can monitor the rates on trunk to see what happens?
Flags: needinfo?(ryanvm)
https://hg.mozilla.org/mozilla-central/rev/7cef5b202339
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Blocks: 1438367
(In reply to :Gijs (he/him) from comment #19)
> Ryan, how do I check crash rates on nightly once this change lands, so I can
> request beta uplift once those verify this works as expected?

To follow-up, I've only seen one mozilla::ipc::PrincipalInfoToPrincipal crash on Nightly since this landed, and the stack below it looks different:
https://crash-stats.mozilla.com/report/index/045fe1d1-1504-41e1-a538-10f3b0180521

So I guess that's a good sign? That said, bug 1459655 is also tapering off on Beta61 at this point as expected. I would think the issue won't affect release at all since there's no net change between 60 and 61 at this point. Uless we still want bug 1438367 to ship in 61, are we good to just leave things as-is at this point?
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #23)
> (In reply to :Gijs (he/him) from comment #19)
> > Ryan, how do I check crash rates on nightly once this change lands, so I can
> > request beta uplift once those verify this works as expected?
> 
> To follow-up, I've only seen one mozilla::ipc::PrincipalInfoToPrincipal
> crash on Nightly since this landed, and the stack below it looks different:
> https://crash-stats.mozilla.com/report/index/045fe1d1-1504-41e1-a538-
> 10f3b0180521
> 
> So I guess that's a good sign? That said, bug 1459655 is also tapering off
> on Beta61 at this point as expected. I would think the issue won't affect
> release at all since there's no net change between 60 and 61 at this point.
> Uless we still want bug 1438367 to ship in 61, are we good to just leave
> things as-is at this point?

I think I would slightly prefer bug 1438367 to ship in 61 because it's good defense-in-depth work, and I think the patch in this bug is pretty safe generally, but I wouldn't throw a tantrum over it being in 61 instead. I'd delegate the final decision to Mardak. :-)
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(edilee)
(In reply to :Gijs (he/him) from comment #24)
> (In reply to Ryan VanderMeulen [:RyanVM] from comment #23)
> > (In reply to :Gijs (he/him) from comment #19)
> > > Ryan, how do I check crash rates on nightly once this change lands, so I can
> > > request beta uplift once those verify this works as expected?
> > 
> > To follow-up, I've only seen one mozilla::ipc::PrincipalInfoToPrincipal
> > crash on Nightly since this landed, and the stack below it looks different:
> > https://crash-stats.mozilla.com/report/index/045fe1d1-1504-41e1-a538-
> > 10f3b0180521
> > 
> > So I guess that's a good sign? That said, bug 1459655 is also tapering off
> > on Beta61 at this point as expected. I would think the issue won't affect
> > release at all since there's no net change between 60 and 61 at this point.
> > Uless we still want bug 1438367 to ship in 61, are we good to just leave
> > things as-is at this point?
> 
> I think I would slightly prefer bug 1438367 to ship in 61 because it's good
> defense-in-depth work, and I think the patch in this bug is pretty safe
> generally, but I wouldn't throw a tantrum over it being in 61 instead.

... in *62* instead. Yes, it's afternoon here...
Activity Stream about:home with MAKE_LINKABLE has been shipping since Firefox 57, and 61 has some extra logic to blank the page if it was linked. So from that perspective, I think it would be fine to not uplift the fix here to 61.

However, if there's any users who have different (not-LINKABLE) about:home being restored by session restore from early 61 beta, it would be unfortunate for them to still be crashing. Bug 1459655 was uplifted to 61b5, and there are still crashes:

https://crash-stats.mozilla.com/signature/?version=61.0b7&version=61.0b6&version=61.0b5&signature=mozilla%3A%3Aipc%3A%3APrincipalInfoToPrincipal

So for that reason, it might be worthwhile to uplift this to try to make those go away.
Flags: needinfo?(edilee)
Comment on attachment 8975637 [details]
Bug 1461407 - make about:home unlinkable again and improve behavior of serialized principals across changes to URLs,

Reuesting uplift per comment 26.

Approval Request Comment
[Feature/Bug causing the regression]: bug 1438367 and then bug 1459655.
[User impact if declined]: crashes
[Is this code covered by automated tests?]: no. :-(
[Has the fix been verified in Nightly?]: yes, crashes are gone on Nightly.
[Needs manual test from QE? If yes, steps to reproduce]: I don't think so.
[List of other uplifts needed for the feature/fix]: n/a
[Is the change risky?]: not really
[Why is the change risky/not risky?]: it's a relatively small change that's been verified in nightly, has baked for a bit now, and the code it changes is generally straightforward enough, so the scope for problems is pretty small
[String changes made/needed]: nope
Attachment #8975637 - Flags: approval-mozilla-beta?
Comment on attachment 8975637 [details]
Bug 1461407 - make about:home unlinkable again and improve behavior of serialized principals across changes to URLs,

Allows us to better secure our about:home page and will reduce the crashes experienced by beta users upgrading from older builds. Approved for 61.0b8.
Attachment #8975637 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
So far no early crash reports for 61.0b8. Although if someone is crashing near startup because of a restored about:home, seems unlikely that it'll be able to update automatically to 61.0b8… ?

https://crash-stats.mozilla.com/signature/?version=61.0b8&signature=mozilla%3A%3Aipc%3A%3APrincipalInfoToPrincipal
See Also: → 1553629
You need to log in before you can comment on or make changes to this bug.