Closed Bug 1305237 Opened 7 years ago Closed 6 years ago

Pass ancestors URLs information to webRequest.onBeforeRequest listeners

Categories

(WebExtensions :: Request Handling, defect, P3)

defect

Tracking

(firefox58 fixed)

RESOLVED FIXED
mozilla58
Tracking Status
firefox58 --- fixed
webextensions +

People

(Reporter: ma1, Assigned: mixedpuppy)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: dev-doc-complete, Whiteboard: triaged)

Attachments

(2 files, 6 obsolete files)

Both Adblock Plus and NoScript (and presumably similar extension) have use cases for knowing the whole ancestors chain, up to the top frame, for both document and sub resources loads.

This is fairly simple to obtain in a nsIContentPolicy (which we still use, but only for non-HTTP loads) by directly walking up the frames hierarchy; on the other hand, AFAIK, is currently impossible in HTTP observers living in the parent process, which is how onBeforeRequest is implemented in the HTTP case.

Therefore I believe we preliminarily need to augment nsILoadInfo by adding another array attribute "ancestors", similar to redirectChainIncludingInternalRedirects, but filling it with the loading element's ancestors principals, starting with parent frame if any: if this is the top frame, the array would be empty, otherwise the first element would be the same as nsILoadInfo.loadingPrincipal, the second element would be the parent document's principal and so on, up to the top frame.

Once nsILoadInfo.ancestors is in place, we can use it in onBeforeRequest to fill an analogous "ancestors" array attribute in the event's data object, with the URLs (as strings)  of the collected principals.

Kris, Bill, what do you think?
Blocks: 1166496
OS: Unspecified → All
Hardware: Unspecified → All
Would that also be include the owner frames for loads from workers?
(In reply to The 8472 from comment #1)
> Would that also be include the owner frames for loads from workers?

I suspect that would be significantly more difficult, unless we cache this information at worker creation time.
Asking someone surely more knowledgeable than me. Andrea?
Flags: needinfo?(amarchesini)
Whiteboard: triaged
Assignee: nobody → evilpies
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
I find it difficult to come up with a good definition of what should actually be included in "frameAncestors".
Let's say we have some setup like this "top level document > iframe 1 > iframe 2 > image.png"
top level document => [] ancestors
iframe 1 => [] or [top level] ? "top level document" would already be the originUrl in this case
iframe 2 => [iframe 1, top level] or [top level], again "iframe1" would be the originUrl
image.png => [iframe 1, top level] or [iframe 2, iframe 1, top level]
these seem the most logical choices to me:

> top level document => [] ancestors
> iframe 1 => [top level]
> frame 2 => [iframe 1, top level]
> image.png => [iframe 2, iframe 1, top level]


But there is one further complication, the triggering principal. When an image is loaded by CSS, shouldn't we get the following?

image.png => [stylesheet, iframe2, iframe1, top level]
Attached patch WIP frameAncestors (obsolete) — Splinter Review
This sort of mixes two different ones, sorry. ContentPolicy vs LoadInfo is different here.

LoadInfo behaves like this
top level => []
iframe 1 => []
iframe 2 => [top level]
image.png => [iframe 1, iframe 2]

The content policy implementation is a bit different.
I think anything that requires post-processing by the addon to *add* information would end up being stateful. E.g. resolving tab or frame IDs to URIs would require keeping state about those relationships. URIs are ambiguous, frame IDs get reused if the frames get navigated (backwards navigation!). So it would be easy to miss some edge-cases.

Snapshotting as much information as possible during the correct time in the child process should yield the most accurate picture. Stripping information out of the array would be a lot easier than trying to fill in the blanks.
I just illustrated a way to get the tab during a request in bug 1330600 comment 9, could something along that line be used to work your way up through the ancestors?
(In reply to Shane Caraveo (:mixedpuppy) from comment #7)
> I just illustrated a way to get the tab during a request in bug 1330600
> comment 9, could something along that line be used to work your way up
> through the ancestors?

a) that uses async APIs, which means it's potentially subject to racy behavior
b) it only works for top level frames, i.e. the tab itself. there's no API to query information on frames
webNavigation.getAllFrames wouldn't work?  Arguably a weird place for that api, it would feel more correct if it were tabs.Tab.getAllFrames.  In fact it seems like that provides whats necessary for an ancestor chain.
and I'm really not sure about the racy behavior, all the apis are async.  if you're using blocking onBeforeRequest, the request is suspended so subrequests wouldn't happen.  What would be happening to cause racy conditions?
Oh, right, my mistake. That leaves the races.

> What would be happening to cause racy conditions?

Perhaps javascript in the child process navigating pages or frames? Especially back navigation, which won't trigger a webrequest. And with out of process addons other addons might be acting concurrently on other requests (suspending one request won't necessarily delay processing of others AIUI) and trigger additional actions. So anything could happen in the meantime.
(In reply to The 8472 from comment #11)
> Oh, right, my mistake. That leaves the races.
> 
> > What would be happening to cause racy conditions?
> 
> Perhaps javascript in the child process navigating pages or frames?

I don't see how using webrequest with a pre-filled ancestor chain would change that.

> Especially back navigation, which won't trigger a webrequest. 

Both the OP and the WIP attached to this bug only address webrequest, so I'm not clear how that makes a difference.

> And with out
> of process addons other addons might be acting concurrently on other
> requests (suspending one request won't necessarily delay processing of
> others AIUI) and trigger additional actions. So anything could happen in the
> meantime.

Suspending a request will prevent the resource being loaded from loading other resources.  ie. the html page in a frame being suspended, the scripts/etc. it loads would not be available at that time.

Even with the changes in the WIP, multiple addons may be processing the request anyway and there is no promised order to that, so I don't see how doing this low level vs. using getAllFrames makes any difference there.

I am playing devils advocate here, and am not really against this change if it adds real value, but I'm trying to understand what it introduces that cannot be done already.
> Suspending a request will prevent the resource being loaded from loading other resources.

Child resources. But what about the ancestors? If the ancestor changes in the meantime, e.g. by frame navigation then getAllFrames might return frame URLs that are inconsistent with the ancestor chain that actually triggered the request.
My use case, and the reason why I filed this bug, depends on the information about ancestors of the document which triggered the load being an accurate snapshot, freezing the moment when the load has been attempted.
When I look at the request info asynchronously I don't care if something has changed in the meanwhile, provided that my request is suspended, because I need to decide about that snapshot info.
Example (and an actual requirement of the Tor Browser): I need to know whether all the parents of the document being loaded had an HTTPS URI *when the load started*.

Therefore, taking the ancestor chain snapshot in the child process at the moment the load is triggered is the right thing to do even if the processing happens asynchronously, later, on a suspended channel. Looking at the ancestors chain "live" at processing time (i.e. generally after the fact) using getAllFrames() is not OK at all, as suggested also in comment 13.
Ok, that is a much clearer requirement, makes sense.  From the looks of this, you only need the urls, not a frameId or anything.  Knowing which frame a request was made in is not necessary?  I haven't looked deeply at the WIP, but I wonder if the frameId would be easy to just add in case it was useful.
(In reply to Shane Caraveo (:mixedpuppy) from comment #15)
> Ok, that is a much clearer requirement, makes sense.  From the looks of
> this, you only need the urls, not a frameId or anything.  Knowing which
> frame a request was made in is not necessary?  I haven't looked deeply at
> the WIP, but I wonder if the frameId would be easy to just add in case it
> was useful.

The frameIds are useful (required in some real use cases in NoScript / Tor Browser) to associate and preserve some state to the frame structure, e.g. in order to make permission decisions based on previous decisions for the ancestors (like "Cascade permissions" or "Restrict subframes if parent or top frame has been restricted") or other data from the requests which loaded the ancestors (e.g. HTTP headers like CSP).
frameIDs are a good idea, I didn't even think about adding those. Would be unfortunate if we had to add yet another array to LoadInfo for this though.
If I wanted to gather additional information in the child process and add it to the requestinfo, would that be possible through webext experiments?
(In reply to Tom Schuster [:evilpie] from comment #17)
> frameIDs are a good idea, I didn't even think about adding those. Would be
> unfortunate if we had to add yet another array to LoadInfo for this though.

Why would it need to be another array?  Just a new interface and an array of that interface.  In fact, it could probably just match the objects from webNavigation.getAllFrames.

[{tabId, processId, frameId, parentFrameId, url},...]
(In reply to The 8472 from comment #18)
> If I wanted to gather additional information in the child process and add it
> to the requestinfo, would that be possible through webext experiments?

If you mean the details objects that webrequest listeners receive, I don't think so, but an interesting thought (at least in respect to webRequest).  If not, what is requestinfo?
Yes, I meant the details.
Would it be possible to include information about html5 sandboxing restriction then? I think for privacy addons it might make sense to differentiate between sandboxed and non-sandboxed loads since the former can't use any persistent storage.
webextensions: --- → ?
For frameIDs, we would probably just pass the outerWindowID around and lookup the right frame id in the webextension code. This should be easily doable.
Tom, we'd like to target this for 55.  Let me know if you'd like me to take over.
Flags: needinfo?(evilpies)
No I am still working on this, we are easily going to make 55.
Flags: needinfo?(evilpies)
I changed frameAncestors to an array of objects, with the keys url and windowId, that should allow us to extend this in the future as required.

I still need to finish writing some tests, especially around the "WebRequest content policy" and make sure the behavior matches with the loadInfo implementation.
Attachment #8828906 - Attachment is obsolete: true
This changes frameAncestors to behave like described in Comment 4. I find this should be the least surprising behavior for users.

Now the loadInfo based code should match the code for content policy.
Attachment #8849072 - Attachment is obsolete: true
webextensions: ? → +
Assignee: evilpies → mixedpuppy
Giorgio, did you get a chance to look at the patches?
Flags: needinfo?(g.maone)
Tom, do you have any notes or comments about the last state of the patches?
Flags: needinfo?(evilpies)
Sorry, I am on vacation and can't access my work for about two weeks. I had some updated patches and started working on some tests, but it's basically not that different from what I is in my last patch.
Flags: needinfo?(evilpies)
Tom, are those in a state where they could be attached here?  I took the bug, but if you're planning to finish it in the next couple weeks you can take it back.
Flags: needinfo?(evilpies)
As I mentioned earlier, having effective sandbox properties and origin attributes per ancestor frame would be useful too.
I can attach the patches next week, not sure what the state is anymore. The patches aren't actually that large though.
Flags: needinfo?(evilpies)
Not sure if we are okay with adding this additional information to all LoadInfos?
Attachment #8849070 - Attachment is obsolete: true
Attachment #8887049 - Flags: feedback?(mixedpuppy)
I just rebased those patches, but nothing substantial changed. The test I wrote doesn't pass anymore, I think because we always give the main_frame an id of 0 now. I am not totally convinced that the ContentPolicyManager matches the HttpObserverManager.
Attachment #8849082 - Attachment is obsolete: true
Attachment #8887051 - Flags: feedback?(mixedpuppy)
Attachment #8887051 - Attachment is patch: true
Comment on attachment 8887049 [details] [diff] [review]
LoadInfo changes to include all ancestors principals and window IDs

This is a patch I cannot really evaluate, see if we can find someone who can.
Attachment #8887049 - Flags: feedback?(mixedpuppy)
Attachment #8887049 - Flags: feedback?(honzab.moz)
Attachment #8887049 - Flags: feedback?(bzbarsky)
Comment on attachment 8887049 [details] [diff] [review]
LoadInfo changes to include all ancestors principals and window IDs

Review of attachment 8887049 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not a loadinfo peer, but personally would not be that much against this.
Attachment #8887049 - Flags: feedback?(honzab.moz)
Comment on attachment 8887049 [details] [diff] [review]
LoadInfo changes to include all ancestors principals and window IDs

This seems ok as a general idea.  Some caveats:

1)  If we ever decide to do out of process iframes, this would be somewhat of an impediment to doing so.  That said, if we end up implementing something like ancestorOrigins then we would have at least some form of origin information.  But window ids might end up meaningless in the oop iframe case, because they're per-process.

2)  It's not clear to me what the expected thing is here in sandboxing without allow-same-origin scenarios.  If adblockers expect to see the URL of the thing being sandboxed, you're not going to get that from the principal.

3)  The API documentation being added doesn't make much sense for non-document loads.  It needs to be fixed.

4)  property-getter APIs that return a new object each time are a bit of an antipattern.  Methods are probably better, if we don't want to cache the object.

I didn't check the details of how we're getting the things to put in the arrays; it's not worth spending time on until at least items 2 and 3 above are addressed.
Attachment #8887049 - Flags: feedback?(bzbarsky) → feedback+
Comment on attachment 8887051 [details] [diff] [review]
Expose frameAncestors to webextensions

This all seems fine to me.
Attachment #8887051 - Flags: feedback?(mixedpuppy) → feedback+
(In reply to Boris Zbarsky [:bz] from comment #39)
> Comment on attachment 8887049 [details] [diff] [review]

> 1)  If we ever decide to do out of process iframes, this would be somewhat
> of an impediment to doing so.  That said, if we end up implementing
> something like ancestorOrigins then we would have at least some form of
> origin information.  But window ids might end up meaningless in the oop
> iframe case, because they're per-process.

How much should we worry about this being an impediment?  If we went OOP for iframes in the next year (or perhaps even two) we should certainly figure out something better.
Flags: needinfo?(bzbarsky)
I really doubt we can do OOP iframes in the next year.  I can't speak to the next two.

One simple option in terms of the API surface is to ship over the process id, not just the window id, and then the recipient can figure things out as needed.
Flags: needinfo?(bzbarsky)
(In reply to Limited Availabilty till Oct | Tom Schuster [:evilpie] from comment #36)
> Created attachment 8887051 [details] [diff] [review]
> Expose frameAncestors to webextensions
> 
> I just rebased those patches, but nothing substantial changed. The test I
> wrote doesn't pass anymore, I think because we always give the main_frame an
> id of 0 now. I am not totally convinced that the ContentPolicyManager
> matches the HttpObserverManager.

Can you attach your test?

If you've got cycles and want to help in any of this let me know.
Flags: needinfo?(evilpies)
(In reply to Boris Zbarsky [:bz] (vacation Aug 14-27) from comment #39)

> 2)  It's not clear to me what the expected thing is here in sandboxing
> without allow-same-origin scenarios.  If adblockers expect to see the URL of
> the thing being sandboxed, you're not going to get that from the principal.

Currently, there is no way to see a request inside a sandboxed iframe with webrequests, though I intend to change that.  

With a little hackery in the webrequest module, this is what I see for a sandboxed request without allow-same-origin, and it provides what I would have expected.  

It doesn't provide the url for the iframe itself (this is an image loaded in the iframe), which may prove to be an issue, but I'm not overly concerned with it (unless there is a very simple fix).

Did you expect something else?

{
  "requestId": "12",
  "url": "http://mochi.test:8888/.../file_image_good.png#sandboxed",
  "originUrl": "moz-nullprincipal:{03fd5623-648a-7748-a5c0-0ab591bcd52d}",
  "documentUrl": "moz-nullprincipal:{03fd5623-648a-7748-a5c0-0ab591bcd52d}",
  "method": "GET",
  "tabId": 1,
  "type": "image",
  "timeStamp": 1502313009296,
  "frameId": 2147483664,
  "parentFrameId": 2147483662,
  "frameAncestors": [{
      "url": "http://mochi.test:8888/.../file_simple_xhr_frame2.html",
      "frameId": 2147483662
    }, {
      "url": "http://mochi.test:8888/.../file_simple_xhr_frame.html",
      "frameId": 2147483660
    }, {
      "url": "http://mochi.test:8888/.../file_simple_xhr.html?nocache=0.7630002042184172",
      "frameId": 0
    }
  ]
}
Flags: needinfo?(bzbarsky)
See Also: → 1388889
Attachment #8887049 - Attachment is obsolete: true
Attachment #8887051 - Attachment is obsolete: true
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

This should address points 3 & 4, see comment 44 for point 2.
Attachment #8895588 - Flags: feedback?(bzbarsky)
Comment on attachment 8895589 [details]
Bug 1305237 Expose frameAncestors to webextensions,

Can you look at the tests?
Attachment #8895589 - Flags: feedback?(evilpies)
Comment on attachment 8895589 [details]
Bug 1305237 Expose frameAncestors to webextensions,

https://reviewboard.mozilla.org/r/166796/#review172186

Looks good! I added this tiny test: https://pastebin.mozilla.org/9029323
Flags: needinfo?(evilpies)
Attachment #8895589 - Flags: feedback?(evilpies) → feedback+
> It doesn't provide the url for the iframe itself (this is an image loaded in the iframe),
> which may prove to be an issue

Right, that is the question.

> unless there is a very simple fix

The only fix I can think of would be to go from a request context to a window or document and get the url...

> Did you expect something else?

No.  The question is what extensions using these APIs will expect.

In particular, using the principal for "documentURL" seems a bit odd to me.  Maybe as a followup we _should_ do the "go to a document/window" thing.
Flags: needinfo?(bzbarsky)
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

https://reviewboard.mozilla.org/r/166794/#review172466

::: netwerk/base/LoadInfo.cpp:112
(Diff revision 1)
> +      do {
> +        if (!parent || prevParent == parent) {
> +          break;
> +        }
> +
> +        nsCOMPtr<nsIScriptObjectPrincipal> scriptObjPrin = do_QueryInterface(parent);

This worries me some.

In particular, if we can ever end up in this code for a document which has a navigated-away-from ancestor, then the principals will be wrong.  Ideally we'd walk the parent/embedding document chain instead, because that's the right thing that actually shows the provenance of our document.  This keeps coming up; we should really add a nice accessor for it on nsIDocument or nsPIDOMWindowInner or something...

::: netwerk/base/LoadInfo.cpp:118
(Diff revision 1)
> +        if (!scriptObjPrin) {
> +          break;
> +        }
> +
> +        mAncestors.AppendElement(scriptObjPrin->GetPrincipal());
> +        mAncestorsOuterWindowIDs.AppendElement(parent->WindowID());

Again, not clear whether outer ids or inner ids are more useful here.  What are the expected use cases for the ids?

And again, we should think about what the API exposed to consumers is; in a world where the ids are non-unique because we have multiple processes...

::: netwerk/base/LoadInfo.cpp:349
(Diff revision 1)
>                     bool aInitialSecurityCheckDone,
>                     bool aIsThirdPartyContext,
>                     const OriginAttributes& aOriginAttributes,
>                     RedirectHistoryArray& aRedirectChainIncludingInternalRedirects,
>                     RedirectHistoryArray& aRedirectChain,
> +                   nsTArray<nsCOMPtr<nsIPrincipal>>& aAncestors,

This seems like a footgun: a caller can pass in an array and it will get nuked on them, and that's pretty unexpected.

This should probably take an rvalue reference.  At least then the caller will need to explicitly Move() if they want to pass stuff here and it will be clear that it's getting moved.

::: netwerk/base/LoadInfo.cpp:394
(Diff revision 1)
>  
>    mRedirectChainIncludingInternalRedirects.SwapElements(
>      aRedirectChainIncludingInternalRedirects);
>  
>    mRedirectChain.SwapElements(aRedirectChain);
> +  mAncestors.SwapElements(aAncestors);

And you wouldn't need this; the rvalue-reference ctor that nsTArray has does the right thing.  You'd just mAncestors(Move(aAncestors)) in the initializer list.
Attachment #8895588 - Flags: feedback?(bzbarsky) → feedback+
silence.
Flags: needinfo?(g.maone)
Flags: needinfo?(amarchesini)
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

https://reviewboard.mozilla.org/r/166794/#review172466

> Again, not clear whether outer ids or inner ids are more useful here.  What are the expected use cases for the ids?
> 
> And again, we should think about what the API exposed to consumers is; in a world where the ids are non-unique because we have multiple processes...

We use outer ids in webrequest, and these may need to be matched there.
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

https://reviewboard.mozilla.org/r/166794/#review172466

> This worries me some.
> 
> In particular, if we can ever end up in this code for a document which has a navigated-away-from ancestor, then the principals will be wrong.  Ideally we'd walk the parent/embedding document chain instead, because that's the right thing that actually shows the provenance of our document.  This keeps coming up; we should really add a nice accessor for it on nsIDocument or nsPIDOMWindowInner or something...

I've looked at this a bit and am not entirely certain what to do here, could use some further guidance.

> This seems like a footgun: a caller can pass in an array and it will get nuked on them, and that's pretty unexpected.
> 
> This should probably take an rvalue reference.  At least then the caller will need to explicitly Move() if they want to pass stuff here and it will be clear that it's getting moved.

I think I have these fixed.
Blocks: 1388889
See Also: 1388889
No longer blocks: 1388889
forgot to ni? for comment 56
Flags: needinfo?(bzbarsky)
> I've looked at this a bit and am not entirely certain what to do here, could use some further guidance.

I think the right thing to do is to walk the GetParentDocument() chain.  The important part is stopping at the same place where we'd stop walking the "get parent window" chain...  

Unfortunately, the logic in GetScriptableParent is a tad convoluted.  That's why I was talking about adding an API that walks the parent document chain but stops at the same place.  We have some existing stuff like nsIDocument::GetTopLevelContentDocument and the checks in nsDocument::IsThirdParty that does something similar, but no actual way to iterate all the documents we're nested through.

If you'd prefer, I can just write this part up myself; just let me know.
Flags: needinfo?(bzbarsky)
(In reply to Boris Zbarsky [:bz] (still digging out from vacation mail) from comment #58)
> > I've looked at this a bit and am not entirely certain what to do here, could use some further guidance.

> If you'd prefer, I can just write this part up myself; just let me know.

That would be super awesome.
Flags: needinfo?(bzbarsky)
Depends on: 1400501
I filed bug 1400501 on exposing a list of ancestor principals on documents.

Sorry for the horrible lag on this.  :(

For the outer window ids, walking up the docshell parent chain is fine, of course.
Flags: needinfo?(bzbarsky)
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

I've updated the patch on top of bug 1400501, would like some feedback, some of my cpp probably need improvement.  Also uncertain about the need to copy all this through loadinfo, perhaps there is a better way?
Attachment #8895588 - Flags: feedback?(bzbarsky)
Comment on attachment 8895589 [details]
Bug 1305237 Expose frameAncestors to webextensions,

Updating the channel wrapper to pass ancestor info through.  Would like a first round of feedback on that.
Attachment #8895589 - Flags: feedback?(kmaglione+bmo)
Comment on attachment 8895589 [details]
Bug 1305237 Expose frameAncestors to webextensions,

https://reviewboard.mozilla.org/r/166796/#review186870

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:482
(Diff revision 3)
>  }
>  
> +void
> +ChannelWrapper::GetAncestorInfo(nsTArray<MozAncestorInfo>& aAncestorInfo) const
> +{
> +  if (nsCOMPtr<nsILoadInfo> loadInfo = GetLoadInfo()) {

We wind up returning an empty array if there's no load info. We should probably return null instead.

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:486
(Diff revision 3)
> +      Unused << principal->GetURI(getter_AddRefs(uri));
> +      Unused << uri->GetSpec(ancestor.mUrl);

These are both fallible (though admittely only in the case of OOM). And the principal URI can be null, though in this case only in the case of a system principal. We should throw rather than crashing or returning a null principal in those cases.

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:488
(Diff revision 3)
> +    for (const nsCOMPtr<nsIPrincipal>& principal : loadInfo->Ancestors()) {
> +      MozAncestorInfo ancestor;
> +      nsCOMPtr<nsIURI> uri;
> +      Unused << principal->GetURI(getter_AddRefs(uri));
> +      Unused << uri->GetSpec(ancestor.mUrl);
> +      aAncestorInfo.AppendElement(Move(ancestor));

Nit: You should probably call `aAncestorInfo.SetCapacity(length, fallible)` before the loop, and just throw if it fails.

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:491
(Diff revision 3)
> +      Unused << principal->GetURI(getter_AddRefs(uri));
> +      Unused << uri->GetSpec(ancestor.mUrl);
> +      aAncestorInfo.AppendElement(Move(ancestor));
> +      uri.forget();
> +    }
> +    const nsTArray<uint64_t>& ancestorsOuterWindowIDs = loadInfo->AncestorsOuterWindowIDs();

Can you assert that both arrays are the same length here? It would be nice to have something like Zip() in MFBT to iterate over both arrays and return tuples for matched elements... but probably I digress.

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:492
(Diff revision 3)
> +      Unused << uri->GetSpec(ancestor.mUrl);
> +      aAncestorInfo.AppendElement(Move(ancestor));
> +      uri.forget();
> +    }
> +    const nsTArray<uint64_t>& ancestorsOuterWindowIDs = loadInfo->AncestorsOuterWindowIDs();
> +    for (uint32_t i = 0; i < aAncestorInfo.Length() && i < ancestorsOuterWindowIDs.Length(); ++i) {

You can use a range-based loop here, too. Just make sure both arrays are the same length first.

::: toolkit/modules/addons/WebRequest.jsm:741
(Diff revision 3)
>  
>        serialize: serializeRequestData,
>      };
>  
> +    let {ancestorInfo} = channel;
> +    let frameAncestors = ancestorInfo.length > 0 ? ancestorInfo : [];

The ternary is a no-op.

::: toolkit/modules/addons/WebRequest.jsm:743
(Diff revision 3)
>      };
>  
> +    let {ancestorInfo} = channel;
> +    let frameAncestors = ancestorInfo.length > 0 ? ancestorInfo : [];
> +    if (data.type === "sub_frame") {
> +      frameAncestors.unshift({

This is going to give the wrong results when we have no load info, and imply a top-level frame.

::: toolkit/modules/addons/WebRequest.jsm:752
(Diff revision 3)
> +    }
> +    if (frameAncestors.length > 0) {
> +      // In a request in some subframe, we're not going to know the original
> +      // windowId of the top level request, but it is always the last entry
> +      // in the chain.  Just set it to zero here.
> +      frameAncestors[frameAncestors.length - 1].frameId = 0;

This will give a doubly wrong answer when we have no load info, and even more strongly imply that this is a top-level frame.

::: toolkit/modules/addons/WebRequestContent.js:170
(Diff revision 3)
> +          frame = frame.parent;
> +        } while (frame);

A `for` loop would work just as well here, and be easier to follow.

Also, `frame` will always be true at this point.
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

https://reviewboard.mozilla.org/r/166794/#review186874

::: ipc/glue/BackgroundUtils.cpp:375
(Diff revision 3)
>      NS_ENSURE_SUCCESS(rv, rv);
>    }
>  
> +  nsTArray<PrincipalInfo> ancestors;
> +  for (const nsCOMPtr<nsIPrincipal>& principal : aLoadInfo->Ancestors()) {
> +    rv = PrincipalToPrincipalInfo(principal, ancestors.AppendElement());

You should call `.SetCapacity` on `ancestors` ahead of time to avoid unnecessary reallocations.

::: ipc/glue/BackgroundUtils.cpp:478
(Diff revision 3)
>        RHEntryInfoToRHEntry(entryInfo);
>      NS_ENSURE_SUCCESS(rv, rv);
>      redirectChain.AppendElement(redirectHistoryEntry.forget());
>    }
>  
> +  nsTArray<nsCOMPtr<nsIPrincipal>> ancestors;

`SetCapacity()` here, too.

::: netwerk/base/LoadInfo.h:118
(Diff revision 3)
>             bool aInitialSecurityCheckDone,
>             bool aIsThirdPartyRequest,
>             const OriginAttributes& aOriginAttributes,
>             RedirectHistoryArray& aRedirectChainIncludingInternalRedirects,
>             RedirectHistoryArray& aRedirectChain,
> +           nsTArray<nsCOMPtr<nsIPrincipal>>&& aAncestors,

Kind of weird to have this as the only moved argument... but I suppose it may be worth it for the ref counting overhead.

::: netwerk/base/LoadInfo.cpp:33
(Diff revision 3)
>  
>  namespace mozilla {
>  namespace net {
>  
> -static uint64_t
> -FindTopOuterWindowID(nsPIDOMWindowOuter* aOuter)
> +static void
> +GatherTopOuterWindowIDs(nsPIDOMWindowOuter* aOuter, nsTArray<uint64_t>& outerWindowIDs, uint64_t& topOuterWindowID)

Nit: s/Top/Ancestor/

::: netwerk/base/LoadInfo.cpp:319
(Diff revision 3)
>    , mIsThirdPartyContext(rhs.mIsThirdPartyContext)
>    , mOriginAttributes(rhs.mOriginAttributes)
>    , mRedirectChainIncludingInternalRedirects(
>        rhs.mRedirectChainIncludingInternalRedirects)
>    , mRedirectChain(rhs.mRedirectChain)
> +  , mAncestors(Move(rhs.mAncestors))

This is a copy constructor. This should be a copy, not a move.

::: netwerk/base/LoadInfo.cpp:383
(Diff revision 3)
>    , mFrameOuterWindowID(aFrameOuterWindowID)
>    , mEnforceSecurity(aEnforceSecurity)
>    , mInitialSecurityCheckDone(aInitialSecurityCheckDone)
>    , mIsThirdPartyContext(aIsThirdPartyContext)
>    , mOriginAttributes(aOriginAttributes)
> +  , mAncestors(aAncestors)

`Move(aAncestors)`

::: netwerk/base/nsILoadInfo.idl:671
(Diff revision 3)
> +  [noscript, notxpcom, nostdcall, binaryname(AncestorsOuterWindowIDs)]
> +  Uint64ArrayRef binaryAncestorsOuterWindowIDs();

No need for the `binary` in either of these, since there isn't a scriptable version with the same name.
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

https://reviewboard.mozilla.org/r/166794/#review187132

::: ipc/glue/BackgroundUtils.cpp:373
(Diff revision 3)
>      RedirectHistoryEntryInfo* entry = redirectChain.AppendElement();
>      rv = RHEntryToRHEntryInfo(redirectEntry, entry);
>      NS_ENSURE_SUCCESS(rv, rv);
>    }
>  
> +  nsTArray<PrincipalInfo> ancestors;

ancestorPrincipals, perhaps?

::: ipc/glue/BackgroundUtils.cpp:374
(Diff revision 3)
>      rv = RHEntryToRHEntryInfo(redirectEntry, entry);
>      NS_ENSURE_SUCCESS(rv, rv);
>    }
>  
> +  nsTArray<PrincipalInfo> ancestors;
> +  for (const nsCOMPtr<nsIPrincipal>& principal : aLoadInfo->Ancestors()) {

Could `const auto& principal` here, I guess.  Not a big deal either way.

::: ipc/glue/BackgroundUtils.cpp:478
(Diff revision 3)
>        RHEntryInfoToRHEntry(entryInfo);
>      NS_ENSURE_SUCCESS(rv, rv);
>      redirectChain.AppendElement(redirectHistoryEntry.forget());
>    }
>  
> +  nsTArray<nsCOMPtr<nsIPrincipal>> ancestors;

ancestorPrincipals

::: netwerk/base/LoadInfo.h:118
(Diff revision 3)
>             bool aInitialSecurityCheckDone,
>             bool aIsThirdPartyRequest,
>             const OriginAttributes& aOriginAttributes,
>             RedirectHistoryArray& aRedirectChainIncludingInternalRedirects,
>             RedirectHistoryArray& aRedirectChain,
> +           nsTArray<nsCOMPtr<nsIPrincipal>>&& aAncestors,

aAncestorPrincipals

::: netwerk/base/LoadInfo.h:173
(Diff revision 3)
>    bool                             mInitialSecurityCheckDone;
>    bool                             mIsThirdPartyContext;
>    OriginAttributes                 mOriginAttributes;
>    RedirectHistoryArray             mRedirectChainIncludingInternalRedirects;
>    RedirectHistoryArray             mRedirectChain;
> +  nsTArray<nsCOMPtr<nsIPrincipal>> mAncestors;

mAncestorPrincipals.

::: netwerk/base/LoadInfo.h:174
(Diff revision 3)
>    bool                             mIsThirdPartyContext;
>    OriginAttributes                 mOriginAttributes;
>    RedirectHistoryArray             mRedirectChainIncludingInternalRedirects;
>    RedirectHistoryArray             mRedirectChain;
> +  nsTArray<nsCOMPtr<nsIPrincipal>> mAncestors;
> +  nsTArray<uint64_t>               mAncestorsOuterWindowIDs;

Should probably be consistent about using mAncestorsOuterWindowIDs/mAncestorsPrincipals or mAncestorOuterWindowIDs/mAncestorPrincipals.  Similar in all the APIs involved.

Personally, I somewhat prefer mAncestorOuterWindowIDs/mAncestorPrincipals, but can go either way as long as it's consistent.

::: netwerk/base/LoadInfo.cpp:33
(Diff revision 3)
>  
>  namespace mozilla {
>  namespace net {
>  
> -static uint64_t
> -FindTopOuterWindowID(nsPIDOMWindowOuter* aOuter)
> +static void
> +GatherTopOuterWindowIDs(nsPIDOMWindowOuter* aOuter, nsTArray<uint64_t>& outerWindowIDs, uint64_t& topOuterWindowID)

Please be consistent about whether your args are prefixed with 'a'.

And I don't see a good reason to stop returning the top window id and turn it into an out param.  You can just have a return value for that, and an out param for the array.

::: netwerk/base/LoadInfo.cpp:122
(Diff revision 3)
>        mParentOuterWindowID = parent ? parent->WindowID() : mOuterWindowID;
> -      mTopOuterWindowID = FindTopOuterWindowID(contextOuter);
> +      GatherTopOuterWindowIDs(contextOuter, mAncestorsOuterWindowIDs, mTopOuterWindowID);
>      }
>  
>      mInnerWindowID = aLoadingContext->OwnerDoc()->InnerWindowID();
> +    mAncestors = Move(aLoadingContext->OwnerDoc()->AncestorPrincipals());

This shouldn't be a Move, since you're not changing the AncestorPrincipals() on the document.

::: netwerk/base/LoadInfo.cpp:282
(Diff revision 3)
>  
>    // get the docshell from the outerwindow, and then get the originattributes
>    nsCOMPtr<nsIDocShell> docShell = aOuterWindow->GetDocShell();
>    MOZ_ASSERT(docShell);
>    mOriginAttributes = nsDocShell::Cast(docShell)->GetOriginAttributes();
> +  mAncestors = Move(nsDocShell::Cast(docShell)->AncestorPrincipals());

Again, not a Move.

::: netwerk/base/LoadInfo.cpp:927
(Diff revision 3)
>  {
>    return mRedirectChain;
>  }
>  
> +const nsTArray<nsCOMPtr<nsIPrincipal>>&
> +LoadInfo::Ancestors()

This needs a better name, like AncestorPrincipals() or something.  Again, make it consistent with the window ID getter.

::: netwerk/base/nsILoadInfo.idl:27
(Diff revision 3)
>  [ref] native nsIRedirectHistoryEntryArray(const nsTArray<nsCOMPtr<nsIRedirectHistoryEntry>>);
>  native OriginAttributes(mozilla::OriginAttributes);
>  [ref] native const_OriginAttributesRef(const mozilla::OriginAttributes);
>  [ref] native StringArrayRef(const nsTArray<nsCString>);
> +[ref] native Uint64ArrayRef(const nsTArray<uint64_t>);
> +[ref] native const_nsIPrincipalArray(const nsTArray<nsCOMPtr<nsIPrincipal>>);

Probably better to call this type PrincipalArrayRef.

::: netwerk/base/nsILoadInfo.idl:648
(Diff revision 3)
>     */
>    [noscript, notxpcom, nostdcall, binaryname(RedirectChain)]
>    nsIRedirectHistoryEntryArray binaryRedirectChain();
>  
>    /**
> +   * An array of nsIPrincipals which stores the all the parent frames not

s/the all the/all the/

Also, this does not store the _frames_.  It stores the principals of the ancestor documents.

::: netwerk/base/nsILoadInfo.idl:652
(Diff revision 3)
>    /**
> +   * An array of nsIPrincipals which stores the all the parent frames not
> +   * including the frame loading this request.  Closest ancestor is at index
> +   * zero, top level ancestor is last index.
> +   *
> +   * The ancestors[0] entry for an iframe will be the iframes parent frame.

s/the iframes/the iframe's/

And the entry will be the principal of the parent document, of course.

::: netwerk/base/nsILoadInfo.idl:653
(Diff revision 3)
> +   * An array of nsIPrincipals which stores the all the parent frames not
> +   * including the frame loading this request.  Closest ancestor is at index
> +   * zero, top level ancestor is last index.
> +   *
> +   * The ancestors[0] entry for an iframe will be the iframes parent frame.
> +   * The ancestors[0] entry for an image loaded in an iframe will be the

Again, principal of the iframe's parent document.

::: netwerk/base/nsILoadInfo.idl:659
(Diff revision 3)
> +   * iframe's parent frame.
> +   *
> +   * Please note that this array has the same lifetime as the
> +   * loadInfo object - use with caution!
> +   */
> +  [noscript, notxpcom, nostdcall, binaryname(Ancestors)]

Why do you need the binaryName game here?  Just give it the name you want.

::: netwerk/base/nsILoadInfo.idl:666
(Diff revision 3)
> +
> +
> +  /**
> +   * An array of outer window IDs, which stores IDs for all the parent frames not
> +   * including the frame loading this request.  These IDs match the
> +   * corresponding index in ancestors above.

s/ancestors/ancestorPrincipals/ (or ancestorsPrincipals), right?

::: netwerk/ipc/NeckoChannelParams.ipdlh:62
(Diff revision 3)
>    bool                        initialSecurityCheckDone;
>    bool                        isInThirdPartyContext;
>    OriginAttributes            originAttributes;
>    RedirectHistoryEntryInfo[]  redirectChainIncludingInternalRedirects;
>    RedirectHistoryEntryInfo[]  redirectChain;
> +  PrincipalInfo[]             ancestors;

ancestorPrincipals or ancestorsPrincipals.

And this needs to be documented.

::: netwerk/ipc/NeckoChannelParams.ipdlh:63
(Diff revision 3)
>    bool                        isInThirdPartyContext;
>    OriginAttributes            originAttributes;
>    RedirectHistoryEntryInfo[]  redirectChainIncludingInternalRedirects;
>    RedirectHistoryEntryInfo[]  redirectChain;
> +  PrincipalInfo[]             ancestors;
> +  uint64_t[]                  ancestorsOuterWindowIDs;

Again, needs to be documented.
Attachment #8895588 - Flags: feedback?(bzbarsky) → feedback+
Attachment #8895589 - Flags: feedback?(kmaglione+bmo)
(In reply to Kris Maglione [:kmag] (long backlog; ping on IRC if you're blocked) from comment #65)
> ::: toolkit/modules/addons/WebRequest.jsm:743
> (Diff revision 3)
> >      };
> >  
> > +    let {ancestorInfo} = channel;
> > +    let frameAncestors = ancestorInfo.length > 0 ? ancestorInfo : [];
> > +    if (data.type === "sub_frame") {
> > +      frameAncestors.unshift({
> 
> This is going to give the wrong results when we have no load info, and imply
> a top-level frame.

It should be impossible to have no loadinfo and type sub_frame.  ChannelWrapper::Type returns MozContentPolicyType::Other in that situation.  However I'm changing this (and below) to be set in ChannelWrapper.

> ::: toolkit/modules/addons/WebRequest.jsm:752
> (Diff revision 3)
> > +    }
> > +    if (frameAncestors.length > 0) {
> > +      // In a request in some subframe, we're not going to know the original
> > +      // windowId of the top level request, but it is always the last entry
> > +      // in the chain.  Just set it to zero here.
> > +      frameAncestors[frameAncestors.length - 1].frameId = 0;
> 
> This will give a doubly wrong answer when we have no load info, and even
> more strongly imply that this is a top-level frame.

The last ancestor is always the top-level frame.
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

https://reviewboard.mozilla.org/r/166794/#review187384

r=me on the code, but there are some comment nits.

::: netwerk/base/nsILoadInfo.idl:648
(Diff revision 4)
>     */
>    [noscript, notxpcom, nostdcall, binaryname(RedirectChain)]
>    nsIRedirectHistoryEntryArray binaryRedirectChain();
>  
>    /**
> +   * An array of nsIPrincipals which stores the all the parent frames not

s/the all the/the/

::: netwerk/base/nsILoadInfo.idl:648
(Diff revision 4)
>     */
>    [noscript, notxpcom, nostdcall, binaryname(RedirectChain)]
>    nsIRedirectHistoryEntryArray binaryRedirectChain();
>  
>    /**
> +   * An array of nsIPrincipals which stores the all the parent frames not

This is storing principals, not frames....

::: netwerk/base/nsILoadInfo.idl:652
(Diff revision 4)
>    /**
> +   * An array of nsIPrincipals which stores the all the parent frames not
> +   * including the frame loading this request.  Closest ancestor is at index
> +   * zero, top level ancestor is last index.
> +   *
> +   * The ancestors[0] entry for an iframe will be the principal of the iframe's parent.

"for an iframe load", right?

And perhaps "the principal of the iframe element's owner document"?

::: netwerk/base/nsILoadInfo.idl:653
(Diff revision 4)
> +   * An array of nsIPrincipals which stores the all the parent frames not
> +   * including the frame loading this request.  Closest ancestor is at index
> +   * zero, top level ancestor is last index.
> +   *
> +   * The ancestors[0] entry for an iframe will be the principal of the iframe's parent.
> +   * The ancestors[0] entry for an image loaded in an iframe will be the

Again, principal of the iframe element's owner document?

::: netwerk/base/nsILoadInfo.idl:655
(Diff revision 4)
> +   * zero, top level ancestor is last index.
> +   *
> +   * The ancestors[0] entry for an iframe will be the principal of the iframe's parent.
> +   * The ancestors[0] entry for an image loaded in an iframe will be the
> +   * principal of the iframe's parent.
> +   *

This comment might benefit from a pointer to the documentation for nsIDocument::AncestorPrincipals.

::: netwerk/base/nsILoadInfo.idl:665
(Diff revision 4)
> +  PrincipalArrayRef AncestorPrincipals();
> +
> +
> +  /**
> +   * An array of outer window IDs, which stores IDs for all the parent frames not
> +   * including the frame loading this request.  These IDs match the

The concept of "frame loading this request" is a bit vague.  Maybe:

    which stores IDs for all the ancestors of the window the request is associated with
    
or something?

::: netwerk/base/nsILoadInfo.idl:666
(Diff revision 4)
> +
> +
> +  /**
> +   * An array of outer window IDs, which stores IDs for all the parent frames not
> +   * including the frame loading this request.  These IDs match the
> +   * corresponding index in ancestorPrincipals above.

I don't really like this phrasing (makes it sound like the ids are indices into ancestorPrincipals), but I'm not sure how to improve it.  Maybe:

    These IDs are ordered the same way as ancestorPrincipals above
    
if that's what we're getting at?
Attachment #8895588 - Flags: review?(bzbarsky) → review+
Comment on attachment 8895589 [details]
Bug 1305237 Expose frameAncestors to webextensions,

https://reviewboard.mozilla.org/r/166796/#review187358

Submitting these comments now since the patch got updated in the middle of my review...

::: dom/webidl/ChannelWrapper.webidl:139
(Diff revision 4)
>  
> +  /**
> +   * Returns the an array of objects that combine the url and frameId from the
> +   * ancestorPrincipals and ancestorOuterWindowIDs on loadInfo.
> +   * The immediate parent is the first entry, the last entry is always the top level
> +   * frame.  frameAncestors will be null for top level frames and requests within

This should probably only be null for requests that don't happen in a frame, or don't have LoadInfo. For top-level frames, it should be an empty array.

::: dom/webidl/ChannelWrapper.webidl:142
(Diff revision 4)
> +   * ancestorPrincipals and ancestorOuterWindowIDs on loadInfo.
> +   * The immediate parent is the first entry, the last entry is always the top level
> +   * frame.  frameAncestors will be null for top level frames and requests within
> +   * the top level frame (ie. where the originUrl is the top level frame).
> +   */
> +  [Cached, GetterThrows, Pure]

Nit: [Frozen]

::: dom/webidl/ChannelWrapper.webidl:171
(Diff revision 4)
> +  required ByteString url;
> +  required unsigned long long frameId;

Nit: Please add doc comments.
Blocks: 1402096
Comment on attachment 8895589 [details]
Bug 1305237 Expose frameAncestors to webextensions,

https://reviewboard.mozilla.org/r/166796/#review187668

Close. Mostly just minor issues at this point.

::: toolkit/components/extensions/test/mochitest/test_ext_webrequest_frameId.html:95
(Diff revision 5)
> +    depth: 2,
> +    origin: "file_simple_xhr_frame2.html",
> +    parent: "file_simple_xhr_frame.html",
>    },
> -  // This is loaded in a sandbox iframe.
> +  // Last frame tests content policy frame ancestors.
> +  "plain,webRequestTest": { // filename logic in checkDetails makes this weird.

Maybe fix the filename logic, then? :)

::: toolkit/components/extensions/test/mochitest/test_ext_webrequest_frameId.html:134
(Diff revision 5)
>    let url = new URL(details.url);
>    let filename = url.pathname.split("/").pop();
>    let expect = expected[filename];
>    is(expect.type, details.type, `${details.type} type matches`);
> +  if (details.parentFrameId == -1) {
> +    is(details.frameAncestors, undefined, "no ancestors for main_frame requests");

Per above, this should probably be an empty array except for requests that don't have LoadInfo or aren't tied to a frame at all.

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:491
(Diff revision 5)
> +
> +  nsTArray<dom::MozFrameAncestorInfo> frameAncestors;
> +  const nsTArray<nsCOMPtr<nsIPrincipal>>& ancestorPrincipals = loadInfo->AncestorPrincipals();
> +  const nsTArray<uint64_t>& ancestorOuterWindowIDs = loadInfo->AncestorOuterWindowIDs();
> +  uint32_t size = ancestorPrincipals.Length();
> +  MOZ_ASSERT(size == ancestorOuterWindowIDs.Length());

Please make this a diagnostic assert, and also throw if lengths are unequal for release branches.

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:495
(Diff revision 5)
> +  uint32_t size = ancestorPrincipals.Length();
> +  MOZ_ASSERT(size == ancestorOuterWindowIDs.Length());
> +
> +  bool subFrame = loadInfo->GetExternalContentPolicyType() == nsIContentPolicy::TYPE_SUBDOCUMENT;
> +  if (!frameAncestors.SetCapacity(subFrame ? size : size + 1, mozilla::fallible)) {
> +    aRv.Throw(NS_ERROR_UNEXPECTED);

Nit: `NS_ERROR_OUT_OF_MEMORY`

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:504
(Diff revision 5)
> +  // The immediate parent is always the first element in the ancestor arrays, however
> +  // SUBDOCUMENTs do not have their immediate parent included, so we inject it here.
> +  // This will force wrapper.parentWindowId == wrapper.frameAncestors[0].frameId to
> +  // always be true.  All ather requests already match this way.
> +  if (subFrame) {
> +    MozFrameAncestorInfo ancestor;

Nit: `auto ancestor = frameAncestors.AppendElement();`

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:512
(Diff revision 5)
> +    frameAncestors.AppendElement(Move(ancestor));
> +  }
> +
> +  nsresult rv;
> +  for (uint32_t i = 0; i < size; ++i) {
> +    MozFrameAncestorInfo ancestor;

Nit: `auto ancestor = frameAncestors.AppendElement();`

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:514
(Diff revision 5)
> +    if (NS_FAILED(ancestorPrincipals[i]->GetURI(getter_AddRefs(uri))) || !uri) {
> +      aRv.Throw(NS_ERROR_UNEXPECTED);

We should probably just add a private `GetFrameAncestors` method that returns a nsresult rather than taking an `ErrorResult`. Then we can use `MOZ_TRY` here and elsewhere, and have the wrapper do `aRv.Throw(rv)` if it fails.

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:519
(Diff revision 5)
> +    if (NS_FAILED(ancestorPrincipals[i]->GetURI(getter_AddRefs(uri))) || !uri) {
> +      aRv.Throw(NS_ERROR_UNEXPECTED);
> +      return;
> +    }
> +    rv = uri->GetSpec(ancestor.mUrl);
> +    uri.forget();

This is a leak.

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:527
(Diff revision 5)
> +  if (frameAncestors.Length() > 0) {
> +    aFrameAncestors.SetValue() = frameAncestors;
> +  }

I think we want to set this whenever we have LoadInfo and the request comes from a frame.

Also, this should be `aFrameAncestors.SetValue(Move(frameAncestors));`

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:655
(Diff revision 5)
>      rv = NS_GetFinalChannelURI(chan, getter_AddRefs(uri));
>    }
>    if (NS_FAILED(rv)) {
>      aRv.Throw(rv);
>    }
> -  return uri.forget();;
> +  return uri.forget();

Thanks :)
Attachment #8895589 - Flags: review?(kmaglione+bmo)
Comment on attachment 8895589 [details]
Bug 1305237 Expose frameAncestors to webextensions,

https://reviewboard.mozilla.org/r/166796/#review187736

::: dom/webidl/ChannelWrapper.webidl:143
(Diff revision 6)
> +   * The immediate parent is the first entry, the last entry is always the top level
> +   * frame.  frameAncestors will be null for requests that do not happen in a
> +   * frame.  It will be an empty list for top level frames and requests within
> +   * the top level frame (ie. where the originUrl is the top level frame).
> +   */
> +  [Cached, GetterThrows, Pure, Frozen]

Nit: Please keep attributes sorted.

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:488
(Diff revision 6)
> +  nsTArray<dom::MozFrameAncestorInfo> frameAncestors;
> +  nsresult rv = GetFrameAncestors(loadInfo, frameAncestors);

Nit: No need for the temporary. Just `GetFrameAncestors(loadInfo, aFrameAncestors.SetValue())`

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:493
(Diff revision 6)
> +  } else {
> +    aFrameAncestors.SetValue(Move(frameAncestors));
> +  }

Nit: No need for the else.

::: toolkit/components/extensions/webrequest/ChannelWrapper.cpp:510
(Diff revision 6)
> +  if (size != ancestorOuterWindowIDs.Length()) {
> +    return NS_ERROR_UNEXPECTED;
> +  }
> +
> +  bool subFrame = aLoadInfo->GetExternalContentPolicyType() == nsIContentPolicy::TYPE_SUBDOCUMENT;
> +  if (!aFrameAncestors.SetCapacity(subFrame ? size : size + 1, mozilla::fallible)) {

Nit: No need for `mozilla::`
Attachment #8895589 - Flags: review?(kmaglione+bmo) → review+
Comment on attachment 8895589 [details]
Bug 1305237 Expose frameAncestors to webextensions,

https://reviewboard.mozilla.org/r/166796/#review188438

r=m on the webidl bits with the comment isssues below fixed.

::: dom/webidl/ChannelWrapper.webidl:136
(Diff revision 9)
>  
>    [Cached, Pure]
>    readonly attribute nsISupports? browserElement;
>  
> +  /**
> +   * Returns the an array of objects that combine the url and frameId from the

s/the an/an/

::: dom/webidl/ChannelWrapper.webidl:139
(Diff revision 9)
>  
> +  /**
> +   * Returns the an array of objects that combine the url and frameId from the
> +   * ancestorPrincipals and ancestorOuterWindowIDs on loadInfo.
> +   * The immediate parent is the first entry, the last entry is always the top level
> +   * frame.  frameAncestors will be null for requests that do not happen in a

Actually, it will be null for requests without a loadInfo, which is completely unrelated to anything to do with "frames" as far as I can tell.  Furthermore, the long-term plan is to not have a concept of requests without a loadinfo.

Given that, I think this API should just return empty list for no loadinfo and never return null.

::: dom/webidl/ChannelWrapper.webidl:140
(Diff revision 9)
> +  /**
> +   * Returns the an array of objects that combine the url and frameId from the
> +   * ancestorPrincipals and ancestorOuterWindowIDs on loadInfo.
> +   * The immediate parent is the first entry, the last entry is always the top level
> +   * frame.  frameAncestors will be null for requests that do not happen in a
> +   * frame.  It will be an empty list for top level frames and requests within

The use of "frame" here is pretty confusing, because the term is so overloaded.  How about:

  It will be an empty list for toplevel window loads and non-subdocument resource loads within a toplevel window.  For the latter, originURL will provide information on what window is doing the load.
  
or something?

::: dom/webidl/ChannelWrapper.webidl:175
(Diff revision 9)
> +
> +/**
> + * MozFrameAncestorInfo combines loadInfo::AncestorPrincipals with
> + * loadInfo::AncestorOuterWindowIDs for easier access in the WebRequest API.
> + *
> + * url is the principal URI for the parent of the loading frame.

Given that we're likely to want to get rid of the ill-defined concept of "principal URI" (which has no equivalent in web specs), I'm not sure whether we should advertise this as such here.

That said, I'm not sure what the right thing to call this URI is.  Conceptually, the algorithm at https://w3c.github.io/webappsec-referrer-policy/#determine-requests-referrer might be what we want, but it's kinda bunk-ish for about:blank iframes and whatnot.  Maybe just say that this is a URL representing the entity responsible for loading the previous document in the chain or something.  Not sure.
Attachment #8895589 - Flags: review+
(In reply to Boris Zbarsky [:bz] (still digging out from vacation mail) from comment #82)
> ::: dom/webidl/ChannelWrapper.webidl:139
> (Diff revision 9)
> > +  /**
> > +   * Returns the an array of objects that combine the url and frameId from the
> > +   * ancestorPrincipals and ancestorOuterWindowIDs on loadInfo.
> > +   * The immediate parent is the first entry, the last entry is always the top level
> > +   * frame.  frameAncestors will be null for requests that do not happen in a
> 
> Actually, it will be null for requests without a loadInfo, which is
> completely unrelated to anything to do with "frames" as far as I can tell.
> Furthermore, the long-term plan is to not have a concept of requests without
> a loadinfo.
> 
> Given that, I think this API should just return empty list for no loadinfo
> and never return null.

I asked for this to be nullable because returning an array of ancestor frames
seemed misleading for requests that don't belong to a document. I don't think we
currently have any requests that belong to DOM windows but don't have load info,
so no load info should imply that the request doesn't belong to a window.

But I agree that this comment is misleading. I'd intended that we'd also return
null when the load info did not have a window ID.

Does that behavior sound OK to you?
Flags: needinfo?(bzbarsky)
That seems reasonable, yes, along with better documentation.
Flags: needinfo?(bzbarsky)
Comment on attachment 8895589 [details]
Bug 1305237 Expose frameAncestors to webextensions,

https://reviewboard.mozilla.org/r/166796/#review188950

::: toolkit/modules/tests/browser/browser_WebRequest_ancestors.js:5
(Diff revision 10)
> +"use strict";
> +
> +var { interfaces: Ci, classes: Cc, utils: Cu, results: Cr } = Components;
> +
> +const XMLHttpRequest = Components.Constructor("@mozilla.org/xmlextras/xmlhttprequest;1", "nsIXMLHttpRequest");

Nit: `Cu.importGlobalProperties(["XMLHttpRequest"]);`

::: toolkit/modules/tests/browser/browser_WebRequest_ancestors.js:9
(Diff revision 10)
> +function getDeferred() {
> +  let deferred = {};
> +
> +  deferred.promise = new Promise((resolve, reject) => {
> +    deferred.resolve = resolve;
> +    deferred.reject = reject;
> +  });
> +
> +  return deferred;
> +}

Nit: May as well use PromiseUtils for this.
Pushed by mixedpuppy@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/fd1d81b93380
LoadInfo changes to include all ancestors principals and window IDs, r=bz
https://hg.mozilla.org/integration/autoland/rev/d0d30a90efa1
Expose frameAncestors to webextensions, r=bz,kmag
https://hg.mozilla.org/mozilla-central/rev/fd1d81b93380
https://hg.mozilla.org/mozilla-central/rev/d0d30a90efa1
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Depends on: 1403932
Backed out for causing bug 1403932.
https://hg.mozilla.org/mozilla-central/rev/307a7a34013060a6a1e87dfbb911f058d0781a2e
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla58 → ---
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

This version changes the gathering of the outerWindowIDs to match that of the principals.  This ensures the two are always in sync and the same length.  The item I am unsure of is retrieving the windowID in nsFrameLoader, whether I can depend on the window always being there at this point.
Attachment #8895588 - Flags: review+ → review?(bzbarsky)
The backout conflicted with the merge of bug 1402944. I think I resolved the conflicts, but it wouldn't hurt to make sure I got it right.
(In reply to Wes Kocher (:KWierso) from comment #97)
> The backout conflicted with the merge of bug 1402944. I think I resolved the
> conflicts, but it wouldn't hurt to make sure I got it right.

Looks like it mostly came out OK. Thanks. The MozFrameAncestorInfo dictionary wound up left in ChannelWrapper.webidl[1], but it won't cause any harm until this re-lands.

[1]: https://hg.mozilla.org/mozilla-central/annotate/e6c32278f32c/dom/webidl/ChannelWrapper.webidl#l377
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

https://reviewboard.mozilla.org/r/166794/#review189990

::: netwerk/base/LoadInfo.cpp:121
(Diff revision 8)
> +    mAncestorPrincipals = aLoadingContext->OwnerDoc()->AncestorPrincipals();
> +    mAncestorOuterWindowIDs = aLoadingContext->OwnerDoc()->AncestorOuterWindowIDs();

We should probably also assert that these arrays are the same length when we create the LoadInfo if we're going to keep the assertion in ChannelWrapper, since that increases the odds of it triggering a content process crash rather than a parent process crash when there's a mismatch.

And I do think we should keep that assertion, since a mismatch means we're delivering (potentially dangerously) inaccurate request information there, and just throwing could mean we (potentially dangerously) fail to notify request observers.
Yeah, the assert in loadInfo is good, it shows up as a tab crash rather than a firefox crash.
Attachment #8895588 - Flags: review?(bzbarsky)
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

https://reviewboard.mozilla.org/r/166794/#review190018

::: dom/base/nsFrameLoader.cpp:2726
(Diff revision 9)
>      nsDocShell::Cast(mDocShell)->SetAncestorPrincipals(Move(ancestorPrincipals));
> +
> +    // Repeat for outer window IDs.
> +    nsTArray<uint64_t> ancestorOuterWindowIDs;
> +    ancestorOuterWindowIDs = doc->AncestorOuterWindowIDs();
> +    ancestorOuterWindowIDs.InsertElementAt(0, doc->GetWindow()->WindowID());

This should be safe, yes.  You can't get elements in a parent document like this without that document having a window.
Attachment #8895588 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky [:bz] (still digging out from vacation mail) from comment #103)
> Comment on attachment 8895588 [details]

> > +    ancestorOuterWindowIDs.InsertElementAt(0, doc->GetWindow()->WindowID());
> 
> This should be safe, yes.  You can't get elements in a parent document like
> this without that document having a window.

Seems that may be a problem:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=72d48e3d5c41&selectedJob=133990989

Try in comment 104 is to see if that is the case.
Ah, I see.  That's cloning a document for printing.  When that happens, the clone gets a docshell, but not a window.

The other thing that's true of those documents is that they do not do any resource loads.  So you could probably just skip this whole block if doc->IsStaticDocument() and that should help with the tests.
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

https://reviewboard.mozilla.org/r/166794/#review190206

::: dom/base/nsFrameLoader.cpp:2716
(Diff revisions 9 - 10)
>  
>    nsDocShell::Cast(mDocShell)->SetOriginAttributes(attrs);
>  
>    if (!mDocShell->GetIsMozBrowser() &&
> -      parentType == mDocShell->ItemType()) {
> +      parentType == mDocShell->ItemType() &&
> +      !doc->IsStaticDocument()) {

Please add a comment explaining why this non-obvious check.
Pushed by mixedpuppy@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/e05bab140564
LoadInfo changes to include all ancestors principals and window IDs, r=bz
https://hg.mozilla.org/integration/autoland/rev/163a2b0bb0a0
Expose frameAncestors to webextensions, r=bz,kmag
The failures seem intermittent, so you might have to retrigger a few times to catch it on try.
(In reply to Wes Kocher (:KWierso) from comment #114)
> The failures seem intermittent, so you might have to retrigger a few times
> to catch it on try.

It's the favicon slipping through before the test is finished.  I'll reland with a fix to prevent that.
Flags: needinfo?(mixedpuppy)
Pushed by mixedpuppy@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/5fe72402746f
LoadInfo changes to include all ancestors principals and window IDs, r=bz
https://hg.mozilla.org/integration/autoland/rev/96b5d596cc27
Expose frameAncestors to webextensions, r=bz,kmag
Backed out again for those crashtest failrues which are not fixed in the re-landed versions of the patches:
https://hg.mozilla.org/integration/autoland/rev/f713d2b2eacaaf832fdca31b92699bad55e3e8c3
Flags: needinfo?(mixedpuppy)
Specifically, docshell/base/crashtests/403574-1.xhtml and dom/svg/crashtests/1282985-1.svg would hang on a linux opt build.  Changing back to checking the window rather than IsStaticDocument avoided the hang.  Pushing a new review and will run try from there.

bz, any thoughts on this?
Flags: needinfo?(mixedpuppy) → needinfo?(bzbarsky)
Depends on: 1405133
Can you reproduce the failure locally?  Presumably it's crashing and the "hang" is because out e10s crashtest harness is horrible...

I tried reproducing locally in a debug Linux build with https://hg.mozilla.org/integration/autoland/rev/f713d2b2eacaaf832fdca31b92699bad55e3e8c3 reverted, but haven't managed to yet.

If you can reproduce, it's worth seeing what's actually going on in terms of why you're setting up a docshell with no window.
Flags: needinfo?(bzbarsky)
I also can't reproduce with an opt Linux build, fwiw.
I got the hang on ubuntu 16.04 with an opt build.  I had to use --repeat 5 to make that happen, just running the test once always passed.
OK.  I did manage to reproduce running --repeat 5 several times with an opt build.

It looks like the sequence of events is this:

1)  We start unloading the document (in this case this is dom/svg/crashtests/1282985-1.svg I was testing).
2)  We create the new content viewer and call Close() on the previous one.  This effectively sets mWindow
    to null, but we're still showing the old document (which still has a presshell, etc) until we
    start layout on the new one.
3)  While we wait for that to happen, a refresh tick fires and we do frame construction and layout.
4)  Frame construction for the <svg:use> clones the <use> target, which includes an <iframe> descendant.
    The anonymous clone of the <iframe> starts a load of the src.

We could get here with script inserting a normal <iframe> into the partially-unloaded document at the right moment.

Given that, maybe we should just have nsFrameLoader::MaybeCreateDocShell bail out early with an error if the document has no window in the non-static-document case, just like it does if mOwnerContent->IsInComposedDoc() is false in that case.  That would ensure that we don't do the load at all in this situation, which seems reasonable.
I made that modification, and I am still able to produce the crash.  If I check the window exists where we're creating the ancestors I don't crash.  To test I made the following change in MaybeCreateDocShell:

  if (!(doc->IsStaticDocument() || doc->GetWindow() || mOwnerContent->IsInComposedDoc())) {
    return NS_ERROR_UNEXPECTED;
  }
Flags: needinfo?(bzbarsky)
That's the wrong condition.  Your condition will return in _fewer_ cases than before, because the existing condition is equivalent to:

  if (!doc->IsStaticDocument() && !mOwnerContent->IsInComposedDoc())

while your modified condition is:

  if (!doc->IsStaticDocument() && !doc->GetWindow() && !mOwnerContent->IsInComposedDoc())) {

You want to either add a new if like so:

  if (!(doc->IsStaticDocument() || doc->GetWindow())) {

or rewrite the existing one like so:

  if (!doc->IsStaticDocument() && (!doc->GetWindow() || !mOwnerContent->IsInComposedDoc())) {

the idea being that for non-staticdocument cases we should not be doing loads if either our owner content is not in the document or the document is partially torn down.
Flags: needinfo?(bzbarsky)
And to be clear, I think I prefer the "rewrite the existing one like so" approach, with some comments.  Probably worth asking someone other than me (e.g. mystor?) to review that piece.
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

mystor: r? primarily for the change in MaybeCreateDocShell.
Attachment #8895588 - Flags: review?(nika)
Comment on attachment 8895588 [details]
Bug 1305237 LoadInfo changes to include all ancestors principals and window IDs,

https://reviewboard.mozilla.org/r/166794/#review193164

Seems fine to me :-) - r=me.

::: dom/base/nsFrameLoader.cpp:2723
(Diff revision 13)
> +  // the document is cloned with a docshell that has no window.  We check
> +  // that the window exists to ensure we don't try to gather ancestors for
> +  // those cases.
>    if (!mDocShell->GetIsMozBrowser() &&
> -      parentType == mDocShell->ItemType()) {
> +      parentType == mDocShell->ItemType() &&
> +      !doc->IsStaticDocument()) {

Can we double-check that doc->GetWindow() is non-null here? There are a lot of lines between this and the previous check that we had a window.
Attachment #8895588 - Flags: review?(nika) → review+
Pushed by mixedpuppy@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/ebd2ef6abc6a
LoadInfo changes to include all ancestors principals and window IDs, r=bz,mystor
https://hg.mozilla.org/integration/autoland/rev/cbe84780624c
Expose frameAncestors to webextensions, r=bz,kmag
https://hg.mozilla.org/mozilla-central/rev/ebd2ef6abc6a
https://hg.mozilla.org/mozilla-central/rev/cbe84780624c
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
I've added a bit on this: https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/webRequest/onBeforeRequest

Please let me know if this covers it.
Flags: needinfo?(mixedpuppy)
lgtm
Flags: needinfo?(mixedpuppy)
Is manual testing required on this bug? If Yes, please provide some STR and the proper webextension(if required), if No set the “qe-verify-“ flag.
Flags: needinfo?(mixedpuppy)
Flags: needinfo?(mixedpuppy) → qe-verify-
Depends on: 1420639
But it seems that frameAncestors is also available in onHeadersReceived [1]?

[1] https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/webRequest/onHeadersReceived
Flags: needinfo?(wbamberg)
Flags: needinfo?(mixedpuppy)
> it seems that frameAncestors is also available in onHeadersReceived [1]?

From what I can tell it's available in all events. Shane, can you confirm that this is what you would expect, and I'll update the docs?
Flags: needinfo?(wbamberg)
They should be available to all events.
Flags: needinfo?(mixedpuppy)
Product: Toolkit → WebExtensions
See Also: → 1602005
You need to log in before you can comment on or make changes to this bug.