Closed Bug 1058470 Opened 5 years ago Closed 5 years ago

Update blob URL and origin related code

Categories

(Core :: DOM: Core & HTML, defect)

x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla34

People

(Reporter: annevk, Assigned: baku)

References

(Blocks 1 open bug)

Details

(Keywords: dev-doc-needed)

Attachments

(1 file, 5 obsolete files)

Per http://dev.w3.org/2006/webapi/FileAPI/#unicodeBlobURL blob URLs are now supposed to contain their origin.

http://url.spec.whatwg.org/#origin defines how to then extract it.

We might want to remove the API introduced in bug 1046240 after this is fixed.
Attached patch origin.patch (obsolete) — Splinter Review
I have these questions:

1. at the moment we use the principal from the global object. In the spec it's written that we should use the principal of the incumbent settings object. Are they the same in this case?

2. This patch fails to get the origin from some principal. about:blank is one of the pages where the origin is 'null'. Which kind of check should I do in this case?
Attachment #8478946 - Flags: feedback?(bzbarsky)
Comment on attachment 8478946 [details] [diff] [review]
origin.patch

Well, my first question is why the spec is using the incumbent settings object instead of the function's realm....

But if we assume that we do want the incumbent settings object, then it is NOT the same thing as the global's origin.  The incumbent settings object would correspond to the global returned by mozilla::dom::GetIncumbentGlobal.

> Which kind of check should I do in this case?

The spec needs to cover this case, no?

>+  if (scheme == BLOBURI_SCHEME) {

This does not make me terribly happy.  Why can we not check for nsIURIWithPrincipal and then get the principal from it?  Or would that give a different principal?  At the very least this needs documentation explaining why we're doing things that are generally not ok in Gecko code (scheme checks) and slow string manipulation.

>+GetOrigin(nsIPrincipal* aPrincipal, nsACString& aOrigin)

Why is the nsContentUtils method for this not OK here?  _Really_ needs documentation.

>+  if (aPrincipal && !nsContentUtils::IsSystemOrExpandedPrincipal(aPrincipal)) 

Why not expanded principals?  That seems a bit fishy....
Attachment #8478946 - Flags: feedback?(bzbarsky) → feedback+
Attached patch origin.patch (obsolete) — Splinter Review
This patch splits GenerateURIString from AddDataEntry. This seems fine for me because AddDataEntry is used by URL API and nsDOMFile only.

Btw I'll file a bug to remove nsDOMFile::GetInternalUrl because it seems not used at all.
Attachment #8478946 - Attachment is obsolete: true
Attachment #8479806 - Flags: review?(bzbarsky)
Comment on attachment 8479806 [details] [diff] [review]
origin.patch

So why _do_ we want the incumbent global?  I don't think we do.  I think using the Realm of the function makes the most sense here, and we should get the spec fixed accordingly.

That would let us just use "principal", I believe.  Then you wouldn't need the separate GenerateURIString, right?

> URL::GetOrigin(nsString& aOrigin, ErrorResult& aRv) const

Why is this using GetUTFNonNullOrigin?  As far as I can tell, per spec this should return "null" as needed...  I guess that's a preexisting issue, but since we're adding the new GetUTFNonNullOrigin I'd like to understand what's going on here.
Attachment #8479806 - Flags: review?(bzbarsky)
Attached patch origin.patch (obsolete) — Splinter Review
I'll speak with annevk about filing the bug to the spec.
Attachment #8479806 - Attachment is obsolete: true
Attachment #8479975 - Flags: review?(bzbarsky)
Which "bug to the spec" are you talking about here?  Is that about "null" or about the incumbent global bit?  Because the "null" stuff is still present in this patch....
Flags: needinfo?(amarchesini)
(In reply to Boris Zbarsky [:bz] from comment #6)
> Which "bug to the spec" are you talking about here?  Is that about "null" or
> about the incumbent global bit?  Because the "null" stuff is still present
> in this patch....

The incumbent: I think that is the main one. But tomorrow I'll file both or at least I'll speak with annevk about them.
Flags: needinfo?(amarchesini)
I'd really like to get the null thing sorted out before we add this new not-spec-following API.
Attached patch origin.patch (obsolete) — Splinter Review
Attachment #8479975 - Attachment is obsolete: true
Attachment #8479975 - Flags: review?(bzbarsky)
Attachment #8481389 - Flags: review?(bzbarsky)
Comment on attachment 8481389 [details] [diff] [review]
origin.patch

>+    aUri += NS_ConvertUTF16toUTF8(origin);

  AppendUTF16toUTF8(origin, aURI);

>+    aUri += NS_LITERAL_CSTRING("/");

  aURI.Append('/');

Also while we're here:

>   aUri += aScheme;

Should that not just assign to aURI instead of appending?

>   aUri += NS_LITERAL_CSTRING(":");

  aURI.Append(':');

r=me with those nits.
Attachment #8481389 - Flags: review?(bzbarsky) → review+
Keywords: dev-doc-needed
Attached patch origin.patch (obsolete) — Splinter Review
https://tbpl.mozilla.org/?tree=Try&rev=aa26e09dbd9d

If green we can land this patch.
Attachment #8481389 - Attachment is obsolete: true
Actually it breaks a devtools test:

https://tbpl.mozilla.org/?tree=Try&rev=713c06bc9673
Attached patch origin.patchSplinter Review
Attachment #8481408 - Attachment is obsolete: true
Keywords: checkin-needed
Arun, can you make sure the File API specification is fixed per comment 4 and on? Seems it's getting the origin from the wrong object.
Flags: needinfo?(arun)
https://hg.mozilla.org/mozilla-central/rev/dc3942a35b5c
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
(In reply to Boris Zbarsky [:bz] from comment #4)
> Comment on attachment 8479806 [details] [diff] [review]
> origin.patch
> 
> So why _do_ we want the incumbent global?  I don't think we do.  I think
> using the Realm of the function makes the most sense here, and we should get
> the spec fixed accordingly.
> 
> That would let us just use "principal", I believe.  Then you wouldn't need
> the separate GenerateURIString, right?
> 
> > URL::GetOrigin(nsString& aOrigin, ErrorResult& aRv) const
> 
> Why is this using GetUTFNonNullOrigin?  As far as I can tell, per spec this
> should return "null" as needed...  I guess that's a preexisting issue, but
> since we're adding the new GetUTFNonNullOrigin I'd like to understand what's
> going on here.


Happy to make spec. changes, but I'm not clear on why yet. There appear to be two things:

1. The origin of the Blob URL, and
2. What to do for non-HTTP(S) origins, many of which are user-agent proprietary (about:blank here also, although not as proprietary as chrome: or even file:).

For 2. above, that is, the case of non-HTTP(S) origins, well, isn't defined, really: the spec has a note to this effect. And 1. does what Chrome does; we're simply copying their habit of tagging the origin in the Blob URL, and making it a valid part of the scheme data definition.

bzbarsky, can you tell me why using Realm makes more sense than what the spec. currently specifies? Also, can you give me an example where the effective script origin specified by the incumbent settings object at the time the method that created it -- either URL.createObjectURL or URL.createFor -- was called, *differs* from origin determined using GetFunctionRealm of the function?
Flags: needinfo?(arun)
> For 2. above, that is, the case of non-HTTP(S) origins, well, isn't defined, really

It should be, for at least about:blank and data:.  But that's not the part I'm worrying about here.

> 1. The origin of the Blob URL, and

This is the real issue.

> And 1. does what Chrome does; we're simply copying their habit of tagging the origin 

The question is _which_origin_.

> can you tell me why using Realm makes more sense than what the spec. currently specifies?

Because it better matches what ES6 does, imo.

This matters when you have code in window A do windowB.URL.createObjectURL(string).  Should the result have the origin of window A or window B?  The spec right now has it be the _effective_ (i.e. taking document.domain into account; is that really what UAs do?) origin of A.  The Realm would give you the origin of B.

In any case, http://web.mit.edu/bzbarsky/www/testcases/uri/origin-of-blob-urls.html shows that Chrome is most definitely not using the effective script origin, but just the origin.  That's also what we did in this bug, because imo using effective script origin here makes little sense... unless you load the blob: in an iframe of course, in which case it won't be same-origin with you per spec (but will be in Gecko, afaict, because we actually have the blob alias the origin and effective script origin of the document that created it and propagate that to the document loaded from the blob).

If you drop the use of effective script origins, then I suspect the difference between incumbent settings object origin and function Realm origin is not relevant for the web platform.  In fact, there has been a suggestion that the incumbent global go away altogether in favor of the function Realm's global: see https://www.w3.org/Bugs/Public/show_bug.cgi?id=26603

> can you give me an example where the effective script origin specified by the incumbent
> settings object at the time the method that created it -- either URL.createObjectURL or
> URL.createFor -- was called, *differs* from origin determined using GetFunctionRealm of
> the function

Since it's using the _effective_ script origin, any time document.domain got set.
(In reply to Boris Zbarsky [:bz] from comment #18)
> > For 2. above, that is, the case of non-HTTP(S) origins, well, isn't defined, really
> 
> It should be, for at least about:blank and data:.  


I'm not sure that data: URL origin is a well-agreed enough concept on which to try and specify what a blob URL coined from within it might have as origin. We could say something like blob:[opaqueToken]/UUID but I'm not sure. And about:blank also, but I'm not sure blob URLs coined from within are realistic. I suppose effective script origin and incumbent settings object gave us that out of the box.
 
> > 1. The origin of the Blob URL, and
> 
> This is the real issue.


OK, this is a good thing to get right. I now agree that _effective_ script origin probably isn't right. But I'm not sure what is a stable concept that can define blob URL origin. 

It sounds like what's correct is that for HTTP and HTTPS URLs, the origin of a blob URL is the origin of the function's Realm when URL.createObjectURL or URL.createFor are called. Will that hold water?
> I'm not sure that data: URL origin is a well-agreed enough concept

I'm talking about data: URIS that the user just types in the url bar.  For those everyone agrees they have a nonce origin which serializes to "null".

> And about:blank also, but I'm not sure blob URLs coined from within are realistic.

They totally are.  User opens new tab, loads about:blank, opens console, starts experimenting.  This is a very common use case, especially (sadly) in a classroom setting when people are being taught about web technologies.

> I suppose effective script origin and incumbent settings object gave us that out of the
> box.

To be clear, this is well-defined in the spec right now.  You get blob:null/UUID for both the data: and about:blank cases if the origin is a nonce origin.  That's why I said it wasn't an issue.

> But I'm not sure what is a stable concept that can define blob URL origin.

The situation here is moderately sucky, imo.  http://www.whatwg.org/specs/web-apps/current-work/multipage/browsers.html#origin doesn't define the concept of origin of a global, for reasons that escape me, but of course this is the thing that you really want here: the origin of the global of the current Realm.  

Defining the origin of a Window is easy; we'd need to define it for worker globals too.

This is basically something you should work out with hixie.  :(
OK. I
(In reply to Arun K. Ranganathan from comment #21)
> OK. I

(Submitted too early. I'm Cc'ing Hixie. I also think a relevant conversation is https://www.w3.org/Bugs/Public/show_bug.cgi?id=26603 since we what we need is a good underlying model for the origin of the global of the current Realm; currently, ES's Realm and HTML's origin are not connected).
Andrea, is there a reason that you only changed URL.origin and not all the other URLUtils .origin implementations?

That seems a bit odd.  Should't we have put that code inside GetUTFOrigin or whatnot?
Flags: needinfo?(amarchesini)
Blocks: 1101584
Flags: needinfo?(amarchesini)
Blocks: 1101598
Depends on: 1160892
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.