Closed Bug 1454190 Opened 7 years ago Closed 7 years ago

Need a way to make requests without Origin header for cross-origin resources

Categories

(WebExtensions :: Untriaged, defect)

x86
Windows 10
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: dw-dev, Unassigned)

Details

Attachments

(7 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:59.0) Gecko/20100101 Firefox/59.0 Build ID: 20180310025718 Steps to reproduce: I am the developer of the Save Page WE add-on. In order to save a page and all of its resources, Save Page WE has to load the page's resources again. Usually, Save Page WE loads the resources by making GET requests using XMLHttpRequest in the background script. This works for all resources that do not require a Referer Header. If a request fails, Save Page WE assumes that this is because a Referer Header was required, and makes another GET request using content.XMLHttpRequest in the content script running in the page. This works for same-origin resources, but NOT for cross origin resources. My testing was done with Firefox Nightly 61.0a1, Save Page WE 9.5 and this page: https://www.pixiv.net/ I have compared the request and response headers for one of the failing resources, when the browser does the resource load (e.g. when page is loaded) and when Save Page WE does the resource load (see results below). The only significant difference seems to be that when Save Page WE does the resource load, an Origin Header is sent in the request. When the browser does the resource load, this header is not sent. My understanding is that content.XMLHttpRequest is meant to do a request in exactly the same way as when the browser does. So this appears to be a bug in content.XMLHttpRequest - or am I missing something? Browser Resoure Load --------------------- Request Headers Host: i.pximg.net User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0 Accept: */* Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate, br Referer: https://www.pixiv.net/ Connection: keep-alive Response Headers HTTP/2.0 200 OK server: nginx date: Fri, 13 Apr 2018 17:58:53 GMT content-type: image/jpeg content-length: 728297 last-modified: Tue, 07 Apr 2015 01:31:23 GMT expires: Sat, 13 Apr 2019 06:19:29 GMT cache-control: max-age=31536000 x-content-type-options: nosniff age: 41443 via: http/1.1 f001 (second) accept-ranges: bytes X-Firefox-Spdy: h2 Save Page WE Resource Load ---------------------------- Request Headers Host: i.pximg.net User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0 Accept: */* Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate, br Referer: https://www.pixiv.net/ Origin: https://www.pixiv.net Connection: keep-alive Response Headers HTTP/2.0 200 OK server: nginx date: Fri, 13 Apr 2018 18:02:05 GMT content-type: image/jpeg content-length: 323329 last-modified: Wed, 11 Mar 2015 06:07:15 GMT expires: Sat, 13 Apr 2019 12:59:16 GMT cache-control: max-age=31536000 x-content-type-options: nosniff age: 18048 via: http/1.1 f005 (second) accept-ranges: bytes X-Firefox-Spdy: h2 Actual results: The GET request using content.XMLHttpRequest does not complete normally, and the onload() event handler is not fired. Instead the onerror() event handler is fired, but the status shown in the Developer Tools Network panel is '200' (OK). Sometimes the resource is partially loaded. Expected results: The GET request using content.XMLHttpRequest should complete normally, and the onload() event handler should be fired.
CORRECTION My testing was done with Save Page WE 9.5.1 (attached), which is a slightly modified version of Save Page WE 9.5. NEW INFORMATION When Save Page WE saves the test page (https://www.pixiv.net/), there are four images that fail to load (status 403) when initiated from the background script using XMLHttpRequest. When Save Page WE attempts to load these images from the content script using content.XMLHttpRequest, the two loads of small images (~5KB) succeed, but the two loads of large images (~1MB) do not complete. However, all four loads call the onerror() event handler, and in the Developer Tools Network panel the stats is 200.
Attached file Save Page WE 9.5.1.zip
Hello, I tested this issue using the latest version of Nightly 61.0a1 (2018-04-22) and i used the "about:addons" version of "Save Page WE 9.5.1" on the https://www.pixiv.net/ , when i try to save the page i'm getting the Error : "4 our of 17 resourcers could not be loaded and will not be saved", is there a way for me to see how the Content.XMLHttpRequest runs and that the onload() handler is not fired, Can you provide more steps on how to check the events in the Network panel? Thanks
Flags: needinfo?(dw-dev)
The error: "4 our of 17 resourcers could not be loaded and will not be saved" happens because the downloads of 4 .jpg images fails. I have re-done my testing so I can provide more information. These are the steps that I used for my testing: TEST SAVE PAGE WE RESOURCE LOADS 1. Start Firefox Nightly 61.0a1. 2. Install Save Page WE 9.5.2 (new attachment). 3. Load https://www.pixiv.net/ into the selected tab. 4. On main menu, select Tools > Web Developer > Browser Toolbox, and select the Network panel. 5. On main menu, select History > Clear Recent History, select Everything and tick all checkboxes, and click on Clear Now. 6. Right-click on Save Page WE toolbar button and select Save Standard Itema. The test results are attached as three screenshots (A,B,C). Screenshot A: The first attempts to load the 4 images from the background script fail with status 403. The second attempts to load the 4 images from the content script appear to succeed with status 200, but actually fail and call the XMLHttpRequest onerror handler. Screenshot B: This shows the Request and Response headers for the second attempt to load one of the images (26339586_p0_master1200.jpg) from the content script. Note, the Request includes both Referer and Origin headers. Screenshot C: This shows the console log messages. There are messages from the content script when the XMLHttpRequest GET's are initiated, and when the onerror handler is called. The relevant source code lines in the content script (content.js) are lines 1121 t0 1186. There are also "Cross-Origin Request Blocked" messages, which do not appear when Firefox loads these images during a nomal page load. TEST FIREFOX RESOURCE LOADS 1. Start Firefox Nightly 61.0a1. 3. Load https://www.pixiv.net/ into the selected tab. 2. On main menu, select Tools > Web Developer > Browser Toolbox, and select the Network panel. 4. On main menu, select History > Clear Recent History, select Everything and tick all checkboxes, and click on Clear Now. 5. Click on the Page Reload icon on the toolbar to reload the page. The test results are attached as two screenshots (D,E). Screenshot D: The loads of the 4 images (e.g. 57080648_p0_master1200.jpg) all succeed with status 200. Screenshot E: This shows the Request and Response headers for the load of one of the images (57080648_p0_master1200.jpg). Note, the Request includes a Referer header, BUT NOT an Origin header. QUESTIONS 1. Why is there an Origin header in the Request sent by Save Page WE, but not in the Request sent by a Firefox page load? 2. Why do the "Cross-Origin Request Blocked" messages appear in the console when the resource is requested by Save Page WE, but not when the resource is requeste by a Firefox page load? I hope this fully explains the issue.
Flags: needinfo?(dw-dev) → needinfo?(rares.doghi)
Attached file Save Page WE 9.5.2.zip
Attached image Screenshot A.png
Attached image Screenshot B.png
Attached image Screenshot C.png
Attached image Screenshot D.png
Attached image Screenshot E.png
Hello again , Thanks a lot for the extra steps and attachments, the issue became a lot clearer and i did manage to reproduce this issue on the latest version of nightly (61.0a1 (2018-04-26)) on Windows 10 using the above mentioned steps, i did notice the extra "Origin" header displayed when i use the Save Page 9.5.2 but not when i simply reload the page, I will set the Component for this one to Networking, if somebody else thinks the component should be something else please change it to the correct Component.
Status: UNCONFIRMED → NEW
Component: Untriaged → Networking
Ever confirmed: true
Flags: needinfo?(rares.doghi)
OS: Unspecified → Windows 10
Product: Firefox → Core
Hardware: Unspecified → x86
Version: 61 Branch → Trunk
Component: Networking → DOM
XHR is same origin only by default. https://xhr.spec.whatwg.org/#security-considerations I don't know if there are some ways for addons to load cross origin data.
Component: DOM → General
Product: Core → Firefox
Moved to Firefox, since from implementation point of view XHR works as expected, per spec, but perhaps there are APIs for addons which I'm not familiar with. (and couldn't find any good bugzilla component for addons)
Component: General → WebExtensions: Untriaged
Product: Firefox → Toolkit
Summary: content.XMLHttpRequest does not work correctly for cross-origin resources → Need a way to make requests without Origin header for cross-origin resources
The new Summary is exactly right. In the "XHR and Fetch" section of the "Add-ons | Content Scripts" documentation (https://developer.mozilla.org/en-US/Add-ons/WebExtensions/Content_scripts), is says that: > From version 58 onwards extensions that need to perform requests that behave as if they were sent by the content itself > can use content.XMLHttpRequest and content.fetch() instead. At present, content.XMLHttpRequest does NOT perform (cross-origin) requests that behave as if they were sent by the content itself. Including an Origin header in requests for cross-origin resources is a bug in content.XMLHttpRequest.
There's something here that isn't clear to me. In OP the comparison is between a browser document loading requests (images via css or image tags in html) and xhr/fetch. Based on document load, you're saying that xhr should behave the same. I think that is an incorrect conclusion. Comment 14 suggests the documentation is claiming that this should be so, but again, it is an incorrect conclusion. As smaug pointed out, xhr is behaving correctly. If you can provide a simple test case (ie. not your addon) that directly compares a cross origin xhr in a web page that loads as a document vs. the use of content.xhr in a content script attached to that page, and show that they differ, then I would consider that a bug. Otherwise, this is a request for something that goes against our security model (which is what I believe it is) so I am closing as invalid.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INVALID
This is a good point. I was reading the documentation as referring to browser document load resource requests, but I now understand that it is referring to xhr requests made in the document content (presumably in scripts). So, my understanding now is that: - in the background script, an xhr request will not have Origin or Referer headers. - in a content script, an xhr request will not have Origin or Referer headers. - in a content script, a content.xhr request will have both Origin and Referer headers. Is there any way to issue an xhr request with a Referer header, but without an Origin header?
Flags: needinfo?(mixedpuppy)
I forgot to mention that I have also tested Save Page WE 9.5.2 with Chrome. As far as I know, Chrome does not have a content.xhr method, so the xhr method was used instead. The 4 resources, that could not be loaded by the content script using Firefox, were successfully loaded by the content script using Chrome. The tests showed that using Chrome: - in the background script, an xhr request does not have Origin or Referer headers. - in a content script, an xhr request does not have an Origin header, BUT DOES have a Referer header. When sending an xhr request from a content script, why does Firefox not add a Referer header, when Chrome does? This seems to be an inconsistency in the implementation of the APIs.
You could set the referrer with the WebRequest API, just like in Chrome. Alternatively the request origin could be whitelisted by setting the `Access-Control-Allow-Origin: <origin>` and `Access-Control-Allow-Credentials: true` headers during the response, for cross-origin requests initiated from the content.x APIs. In both cases be absolutely sure to set the headers for your own requests only, to avoid security issues.
Thanks, that is a really helpful suggestion. I use the WebRequest API in another of my add-ons (Print edit WE). How can I distinguish between requests sent by my add-on and requests sent by the browser during document load (or sent by the page's content)?
Flags: needinfo?(mozilla)
The method I'm currently using is to generate a token in the content script, message the background script with the token, the url and the desired referrer, start listening for requests in the background, then make the request from the content script by setting a custom header with the token. This header will be detected and removed by your background script, and the referrer added. This function is called in the background when the message is sent from the content script, it takes care of a single request: function setRequestReferrer(url, referrer, token) { const requestCallback = function(details) { const headers = details.requestHeaders; for (const header of headers) { if ( header.name.toLowerCase() === 'x-token' && header.value === token ) { headers.splice(headers.indexOf(header), 1); headers.push({ name: 'Referer', value: referrer }); removeCallbacks(); return {requestHeaders: headers}; } } }; const removeCallbacks = function() { window.clearTimeout(timeoutId); browser.webRequest.onBeforeSendHeaders.removeListener(requestCallback); }; const timeoutId = window.setTimeout(removeCallbacks, 10000); // 10 seconds browser.webRequest.onBeforeSendHeaders.addListener( requestCallback, { urls: [url], types: ['xmlhttprequest'] }, ['blocking', 'requestHeaders'] ); } It's a bit cumbersome for such a simple task, you should also use a single listener with a shared token for your entire batch of urls. Let me know if you find a simpler solution.
Flags: needinfo?(mozilla)
dw-dev: "content.*" is specific to firefox, it was added as a way to use the contents xhr/etc. post a new bug specifically about the difference of (the default, not content) xhr between chrome and firefox in the content script. Lets see what input we get from someone who might know xhr better.
Flags: needinfo?(mixedpuppy)
Shane, As requested, I have raised a new Bug 1458183 for this discrepancy in setting a Referer header between Fiefox and Chrome.
Product: Toolkit → WebExtensions
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: