Closed Bug 1330600 Opened 7 years ago Closed 7 years ago

the webrequest needs to contain additional information about the cause of the download

Categories

(WebExtensions :: Request Handling, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: fdsc, Unassigned)

References

Details

(Whiteboard: [webRequest] investigate)

Attachments

(1 file)

2.55 KB, application/x-xpinstall
Details
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:50.0) Gecko/20100101 Firefox/50.0
Build ID: 20161208153507

Steps to reproduce:

1. Suppose there is a ajax page. The page sends requests from time to time even without user action.

2. The user presses the home button or enters a different address into the address bar.

3. Now FireFox creates two different groups of requests. One from the old page (by home button), some from the new page. It lasts a few seconds until a new page is not loaded and the old page appears.


Actual results:

1. It seems impossible to understand which the handler is invoked WebRequests.  Request change the contents of a tab entirely or does apply to subqueries.
2. Older requests are Impossible to distinguish from the new requests.



Expected results:

1. I think that the queries that change the contents of the tab as a whole, must be separately marked with a number and this number must be passed to all WebRequest handlers of the tab.
2. Necessary to flag that the request is sent to change the contents of tab entirely.
See Also: → 1305237
Flags: needinfo?(mixedpuppy)
I'm not sure what else is needed beyond the existing API.  requestId, timeStamp, tabId and frameId.  Otherwise there is some missing information I need to understand this bug.  If that is the case, could you provide a working code example?  If bug 1305237 will solve your use case, then lets DUP this to that bug.
Flags: needinfo?(mixedpuppy) → needinfo?(fdsc)
> could you provide a working code example

How?


> If bug 1305237 will solve your use case
No, by itself it does not solve the problem. But, it seems, they are seeking to solve similar problem.


Roughly, the meaning of this.
1. The entireTabReloadFlag to view that occur the reload a contents of the entire tab with the request.
2. Each request must be additional tabRequestId.
If (entireTabReloadFlag)
then
  tabRequestId = RequestId
  or
  tabRequestId = null (for service requests)
else
  tabRequestId must equal first RequestId, when entireTabReloadFlag has been setted



Usage might look something like this

HUAC.onBeforeRequest = function(response)
{
    // collect all information about requests
    HUAC.responses[response.requestId] = {id: response.requestId, mainId: response.tabRequestId, url: response.url};

// response.frameId == 0 and response.tabRequestId == response.requestId
    if (response.entireTabReloadFlag)
    {
        HUAC.mainRequests[response.requestId] = {id: response.requestId, url: response.url, isMain: true, requests: [], tabId: response.tabId};

        // Prohibit the request, if such URL cannot be loaded in a tab.
        if (HUAC.isDineidTabUrl(response.url, response.tabId))
            return {cancel: true};
    }
    else
    {
        // Collect information about all involved queries in HUAC.mainRequests cell.
        HUAC.mainRequests[response.tabRequestId].requests.push(response.requestId);

        // Prohibit the request, if such URL cannot be loaded in a tab with this url.
        // !!! Here. To make a decision, I need the url of the tab. Not the origin URL
        if (HUAC.isDineidUrl(response.url, HUAC.mainRequests[response.tabRequestId].url, response.tabId))
            return {cancel: true};
    }
}
Flags: needinfo?(fdsc) → needinfo?(mixedpuppy)
See Also: 1305237
If I'm understanding you correctly, you should be able to accomplish this without changing the api.

browser.tabs.get(request.tabId).then(tab => {
  ignoreRequest = tab.url != request.originUrl
});

You'd probably also have to take into consideration the request.frameId in case it was a subframe.

Or possibly you could somehow use browser.tabs.onUpdated.addListener(listener) to watch for the url changing.

https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/tabs/onUpdated
Flags: needinfo?(mixedpuppy)
No. This is absolutely not what I need.

> browser.tabs.get

This is request is asynchronous. To denied the request, I need to know the information synchronously.

Besides, I already wrote above. Generally speaking, requests should be tied to that, what content of tab is downloaded. That is, requests so happens that in one tab, already loading a new document, but the old still send requests.

In addition. The URL of the tab may vary through windows.history

> https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/tabs/onUpdated

This no makes links to request. And probably loaded too late.
Flags: needinfo?(mixedpuppy)
I will explain what I want to achieve. My add-on, https://addons.mozilla.org/firefox/addon/http-useragent-cleaner/ allows user to block, for example, frames.

And, for example, you can work with at google mail. And, at once, if a other website loads in a frame google, you can reset the cookies in the frame, or prevent a load of the entire frame.

To do this, I need to have the synchronously knowledge which tab loads the request. And it is tab url,  not the origin. And it should be attached to an every processed request.
Your request still makes no sense.  

You want a duplication of details.id (the request id) unless it is a service request (and I'm not clear what you mean by that). You can tell what kind of request it is by looking at details.type (you probably only want main_frame and sub_frame types), so this duplication is not necessary.  You can get the same thing with:

["main_frame", "sub_frame"].includes(details.type)

details.originURL is the url that initiated the request, it is not an "origin".

types: https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/webRequest/ResourceType
originURL: https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/webRequest/onBeforeRequest
Flags: needinfo?(mixedpuppy)
From https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/webRequest/onBeforeRequest

> originUrl
>    ... Note that this may not be the same as the URL of the page into which the requested resource will be loaded. For example, if a document triggers a load in a different window through the target attribute of a link, or a CSS document includes an image using the url() functional notation, then this will be the URL of the original document or of the CSS document, respectively

I'm afraid that's not the URL of the first request on the tab. Window with many frames this will be the URL of one of the frames


> You want a duplication of details.id

In the example above, as you can see, details.id has a different role and is not the same as what I'm asking.

You are confusing data from different queries. I really need request.id . However, request.id from another query (from first query from tab).
Flags: needinfo?(mixedpuppy)
Whiteboard: [webRequest] investigate
Ok, based on what I can figure out from your comments, I think the code below is what you are trying to do.  If does not handle what you need, we need to get really clear about what you are trying to do, not what you think the api should have.


// You may want to preload this map by using tabs.getAllInWindow() for
// each window and loading the map.
let mainRequests = new Map();

function isTopLevelFrame({frameId, parentFrameId}) {
  return frameId == 0 && parentFrameId == -1;
}

browser.tabs.onRemoved.addListener((details) => {
  console.log("tabs.onRemoved", details);
  if (mainRequests.has(details.tabId)) {
    mainRequests.delete(details.tabId);
  }
});

// Watch for navigation events, this happens prior to the request.
browser.webNavigation.onBeforeNavigate.addListener((details) => {
  console.log("webNavigation.onBeforeNavigate", details);
  if (!isTopLevelFrame(details)) {
    return;
  }
  // save tab data because this is a top level navigation event.
  let requestData = {
    tabUrl: details.url,
    tabId: details.tabId,
    // webNavigation.onCommitted would provide transitionType, but it happens
    // AFTER webRequest.onBeforeRequest.  However if we've tracked the tab
    // before, we know it is a top level navigation event, could be a reload,
    // or the user typed a new url into the urlbar, etc.
    tabReloadFlag: mainRequests.has(details.tabId),
    requests: [],
  };
  mainRequests.set(details.tabId, requestData);
});

// Stop a request.
browser.webRequest.onBeforeRequest.addListener((details) => {
  if (!mainRequests.has(details.tabId)) {
    // was not a navigation request
    return;
  }
  console.log("webRequest.onBeforeRequest", details);

  let requestData = mainRequests.get(details.tabId);

  // Collect information about all involved queries in HUAC.mainRequests cell.
  requestData.requests.push(details); // potentially lots of data, limit what you store

  if (details.type === "main_frame" && requestData.tabReloadFlag) {
    // Prohibit the request, if such URL cannot be loaded in a tab.
    //if (HUAC.isDeniedTabUrl(details.url, details.tabId)) {
    //    return {cancel: true};
    //}
  } else {
    //if (HUAC.isDeniedUrl(details.url, requestData.tabUrl, details.tabId)) {
    //    return {cancel: true};
    //}
  }
}, {urls: ["<all_urls>"]} , ["blocking"]);
Flags: needinfo?(mixedpuppy) → needinfo?(fdsc)
This may be much simpler depending on what you need:

browser.webRequest.onBeforeRequest.addListener((details) => {
  return new Promise(resolve => {
    browser.tabs.get(details.tabId).then(tab => {
      if (details.type === "main_frame" && details.url === tab.url) {
        // Prohibit the request, if such URL cannot be loaded in a tab.
        if (HUAC.isDeniedTabUrl(details.url, details.tabId)) {
            resolve({cancel: true});
        }
      } else {
        if (HUAC.isDeniedUrl(details.url, tab.url, details.tabId)) {
            resolve({cancel: true});
        }
      }
    })
  });
}, {urls: ["<all_urls>"]} , ["blocking"]);
but remember to call resolve() if you dont block the request.
@fdsc, did my latest example provide a path forward?
I still haven't tried to do it.

As I understand it, the examples are not removed the problem with the simultaneous loading of two different documents to one tab (AJAX in an old document and loading of a new document).

And I need to check the work window.history
It all takes time, so still I never tried it.

I'll write when I'll try.
Attached file dev.xpi
I tried what you said.

Unfortunately, your solution is not suitable.
Imagine the scenario:
1. 'yandex.ru' loaded
2. 'mail.ru' url click to load
3. 'yandex.ru' document generated ajax request

It is impossible to understand the request #3 already refers to mail.ru or to yandex.ru ?

Similarly, if the request for 'mail.ru' was blocked by extension (or canceled by the user), it is also, generally speaking, impossible to understand.


That is, your solution works until the tab reboots. I'm afraid there may be other difficulties.
Flags: needinfo?(fdsc) → needinfo?(mixedpuppy)
(In reply to fdsc from comment #13)
> Created attachment 8834110 [details]
> dev.xpi

> Unfortunately, your solution is not suitable.
> Imagine the scenario:
> 1. 'yandex.ru' loaded
> 2. 'mail.ru' url click to load
> 3. 'yandex.ru' document generated ajax request
> 
> It is impossible to understand the request #3 already refers to mail.ru or
> to yandex.ru ?
> 

Bug 1311815 adds documentUrl to the details object, which will give you the url that the request was loading into.  In #3 above, documentUrl would be for yandex.ru.  That in combination with the tabId as I illustrated earlier should give you the ability to know the request was for the old page.  Likewise, using webNavigation.onBeforeNavigate you could track that the tab is about to change, and later requests (those with a request id you have already seen) may be invalid.
Flags: needinfo?(mixedpuppy)
Well, I'll wait until Bug 1311815 will be published and will test the changes. Perhaps it will work.
need info to let you know that patch landed in Nightly, if you want to test to see if it meets the need.
Flags: needinfo?(fdsc)
I tested the DocumentURL feature.

Disadvantages:
1. The "data:" scheme have not the documentURL
2. I need a little bit another. Perfectly, I need the documentURL at the top-most documentURL in tab. But I can get it from an another requests data.


Maybe I should create a new defect about "data" and close this defect?
Flags: needinfo?(fdsc)
Yes, open a bug specific to data uris.
Status: UNCONFIRMED → RESOLVED
Closed: 7 years ago
Resolution: --- → INVALID
See Also: → 1344575
Product: Toolkit → WebExtensions
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: