Open Bug 1742589 Opened 4 years ago Updated 4 months ago

Use the sandbox realm for "log.entryAdded" events when entries aren't coming from the default window

Categories

(Remote Protocol :: WebDriver BiDi, task, P2)

task
Points:
8

Tracking

(Not tracked)

People

(Reporter: jdescottes, Unassigned, NeedInfo)

References

(Blocks 1 open bug)

Details

(Whiteboard: [webdriver:backlog])

Attachments

(1 file)

Follow up to Bug 1694143, we should emit the realm in the payload for the log.entryAdded event.

Whiteboard: [webdriver:triage] → [bidi-m3-mvp]
Whiteboard: [bidi-m3-mvp]
Priority: -- → P3
Whiteboard: [bidi-m3-mvp]
Points: --- → 8
No longer blocks: 1724669

Realm support is not something we are going to implement soon. But there is bug 1767260 which will add support for the source field including the browsing context specifier.

Depends on: 1767260
Whiteboard: [bidi-m3-mvp] → [webdriver:backlog]

With bug 1767260 we also got the realm set to the inner window id. But once log entries can be sent out from a sandbox we most likely have to do some updates here.

Would be good to further discuss and better align with the work for script evaluation.

Whiteboard: [webdriver:backlog] → [webdriver:backlog][webdriver:triage]

We should wait until we have proper support for sandbox usage for script evaluation before continuing with this bug.

Depends on: 1770480
Whiteboard: [webdriver:backlog][webdriver:triage] → [webdriver:backlog]

All dependencies seem to have been fixed. To get the realm id added we will have to use the following:

  • For the console listener: the innerWindowID of the message
  • For the console API: the innerID (basically the innerWindowID) of the message can be used

With that ID we would have to retrieve the related Realm from the available realms in the script module.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #4)

All dependencies seem to have been fixed. To get the realm id added we will have to use the following:

  • For the console listener: the innerWindowID of the message
  • For the console API: the innerID (basically the innerWindowID) of the message can be used

With that ID we would have to retrieve the related Realm from the available realms in the script module.

We already somehow implemented this in Bug 1767260. It's not exactly done how you suggested, but the end result is the same. The log module adds the innerWindowID to all messages here https://searchfox.org/mozilla-central/rev/abcee8d2c97a5c8a1fbeaf84607ea427be72497a/remote/webdriver-bidi/modules/windowglobal/log.sys.mjs#48.

And the events created for a given module are filtered on innerWindowID for console listener and on innerID for console api listener.
Although we are not going through a realm object, nor are we using the script module. But the id should already be correct.

(In reply to Julian Descottes [:jdescottes] from comment #5)

We already somehow implemented this in Bug 1767260. It's not exactly done how you suggested, but the end result is the same. The log module adds the innerWindowID to all messages here https://searchfox.org/mozilla-central/rev/abcee8d2c97a5c8a1fbeaf84607ea427be72497a/remote/webdriver-bidi/modules/windowglobal/log.sys.mjs#48.

Ah, right. So the windowglobal log module should call into the script module to actually retrieve the correct realm for the inner id. What we have right now was just a workaround because at the time of implementation we haven't had realm support yet.

In the scope of bug 1731589, we started using the default realm for serialization of arguments and in the payload for the log.entryAdded event. But we should actually use the realm from which an event came from to cover the case, when the event is emitted in a sandbox.
We would need, most likely, to use sourceId from the event to find the right realm. One technical constrain could also be that BiDi might not know about the realm which was created to emit the event.

Whiteboard: [webdriver:backlog] → [webdriver:backlog][webdriver:triage]

As decided in today's meeting we should have a focus on this particular bug for the current or next milestone.

Priority: P3 → P2
Summary: Support realms in log.entryAdded → Use the sandbox realm for "log.entryAdded" events when entries aren't coming from the default window
Whiteboard: [webdriver:backlog][webdriver:triage] → [webdriver:backlog]

During the meeting we also mentioned that having the correct realm will be crucial for workers which is partly why we bumped the priority.

But since we don't implement workers at the moment, it will be hard to work on that part. Which leaves only the "messages coming from sandboxes" scenario. And most likely this will need a different technical solution than for worker messages. Worker messages should be captured in the worker thread, so we should easily have a way to identify them when we create the event. If we make this bug only about sandbox window realms, does this impact the priority?

James was advocating for P2 so lets get his feedback.

Flags: needinfo?(james)

My opinion is that continuing to send events with wrong information is bad for the ecosystem as people learn that they can't rely on getting consistent realm ids to identify the source of script things and so have to spend a bunch of time on workarounds. So my argument is that we should probably invest the time in fixing this for sandboxes upfront, allowing clients to depend on it, and then making the introduction of workers easier.

Flags: needinfo?(james)

We dropped on this topic for quite a bit but with the work we did over the past half year it seems to be easier to accomplish now.

Nowadays we have the list of realms available via the window global message manager instance. Given that each realm should be tight to a specific inner window, I assume that we should let the console (API) listeners retrieve such a window via the inner window id, which is available for both APIs, and let it forward via their internal event's payload. Then in the WebDriver BiDi's content log module we can use that window reference to filter for the related Realm to be returned via the log.entryAdded event.

I would propose that we get this fixed before clients might to heavily rely on the default realm that we currently return. Also I don't think that this will be 8 points anymore but more 3 (or max. 5) points.

Whiteboard: [webdriver:backlog] → [webdriver:backlog][webdriver:triage]

Actually as what Sasha already mentioned in comment 7 the innerWindowId is not enough here and we need a unique id that identifies the actual sandbox the event is coming from. The souceId that is emitted eg as part of the nsIScriptError might actually be such a candidate, but it's not clear for us how to actually correlate it to a created sandbox via Cu.Sandbox().

Olli, would sourceId the right choice here and if yes, how could we retrieve it from a sandbox instance? Thanks.

Flags: needinfo?(smaug)
Whiteboard: [webdriver:backlog][webdriver:triage] → [webdriver:backlog]

I don't know the answer. It depends on how does something somewhere map sourceId to something else.
I'm not sure how strict/consistent the callers of InitSourceId are.

Flags: needinfo?(smaug)

(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #14)

I don't know the answer. It depends on how does something somewhere map sourceId to something else.
I'm not sure how strict/consistent the callers of InitSourceId are.

Most importantly for now would be those callers that use console API calls and raise script errors based on web content. So I assume that all these are part of the DOM code somewhere? The main question would still be if sourceId is what we are looking for or if there is something else that should actually be used to identify a sandbox / realm.

Flags: needinfo?(smaug)

I still don't know the answer. I don't know what you'd do with sourceId, and sourceId is just some random integer.
Given bug 1447244, jimb might remember better what that integer represents.

Flags: needinfo?(smaug)

(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #16)

I still don't know the answer. I don't know what you'd do with sourceId, and sourceId is just some random integer.
Given bug 1447244, jimb might remember better what that integer represents.

Sorry that I completely missed your reply. I have taken a look at bug 1447244 and probably found the answer in bug 1447244 comment 28:

Yeah, the tag is a unique ID for a script source. I'll rename it to scriptSourceId; I called it a tag instead to distinguish it from the script source IDs used in devtools code, but there is little overlap between the places where the two kinds of IDs are used so I don't think it would cause much confusion.

Nevertheless I would like to ask Jimb if he maybe knows how to identify if a console API call comes from a specific sandbox, or if this would be a new feature.

Flags: needinfo?(jimb)

:whimboo I'm sorry - I really have no idea about this.

Flags: needinfo?(jimb)

Tooru, maybe you can help us with this topic (starting with comment 13)? Thanks!

Flags: needinfo?(arai.unmht)

I'm not sure I understand the question and the context.
If there's a testcase that demonstrates the entire issue, that will be helpful.


Here's my current rough understanding and questions.

There's a code like the following (which can be executed in xpcshell):

const systemPrincipal = Cc["@mozilla.org/systemprincipal;1"].createInstance(
  Ci.nsIPrincipal
);
const sb = Cu.Sandbox(systemPrincipal);

sb.console = console;

Cu.evalInSandbox(`
console.log("hey");
`, sb);

And this bug is talking about the "log.entryAdded" message emitted by LogModule.#onConsoleAPIMessage, and the goal here is to associate the event to the sb variable above? or to associate the event to the script source text? or some other things?
Also, is the question also about how to associate, or only about how to get the sandbox or the script source?


Anyway, the related parts would be the following:

Preparation:

When console API is called on main thread:

  1. a script calls console.log
  2. SpiderMonkey calls mozilla::dom::ConsoleInstance_Binding::log
  3. mozilla::dom::ConsoleInstance_Binding::log calls mozilla::dom::ConsoleInstance::Log
  4. mozilla::dom::ConsoleInstance::Log calls mozilla::dom::Console::MethodInternal
  5. mozilla::dom::Console::MethodInternal calls mozilla::dom::MainThreadConsoleData::ProcessCallData
  6. mozilla::dom::MainThreadConsoleData::ProcessCallData calls nsIConsoleAPIStorage::recordEvent
  7. ConsoleAPIStorageService.recordEvent calls the callbacks, which is ConsoleAPIListener.#onConsoleAPIMessage
  8. ConsoleAPIListener.#onConsoleAPIMessage emits the "message" event on the listeners, which is LogModule.#onConsoleAPIMessage
  9. LogModule.#onConsoleAPIMessage emits the "log.entryAdded" event

When console API is called on worker, the mozilla::dom::Console::MethodInternal part is slightly different:

  1. mozilla::dom::Console::MethodInternal dispatches mozilla::dom::ConsoleCallDataWorkerRunnable
  2. mozilla::dom::ConsoleCallDataWorkerRunnable::RunConsole calls mozilla::dom::ConsoleRunnable::ProcessCallData on main thread

When console API is called on worklet, similar different as the worker case:

  1. mozilla::dom::Console::MethodInternal dispatches mozilla::dom::ConsoleCallDataWorkletRunnable
  2. mozilla::dom::ConsoleCallDataWorkletRunnable::Run calls mozilla::dom::ConsoleRunnable::ProcessCallData on main thread

Then, inside the mozilla::dom::Console::MethodInternal, and somewhere in the same callstack, the scripted caller's global can be retrieved with JS::GetScriptedCallerGlobal.

I'm not sure if there's a way to retrieve the sandbox object from the JS global object. But if the goal is to retrieve some property of sandbox, there maybe a way, but still depends on what the goal is.

Also, some of the caller's data, such as the origin attributes are stored into the mozilla::dom::ConsoleCallData, in mozilla::dom::Console::MethodInternal, but I'm not sure which information is necessary in this bug's context.

Flags: needinfo?(arai.unmht)

Thanks for the detailed answer Arai.

I will try to rephrase the goal.

WebDriver BiDi allows clients to evaluate scripts in sandboxes. To do so, when they use script.evaluate or script.callFunction, as "target" they can provide a "sandbox" parameter (string) to those commands, on top of a "context" argument (browsing context). If "sandbox" is provided, WebDriver BiDi will create a "sandbox realm" in this Browsing Context. And clients will then receive a "realm id" corresponding to the realm which was used to evaluate the script.

Take the following scenario with BiDi.

Client subscribes to log events:

{ "method": "session.subscribe", "params": { "events": [ "log" ] }, "id": 3 }

Then evaluates a script in a sandbox realm, which triggers console logs. For instance:

{
  "method":"script.evaluate",
  "params":{
    { 
      target: { 
        context: "",
        sandbox: "foo"
      }, 
      awaitPromise: false,
      expression: "console.log('test')"
    }
  },
  "id": 4
}

The result will be something like:

{ "type": "success", "id": 4, "result": { "realm": "b22868df-9cd3-4eb0-8e2a-50f91df4e99e", "type": "success", "result": { "type": "undefined" } } }

A realm was created, with the id b22868df-9cd3-4eb0-8e2a-50f91df4e99e.
In parallel, the log event we emitted was:

{
  "type": "event",
  "method": "log.entryAdded",
  "params": {
    "type": "console",
    "method": "log",
    "source": {
      "realm": "1613b717-67ea-4061-a1b0-ac07ed41908b",
      "context": "e35dd50e-6232-4714-963f-03499ebe6dd3"
    },
    "args": [
      {
        "type": "string",
        "value": "test"
      }
    ],
    "level": "info",
    "text": "test",
    "timestamp": 1744016906321
  }
}

The realm id for the event was set to 1613b717-67ea-4061-a1b0-ac07ed41908b which is the id for the "default" (non-sandboxed) realm of the Browsing Context. Which is wrong because we should make it clear that the log came from the sandbox "foo" with realm id "b22868df-9cd3-4eb0-8e2a-50f91df4e99e".

Internally, we keep a map of all the realms we created so far in our WindowGlobalMessageHandler class (we have one instance of this per window global and per session, they follow the same lifecycle as JSWindowActors). The "realms" objects themselves are WindowRealm instances, which are wrappers around either a window global, or a sandbox global object.

Based on this, I think that if we have access to the global object which created a log event, then we should be able to find the corresponding WebDriver BiDi realm in this map (provided that the Realm was created by this WebDriver BiDi session, but let's focus on this case for now).

Arai: does that clarify the issue a bit?

I'm not sure if there's a way to retrieve the sandbox object from the JS global object.

Here by "sandbox object" you mean the result from Cu.Sandbox right?
Note that to execute our code here, we use executeInGlobal on a "globalObjectReference" created from dbg.makeGlobalObjectReference called on the sandbox object (here). Would this "globalObjectReference" match the "JS global object" here ?

Flags: needinfo?(arai.unmht)

(In reply to Julian Descottes [:jdescottes] from comment #21)

Arai: does that clarify the issue a bit?

Yes, thank you, this answers most of the questions.

Just one remaining question.
The logic in WindowGlobalMessageHandler.#getRealmFromSandboxName sounds like the the "sandbox" parameter can point existing sandbox created by previous calls.
Is it correct?
(If the sandbox is used only by one script source, then the script source's ID could also be used, but otherwise the script source is not what we want)

Then, in any case, using the global object as a key of reverse-map or something sounds reasonable, as long as the global object is reachable from the code.
For the main-thread-only case, it should be possible, but if the logging happens in workers and worklets, this might need some more trick (but this would also depends on how the WebDriver code handles the sandboxes for workers)

I'm not sure if there's a way to retrieve the sandbox object from the JS global object.

Here by "sandbox object" you mean the result from Cu.Sandbox right?

Oh, just realized there's no extra "sandbox object", and it just returns the global object.

Note that to execute our code here, we use executeInGlobal on a "globalObjectReference" created from dbg.makeGlobalObjectReference called on the sandbox object (here). Would this "globalObjectReference" match the "JS global object" here ?

The global object retrievable with JS::GetScriptedCallerGlobal is the raw unwrapped global object of the sandbox itself, which is not a debugger object referece.
So, the value would match to WindowRealm.#globalObject property.

Then, the caller's global is retrievable from the following frames:

For the main thread's case, the arguments data is directly passed to mozilla::dom::MainThreadConsoleData::ProcessCallData, so the caller's global can also be added there.

For the worker/worklet's case, it uses structured clone (mozilla::dom::ConsoleRunnable::ProcessCallData), so it's not possible to directly pass the global object.

Then, currently the ConsoleEvent does not contain the information about the caller global.
So, if the direct reference to the global object is necessary, that needs to be added to the ConsoleEvent, and needs to be populated in Console::MethodInternal or somewhere.
If it can be some numeric/string identifier shared between the sandbox creation and the log events, then something like the mozilla::dom::ConsoleEvent::mAddonId would also work.

Anyway, this will need modification to the console API implementation and the related structs.
If we're to design it based on the upcoming usage in worker/worklet cases, this might need identifier-based things.
If only the main-thread case is necessary for now, just adding the global object to the message would work.

Flags: needinfo?(arai.unmht)

This at least works for the main thread case.
(worker/worklet cases aren't supported).

ConsoleAPIListener.#onConsoleAPIMessage should be able to see the caller global by message.maybeCallerGlobal.

Julian, it looks like we missed track on this bug. Would you mind having a look if the proposed patch from Arai actually works for us?

Flags: needinfo?(jdescottes)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: