Closed Bug 1899291 Opened 1 month ago Closed 1 month ago

Improve channelId generation code to avoid clashes between tabs with identical pID

Categories

(Core :: Networking: HTTP, defect, P2)

defect
Points:
1

Tracking

()

RESOLVED FIXED
128 Branch
Tracking Status
firefox128 --- fixed

People

(Reporter: jdescottes, Assigned: jdescottes)

References

Details

(Whiteboard: [webdriver:m11][necko-triaged][necko-priority-next])

Attachments

(1 file)

Spotted while working on Bug 1880803.

(In reply to Valentin Gosu [:valentin] (he/him) from Bug 1880803 comment #9)

(In reply to Julian Descottes [:jdescottes] from comment #7)

The two requests are 6 seconds apart, but their channel is using the same channelId "21492016349187". This is something we have only seen on windows so far.

As far as I know we shouldn't be reusing channelIDs unless a redirect is involved, or a service worker.
That doesn't seem to be the case here.
As Julian mentioned, the tests are performed in different tabs, which means it's possible that the channelID gets generated, process finishes and gets killed, and another process comes along and gets assigned the same ProcessID leading to potentially the same channelIDs coming from the new process.

The channelID causing the issue in the tests is 0x138C00000003 - so this theory seems likely.
If this turns out to be the case we can update the channelID generating code to include some other factor - like a counter of spawned processes?

After adding more logs, it seems that the two requests which share the same channelId have been issued by two different tabs which have been using the same pID (pID 5512 in https://treeherder.mozilla.org/logviewer?job_id=459929256&repo=try&lineNumber=144618). The test where this occurs opens a tab, intercepts a fetch and then closes the tab. So it's possible that the OS will reuse a previous pID for the new tab.

It would be great to improve the channelId generator so that we are guaranteed to have different ids, even if the pID is identical.

Blocks: 1899321

Valentin, are we needing any info to start acting on this?

Severity: -- → N/A
Flags: needinfo?(valentin.gosu)
Priority: -- → P2
Whiteboard: [necko-triaged][necko-priority-review]

I think so.
The problem is that mProcessId may be recycled.
We could switch to using the uniqueProcessID instead.

https://searchfox.org/mozilla-central/rev/f60bb10a5fe6936f9e9f9e8a90d52c18a0ffd818/xpcom/system/nsIXULRuntime.idl#240-243

/**
 * A globally unique and non-recycled ID of the caller's process.
 */
readonly attribute uint64_t uniqueProcessID;

The main problem is that we need to make the uniqueProcessID (64 bits) and mNextChannelId (32 bits) fit into 53 bits
But I think it's fine to just use the first uniqueProcessID instead of mProcessId, as the uniqueProcessID starts from 1. If we ever go through more than 2^22 content processes, we have bigger problems 😆

https://searchfox.org/mozilla-central/rev/893f350260faac2ee6bf2b14c627d55eb2babfb0/netwerk/protocol/http/nsHttpHandler.cpp#2497-2503

channelId =
    // channelId is sometimes passed to JavaScript code (e.g. devtools),
    // and since on Linux PID_MAX_LIMIT is 2^22 we cannot
    // shift PID more than 31 bits left. Otherwise resulting values
    // will be exceed safe JavaScript integer range.
    ((static_cast<uint64_t>(mProcessId) << 31) & 0xFFFFFFFF80000000LL) |
    mNextChannelId++;

So instead of mProcessId we need to have mUniqueProcessId, and we need to change the comments to reflect the new changes.

Severity: N/A → S4
Points: --- → 1
Flags: needinfo?(valentin.gosu)
Whiteboard: [necko-triaged][necko-priority-review] → [necko-triaged][necko-priority-next]
Assignee: nobody → jdescottes
Status: NEW → ASSIGNED

I can also confirm this fixes the channelId clashes I saw with webdriver tests: https://treeherder.mozilla.org/jobs?repo=try&revision=de9eaff88164e69fa6a095d7f2694318761b5709

Pushed by jdescottes@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e50b9d6cb1b6
Generate channelIds based on the uniqueProcessId r=valentin,necko-reviewers
Status: ASSIGNED → RESOLVED
Closed: 1 month ago
Resolution: --- → FIXED
Target Milestone: --- → 128 Branch
Whiteboard: [necko-triaged][necko-priority-next] → [webdriver:m11][necko-triaged][necko-priority-next]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: