Closed Bug 1501108 Opened 2 years ago Closed 1 year ago

Add GV API for cross-origin attributes (to support container tabs)

Categories

(GeckoView :: General, enhancement, P1)

Unspecified
Android
enhancement

Tracking

(firefox-esr60 wontfix, firefox-esr68 wontfix, firefox65 wontfix, firefox66 wontfix, firefox67 wontfix, firefox68 wontfix, firefox69 wontfix, firefox70 fixed)

RESOLVED FIXED
mozilla70
Tracking Status
firefox-esr60 --- wontfix
firefox-esr68 --- wontfix
firefox65 --- wontfix
firefox66 --- wontfix
firefox67 --- wontfix
firefox68 --- wontfix
firefox69 --- wontfix
firefox70 --- fixed

People

(Reporter: cpeterson, Assigned: esawin)

References

()

Details

(Whiteboard: [geckoview:fenix:p3])

Attachments

(7 files, 1 obsolete file)

https://support.mozilla.org/en-US/kb/containers

Container tabs are like normal tabs except that the sites you visit will have access to a separate slice of the browser's storage. This means your site preferences, logged in sessions, and advertising tracking data won't carry over to the new container.
This would go in GeckoSessionSettings.
Keywords: good-first-bug
[geckoview:fenix] because Fenix will want this feature.
Whiteboard: [geckoview:fenix]
Product: Firefox for Android → GeckoView

[geckoview:fenix:p3] because this feature is not a Fenix MVP release blocker, but it might be interesting for Fenix's PWA story.

Whiteboard: [geckoview:fenix] → [geckoview:fenix:p3]

[geckoview:fenix:p2] because Fenix Product Management is talking about using containers to sandbox PWAs.

Priority: P3 → P2
Whiteboard: [geckoview:fenix:p3] → [geckoview:fenix:p2]
Assignee: nobody → esawin
Keywords: good-first-bug

I think there are a few ways we could do this:

  1. Use the userContextId attribute as desktop does, and allow apps to specify an integer value, e.g. GeckoSessionSettings.setContainerId(42).

  2. Same as 1, but restrict to the well-known values we have for containers on desktop. We'd probably want to expose these as constants, so you would use it as GeckoSessionSettings.setContainerId(GeckoSessionSettings.CONTAINER_WORK).

  3. Choose a new hardcoded key (geckoContainerId or similar), and allow the app to set arbitrary values of any type (or just int/string/float), e.g. GeckoSessionSettings.setOriginAttribute("my attribute")

  4. Same as 3 but don't use a hardcoded key, e.g. GeckoSessionSettings.setOriginAttribute("foobar", "my attribute")

  5. Same as 3 but with a Map of (potentially) several keys/values.

If we ever want to support sync of container ids with desktop, we can't do 3. Baku, any opinion on these from the Gecko side of things? Obviously 5 is the most flexible, but I worry about us breaking Gecko with an unexpectedly high number of keys. I'm inclined to believe 4 may be the best compromise.

Flags: needinfo?(amarchesini)
Summary: Add GV API for Container tabs → Add GV API for cross-origin attributes
Priority: P2 → P1
Whiteboard: [geckoview:fenix:p2] → [geckoview:fenix:p1]

I've implemented option 3 restricted to the type int.

The API provides a simple interface that is difficult to misuse (performance) at app level while maintaining enough similarity with the userContextId mechanics at Gecko level without directly conflicting with it.
Options 4 and 5 may still be realized through mapping mechanics at app level.

I hope the patch clarifies the intention of API and will move the discussion forward.

Attachment #9042554 - Attachment description: Bug 1501108 - [1.0] Add origin attribute session setting. → Bug 1501108 - [1.1] Add GeckoView Container ID support.

What is the intended use of geckoContainerId? Is it for Containers or Progessive Web Apps? We really have to think about the future where container ids may be synced across devices and design this in a way that is compatible with that.

Also, when you say "allow the app to set arbitrary values of any type", what kind of app do you have in mind? A mobile application that builds on top of Gecko View? Or an extension? Or both?

baku, please weigh in here and review.

Flags: needinfo?(tanvi)

(In reply to Tanvi Vyas[:tanvi] from comment #8)

What is the intended use of geckoContainerId? Is it for Containers or Progessive Web Apps? We really have to think about the future where container ids may be synced across devices and design this in a way that is compatible with that.

We want to provide a GeckoView API for cookie jar separation mechanics. It should provide enough flexibility to allow the implementation of containers, isolated PWAs and more.
Syncing ids across devices is an app-level mechanic and would be difficult to account for at the GeckoView level.

Also, when you say "allow the app to set arbitrary values of any type", what kind of app do you have in mind? A mobile application that builds on top of Gecko View? Or an extension? Or both?

We currently target Fenix and other internal apps, but third-party apps may find different use cases for this API.
This is not an API for extensions.

Flags: needinfo?(tanvi)
Attachment #9042554 - Attachment description: Bug 1501108 - [1.1] Add GeckoView Container ID support. → Bug 1501108 - [1.2] Add GeckoView Session Context ID support.

Depends on D19182

Summary: Add GV API for cross-origin attributes → Add GV API for cross-origin attributes (to support container tabs)

Fenix plans to use containers in M4, so GV needs to provide the API in M3.

Whiteboard: [geckoview:fenix:p1] → [geckoview:fenix:m3]

Baku, I've replied to your questions on the patch, do you need any other information for the review?

I've added you as a reviewer to the test patch, maybe you also have an idea why replacing the HTML page with evaluateJS gives us a security error (for the same JS code).

Attachment #9044311 - Attachment description: Bug 1501108 - [2.0] Add Session Context ID test. → Bug 1501108 - [2.2] Add Session Context ID test.
Depends on: 1537882

Lower priority from M3 to P2 "nice-to-have" because Fenix has deferred the Container feature to post-MVP.

Whiteboard: [geckoview:fenix:m3] → [geckoview:fenix:p2]
Attachment #9051404 - Attachment description: Bug 1501108 - [3.0] Add a GeckoRuntime API to delete session context data. → Bug 1501108 - [3.1] Add a GeckoRuntime API to delete session context data.

I've added a GV workaround for bug 1537882 in patch 3.1 so that we don't necessarily need to block on that issue.

Attachment #9051404 - Attachment description: Bug 1501108 - [3.1] Add a GeckoRuntime API to delete session context data. → Bug 1501108 - [3.2] Add a StorageController API to delete session context data.
Blocks: 1539429
Attachment #9051404 - Attachment description: Bug 1501108 - [3.2] Add a StorageController API to delete session context data. → Bug 1501108 - [3.3] Add a StorageController API to delete session context data.
Attachment #9051404 - Attachment description: Bug 1501108 - [3.3] Add a StorageController API to delete session context data. → Bug 1501108 - [3.4] Add a StorageController API to delete session context data.
Pushed by esawin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cad2b29b79d2
[1.2] Add GeckoView Session Context ID support. r=snorp,baku,mayhemer
https://hg.mozilla.org/integration/autoland/rev/de36c9fb8c65
[2.2] Add Session Context ID test. r=snorp,baku
https://hg.mozilla.org/integration/autoland/rev/1936dde5f34c
[3.4] Add a StorageController API to delete session context data. r=baku,snorp,geckoview-reviewers
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/da28ce3af312
Removed extra spaces in order to fix eslint failure.
Attached file Bug 1501108 - Fix lint error. (obsolete) —

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception&revision=1936dde5f34c98c7eaca6d53d9826e3b40f1134c&selectedJob=240465304

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=240465304&repo=autoland&lineNumber=1775

Backout link: https://hg.mozilla.org/integration/autoland/rev/095d253f97be7112a6f0ffa0b9d892964f5264d7

[task 2019-04-15T21:47:38.562Z] 21:47:38 INFO - TEST-PASS | browser/base/content/test/caps/browser_principalSerialization_version1.js | Ensure value of firstPartyDomain is -
[task 2019-04-15T21:47:38.565Z] 21:47:38 INFO - Buffered messages finished
[task 2019-04-15T21:47:38.566Z] 21:47:38 INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/caps/browser_principalSerialization_version1.js | Ensure value of geckoViewSessionContextId is undefined - Got , expected undefined
[task 2019-04-15T21:47:38.567Z] 21:47:38 INFO - Stack trace:
[task 2019-04-15T21:47:38.568Z] 21:47:38 INFO - chrome://mochikit/content/browser-test.js:test_is:1325
[task 2019-04-15T21:47:38.569Z] 21:47:38 INFO - chrome://mochitests/content/browser/browser/base/content/test/caps/browser_principalSerialization_version1.js:test_realHistoryCheck:202
[task 2019-04-15T21:47:38.570Z] 21:47:38 INFO - chrome://mochikit/content/browser-test.js:Tester_execTest/<:1116
[task 2019-04-15T21:47:38.571Z] 21:47:38 INFO - chrome://mochikit/content/browser-test.js:Tester_execTest:1144
[task 2019-04-15T21:47:38.572Z] 21:47:38 INFO - chrome://mochikit/content/browser-test.js:nextTest/<:1005
[task 2019-04-15T21:47:38.573Z] 21:47:38 INFO - chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:SimpleTest.waitForFocus/waitForFocusInner/focusedOrLoaded/<:803
[task 2019-04-15T21:47:38.575Z] 21:47:38 INFO - TEST-PASS | browser/base/content/test/caps/browser_principalSerialization_version1.js | Ensure value of inIsolatedMozBrowser is false -
[task 2019-04-15T21:47:38.576Z] 21:47:38 INFO - TEST-PASS | browser/base/content/test/caps/browser_principalSerialization_version1.js | Ensure value of privateBrowsingId is 0 -
[task 2019-04-15T21:47:38.577Z] 21:47:38 INFO - TEST-PASS | browser/base/content/test/caps/browser_principalSerialization_version1.js | Ensure value of userContextId is 0 -
[task 2019-04-15T21:47:38.579Z] 21:47:38 INFO - TEST-PASS | browser/base/content/test/caps/browser_principalSerialization_version1.js | Should have not have a URI for system -
[task 2019-04-15T21:47:38.582Z] 21:47:38 INFO - TEST-PASS | browser/base/content/test/caps/browser_principalSerialization_version1.js | should have CSP -
[task 2019-04-15T21:47:38.583Z] 21:47:38 INFO - TEST-PASS | browser/base/content/test/caps/browser_principalSerialization_version1.js | Ensure value of appId is 0 -
[task 2019-04-15T21:47:38.584Z] 21:47:38 INFO - TEST-PASS | browser/base/content/test/caps/browser_principalSerialization_version1.js | Ensure value of firstPartyDomain is -
[task 2019-04-15T21:47:38.585Z] 21:47:38 INFO - Not taking screenshot here: see the one that was previously logged

Flags: needinfo?(esawin)
Flags: needinfo?(esawin)
Attachment #9058419 - Attachment is obsolete: true
Attachment #9042554 - Attachment description: Bug 1501108 - [1.2] Add GeckoView Session Context ID support. → Bug 1501108 - [1.3] Add GeckoView Session Context ID support.
Attachment #9044311 - Attachment description: Bug 1501108 - [2.2] Add Session Context ID test. → Bug 1501108 - [2.3] Add Session Context ID test.
Pushed by esawin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/23a77e063257
[1.3] Add GeckoView Session Context ID support. r=snorp,baku,mayhemer
https://hg.mozilla.org/integration/autoland/rev/335cec0aacd8
[2.3] Add Session Context ID test. r=snorp,baku
https://hg.mozilla.org/integration/autoland/rev/99ba286125d1
[3.4] Add a StorageController API to delete session context data. r=baku,snorp,geckoview-reviewers
https://hg.mozilla.org/integration/autoland/rev/8f2d511ad49f
[4.0] Add empty origin attribute values to fix desktop tests. r=baku
Regressions: 1515057

Backed out for causing very frequent leaks (2000 last week on sheriffed trees) in macOS debug wpt tests - see bug 1515057:

https://hg.mozilla.org/integration/autoland/rev/3eb938b576a99000bff84b6ec314f938959e8e88

Status: RESOLVED → REOPENED
Flags: needinfo?(esawin)
Resolution: FIXED → ---
Target Milestone: mozilla68 → ---
Backout by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/autoland/rev/2e58f3159dfb
Backed out 4 changesets for causing very frequent leaks in macOS debug wpt tests: Set API key provided by api-lint tasks. a=backout

:esawin could I ask why we are using a String here instead of an int like userContextId?

  • If we intend to use it as a domain storage it probably should be named as such.
  • If we intend to have it used for userContextId and a PWA domain it perhaps should be two fields?
  • If it's intended to be an 'app' key then perhaps have it named geckoViewAppKey?

Thanks!

If you have no idea what is going wrong here, you can set XPCOM_MEM_BLOAT_LOG=1 and XPCOM_MEM_LOG_CLASSES=nsStringBuffer. This will make the browser print out what the contents of the nsStringBuffer actually is. Other than that, I don't have many ideas. It feels like a sort of thing where we're spinning up some kind of networking runnable after shutdown has started, so we don't end up running it and just end up leaking it.

(In reply to Jonathan Kingston [:jkt] from comment #26)

:esawin could I ask why we are using a String here instead of an int like userContextId?

This has been thoroughly discussed in this bug and during the patch review here: https://phabricator.services.mozilla.com/D19182.

  • If we intend to use it as a domain storage it probably should be named as such.
  • If we intend to have it used for userContextId and a PWA domain it perhaps should be two fields?
  • If it's intended to be an 'app' key then perhaps have it named geckoViewAppKey?

We don't restrict the nature of the mapping, this is up to the app developer.
This is a generic API exposing per-session cookie-jar separation mechanics.
It's not intended to be an app key, it's a context ID for a given GeckoSession, hence the name.

Flags: needinfo?(esawin)

(In reply to Andrew McCreight [:mccr8] from comment #27)

If you have no idea what is going wrong here, you can set XPCOM_MEM_BLOAT_LOG=1 and XPCOM_MEM_LOG_CLASSES=nsStringBuffer. This will make the browser print out what the contents of the nsStringBuffer actually is. Other than that, I don't have many ideas. It feels like a sort of thing where we're spinning up some kind of networking runnable after shutdown has started, so we don't end up running it and just end up leaking it.

Will this work in automation? The leak is only occurring in our automation tests on MacOS and I don't have a Mac.

(In reply to Eugen Sawin [:esawin] from comment #29)

Will this work in automation? The leak is only occurring in our automation tests on MacOS and I don't have a Mac.

Yes, it should. It looks like you can set environment values for WPTs in here:
https://searchfox.org/mozilla-central/rev/b59a99943de4dd314bae4e44ab43ce7687ccbbec/testing/web-platform/tests/tools/wptrunner/wptrunner/browsers/firefox.py#294

You'll want to leave the existing XPCOM_MEM_BLOAT_LOG part alone, and add in something like
env["XPCOM_MEM_LOG_CLASSES"] = "nsStringBuffer"

It will slow down the browser a lot, but hopefully not enough to interfere with reproducing the test.

Depends on: 1553454
No longer depends on: 1553454

As suspected, extending OriginAttributesDictionary and/or OriginAttributesPatternDictionary is what triggers the leaks: https://treeherder.mozilla.org/#/jobs?repo=try&revision=33b2100cf4ed953c63b6b033af35b78a2d857ca3&selectedJob=249827342

I assume that is not expected. Who would be best to own this issue (I can file a new bug)?

(In reply to Andrew McCreight [:mccr8] from comment #30)

(In reply to Eugen Sawin [:esawin] from comment #29)

Will this work in automation? The leak is only occurring in our automation tests on MacOS and I don't have a Mac.

Yes, it should. It looks like you can set environment values for WPTs in here:
https://searchfox.org/mozilla-central/rev/b59a99943de4dd314bae4e44ab43ce7687ccbbec/testing/web-platform/tests/tools/wptrunner/wptrunner/browsers/firefox.py#294

You'll want to leave the existing XPCOM_MEM_BLOAT_LOG part alone, and add in something like
env["XPCOM_MEM_LOG_CLASSES"] = "nsStringBuffer"

It will slow down the browser a lot, but hopefully not enough to interfere with reproducing the test.

Thanks for the instructions! However, the leak report file isn't exposed as a log artifact on treeherder, redirecting the file path to MOZ_UPLOAD_DIR doesn't work because env isn't set when processing firefox.py and redirecting it to STDOUT returns an error.

Flags: needinfo?(continuation)

(In reply to Eugen Sawin [:esawin] from comment #31)

I assume that is not expected. Who would be best to own this issue (I can file a new bug)?

That's very peculiar.

Boris, any ideas why adding a DOMString field to OriginAttributesDictionary and OriginAttributesPatternDictionary would cause us to start leaking nsPipe, nsPipeInputStream, nsSocketTransport and nsSocketTransportService (as well as some other misc. stuff)?

Flags: needinfo?(continuation) → needinfo?(bzbarsky)

In general, leaks of services like this tend to be caused by a sequence like this:
a) Shutdown starts.
b) We shut down the service.
c) Some runnable or JS runs and uses the service, causing it to be recreated. Or whatever drains some queue of work to be done for the service isn't happening any more.

The failures I looked at were in WPT directories that were running almost no tests. One possibility is that the additional field causes some kind of error during startup, and because the test is so short the error handling doesn't happen until after shutdown has begun, so it becomes the (c).

Yeah, I have no idea what's going on there... No one should be looking at those fields. IPC won't propagate them properly, but no one should be looking at them, so that should be OK....

Have you tried extending only one of the two dictionaries and seeing whether that lets you narrow down what might be involved?

Flags: needinfo?(bzbarsky)

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #34)

Have you tried extending only one of the two dictionaries and seeing whether that lets you narrow down what might be involved?

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5370ebcf98e159d33560f1ec45e64795847d67b6

https://treeherder.mozilla.org/#/jobs?repo=try&revision=077b33a35cd8bb0eca37f6dad26fe96dd233f299

The leak is triggered by extending OriginAttributesDictionary, see previous comment for links.

Flags: needinfo?(amarchesini)

OK, I tried seeing whether the issue is PopulateFromSuffixIterator failing on the unknown bit, but that doesn't seem to be it... I really don't know what's going on here. :(

Attachment #9042554 - Attachment description: Bug 1501108 - [1.3] Add GeckoView Session Context ID support. → Bug 1501108 - [1.5] Add GeckoView Session Context ID support.
Attachment #9044311 - Attachment description: Bug 1501108 - [2.3] Add Session Context ID test. → Bug 1501108 - [2.4] Add Session Context ID test.
Attachment #9051404 - Attachment description: Bug 1501108 - [3.4] Add a StorageController API to delete session context data. → Bug 1501108 - [3.5] Extend StorageController API to delete session context data.
Depends on: 1558944
Attachment #9042554 - Attachment description: Bug 1501108 - [1.5] Add GeckoView Session Context ID support. → Bug 1501108 - [1.6] Add GeckoView Session Context ID support.

Eugen, is this the only GV change we need to support Facebook/social containers in Fenix? btw, the Fenix team now calls that feature "Social Tracking Protection".

Flags: needinfo?(esawin)
Whiteboard: [geckoview:fenix:p2] → [geckoview:fenix:m7]

(In reply to Chris Peterson [:cpeterson] from comment #38)

Eugen, is this the only GV change we need to support Facebook/social containers in Fenix? btw, the Fenix team now calls that feature "Social Tracking Protection".

This bug introduces GeckoSession contextual identity support, which should allow for container tabs, including social containers, feature implementations.

Flags: needinfo?(esawin)

Note that Facebook Container and Social Tracking Protection are not necessarily the same thing.

(In reply to Tanvi Vyas[:tanvi] from comment #40)

Note that Facebook Container and Social Tracking Protection are not necessarily the same thing.

Thanks. Someone on the Fenix team just confirmed that they were talking about social block lists, not Facebook Container.

In that case, I'm moving this bug out of the [geckoview:fenix:m7] milestone.

Whiteboard: [geckoview:fenix:m7] → [geckoview:fenix:p3]

Eugen, apologies for jumping in so late on this bug.

Please note that Gecko makes some assumptions about how various fields of OriginAttributes are stringified and decoded back when deserializing. Exposing a generic API to GV consumers to set this field without doing proper checks on it is extremely dangerous.

To give a concrete example, if a GV consumer sets sessionContextID to something like "foo^bar", that will corrupt storage databases that Gecko writes down to disk. That is because we use "^" in order to separate origin attributes fields when serializing the struct to a string, and it seems like this patch isn't taking this into account. The impact of this is generally undefined and up to each storage backend (e.g. cookies, IndexedDB, local storage, etc), but to the consumer of storage backends this will most likely appear like site data loss, or worse.

I believe part 1 of this patch must be rewritten in a way that treats the geckoviewSessionContextID as untrusted input and performs all of the necessary sensitization on it.

(In reply to :Ehsan Akhgari from comment #42)

Eugen, apologies for jumping in so late on this bug.

Please note that Gecko makes some assumptions about how various fields of OriginAttributes are stringified and decoded back when deserializing. Exposing a generic API to GV consumers to set this field without doing proper checks on it is extremely dangerous.

To give a concrete example, if a GV consumer sets sessionContextID to something like "foo^bar", that will corrupt storage databases that Gecko writes down to disk. That is because we use "^" in order to separate origin attributes fields when serializing the struct to a string, and it seems like this patch isn't taking this into account. The impact of this is generally undefined and up to each storage backend (e.g. cookies, IndexedDB, local storage, etc), but to the consumer of storage backends this will most likely appear like site data loss, or worse.

I believe part 1 of this patch must be rewritten in a way that treats the geckoviewSessionContextID as untrusted input and performs all of the necessary sensitization on it.

Thanks a lot for the input!
The string is already sanitized against QuotaManager::kReplaceChars in patch 1, if that isn't sufficient, can you point me to a method that would do the appropriate replacements?

What we need to do is document that auto-transformation of the string to prevent any unexpected app-level ID collisions.

(In reply to Eugen Sawin [:esawin] from comment #43)

(In reply to :Ehsan Akhgari from comment #42)

Eugen, apologies for jumping in so late on this bug.

Please note that Gecko makes some assumptions about how various fields of OriginAttributes are stringified and decoded back when deserializing. Exposing a generic API to GV consumers to set this field without doing proper checks on it is extremely dangerous.

To give a concrete example, if a GV consumer sets sessionContextID to something like "foo^bar", that will corrupt storage databases that Gecko writes down to disk. That is because we use "^" in order to separate origin attributes fields when serializing the struct to a string, and it seems like this patch isn't taking this into account. The impact of this is generally undefined and up to each storage backend (e.g. cookies, IndexedDB, local storage, etc), but to the consumer of storage backends this will most likely appear like site data loss, or worse.

I believe part 1 of this patch must be rewritten in a way that treats the geckoviewSessionContextID as untrusted input and performs all of the necessary sensitization on it.

Thanks a lot for the input!
The string is already sanitized against QuotaManager::kReplaceChars in patch 1, if that isn't sufficient

Yeah, I specifically made this comment because that isn't sufficient (it doesn't include "^" nor "=", see https://searchfox.org/mozilla-central/rev/15be167a5b436b57fef944b84eef061d24c1af8c/dom/quota/ActorsParent.cpp#134).

can you point me to a method that would do the appropriate replacements?

I don't think we have an existing method that can do the sanitization... That's because this is the first origin attribute to be exposed to embedders of Gecko for arbitrary purposes.

What we need to do is document that auto-transformation of the string to prevent any unexpected app-level ID collisions.

Right, but first and foremost we need to decide how to do the sanitization. For example if the input context ID is "foo^bar", what should we put in the origin attribute that a) doesn't include the "^" character and b) when deserializing can allow us to translate it back to "^" in a loss-less manner. The simplest idea I could think of is something like applying a simple encoding like base64, except that it can generate slashes and equal signs, both of which are undesirable here, but perhaps they can be dealt with as a special case? Or perhaps employing another similar encoding could be the way to go.

(In reply to :Ehsan Akhgari from comment #44)

Right, but first and foremost we need to decide how to do the sanitization. For example if the input context ID is "foo^bar", what should we put in the origin attribute that a) doesn't include the "^" character and b) when deserializing can allow us to translate it back to "^" in a loss-less manner. The simplest idea I could think of is something like applying a simple encoding like base64, except that it can generate slashes and equal signs, both of which are undesirable here, but perhaps they can be dealt with as a special case? Or perhaps employing another similar encoding could be the way to go.

I don't think we need to have a symmetric mapping method. I'm actually inclined to restrict the accepted characters in the string at the API level to avoid complications in Gecko and unexpected ID collisions.

Alternatively, (still considering we don't need to reverse the mapping) we could just use some scheme a la gv-ctx-SHA1(<user-string>) at the input level in GV. I think I would prefer this solution since it's the simplest way to ensure a safe string and avoid collisions without restricting the user string options.

(In reply to Eugen Sawin [:esawin] from comment #45)

(In reply to :Ehsan Akhgari from comment #44)

Right, but first and foremost we need to decide how to do the sanitization. For example if the input context ID is "foo^bar", what should we put in the origin attribute that a) doesn't include the "^" character and b) when deserializing can allow us to translate it back to "^" in a loss-less manner. The simplest idea I could think of is something like applying a simple encoding like base64, except that it can generate slashes and equal signs, both of which are undesirable here, but perhaps they can be dealt with as a special case? Or perhaps employing another similar encoding could be the way to go.

I don't think we need to have a symmetric mapping method. I'm actually inclined to restrict the accepted characters in the string at the API level to avoid complications in Gecko and unexpected ID collisions.

That also sounds good! But we should probably have at the very least release mode assertions in the OriginAttributes.cpp code that accepts values for this new attribute from the caller to ensure that the higher level code is doing the right thing...

Alternatively, (still considering we don't need to reverse the mapping) we could just use some scheme a la gv-ctx-SHA1(<user-string>) at the input level in GV. I think I would prefer this solution since it's the simplest way to ensure a safe string and avoid collisions without restricting the user string options.

That won't work, since we need to support two way serialization and deserialization, and going from the SHA1 to the user string is hard. ;-)

(In reply to :Ehsan Akhgari from comment #46)

Alternatively, (still considering we don't need to reverse the mapping) we could just use some scheme a la gv-ctx-SHA1(<user-string>) at the input level in GV. I think I would prefer this solution since it's the simplest way to ensure a safe string and avoid collisions without restricting the user string options.

That won't work, since we need to support two way serialization and deserialization, and going from the SHA1 to the user string is hard. ;-)

If we translate the user string to the SHA1 string at the GV level, before passing it down to Gecko, Gecko would only ever see the SHA1 version.
Since the GV API is just a push down controller, we don't need to reverse the Gecko-handled string to its original user string.

(In reply to Eugen Sawin [:esawin] from comment #47)

(In reply to :Ehsan Akhgari from comment #46)

Alternatively, (still considering we don't need to reverse the mapping) we could just use some scheme a la gv-ctx-SHA1(<user-string>) at the input level in GV. I think I would prefer this solution since it's the simplest way to ensure a safe string and avoid collisions without restricting the user string options.

That won't work, since we need to support two way serialization and deserialization, and going from the SHA1 to the user string is hard. ;-)

If we translate the user string to the SHA1 string at the GV level, before passing it down to Gecko, Gecko would only ever see the SHA1 version.
Since the GV API is just a push down controller, we don't need to reverse the Gecko-handled string to its original user string.

Now I understand. Yes, that will definitely work, and it's probably a bit more reliable than trying to escape the input string for various bad characters.

Since we don't have any security constraints, instead of using SHA1, we can just use the (signed) hex representation of the string.
Meaning the only non-alphanumeric character would be -.

See https://phabricator.services.mozilla.com/D38188 for how it could be implemented.

Attachment #9078404 - Attachment description: Bug 1501108 - [5.0] Ensure that the context ID string is safe for Gecko processing. → Bug 1501108 - [5.1] Ensure that the context ID string is safe for Gecko processing.
Attachment #9042554 - Attachment description: Bug 1501108 - [1.6] Add GeckoView Session Context ID support. → Bug 1501108 - [1.7] Add GeckoView Session Context ID support.
Attachment #9051404 - Attachment description: Bug 1501108 - [3.5] Extend StorageController API to delete session context data. → Bug 1501108 - [3.6] Extend StorageController API to delete session context data.

Depends on D38188

Pushed by esawin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5e80f5aa944f
[1.7] Add GeckoView Session Context ID support. r=snorp,baku,mayhemer
https://hg.mozilla.org/integration/autoland/rev/04acd74f249f
[2.4] Add Session Context ID test. r=snorp,baku
https://hg.mozilla.org/integration/autoland/rev/57cddd8ac95f
[3.6] Extend StorageController API to delete session context data. r=baku,snorp,geckoview-reviewers
https://hg.mozilla.org/integration/autoland/rev/b79b46d9edcb
[4.0] Add empty origin attribute values to fix desktop tests. r=baku
https://hg.mozilla.org/integration/autoland/rev/ff40ab3da64f
[5.1] Ensure that the context ID string is safe for Gecko processing. r=Ehsan,geckoview-reviewers,agi
https://hg.mozilla.org/integration/autoland/rev/318615873368
[6.0] Update changelog.
Pushed by esawin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/48797c5119a4
[7.0] Add empty origin attribute values to fix yet another desktop test.
Regressions: 1556632
See Also: → 1643688
You need to log in before you can comment on or make changes to this bug.