Closed Bug 1790666 Opened 2 years ago Closed 1 year ago

Unable to use Google account to login to dall-e OpenAI image generator

Categories

(Core :: Storage: localStorage & sessionStorage, defect, P2)

Firefox 91
All
Unspecified
defect

Tracking

()

RESOLVED FIXED
119 Branch
Webcompat Priority P2
Tracking Status
firefox119 --- fixed

People

(Reporter: ksenia, Assigned: nika)

References

()

Details

Attachments

(2 files)

I'm not sure whether this is the right component, so feel free to change it. This issue was originally reported in https://github.com/webcompat/web-bugs/issues/109058 where users are unable to login to dall-e OpenAI account using google auth (the site is using Auth0).

This problem is reproducible on Firefox Android and in Firefox 91.13.0esr, however in newer desktop versions it appears to have been fixed by Fission (I run mozregression with --find-fix option and it points to https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=383986e2f5cb835a55c47bbd6e15d81035ca0784&tochange=d6e8528f0a936df88369c71f4e390e31a4d621a1).

This is a bit tricky to reproduce, as they put newly signed up accounts to a waitlist and a non waitlisted account is required to observe the issue. I've got on a waitlist and it took about a week to get an invitation to sign in. I could provide my credentials, if required.

To reproduce:

  1. Visit https://labs.openai.com/auth/login and sign in with your google account (if you already have an invite) or sign up (that would require waiting for the invitation) by pressing "Continue with Google" in Firefox on Android or latest desktop esr.
  2. After signing in, observe the page:

Expected:
Able to proceed to image generation

Actual:
"Authentication expired" message is displayed

I did a bit of investigation in https://github.com/webcompat/web-bugs/issues/109058#issuecomment-1244570448, and looks like the problem is that they put some data to the sessionStorage in the beginning of auth flow (while on https://labs.openai.com/auth/login) to retrieve it later on, after sign in with google is performed. It redirects to https://auth0.openai.com/authorize?client_id=..., and later to https://auth0.openai.com/u/login/identifier?state=... where I click on "Continue with Google" and later redirected back to https://labs.openai.com/auth/callback?code=.. again , where it tries to retrieve that same sessionStorage data, but it turns out to be empty.

I'll move it to Storage because it sounds related to sessionStorage.

Component: DOM: Content Processes → Storage: localStorage & sessionStorage
Webcompat Priority: --- → ?
Webcompat Priority: ? → P2
Has STR: --- → yes
Severity: -- → S2
Priority: -- → P2
Flags: needinfo?(jvarga)

(In reply to Ksenia Berezina [:ksenia] from comment #1)

I did a bit of investigation in https://github.com/webcompat/web-bugs/issues/109058#issuecomment-1244570448, and looks like the problem is that they put some data to the sessionStorage in the beginning of auth flow (while on https://labs.openai.com/auth/login) to retrieve it later on, after sign in with google is performed. It redirects to https://auth0.openai.com/authorize?client_id=..., and later to https://auth0.openai.com/u/login/identifier?state=... where I click on "Continue with Google" and later redirected back to https://labs.openai.com/auth/callback?code=.. again , where it tries to retrieve that same sessionStorage data, but it turns out to be empty.

I assume that we have slightly differing implementation specific interpretations of what a session actually is and when it ends. Reading through MDN sessionStorage one might understand that:

  • a session's lifetime is bound to the document (which seems to be bound to the lifetime of the Window/tab and survives navigation inside the same window/tab)
  • each origin (including sub-domains) has its own sessionStore

I'd interpret from this and a short peek on the spec, that we would expect the sessionStorage to survive the redirects as long as the Window/tab initially used remains open. While reasonable in this specific use case, it might be arguable if this is really what we always want, in particular if we are just navigating away without redirect and whilst the window/tab remained open much later and totally independently we click again on a link that brings us to the original origin - would we really want the sessionStorage to contain the old data?

In any case, apparently we have a third, implementation specific limitation, that is the lifetime of the content process that has been used when setting the storage item. It seems that once the process goes away (or is re-used for a different origin?) we lose our sessionStorage. IIUC, fission does "site-isolation", not origin (host-name) isolation, which would mean that during all the redirections we end up inside the same content process (which is never closed) and thus keep the sessionStorage. I think that in the non-fission case instead something makes us switch process while re-directing, resulting in a shiny new (and empty) sessionStorage.

We probably need to be more explicit in our implementation about the lifetime of sessionStorage to not be subject to such implicit side-effects.

Olli, am I getting this right and do you have more thoughts on the "session lifetime" we should align our implementation with?

Flags: needinfo?(jvarga) → needinfo?(smaug)

FWIW, the spec for session is very vague:
https://html.spec.whatwg.org/multipage/history.html#environment-browsing-session

We don't switch processes in non-Fission case if one isn't using coop/coep.
Does the site use those and if so, does
browser.tabs.remote.useCrossOriginEmbedderPolicy = false
browser.tabs.remote.useCrossOriginOpenerPolicy = false
make any difference?

Flags: needinfo?(smaug) → needinfo?(kberezina)

Thanks for looking into this. I've tried to reproduce the issue again, but it appears to be working on both esr and Firefox mobile.
I'll try to change these prefs and debug it, if I come across a similar report, but I think we can close this for now.

Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(kberezina)
Resolution: --- → WORKSFORME

(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #4)

FWIW, the spec for session is very vague:
https://html.spec.whatwg.org/multipage/history.html#environment-browsing-session

We don't switch processes in non-Fission case if one isn't using coop/coep.
Does the site use those and if so, does
browser.tabs.remote.useCrossOriginEmbedderPolicy = false
browser.tabs.remote.useCrossOriginOpenerPolicy = false
make any difference?

I came across this issue with https://beta.character.ai this time, on mobile. When attempting to sign in/ sign up with Google auth, the sign in is not performed. Similarly, they save some data in the sessionStorage at the beginning of auth flow, but after redirect to a callback url (https://beta.character.ai/?code=...&state=.... ) and retrieval there is no item with such key.

When setting browser.tabs.remote.useCrossOriginEmbedderPolicy and browser.tabs.remote.useCrossOriginOpenerPolicy to false, it starts working. They have cross-origin-opener-policy set to same-origin.

Hi Olli, wonder if you could take a look at this, please?

Status: RESOLVED → REOPENED
Flags: needinfo?(smaug)
Resolution: WORKSFORME → ---

(I'm pretty swamped atm)

From SessionStorageManager:

// sessionStorage is a data store that's unique to each tab (i.e. top-level
// browsing context) and origin. Before the advent of Fission all the data
// for a given tab could be stored in a single content process; now each
// site-specific process stores only its portion of the data. As content
// processes terminate, their sessionStorage data needs to be saved in the
// parent process, in case the same origin appears again in the tab (e.g.
// by navigating an iframe element). Therefore SessionStorageManager
// objects exist in both the parent and content processes.

So IIUC we have mechanisms to synchronize sessionStorage between content processes that were introduced with Fission but seem to not check for Fission, still for some reason they seem to fail here? Jan, do you remember something specific?

Flags: needinfo?(smaug) → needinfo?(jvarga)
Assignee: nobody → jjalkanen

Desktop:

  • As already concluded, the issue does not seem to be reproducible with labs.openai.com
  • It is also not reproducible with Google's oauth2 and a local client
  • It is also not reproducible with a local setup where the site uses the same origin as the local oauth2 server
  • The site does not store anything in the SessionStore

Android:

  • The issue is reproducible and the login bar does not turn into activity bar with chat, profile etc. after the first login sequence is completed
  • As far as I can tell, SessionStore is used to store the tab state but not any site data
  • After the first login, pressing the login button does not however initiate a new authentication sequence
  • When both browser.tabs.remote.use* settings under about:config are set to false, login bar changes to the activity bar after the login sequence
  • With Chrome, the activity bar gets loaded after the login without any configuration
  • By comparing the logs, after the login is complete, it seems like something fails silently before the Chat Manager component from the activity bar (blue tile with c+ on it) is initialized, and we're actually logged in but just can't successfully change the GUI
  • Chat manager seems to be based on axios, which appears to use WebSockets through socket.io
  • There are reports where the cause for this kind of problem was found from the CORS headers e.g. axios-post-works-in-chrome-but-not-in-firefox

Because storage does not seem to be the root cause, and WebSocket tests don't run on Android as per 1566168, I will assign this for now to the Networking:Websockets component for better analysis.

Assignee: jjalkanen → nobody
Component: Storage: localStorage & sessionStorage → Networking: WebSockets
Flags: needinfo?(jvarga)

For what it's worth, the last successful request after the login seems to be GET ... manifest.json, and in the case when the browser.tabs.remote.use* settings in about:config are set to false, the requests after that have cross-origin-opener-policy set to true but do not have a defined cross-origin-embedder-policy.

I can reproduce this on android on beta.character.ai. Once I set browser.tabs.remote.useCrossOriginOpenerPolicy to false, the bug goes away, but interestingly, I wasn't able to make the bug go away by changing the callsites be a hardcoded false. I'm assuming there's something I missed, but I'll keep looking into it.

Whiteboard: [necko-triaged][necko-priority-queue]

I finally figured out I was editing the wrong code.
It turns out that setting useCrossOriginOpenerPolicy to false does the trick, as that directly the value of DocumentLoadListener::HasCrossOriginOpenerPolicyMismatch()

https://searchfox.org/mozilla-central/rev/c56cc57d547a5febcc34160e581f9d58c0326aa7/netwerk/ipc/DocumentLoadListener.cpp#1863-1867

auto optionsResult = IsolationOptionsForNavigation(
    browsingContext->Top(), switchToNewTab ? nullptr : parentWindow.get(),
    GetChannelCreationURI(), mChannel, currentRemoteType,
-   HasCrossOriginOpenerPolicyMismatch(), switchToNewTab, mLoadStateLoadType,
+   false, switchToNewTab, mLoadStateLoadType,
    mDocumentChannelId, mRemoteTypeOverride);

In the code above, if I pass false instead of HasCrossOriginOpenerPolicyMismatch() then I can log into beta.character.ai without a problem.
Nika, I'm not exactly sure how the process switch code is supposed to work on Android where fission.autostart is disabled. What should DocumentLoadListener do in that case?

Flags: needinfo?(nika)

Given that the COOP header is set here, a BCG switch (which may have a corresponding process switch) will happen both with Fission enabled and with Fission disabled. After the BCG switch, the SessionStorage should be preserved into the new BrowsingContext as-per the discussion from bug 1656768.

The main difference here is actually whether or not SHIP is enabled, rather than whether Fission is enabled (though Fission enables SHIP so they're quite connected). When SHIP is enabled, we ensure that the SessionStorage carries over to the new BrowsingContext by calling BackgroundSessionStorageManager::PropagateManager from ReplacedBy. This is not called in when SHIP is disabled, so the SessionStorage isn't immediately copied over.

Instead, when SHIP is disabled, both the session history and session storage are copied into the new process using the session store. On desktop this happens by the session storage being collected by prepareToChangeRemoteness, and then restored from afterChangeRemoteness. Because this uses sessionstore though, there is a limit on the maximum size of sessionstore data which can be restored in this way, which is fairly small to keep the session data small (https://searchfox.org/mozilla-central/rev/c56cc57d547a5febcc34160e581f9d58c0326aa7/modules/libpref/init/StaticPrefList.yaml#1568-1573). It wouldn't surprise me if the data being stored in this situation is exceeding the 2k byte size limit, so is being discarded on desktop with Fission disabled.

It appears that the situation on Android is even worse than on desktop - the session state collection for the process switch appears to not collect or store session storage data at all, only collecting session history, so there's no mechanism for the session storage to be persisted (https://searchfox.org/mozilla-central/rev/c56cc57d547a5febcc34160e581f9d58c0326aa7/mobile/android/chrome/geckoview/geckoview.js#183). I don't think there's even a mechanism to collect it to be restored when the session is restored at all (https://searchfox.org/mozilla-central/rev/c56cc57d547a5febcc34160e581f9d58c0326aa7/mobile/android/actors/GeckoViewContentChild.sys.mjs#74-103).

Given that we don't support non-SHIP on desktop anymore, and that Android will never try to restore session storage using sessionrestore from afterChangeRemoteness, the easiest fix here might be to remove the SessionHistoryInParent() check around the call to PropagateManager (https://searchfox.org/mozilla-central/rev/c56cc57d547a5febcc34160e581f9d58c0326aa7/docshell/base/CanonicalBrowsingContext.cpp#384-386). I think this would mean that the session store state would be propagated as it is with Fission enabled across BCG swaps, even with SHIP enabled.

Flags: needinfo?(nika)

With non-SHIP Desktop we would propagate the session storage across
process switches using session restore, so the internal propagation was
diabled. However, android's session restore has no handling for session
storage, meaning that all session storage is discarded.

This changes the logic to always use the internal propagation, even when
SHIP is disabled.

Assignee: nobody → nika
Component: Networking: WebSockets → Storage: localStorage & sessionStorage
Whiteboard: [necko-triaged][necko-priority-queue]
Pushed by nlayzell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b775cfba47d1
Propagate session storage across COOP switches without SHIP, r=smaug,asuth
Status: REOPENED → RESOLVED
Closed: 2 years ago1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 119 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: