Open Bug 1749962 Opened 4 years ago Updated 13 days ago

Losing Treeherder authentication every few hours

Categories

(Core :: Networking: Cookies, defect, P2)

defect

Tracking

()

People

(Reporter: gerard-majax, Unassigned, NeedInfo)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [necko-triaged][necko-monitor])

Attachments

(2 files)

This has been running for several days, with several builds, nightly mozilla builds on Linux/Ubuntu 21.10. Current one is 20220111093827.

STR:

  1. Push to try
  2. Open treeherder link on try, authenticate
  3. Let the tab open waiting for completion
  4. Come back a few hours later

Expected:
I'm still logged in

Actual:
I'm not logged in on Treeherder anymore.

Devtools console shows:

Could not renew login: 
Object { error: "login_required", errorDescription: "Login required", state: "xxx" }
AuthService.js:73:14
    _renewAuth AuthService.js:73

I need to clarify a few things. How do you know you're not logged in anymore after leaving the tab open for a few hours (besides looking at the console)? Do you try to perform a specific action and do you get an error message? Please try to provide as many details as possible.

Flags: needinfo?(lissyx+mozillians)

(In reply to Sarah Clements [:sclements] from comment #1)

I need to clarify a few things. How do you know you're not logged in anymore after leaving the tab open for a few hours (besides looking at the console)? Do you try to perform a specific action and do you get an error message? Please try to provide as many details as possible.

Because:

  • I have "Login / Register" in the top right corner
  • Trying to retrigger tasks fails with the usual tc-level error showing I'm not authenticated

As I said and described, I'm not doing anything at all. Push to try, wait a few hours doing something else, come back to the tab to see, and kaboom, creds losts.

Flags: needinfo?(lissyx+mozillians)

Might be similar to another bug 1646809. I'll try to look at this Monday but it might not be an easy fix.

Alexandre, has anything changed recently with regards to cookies being blocked or anything like that? Can you try opening Treeherder in private tab or window and see if the issue persists?

Also, next time this happens, please open the network tab and see if a call toapi/auth/login/ was made and what the response is.

(In reply to Sarah Clements [:sclements] from comment #4)

Alexandre, has anything changed recently with regards to cookies being blocked or anything like that? Can you try opening Treeherder in private tab or window and see if the issue persists?

No, nothing has changed to the best of my knowledge.

(In reply to Sarah Clements [:sclements] from comment #5)

Also, next time this happens, please open the network tab and see if a call toapi/auth/login/ was made and what the response is.

There was no api/auth/login call being made when:

Could not renew login: 
Object { error: "login_required", errorDescription: "Login required", state: "" }
​
error: "login_required"
​
errorDescription: "Login required"
​
state: ""
​
<prototype>: Object { … }
AuthService.js:73:14
    _renewAuth AuthService.js:73

(In reply to Sarah Clements [:sclements] from comment #4)

Alexandre, has anything changed recently with regards to cookies being blocked or anything like that? Can you try opening Treeherder in private tab or window and see if the issue persists?

Just reproduced in a private window. In both non-private and private, I have just noticed one non error message:

Le cookie « com.auth0.auth.STATE » a la politique « SameSite » définie sur « Lax » car son attribut « SameSite » n’est pas défini et « SameSite=Lax » est la valeur par défaut de cet attribut.

It states that the cookie com.auth0.STATE (with STATE being the value inside the error already mentionned) as a SameSite policy defined on Lax because its SameSite value is undefined and default is SameSite=Lax.

aha! It might be related to the same-site cookie issue. Which was just fixed and a new release published, so I'll update the auth0 library we're using and do some local testing.

Assignee: nobody → sclements

Chris, can you help with this? I need to test a patch that a requires login.

Flags: needinfo?(cvalaas)

Sure, what do you need?

Flags: needinfo?(cvalaas)

(In reply to chris valaas [:cvalaas] from comment #11)

Sure, what do you need?

oops, wrong bug I need info'd you on :) It's bug 1751163 I need help with.

Depends on: 1751163
No longer depends on: 1751163

I have a pr that I need to test that I think will address the same site cookie issue, but I think this is a longstanding issue auth/login that might not even be specific to Treeherder.

See Also: → 1643117, 1646809

(In reply to Sarah Clements [:sclements] from comment #13)

I have a pr that I need to test that I think will address the same site cookie issue, but I think this is a longstanding issue auth/login that might not even be specific to Treeherder.

Any news ?

Flags: needinfo?(sclements)

Sorry, I've moved to another team and this fell off the radar due to other priorities. I have my pr on prototype... can you also login and see if the samesite site cookie issue happens on that deployment as well? https://prototype.treeherder.nonprod.cloudops.mozgcp.net/

Flags: needinfo?(sclements)

(In reply to Sarah Clements [:sclements] from comment #15)

Sorry, I've moved to another team and this fell off the radar due to other priorities. I have my pr on prototype... can you also login and see if the samesite site cookie issue happens on that deployment as well? https://prototype.treeherder.nonprod.cloudops.mozgcp.net/

I'm unsure about SameSite issue, but I still get the same error about login required in the console on that deployment.

Yeah, I think that's part of the issues flagged in other bugs (added in "see also" section). And I still see the samesite cookie issue... there's still some ongoing issues with regards to the fix on the auth0 end, even with the v9.19.0, which I also tested: https://github.com/auth0/auth0.js/issues/1220#issuecomment-1024275244.

But I'll have to un-assign myself to this since I'm on the frontend team now.

Assignee: sclements → nobody
Summary: Loosing Treeherder authentication every few hours → Losing Treeherder authentication every few hours

So I cannot state why, but since a few days, it seems the issue has gone away. Is this a fix on treeherder side / its dependencies ? Is this a firefox-side fix ?

Since Sarah is not working on that anymore, maybe Joel you know if there was a recent change on treeherder side that could correlate ?

Flags: needinfo?(jmaher)

I would be surprised if this is treeherder side. We have done very little to change things. Any package upgrades in the last week or so have been related to development of treeherder. Maybe this is a browser thing?

Flags: needinfo?(jmaher)

This has started to happen again since a few days. Logged in this morning around 0700, it's 0910 and I had to login again.

Seeing this on a treeherder tab where I left debug console open

09:28:56,596
Comme certains cookies utilisent incorrectement l’attribut « SameSite », le fonctionnement ne sera pas celui attendu. 3
09:28:56,981 Le cookie « csrftoken » avec la valeur « Lax » ou « Strict » de l’attribut « SameSite » a été omis à cause d’une redirection intersite. ProxyChannelFilter.jsm:425:18
09:28:56,981 Le cookie « sessionid » avec la valeur « Lax » ou « Strict » de l’attribut « SameSite » a été omis à cause d’une redirection intersite. ProxyChannelFilter.jsm:425:18
09:28:57,311
Could not renew login: 
Object { error: "login_required", errorDescription: "Login required", state: "aau5tw0xjkolPhk6ayhZ4_Q0usZjZz9E" }
AuthService.js:73:14
    _renewAuth AuthService.js:73

Still happening.

I was suggested to remove all storage related to treeherder.mozilla.org within devtools Storage tab, and since I have not been able to reproduce the issue.

Expired authentication cookies are not purged fast enough in Firefox (bug 691973) and become too large in total to be sent completely.

Workaround:

  1. Have a treeherder tab open.
  2. Open the DevTools' Storage tab (Shift + F9):
  3. On the left, expand "Cookies".
  4. Click with the right mouse button onto the http://treeherder.mozilla.org item.
  5. Choose Delete All.
Depends on: 691973
Duplicate of this bug: 1646809

Update: The expired cookies don't get sent - might hit a storage limit.

No more instance so far, after a full week.

Same, I cannot reproduce.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME

This is happening again. So there is something fishy. Not sure if treeherder is pushing storage too far or what.

I've started to once again experience this too.

Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---

Are you both seeing too large cookies?

(In reply to Marco Castelluccio [:marco] from comment #31)

Are you both seeing too large cookies?

How can I verify that?

You can follow the steps from comment 24 (stopping at step 3), the size is one of the columns. You could copy and paste the data here (filtering out the "value" column).

Having paid closer attention, what I'm seeing is not exactly the issue described in comment 0. Rather, it's when I open a new Treeherder tab that I find myself logged out, even if just logged in a few hours ago in a tab that I've since closed.

Cleaned up my cookies three days ago, and repro'd several times now.

Multiple (5) cookies named com.auth0.auth with size 152, one csrftoken of size 73 and one sessionid of size 41.

(In reply to Botond Ballo [:botond] from comment #34)

Having paid closer attention, what I'm seeing is not exactly the issue described in comment 0. Rather, it's when I open a new Treeherder tab that I find myself logged out, even if just logged in a few hours ago in a tab that I've since closed.

Update on this: I also lose my logged-in state if I refresh a Treeherder tab that's been open for a few hours.

I did check Cookies in the Storage tab in devtools prior to refreshing, and there was only one cookie with length 73, so it doesn't seem to be an issue of cookies getting too large.

This is still happening. Every 30 mins, dev console on an open treeherder tab shows a failure to renew login.

I've cleaned up cookies and local storage for both treeherder and TaskCluster, as well as logout and login back on both, to no luck.

Did you also try the steps to delete cookies from comment 24?

Are you using Treeherder in the default container or another container?

Are you able to reproduce in a clean Firefox profile?

(In reply to Marco Castelluccio [:marco] from comment #38)

Did you also try the steps to delete cookies from comment 24?

yes ...

Are you using Treeherder in the default container or another container?

non default, "Professional" where I host all mozilla-related stuff

Are you able to reproduce in a clean Firefox profile?

Running several profile in parallel is going to be really painful, can't we just add more debugging on treeherder to localize what is failing? We already have some error popping in console ...

(In reply to Alexandre LISSY :gerard-majax from comment #39)

Are you using Treeherder in the default container or another container?

non default, "Professional" where I host all mozilla-related stuff

This might be linked to why you're seeing this. I have seen similar issues on Taskcluster with a non-default container (bug 1456161). Botond, are you also using a non-default container?

Are you able to reproduce in a clean Firefox profile?

Running several profile in parallel is going to be really painful, can't we just add more debugging on treeherder to localize what is failing? We already have some error popping in console ...

Yes, unfortunately we have no experts of Auth0 in the team at the moment, so it won't be super quick. It's very likely this is a broader bug than Treeherder only BTW.

(In reply to Marco Castelluccio [:marco] from comment #40)

Botond, are you also using a non-default container?

Yes, Treeherder is configured to always open in my "Work" container.

OK, I think we have a pattern here. This is likely a container bug.

How can we track / debug this? Who knows about containers and cookies interactions ?

Component: Treeherder → DOM: Security
Product: Tree Management → Core
Version: --- → unspecified

laxByDefault is turned off in Nightly (and elsewhere) so that's not the cause. I don't know why containers would make any difference unless your auth0 site was pinned to a different container, but you'd notice that because it would open up new tabs when it navigated there and back.

Multiple (5) cookies named com.auth0.auth with size 152, one csrftoken of size 73 and one sessionid of size 41.

you shouldn't be able to have multiple cookies of the same name in the same container. Does DevTools segregate cookies by origin attributes or does it show all cookies in all containers? The latter wouldn't be very useful

Flags: needinfo?(mcastelluccio)

Redirecting needinfo, as I'm not able to reproduce this bug.

Flags: needinfo?(mcastelluccio) → needinfo?(lissyx+mozillians)

if the question is about the cookies display in devtools, you need to find someone knowledgeable of that.

Flags: needinfo?(lissyx+mozillians)

I see Lax in the SameSite column ?

(In reply to Daniel Veditz [:dveditz] from comment #44)

I don't know why containers would make any difference unless your auth0 site was pinned to a different container, but you'd notice that because it would open up new tabs when it navigated there and back.

Hmm, I do get a new tab opening whenever I log into Treeherder (occasionally more than one, but only one remains open). It's in the same container though.

Treeherder and Firefox-CI are both setup to open by default in the same container. So there should be no problem there. Except if there's some third party I dont know about ?

When you login, you go through auth0.

(In reply to Marco Castelluccio [:marco] from comment #50)

When you login, you go through auth0.

Right. As much as I can tell, it goes throuhg the container as well.

Weird. I had a new error showing in console "login_required: multi-factor authentication required", I guess it's Auth0 expiration (my Google Calendar Tab, which lives in the same container, was expired at the same time). I logged out from Firefox-CI as well as Treeherder, re-auth'd everywhere (not Auth0 since I re-auth on it for Google a few minutes before), and since that moment yesterday afternoon, I have not lost auth on treeherder.

So far there's been no identified cookie or container bug, and could well be auth0 doing the wrong thing. laxByDefault cookies may have been implicated in earlier messages, but those are no longer enabled by default.

If you think it's cookies, please turn on logging and capture the traffic from when it's working to when it starts failing.

Component: DOM: Security → Treeherder
Product: Core → Tree Management
Version: unspecified → ---

Dan, since this is only happening with containers enabled and never happening with containers disabled, shouldn't we consider it a containers bug?

This has started again for me (since a few days).

Unfortunately there is nothing we can do in Treeherder, it is a Core bug.
Dan, I'm going to move this back to Core::DOM: Security since it's only reproducible with containers enabled and I assume that's the right component for containers bugs. If there's a better component for Containers bug, feel free to move it again.

Component: Treeherder → DOM: Security
Product: Tree Management → Core
Version: --- → unspecified

i am reproducing this without container.
I am using 3 systems - one at the paris office, two at home.
using strict as tracking protection

re-upping:

If you think it's cookies, please turn on logging and capture the traffic from when it's working to when it starts failing.

Flags: needinfo?(sledru)
Flags: needinfo?(mcastelluccio)

For the people using Containers: are all the sites involved in the same container? Or are some "pinned" to a different container?

Does Auth0 have a feature where it might de-auth some cookie from the same IP address if there's an auth in a different container? off-hand I'd say no 1) that doesn't make sense and 2) I commonly work with multiple auth0/container sessions without problems

(In reply to Daniel Veditz [:dveditz] from comment #58)

re-upping:

If you think it's cookies, please turn on logging and capture the traffic from when it's working to when it starts failing.

Redirecting needinfo to :gerard-majax and :botond, since I can't reproduce this myself.

Flags: needinfo?(mcastelluccio)
Flags: needinfo?(lissyx+mozillians)
Flags: needinfo?(botond)

(In reply to Daniel Veditz [:dveditz] from comment #59)

For the people using Containers: are all the sites involved in the same container? Or are some "pinned" to a different container?

I had pinned auth0 and others to "professional" tab and vs github was on "personal" but with the move to enforce SSO on github this makes me having to auth on both containers. I have not noticed an immediate problem due to this.

Does Auth0 have a feature where it might de-auth some cookie from the same IP address if there's an auth in a different container? off-hand I'd say no 1) that doesn't make sense and 2) I commonly work with multiple auth0/container sessions without problems

(In reply to Daniel Veditz [:dveditz] from comment #53)

So far there's been no identified cookie or container bug, and could well be auth0 doing the wrong thing. laxByDefault cookies may have been implicated in earlier messages, but those are no longer enabled by default.

If you think it's cookies, please turn on logging and capture the traffic from when it's working to when it starts failing.

Given we are talking about a behavior spanning potentially over days / weeks, is it really an actionable item ?

That being said, I'm away for a long period soon, so I dont think I can manage that kind of tracking before at best ~half of sept.

Flags: needinfo?(lissyx+mozillians)

With a new profile and strict tracking protection enabled, the login persisted on Treeherder after 2+ hours.

Do all affected people have at least 2 mozilla.com SSO accounts?

Sylvestre does not use containers, but question for the others: The issue does not reproduce here with the following hosts assigned to the container used for Treeherder, and they are set to always open in such a container:

  • firefox-ci-tc.services.mozilla.com
  • login.taskcluster.net
  • tools.taskcluster.net (likely not necessary for you)
  • treeherder.mozilla.org

sso.mozilla.com is not assigned to any container but inherits the container environment of the opener.

Does anybody notice differences in their setup?

I dont know how to get an uptodate list of what is supposed to open into what, but I have at least forced firefox-ci-tc.services.mozilla.com as well as treeherder.mozilla.org to the "Professional" container.

ok, it clearly still repro: I logged out of treeherder and taskcluster right when I posted the previous comment, then connected again:

07:55:54,029 Could not renew login: 
Object { error: "login_required", errorDescription: "Login required", state: "u7IHhiY_BB27b8fyutKNiwMQrhB0HONP" }
AuthService.js:73:14
    _renewAuth AuthService.js:73

I've started logging using Cookies preset, and after 5 minutes it's already produced ~400MB of logs. Is this going to be actionable ? How am I supposed to share it with you ?

Flags: needinfo?(dveditz)

1.8GB of logs, re-auth'd at 0804 and first devtool console error reported 0819

Error popping circa 0819 in the console, and this in the logs:

2023-07-27 06:19:15.489146 UTC - [Parent 3113884: Main Thread]: W/cookie ===== COOKIE NOT SENT =====
2023-07-27 06:19:15.489149 UTC - [Parent 3113884: Main Thread]: W/cookie request URL: https://treeherder.mozilla.org/assets/runtime.508cf28a.js
2023-07-27 06:19:15.489154 UTC - [Parent 3113884: Main Thread]: W/cookie current time: Thu Jul 27 06:19:15 2023 GMT
2023-07-27 06:19:15.489156 UTC - [Parent 3113884: Main Thread]: W/cookie rejected because cookies are disabled
2023-07-27 06:19:15.489158 UTC - [Parent 3113884: Main Thread]: W/cookie
2023-07-27 06:19:15.491551 UTC - [Parent 3113884: Main Thread]: W/cookie ===== COOKIE NOT SENT =====
2023-07-27 06:19:15.491554 UTC - [Parent 3113884: Main Thread]: W/cookie request URL: https://treeherder.mozilla.org/assets/397.ca13cc41.js
2023-07-27 06:19:15.491558 UTC - [Parent 3113884: Main Thread]: W/cookie current time: Thu Jul 27 06:19:15 2023 GMT
2023-07-27 06:19:15.491560 UTC - [Parent 3113884: Main Thread]: W/cookie rejected because cookies are disabled
2023-07-27 06:19:15.491569 UTC - [Parent 3113884: Main Thread]: W/cookie
2023-07-27 06:19:15.502885 UTC - [Parent 3113884: Main Thread]: W/cookie ===== COOKIE NOT SENT =====
2023-07-27 06:19:15.502888 UTC - [Parent 3113884: Main Thread]: W/cookie request URL: https://treeherder.mozilla.org/assets/index.0b2df09f.js
2023-07-27 06:19:15.502892 UTC - [Parent 3113884: Main Thread]: W/cookie current time: Thu Jul 27 06:19:15 2023 GMT
2023-07-27 06:19:15.502895 UTC - [Parent 3113884: Main Thread]: W/cookie rejected because cookies are disabled

I should have shared it with you.

Flags: needinfo?(dveditz)

... since it's only reproducible with containers enabled and I assume that's the right component for containers bugs. If there's a better component for Containers bug, feel free to move it again.

That appears not to be true, but I can leave it here while we dig deeper. If it's basic Cookie stuff then "Networking: Cookies" is the right component, but the "rejected because cookies are disabled" errors in comment 67 are unexpected. Clearly cookies are not (globally) disabled or you wouldn't have been able to authenticate in the first place. Is there some anti-tracking heuristic that's kicking in? I could believe that there might be problems with auto-granting Storage Access to 3rd parties, but I thought Total Cookie Protection isolated (double-keyed) 3rd party cookies, not disabled them. Also, the grants in the link above are for an overly-generous 30 days, not just hours.

Paul: any ideas here?

Flags: needinfo?(pbz)

I just checked on my machine and I'm observing the sessionid cookie being set with an Expires 2 hours in the future by https://treeherder.mozilla.org/api/auth/login. This would explain being logged out after a few hours of inactivity, right?

Flags: needinfo?(pbz)

(In reply to :gerard-majax [PTO 01/08/2023-10/09/2023] from comment #67)

Error popping circa 0819 in the console, and this in the logs:

2023-07-27 06:19:15.489146 UTC - [Parent 3113884: Main Thread]: W/cookie ===== COOKIE NOT SENT =====
2023-07-27 06:19:15.489149 UTC - [Parent 3113884: Main Thread]: W/cookie request URL: https://treeherder.mozilla.org/assets/runtime.508cf28a.js
2023-07-27 06:19:15.489154 UTC - [Parent 3113884: Main Thread]: W/cookie current time: Thu Jul 27 06:19:15 2023 GMT
2023-07-27 06:19:15.489156 UTC - [Parent 3113884: Main Thread]: W/cookie rejected because cookies are disabled
2023-07-27 06:19:15.489158 UTC - [Parent 3113884: Main Thread]: W/cookie
2023-07-27 06:19:15.491551 UTC - [Parent 3113884: Main Thread]: W/cookie ===== COOKIE NOT SENT =====
2023-07-27 06:19:15.491554 UTC - [Parent 3113884: Main Thread]: W/cookie request URL: https://treeherder.mozilla.org/assets/397.ca13cc41.js
2023-07-27 06:19:15.491558 UTC - [Parent 3113884: Main Thread]: W/cookie current time: Thu Jul 27 06:19:15 2023 GMT
2023-07-27 06:19:15.491560 UTC - [Parent 3113884: Main Thread]: W/cookie rejected because cookies are disabled
2023-07-27 06:19:15.491569 UTC - [Parent 3113884: Main Thread]: W/cookie
2023-07-27 06:19:15.502885 UTC - [Parent 3113884: Main Thread]: W/cookie ===== COOKIE NOT SENT =====
2023-07-27 06:19:15.502888 UTC - [Parent 3113884: Main Thread]: W/cookie request URL: https://treeherder.mozilla.org/assets/index.0b2df09f.js
2023-07-27 06:19:15.502892 UTC - [Parent 3113884: Main Thread]: W/cookie current time: Thu Jul 27 06:19:15 2023 GMT
2023-07-27 06:19:15.502895 UTC - [Parent 3113884: Main Thread]: W/cookie rejected because cookies are disabled

These messages seem odd indeed given that your client seems to accept cookies prior to that? Do you perhaps have some special cookie permissions set? You can check under about:preferences -> Cookies and Site Data -> Manage Exceptions
Also, what is your cookie behavior set to? The pref is network.cookie.cookieBehavior.

Flags: needinfo?(lissyx+mozillians)

(In reply to Paul Zühlcke [:pbz] from comment #71)

(In reply to :gerard-majax [PTO 01/08/2023-10/09/2023] from comment #67)

Error popping circa 0819 in the console, and this in the logs:

2023-07-27 06:19:15.489146 UTC - [Parent 3113884: Main Thread]: W/cookie ===== COOKIE NOT SENT =====
2023-07-27 06:19:15.489149 UTC - [Parent 3113884: Main Thread]: W/cookie request URL: https://treeherder.mozilla.org/assets/runtime.508cf28a.js
2023-07-27 06:19:15.489154 UTC - [Parent 3113884: Main Thread]: W/cookie current time: Thu Jul 27 06:19:15 2023 GMT
2023-07-27 06:19:15.489156 UTC - [Parent 3113884: Main Thread]: W/cookie rejected because cookies are disabled
2023-07-27 06:19:15.489158 UTC - [Parent 3113884: Main Thread]: W/cookie
2023-07-27 06:19:15.491551 UTC - [Parent 3113884: Main Thread]: W/cookie ===== COOKIE NOT SENT =====
2023-07-27 06:19:15.491554 UTC - [Parent 3113884: Main Thread]: W/cookie request URL: https://treeherder.mozilla.org/assets/397.ca13cc41.js
2023-07-27 06:19:15.491558 UTC - [Parent 3113884: Main Thread]: W/cookie current time: Thu Jul 27 06:19:15 2023 GMT
2023-07-27 06:19:15.491560 UTC - [Parent 3113884: Main Thread]: W/cookie rejected because cookies are disabled
2023-07-27 06:19:15.491569 UTC - [Parent 3113884: Main Thread]: W/cookie
2023-07-27 06:19:15.502885 UTC - [Parent 3113884: Main Thread]: W/cookie ===== COOKIE NOT SENT =====
2023-07-27 06:19:15.502888 UTC - [Parent 3113884: Main Thread]: W/cookie request URL: https://treeherder.mozilla.org/assets/index.0b2df09f.js
2023-07-27 06:19:15.502892 UTC - [Parent 3113884: Main Thread]: W/cookie current time: Thu Jul 27 06:19:15 2023 GMT
2023-07-27 06:19:15.502895 UTC - [Parent 3113884: Main Thread]: W/cookie rejected because cookies are disabled

These messages seem odd indeed given that your client seems to accept cookies prior to that? Do you perhaps have some special cookie permissions set? You can check under about:preferences -> Cookies and Site Data -> Manage Exceptions

None.

Also, what is your cookie behavior set to? The pref is network.cookie.cookieBehavior.

5, default value

Flags: needinfo?(lissyx+mozillians)

FWIW the messages come from here https://searchfox.org/mozilla-central/rev/4044c34031d035fadb588143297ba5421419d44b/netwerk/cookie/CookieService.cpp#1762, which means there is a BEHAVIOR_REJECT present somewhere. This value is usually controlled by per-origin cookie permissions or the global cookie behavior. Which both doesn't seem to apply in your case...

(In reply to Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout) from comment #62)

Do all affected people have at least 2 mozilla.com SSO accounts?

Just one in my case.

Sylvestre does not use containers, but question for the others: The issue does not reproduce here with the following hosts assigned to the container used for Treeherder, and they are set to always open in such a container:

  • firefox-ci-tc.services.mozilla.com
  • login.taskcluster.net
  • tools.taskcluster.net (likely not necessary for you)
  • treeherder.mozilla.org

sso.mozilla.com is not assigned to any container but inherits the container environment of the opener.

Does anybody notice differences in their setup?

I had sso.mozilla.com and treeherder.mozilla.org assigned to my Work container, but not the others. I assigned firefox-ci-tc.services.mozilla.com and taskcluster.net [note: the subdomains login.taskcluster.net and tools.taskcluster.net do not resolve for me, so I just assigned the primary domain] to my Work container as well, but that didn't seem to have any impact on the behaviour.

Flags: needinfo?(botond)

Same, i have only one SSO account (afaik).
Seems that Alexandre tried the cookie logging, so, i guess I don't need to do it :)

Flags: needinfo?(sledru)

I logged in this morning (~10:30), and now (13:21) it's already lost session. I've already shared extensive debugging sessions, I am unsure what I can do more ...

Component: DOM: Security → Networking: Cookies

This is still happening for me

Hi :gerard-majax or :dveditz,

I should have shared it with you.

IIUC, some logs were uploaded or sent somewhere. Can someone DM them (or their location since they are likely large) to me or necko@mozilla.com?

Thanks

Flags: needinfo?(lissyx+mozillians)
Flags: needinfo?(dveditz)

(In reply to Ed Guloien [:edgul] from comment #78)

Hi :gerard-majax or :dveditz,

I should have shared it with you.

IIUC, some logs were uploaded or sent somewhere. Can someone DM them (or their location since they are likely large) to me or necko@mozilla.com?

Thanks

Unfortunately, I have not kept the key I used to sign the logs and I only kept the signed version locally. Maybe Daniel has kept it ?

Flags: needinfo?(lissyx+mozillians)

Things were fine during a few weeks and it looks to fail again ?

What are the oldest values for treeherder.mozilla.org cookies in the Last accessed and Expires / Mag-Age columns? Does this indicate a purge to cookies for this host happened which freed up space which filled up again and reached its limit now?

(In reply to Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout) from comment #81)

What are the oldest values for treeherder.mozilla.org cookies in the Last accessed and Expires / Mag-Age columns? Does this indicate a purge to cookies for this host happened which freed up space which filled up again and reached its limit now?

given that even after re-logging earlier and waiting a bit I see again:

Could not renew login:
Object { error: "login_required", errorDescription: "Login required", state: "xxx" }

I doubt it's the cause

There are many cookies under treeherder.mozilla.org with expiration date already past, even sessionid is Expiration / Durée maximum :"Mon, 23 Oct 2023 16:53:13 GMT", my time being 1752 (CEST)

Again ... I'm OK to collect more data, but I'd like that not to be a waste of my time. Can someone seriously have a look at that once I collect cookies data? And how should I collect it?

It's really really painful on a day to day basis to loose the session every 20-30 minutes.

of course when i start logging, no repro ...

Thanks for trying again. I will take a look at the newest logs (or any logs for that matter) if I can get a copy.

(In reply to :gerard-majax from comment #84)

of course when i start logging, no repro ...

worse: I did repro but I missed it and since I removed the log :(

Ok, so I have a small log with the following STR:

  • force-reload treeherder tab
  • console contains
Could not renew login: 
Object { error: "login_required", errorDescription: "Login required", state: "..." }
​
error: "login_required"
​
errorDescription: "Login required"
​
state: "..."`

Is it enough for you, or do you also need the full login sequence ?

Flags: needinfo?(edgul)

Note that I am easily able to reproduce this.

  1. Login on treeherder
  2. Set system time to a few hours later - later than cookie expiry time.
  3. Refresh page.

The treeherder tab will then not be logged in, but if I click on the login button, the page just logs in without further action.
I get the exact same behaviour in Chrome. Is this the issue everyone else is seeing? If so I'd say it's a treeherder bug, not caused by Firefox.

Treeherder uses Auth0 for authentication and renews every 15 minutes. During debugging, gerard-majax had mentioned he leaves Treeherder open but still gets logged out. This behavior is not observed by code sheriffs or me.

For the current logged in session, there are 2 cookies for treeherder.mozilla.org which have not expired: csrftoken and sessionid

If it's really a treeherder bug, can we assign it to the right person ? And what data can I share to help ?

I'm not sure what the expected behaviour is of the interaction with treeherder/auth0. So it would be helpful if we can get some server-side expertise to review this. But in the provided log which is the reporter says covers from login to login failure I note the following events. It's not exactly clear when the failure happened but my guess would be somwhere between 15-22min marks.

Time 0 (min from login):

  • set-cookie from treeherder csrftoken expiry of 1 year via expires AND max-age
  • set-cookie from treeherder sessionid expiry of 2 hours via expires AND max-age

Time 15 (min from login):

  • we accept 2 new cookies with names like _com.auth0.auth. expiring in 30 min from now

Time 22 (min from login):

  • we are still using the same csrftoken & sessionid when we send cookies to treeherder, we also see the _com.auth0.auth cookies go out with the original cookies in the Cookie header.
  • We note the last treeherder request goes out around this time
  • Followed shortly with the last treeherder response (success) that contains no set-cookie header
  • ...
  • Something triggers cache clearing (not sure what is removed exactly, nor if relevant). Though reporter says there was no user initiated cache clearing.

Time >22:

  • no further treeherder requests go out, so no cookie setting or sending relevant to treeherder

The relevant interactions (bugzilla, auth0, treeherder) seem to all be in the same container, so it doesn't really look like a problem with containerizing as others have mentioned above. Though, I am wondering if cookie partitioning may affect this. :gerard-majax, What mode of ETP are you using? And I'm not seeing any discussion of other browsers above. Can you try this in another browser and see if this replicates?

:aryx,
Since the auth0 is supposed to renew every 15 min, should we expect the initial cookies csrftoken & sessionid to be updated at the 15 min mark or does auth0 handle with the appended cookies somehow?

Flags: needinfo?(lissyx+mozillians)
Flags: needinfo?(edgul)
Flags: needinfo?(aryx.bugmail)

What mode of ETP are you using?

I don't even know what ETP stands for. Enhanced Tracking Protection and according to about:preferences it is set to Standard.

And I'm not seeing any discussion of other browsers above. Can you try this in another browser and see if this replicates?

It's already eating a lot of my time, and there are already reports here of people with similar setup to mine that don't reproduce, so I am not really sure it is worth testing in another browser ...

Flags: needinfo?(lissyx+mozillians)
Severity: -- → S3
Priority: -- → P2
Whiteboard: [necko-triaged][necko-priority-queue]

I'm really not sure what is hapening ; re-auth'd yesterday evening and this morning the session is not lost.

Kershaw is also not seeing this bug anymore.

Whiteboard: [necko-triaged][necko-priority-queue] → [necko-triaged][necko-monitor]
Status: REOPENED → NEW

It was fine for some time and today I've already lost auth 5 times ?

I wonder if this is happening close to the version bump day?

no idea, but since my last message it happened mostly everyday, multiple times per day.

it's back and now I think I even suffer of it on Firefox CI TaskCluster instance ... ?

Flags: needinfo?(dveditz)

Is this still happening? Any further info?

Flags: needinfo?(lissyx+mozillians)

It still happens for me.

(In reply to Randell Jesup [:jesup] (needinfo me) from comment #99)

Is this still happening? Any further info?

Yes.

Flags: needinfo?(lissyx+mozillians)

Status update:
It's reproducible for me with Treeherder's staging instance but not production in the work Firefox profile. In a profile set up for testing and which idled more than two hours, it was the opposite. Investigation will continue.

It's still happening and much more often if I judge by Login/Register like maybe 30min-1h

This is still happening for me ...

Likewise.

Attached image lots_of_cookies.png

My theory is that something is causing the cookies to be evicted - most likely by creating lots of other cookies, and hitting the cookie limit.
As someone reported on matrix, they had a few hundred authentication cookies on treeherder.
If you see this issue again, consider having a look, and maybe clearing some cookies.
Note that other websites that create A LOT OF COOKIES may cause the treeherder cookie to be evicted.

(In reply to Valentin Gosu [:valentin] (he/him) from comment #107)

Created attachment 9495009 [details]
lots_of_cookies.png

My theory is that something is causing the cookies to be evicted - most likely by creating lots of other cookies, and hitting the cookie limit.
As someone reported on matrix, they had a few hundred authentication cookies on treeherder.
If you see this issue again, consider having a look, and maybe clearing some cookies.
Note that other websites that create A LOT OF COOKIES may cause the treeherder cookie to be evicted.

I think it's unrelated ; I dont have many cookies, just a handful, and I still experience the issue. I still see the error mentionned above in the devtools console:

Could not renew login: 
Object { error: "login_required", errorDescription: "Login required", state: "xxx-yy" }

I'm using AI to help investigate (Claude Code)

Authentication Implementation Analysis

Overview

Treeherder uses Auth0 for authentication with JWT (JSON Web Tokens). The implementation involves both client-side and server-side components that work together to authenticate users and maintain their sessions.

Key Components

Client-Side (JavaScript)

  • Located in ui/shared/auth/AuthService.js and ui/helpers/auth.js
  • Handles token renewal and session management
  • Calculates token expiration times
  • Manages the renewal of tokens before they expire

Server-Side (Python/Django)

  • Located in treeherder/auth/backends.py
  • Implements the AuthBackend class for Django authentication
  • Validates tokens received from Auth0
  • Sets session expiration based on token expiration times

Authentication Flow

  1. User logs in via Auth0
  2. Auth0 provides access token and ID token
  3. Tokens are stored in the browser
  4. Tokens are sent to the server for validation
  5. Server validates tokens and creates a session
  6. Session expiration is set based on token expiration
  7. Client-side code attempts to renew tokens before expiration

Token Expiration

  • Both access tokens and ID tokens have expiration times
  • The session expiration is set to the earlier of the two token expirations
  • Default token lifetime is typically 24 hours for access tokens
  • ID tokens may have a shorter lifetime (possibly 1 hour)

Session Management

  • Django uses cache-based session storage (configured in settings.py)
  • Session expiration is explicitly set during authentication
  • X_FRAME_OPTIONS is set to "SAMEORIGIN" to allow token renewal in an iframe

Potential Causes for 1-Hour Authentication Expiration

After analyzing the Treeherder authentication implementation, we've identified several potential causes for the 1-hour expiration issue:

1. Auth0 Default Token Lifetime Settings

  • Auth0 may be configured with a default 1-hour expiration for ID tokens
  • While access tokens typically last 24 hours, ID tokens often have shorter lifetimes
  • The session expiration is set to the earlier of the two token expirations

2. ID Token Expiring Before Access Token

  • The code in treeherder/auth/backends.py sets the session expiration to the minimum of:
    • Access token expiration time
    • ID token expiration time
  • If ID tokens are configured to expire after 1 hour, this would cause the entire session to expire

3. Token Renewal Failures

  • The client-side code attempts to renew tokens before they expire
  • If token renewal is failing silently, users would be logged out when tokens expire
  • This could be due to:
    • Network issues
    • Auth0 configuration issues
    • Bugs in the renewal logic

4. Auth0 Tenant Configuration

  • The Auth0 tenant for Mozilla may have specific settings that affect token lifetimes
  • Custom rules or hooks in Auth0 might be modifying the default token behavior
  • Organization policies might enforce shorter session durations

5. Mozilla's OpenID Connect Guidelines

  • Mozilla may have organization-wide security policies that limit token lifetimes
  • These policies could be enforced at the Auth0 tenant level
  • Standard security practices often recommend shorter token lifetimes for sensitive applications

6. Session Configuration Issues

  • Django's session configuration might be overriding the token-based expiration
  • Cache-based session storage might have its own timeout settings
  • Middleware or other components might be affecting session duration
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: