Closed Bug 659569 Opened 13 years ago Closed 13 years ago

Unable to connect to Ubiquiti Network radio equipment w/o getting stuck in redirect loop.

Tracking

()

Status:

VERIFIED FIXED

Tracking Flags:

Tracking

Status

firefox5

affected

firefox6

fixed

firefox7

---

fixed

People

(Reporter: jst, Assigned: bjarne)

References

Details

(Keywords: regression, Whiteboard: [qa-])

Attachments

(6 files)

Log from before this broke, login page loads fine. 13 years ago Johnny Stenback (:jst) 483.60 KB, text/plain		Details
Log from after this broke, login page does not loads, redirect loop. 13 years ago Johnny Stenback (:jst) 242.55 KB, text/plain		Details
Proposed solution 13 years ago Bjarne (:bjarne) 3.47 KB, patch	bzbarsky : review+ jst : approval-mozilla-aurora+ christian : approval-mozilla-beta-	Details \| Diff \| Splinter Review
Log from before this broke, login succeeds 13 years ago Johnny Stenback (:jst) 827.83 KB, text/plain		Details
Log from current trunk with Bjarne's patch, login fails. 13 years ago Johnny Stenback (:jst) 2.52 MB, text/plain		Details
Hack to detect and fix pattern as described in comment #17 13 years ago Bjarne (:bjarne) 8.14 KB, patch	bzbarsky : review+ mcmanus : review+ briansmith : feedback-	Details \| Diff \| Splinter Review

Johnny Stenback (:jst)

Reporter

Description

•

13 years ago

This is so far something I've only ever seen when trying to log into the admin interface on a Ubiquiti wifi radio and but so far it seems like something that can happen elsewhere as well. The problem is that Firefox gets stuck in a redirect loop and ends up displaying the error for that. Hitting try again gets past that and all the way to the login screen, but logging in is unsuccessful. If I at this point clear the cache, I'm able to log in and things work fine for days until I get kicked out and need to log in again.

If I test with a clean profile and disable both the disk cache and the memory cache before navigating to the radio interface then I don't see the above redirect problem.

This is a regression since 4, so I think we should track this for 5 and 6 as this could be a more widespread problem than what I'm seeing.

Unfortunately I so far can only reproduce this issue with one of these radios locally, but I'm bisecting right now to see what changeset caused this, and maybe that'll shed some light on what might be going on here. I can attach logs etc too, if it's not clear from the changeset itself.

Johnny Stenback (:jst)

Reporter

Comment 1

•

13 years ago

This regressed with revision http://hg.mozilla.org/mozilla-central/rev/6fe1e7832dc8. Cc:ing Bjarne.

Johnny Stenback (:jst)

Reporter

Comment 2

•

13 years ago

IOW, the fix for bug 561276 caused this.

Depends on: 561276

Johnny Stenback (:jst)

Reporter

Comment 3

•

13 years ago

Attached file Log from before this broke, login page loads fine. — Details

This is a URILoader:5,LoadGroup:5,DocLoader:5,nsHttp:5,nsSocketTransport:5 log from before this broke (changeset b490b2316729, with s/PRUInt32/PRUint32/ in nsHttpTransaction.cpp to make it build).

Johnny Stenback (:jst)

Reporter

Comment 4

•

13 years ago

Attached file Log from after this broke, login page does not loads, redirect loop. — Details

Johnny Stenback (:jst)

Reporter

Comment 5

•

13 years ago

... and in those logs the failing load is from 192.168.1.1.

Bjarne (:bjarne)

Assignee

Comment 6

•

13 years ago

Attached patch Proposed solution — Details — Splinter Review

This redirection uses a pattern which causes nsHttpChannel::CheckCache() to clear |mRedirectedCachekeys| prematurely, hence we lose the redirect-chain and the solution doesn't work anymore. Actually a quite stupid mistake in the original fix, but that's life once in a while...

I was able to reproduce by modifying test for bug #561276 to use the same pattern. Fix is to *not* clear |mRedirectedCachekeys| at the wrong time - see patch.

Passes local testing. Pushed to tryserver.

Assignee: nobody → bjarne

Status: NEW → ASSIGNED

Attachment #535036 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Updated

•

13 years ago

Blocks: 561276

No longer depends on: 561276

Boris Zbarsky [:bzbarsky]

Comment 7

•

13 years ago

Comment on attachment 535036 [details] [diff] [review]
Proposed solution

r=me

Attachment #535036 - Flags: review?(bzbarsky) → review+

Johnny Stenback (:jst)

Reporter

Comment 8

•

13 years ago

Thanks for the quick fix Bjarne! Seems to work wonderfully for the initial page that ends up stuck in the redirect loop, but the need to clear the cache before being able to log in remains. I'll capture logs for that as well.

Johnny Stenback (:jst)

Reporter

Comment 9

•

13 years ago

Attached file Log from before this broke, login succeeds — Details

This is a log with Bjarne's patch applied of the same working build where the login page loads fine, but the login itself fails. The behavior here is that the login page loads, I enter the credentials and hit enter and I'm presented with the same login page again instead of actually getting logged in etc.

The radio name in this question is ap-bud instead of an IP address (sorry, had to switch radios here).

Johnny Stenback (:jst)

Reporter

Comment 10

•

13 years ago

Er, sorry, I confused myself there... The previous log is of a working login, where the login succeeds. The upcoming log is of a session where the login itself does not work.

Johnny Stenback (:jst)

Reporter

Comment 11

•

13 years ago

Attached file Log from current trunk with Bjarne's patch, login fails. — Details

Here the login page loads fine (no redirect loop), but logging in brings me back to the login page rather than logging in. If I clear the cache, I can successfully log in.

Asa Dotzler [:asa]

Comment 12

•

13 years ago

can you guys get this into central so we can get this tested? will consider for 5 and 6.

tracking-firefox5: ? → +

Johnny Stenback (:jst)

Reporter

Comment 13

•

13 years ago

Asa, this is only partially fixed by the attached patch. We could probably land that separately (and I'll go ahead and do that), but we need more work here.

Bjarne, if it's not clear from the logs what's going on here I could bring in one of these radios and hook it up to the network in the office so you can test it yourself. Let me know if that would help.

Johnny Stenback (:jst)

Reporter

Comment 14

•

13 years ago

Proposed solution pushed as:

http://hg.mozilla.org/mozilla-central/rev/f584467626df

But there's more to do here, so leaving this bug open.

Bjarne (:bjarne)

Assignee

Comment 15

•

13 years ago

(In reply to comment #13)
> Bjarne, if it's not clear from the logs what's going on here I could bring
> in one of these radios and hook it up to the network in the office so you
> can test it yourself. Let me know if that would help.

That would be infinitely useful, please...  :)

Johnny Stenback (:jst)

Reporter

Comment 16

•

13 years ago

Ok, will do. I'll send you the address and login info etc once it's set up (hopefully tomorrow).

Bjarne (:bjarne)

Assignee

Comment 17

•

13 years ago

Thanks for the test-setup. Makes life a lot simpler...  :)

The login-issue can be described as follows: From the login-page we POST the credentials. The response is a 302 pointing to "/" and necko translates this into a "GET /" request (see bug #598304). We have already loaded "/" and the result is cached, and it happens to be a 301 pointing to the login-page. Necko follows the redirect to the login-page and we're back to the beginning. The redirect-chain doesn't help here because the response to the login-page is a 200.

This worked previously because necko always validated cached redirects if a cookie was set in the request, hence it didn't use the cached 301-response. The real problem is that the 301-response is cached and it could easily be resolved on the server by specifying no-cache or must-validate.

(Note btw that changing the request-method on a 302 (or 301 for that matter) is dead wrong (see RFC2616, notes at the end of sections 10.3.2 and 10.3.3) but according to bug #598304 it might be necessary to avoid breaking the web.)

But how do we fix this? The original code used the presence of a cookie in the request to detect the situation, but bug #561276 decided that this was too coarse-grained. IMO this situation may occur only for replacement-channels where the request-method has changed. I.e. we could carry fwd the request-method of an originating channel to the replacement-channel and compare.

Boris Zbarsky [:bzbarsky]

Comment 18

•

13 years ago

So the server should be sending a "Vary: Cookie" in this case and is not, right?

Bjarne (:bjarne)

Assignee

Comment 19

•

13 years ago

Not really. The cookie is also sent in the earlier, cached request. I assume the server applies a design-pattern which uses the cookie to identify a session but maintains state internally in the session, returning 301's to different pages depending on state. The real, uhh, confusing bit is why the server uses a 301 ("moved permanently") for this. By just replacing 301 with 302 the problem would be resolved since 302s are not cached unless explicitly specified...  :(

See e.g. attachment #535107 [details]: The first request for "/" results in a 301 to "cookiechecker". The request for "cookiechecker" results in a 301 back to "/". Then, the request for "/" is found in the cache but CheckCache() decides to validate it (probably because it's captured without the redirect-chain-patch). The subsequent request for "/" sets the cookie, but now the server responds with a 301 to "login". The request for "login" also sets the cookie and results in a 200 and at this point we have a cached 301-response for "/" with the cookie set. After this we see a number of req/res for resources probably displayed on the login-page, before we get to the POST with credentials. Server responds with a 302 to "/" and observe that CheckCache() decides to validate the cached 301-entry for "/" again. This validation saves us because we get a 200 when requesting "/" with cookie set again. Normally we should have re-used the cached 301 which would send us back to "login". This is what happens in attachment #535110 [details].

The question is whether we should hack around this or not. I have a patch which does as suggested in last para of comment #17, but it essentially breaks RFC2616 (i.e. in this case makes the 301 non-cacheable) in order to allow servers to use a design-pattern and misuse 301 like described above. IMO we should go back to whoever writes this stuff and tell them to use 302 instead in these cases.

Johnny Stenback (:jst)

Reporter

Comment 20

•

13 years ago

Fwiw, I tested IE and Chrome with one of these radios and neither browser has a problem logging into them. So I think we need to do *something* here, but I don't know enough about this to say what.

Bjarne (:bjarne)

Assignee

Comment 21

•

13 years ago

(In reply to comment #19)
> Not really. The cookie is also sent in the earlier, cached request.

To clarify: The cookie-values in this case are the same so "Vary: Cookie" wouldn't help, unfortunately.

(In reply to comment #20)
> Fwiw, I tested IE and Chrome with one of these radios and neither browser
> has a problem logging into them. So I think we need to do *something* here,
> but I don't know enough about this to say what.

Would be interesting to know if IE/Opera receives the same content we do... Otoh I perfectly agree that we must do something - IMO we have 3 options:

1) apply the hack I'll attach in a little while (deliberately allowing such behaviour from servers)

2) back out the fix for bug #561276 (losing the advantage of caching 3xx's which are used correctly)

3) use the power of our current market-share and communicate with whoever writes this software and make them change it

I've encountered the dilemma 1 vs 3 in a number of other bugs and I'd expect to see it in also in future issues, hence it might be worth to take this discussion on a principal level in order to get clear guidelines (or maybe there are some already that I'm unaware of..?)

I don't know how widespread this software is (or whether it is maintained and upgraded) so I'll leave the decision of what to do to others.

Bjarne (:bjarne)

Assignee

Comment 22

•

13 years ago

Attached patch Hack to detect and fix pattern as described in comment #17 — Details — Splinter Review

I don't like this solution but it allows me to log into the wireless and navigate it. If we want to follow this path, and if considered necessary, I could probably put together a test.

Attachment #535589 - Flags: review?(bzbarsky)

Bjarne (:bjarne)

Assignee

Comment 23

•

13 years ago

Comment on attachment 535589 [details] [diff] [review]
Hack to detect and fix pattern as described in comment #17

Btw, it passes local testing and has been sent to tryserver.

Boris Zbarsky [:bzbarsky]

Comment 24

•

13 years ago

Comment on attachment 535589 [details] [diff] [review]
Hack to detect and fix pattern as described in comment #17

Looks ok to me from a code point of view.  I think Patrick or Jason are more qualified to review the HTTP interaction "should we do this?" end of things.

Attachment #535589 - Flags: review?(mcmanus)

Attachment #535589 - Flags: review?(jduell.mcbugs)

Attachment #535589 - Flags: review?(bzbarsky)

Attachment #535589 - Flags: review+

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 25

•

13 years ago

Comment on attachment 535589 [details] [diff] [review]
Hack to detect and fix pattern as described in comment #17

I would not be surprised if other browsers are invalidating the target of the redirect from the POST, since the spec says [1]:

"Some HTTP methods MUST cause a cache to invalidate an entity. This is either the entity referred to by the Request-URI, or by the Location or Content-Location headers (if present). These methods are: PUT, DELETE, POST. In order to prevent denial of service attacks, an invalidation based on the URI in a Location or Content-Location header MUST only be performed if the host part is the same as in the Request-URI."

I think we should do precisely what the spec says: If the host in the Location response header field of a POST (or PUT or DELETE) is the same as the host in the request URI, then doom the entries mentioned in the Location header field (and, similarly for Content-Location) when processing the POST (or PUT or DELETE).

[1] http://tools.ietf.org/html/draft-ietf-httpbis-p6-cache-14#section-2.5

Attachment #535589 - Flags: feedback-

Patrick McManus [:mcmanus]

Comment 26

•

13 years ago

Comment on attachment 535589 [details] [diff] [review]
Hack to detect and fix pattern as described in comment #17

I don't have a problem with what the patch does, so I'm happy to r+ it. We more or less want to bring the not-from-cache semantics of the POST through to the rewritten request.

If I understand Brian's suggestion it boils down to invalidating any cached resource (probably from a GET) that has POST/PUT/etc performed on it even for future GETs.. that has a certain logical attraction to it, but I would want to ponder how that interaction impacts the cache for deployed websites and imo we  should study that in a different bug.

Attachment #535589 - Flags: review?(mcmanus) → review+

Bjarne (:bjarne)

Assignee

Comment 27

•

13 years ago

(In reply to comment #25)
> I would not be surprised if other browsers are invalidating the target of
> the redirect from the POST

A fresh pair of eyes is often fruitful - this would be a far better solution (if it applies, which I believe it will)!

I remember reviewing a patch doing something like this some time ago but I cannot find it in any of my searches... Iirc the patch was ok but should be extended with more details so I believe I cleared the r-request. I'll try to dig it up somehow...

Bjarne (:bjarne)

Assignee

Comment 28

•

13 years ago

Yeah - this is probably also covered by RFC2616 section 13.10 as stated in bug #618835, comment #5. Wonder why I didn't recall this one - good catch Brian. :)

I'll verify that the approach from #618835 fixes the remaining issue of this bug.

Bjarne (:bjarne)

Assignee

Comment 29

•

13 years ago

WIP patch from bug #618835 fixes this problem. Since comment #14 pushed the fix for the original problem described in this bug, I resolve the remaining issue as a dup of bug #618835. Feel free to correct me if this is wrong procedure.

Status: ASSIGNED → RESOLVED

Closed: 13 years ago

Resolution: --- → DUPLICATE

Johnny Stenback (:jst)

Reporter

Comment 30

•

13 years ago

Updating title to match what was fixed here. Thanks again Bjarne for digging in here! We'll continue tracking the remaining issue in bug 618835. And I'm going to mark this fixed instead, not that it makes much difference...

Resolution: DUPLICATE → FIXED

Summary: Unable to log into Ubiquiti Network radio equipment w/o clearing the cache. → Unable to connect to Ubiquiti Network radio equipment w/o getting stuck in redirect loop.

Johnny Stenback (:jst)

Reporter

Updated

•

13 years ago

tracking-firefox6: ? → +

Johnny Stenback (:jst)

Reporter

Comment 31

•

13 years ago

Comment on attachment 535036 [details] [diff] [review]
Proposed solution

This needs to land on 5 beta and 6 aurora.

Attachment #535036 - Flags: approval-mozilla-beta+

Attachment #535036 - Flags: approval-mozilla-aurora+

LegNeato

Comment 32

•

13 years ago

Even though this is a Firefox 5 regression, we are going to remove tracking and ship without it because:

1. We don't think the issue is widespread
2. The workaround is pretty simple (just clicking the try again button)
3. The changes are in http code and we are late in the beta cycle

Clearing version-specific tracking flags.

status-firefox5: --- → affected

status-firefox6: --- → affected

tracking-firefox5: + → -

tracking-firefox6: + → -

LegNeato

Updated

•

13 years ago

Attachment #535036 - Flags: approval-mozilla-beta+ → approval-mozilla-beta-

Boris Zbarsky [:bzbarsky]

Comment 33

•

13 years ago

Bjarne, do you need to request aurora review to land this for Fx6?

Bjarne (:bjarne)

Assignee

Comment 34

•

13 years ago

This is not really a prerequisite for bug #618835 but it is related. Please observe that attachment #535036 [details] [diff] [review] was already approved for aurora (afaics) so I guess I can just request check-in?

Johnny Stenback (:jst)

Reporter

Comment 35

•

13 years ago

http://hg.mozilla.org/releases/mozilla-aurora/rev/4d83f16deb7c

status-firefox6: affected → fixed

status-firefox7: --- → fixed

AndreiD[QA]

Comment 36

•

13 years ago

Is bug 666021 referring to the same issue as the one that was fixed here? Thanks.

Bjarne (:bjarne)

Assignee

Comment 37

•

13 years ago

(In reply to comment #36)
> Is bug 666021 referring to the same issue as the one that was fixed here?

Most likely. Could you try the latest nightly build...?

John Hawkinson

Comment 39

•

13 years ago

(In reply to Christian Legnitto [:LegNeato] from comment #32)
> 2. The workaround is pretty simple (just clicking the try again button)

I'm not sure it's worth pointing out at this late date, but the Try Again button is not sufficient to resolve this issue. It gets past the redirect warning, but then one gets stuck on an infinite loop on the device's login page. That is, the login page is presented, user enters a username/password and clicks the submit button (Login), and the login page is again presented, ad-infinitum.

My solution has been to use another browser to manage this device, which is kind of annoying, especially because most of the time when I want to manage it it is because there is a network problem in-progress...

Jason Duell

Updated

•

13 years ago

Attachment #535589 - Flags: review?(jduell.mcbugs)

u279076

Comment 41

•

13 years ago

qa-: QA is unable to verify this fix, Johnny can you please verify the fix?

Whiteboard: [qa-]

John Hawkinson

Comment 42

•

13 years ago

(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #41)
> qa-: QA is unable to verify this fix, Johnny can you please verify the fix?

Sorry for the delay, I wasn't sure if this was targetted at mortals. Under WinXP, Aurora 9.02a2 seems to work just fine connecting to a Ubiquity device that FF6 could not talk to (running Firmware Version: 	XM.v5.2.1). The Aurora particulars:
 Build identifier: Mozilla/5.0 (Windows NT 5.1; rv:9.0a2) Gecko/20110930 Firefox/9.0a2
 Built from http://hg.mozilla.org/releases/mozilla-aurora/rev/79eab065f0d0

So looks good!!

John Hawkinson

Comment 43

•

13 years ago

> Aurora 9.02a2 seems to work just fine connecting to a Ubiquity device
> that FF6 could not talk to 

I suppose I need to amend that. I'm afraid I lost track of what version I saw the failure under. It was probably FF4, which appears to have been upgraded away from this system. It appears that FF7.0.1 also works just fine to connect to this device.

Johnny Stenback (:jst)

Reporter

Comment 44

•

13 years ago

Yup, this problem is now a thing of the past. Verified.

Status: RESOLVED → VERIFIED

You need to log in before you can comment on or make changes to this bug.