PAC: make failover behave according to specification

RESOLVED FIXED in mozilla1.8alpha2

Status

()

P2
normal
RESOLVED FIXED
15 years ago
13 years ago

People

(Reporter: ddixon, Assigned: darin.moz)

Tracking

Trunk
mozilla1.8alpha2
Points:
---
Dependency tree / graph
Bug Flags:
blocking-aviary1.0 -

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 4 obsolete attachments)

(Reporter)

Description

15 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007 Firebird/0.7
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007 Firebird/0.7

According to the auto-proxy configuration documentation, the browser is supposed
to attempt to use the "primary" proxy server, and if not reachable, use the
secondary and not retry for 30 minutes, and if still a problem at the 30 minute
mark, add another 30 minutes, etc.

Firebird, in fact, on every (almost) http get, attempts to establish a TCP
session with the primary proxy server for 20 seconds, and then connects to the
backup proxy server.  On the next go around, it tries the primary again.


Reproducible: Always

Steps to Reproduce:
1. Create an auto-proxy script returning the following a bogus IP and port as
primary with real as secondary:
return "PROXY 1.1.1.1:8080; PROXY 121.12.14.34:80"
2. Run an analyzer capturing on the two IPs listed in step 1
3. Try to get a web page from the internet

Actual Results:  
A tremendous amount of time is spent by the browser trying to contact the bogus
proxy server.  Some pages will time out.

Expected Results:  
Follow the 30 minute rule that is listed in the documentation at:
http://wp.netscape.com/eng/mozilla/2.0/relnotes/demo/proxy-live.html


I tried this same test with IE 6 and Mozilla 1.5.
With Mozilla 1.5 it also acts the same as Firebird except that the requests to
the bogus proxy server happen faster and then it fails over (from 5-8 seconds).
 It continues to try the primary proxy server on every new http request

With IE, the browser trys for 20 seconds on the first attempt and then does not
retry to down server for 30 minutes.  I did not wait the hour to see if it
follows true to form.

I did not try any version of netscape.
-> Networking
Assignee: general → darin
Component: Browser-General → Networking
QA Contact: general → benc
Summary: Browser does not follow the rules for failover proxy configuration → [PAC] Browser does not follow the rules for failover proxy configuration
(Assignee)

Comment 2

15 years ago
known problem.. i'll fix it for 1.6 beta.
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Target Milestone: --- → mozilla1.6beta
(Assignee)

Updated

15 years ago
OS: Windows XP → All
Hardware: PC → All

Comment 3

15 years ago
this could improve pac proxy failover's performance.

Comment 4

15 years ago
bug 207633 also has the same issue, 
but it has more:
  
"
If all proxies are down, and there was no DIRECT option specified, the Navigator
will ask if proxies should be temporarily ignored, and direct connections
attempted. The Navigator will
ask if proxies should be retried after 20 minutes has passed (then the next time
"
40 minutes from the previous question, always adding 20 minutes).

Blocks: 207633

Comment 5

15 years ago
Created attachment 134904 [details] [diff] [review]
patch 0.1

only do pac failover

darin, can you take a look?
(Assignee)

Comment 6

15 years ago
Comment on attachment 134904 [details] [diff] [review]
patch 0.1

adding to my review queue...
Attachment #134904 - Flags: review?(darin)

Comment 7

15 years ago
Created attachment 136085 [details] [diff] [review]
patch 0.2

more completed patch
Attachment #134904 - Attachment is obsolete: true

Updated

15 years ago
Attachment #136085 - Flags: review?(darin)

Updated

15 years ago
Attachment #134904 - Flags: review?(darin)

Comment 8

15 years ago
Created attachment 137910 [details] [diff] [review]
patch 0.3

add prompt.
Attachment #136085 - Attachment is obsolete: true

Updated

15 years ago
Attachment #136085 - Flags: review?(darin)

Updated

15 years ago
Attachment #137910 - Flags: review?(darin)
(Assignee)

Updated

15 years ago
Target Milestone: mozilla1.6beta → mozilla1.7alpha
(Assignee)

Updated

15 years ago
Priority: -- → P2
Target Milestone: mozilla1.7alpha → mozilla1.7beta
(Assignee)

Comment 9

15 years ago
sorry for delaying in reviewing your patch jerry.  i'm afraid it isn't going to
make mozilla 1.7.  however, i will see to it that it makes it into 1.8.  my
apologies for not getting to your patch in a timely manner.
Target Milestone: mozilla1.7beta → mozilla1.8alpha

Updated

15 years ago
Summary: [PAC] Browser does not follow the rules for failover proxy configuration → PAC: make failover behave according to specification
(Assignee)

Updated

15 years ago
Target Milestone: mozilla1.8alpha1 → mozilla1.8alpha2
(Assignee)

Updated

15 years ago
Blocks: 246060
(Assignee)

Comment 10

15 years ago
Created attachment 151972 [details] [diff] [review]
v1 patch

Ok, I reviewed the Jerry's 0.3 and had quite a few changes I wanted to make, so
I just revised the patch directly.  Here's the revised patch.  I tried to make
the behavior of Moz match that of IE6 as much as possible.  In other words, I
do not bother with failover to DIRECT, so there is no prompting code in this
patch.

I'm content to leave it up to site admins to control whether or not failover to
DIRECT occurs.	Users can always manually enable DIRECT connections if they
need to do so.

I also do not implement Navigator's doubling algorithm.  I just use a pref
controlled timeout value each time a proxy fails.  If all proxies fail, then I
make ExamineForProxy return all proxies.  That way, if there is a major network
failure the user will be able to press reload to retry and not be stuck with
either 1) Moz stuck thinking it can only do direct or 2) Moz stuck thinking it
cannot do anything.  This behavior seems to be consistent with IE6 (minus the
doubling issue... I haven't checked how IE behaves after 30 minutes expire).
Attachment #137910 - Attachment is obsolete: true
(Assignee)

Updated

15 years ago
Attachment #137910 - Flags: review?(darin) → review-
(Assignee)

Updated

15 years ago
Attachment #151972 - Flags: review?(cbiesinger)
Comment on attachment 151972 [details] [diff] [review]
v1 patch

grrr. retyping from memory, after closing the wrong window.

nsIProtocolProxyService.idl
+     * This method returns a proxy to be used for loading the given URL.

does it return null or throw an exception if no proxy is to be used? please
document :)

+     * This method may be called to construct a nsIProxyInfo instance from
+     * the given parameters.

is there a need for this to be a public function?
NS_NewProxyInfo certainly has no caller
(http://lxr.mozilla.org/mozilla/search?string=NS_NewProxyI) and nsIProxyInfo
_is_ supposedly meant to be an opaque cookie. (hmm, dwitte would like it ;) )

if you are going to keep it:
maybe list the valid aType parameters
it would make sense to make aPort an "unsigned short", since that's the valid
range for ports

+    void configureFromPAC(in AUTF8String aURI);

does this load synchronously or asynchronously?
maybe the parameter should be nsIURI?

+     * instead.  As a side-effect, this method may affect future return values
+     * from examineForProxy.

...as well as from getFailoverProxy.

Index: netwerk/base/src/nsProtocolProxyService.cpp
+    char	 *mHost; // owning reference

maybe this should be an nsCString, saves a string copy now that we have string
sharing

+    // Add timeout to interval
+    dsec += mFailedProxyTimeout;

this seems to be the value when the proxy can be tried again... maybe add a
comment, saying that

+    mFailedProxies.Put(key, dsec);

this can fail (OOM)... but I guess that's ok-ish, it'd just mean that this
proxy is not really disabled...

ExaminePACForProxy:
+    do {

well... ok... this "loop" isn't, really. maybe its content should be in its own
function, that calls itself in the
+	 else if (!first && pruneDisabledProxies) {
case. that'd be more clear, at least...

	url. loading it now, in the current thread results in a
	browser crash */

this comment is now misleading/wrong, since it is now explicit that the call
happens on the current thread, the point is just that it's async. please change
its wording.

+    // Verify that |aProxy| is one of our nsProxyInfo objects.
+    nsProxyInfo *pi = nsnull;
+    aProxy->QueryInterface(kProxyInfoID, (void **) &pi);
+    if (!pi)
+	 return NS_ERROR_INVALID_ARG;
+    NS_RELEASE(aProxy);

this release of an argument seems to be really a release of |pi|, so that you
don't have to do it in the error cases. OK, but maybe you should use
nsRefPtr<nsProxyInfo for pi, then you could also use getter_AddRefs, I think

 nsProtocolProxyService::LoadFilters(const char *filters)
 {
+    mFailedProxies.Clear();

I don't think that a change of the "no proxy for" pref should clear the list of
failed proxies. that pref does not cause them to magically be up again :)

Index: netwerk/base/src/nsProtocolProxyService.h
+typedef nsDataHashtable<nsCStringHashKey,PRUint32> nsFailedProxyTable;

looks like a space is missing after the comma

+    PRTime			  mSessionStart;

maybe it would be better to not use seconds since session start, but instead
convert that pref to usec when it's read. it'd do the conversion only once, not
everytime you call SecondsSinceSessionStart...
Attachment #151972 - Flags: review?(cbiesinger) → review+
ah, one more thing... shouldn't SOCKS failover as well?
(Assignee)

Comment 13

15 years ago
(In reply to comment #12)
> ah, one more thing... shouldn't SOCKS failover as well?

It does, but only for HTTP(S) at the moment.  If you mean, shouldn't the Socket
Transport implement SOCKS failover, then... yes, I think I agree that it should.
 I'd rather take care of that problem in a separate bug.
(Assignee)

Comment 14

15 years ago
> +     * This method may be called to construct a nsIProxyInfo instance from
> +     * the given parameters.
> 
> is there a need for this to be a public function?
> NS_NewProxyInfo certainly has no caller
> (http://lxr.mozilla.org/mozilla/search?string=NS_NewProxyI) and nsIProxyInfo
> _is_ supposedly meant to be an opaque cookie. (hmm, dwitte would like it ;) )

Agreed, and I considered removing it until I realized that folks might want to
be able to create a SOCKS connection and would then need to have a way of
providing the nsISocketTransportService with a nsIProxyInfo instance.  In
Mozilla, we never need this functionality because we always start with a URI, so
ExamineForProxy is sufficient.


> if you are going to keep it:
> maybe list the valid aType parameters

done


> it would make sense to make aPort an "unsigned short", since that's the valid
> range for ports

Nope, Necko defines ports as 32-bit signed integers.  Keep in mind that it might
be possible to use (or abuse) a port number to mean something other than what
TCP defines it to mean.

> +    void configureFromPAC(in AUTF8String aURI);
> 
> does this load synchronously or asynchronously?

asynchronously.


> maybe the parameter should be nsIURI?

well, I considered that, but then I'd have to go change the callers.  I'd rather
not do that.  in actual fact, i'm not sure that it makes any sense to expose
this method.  since we observe the PAC file pref, it should be enough for folks
to just modify that pref directly.


> +    PRTime			  mSessionStart;
> 
> maybe it would be better to not use seconds since session start, but instead
> convert that pref to usec when it's read. it'd do the conversion only once, not
> everytime you call SecondsSinceSessionStart...

I decided against this because I did not want to store 64-bit integers in the
hash table.  I don't think SecondsSinceSessionStart will have a measurable perf
cost.


I applied all of your other suggestions and added a bit more comments.
(Assignee)

Comment 15

15 years ago
Created attachment 152146 [details] [diff] [review]
v1.1 patch

revised patch.	this also includes a bug fix.  i found that if i toggled my
squid proxy off and then back on again, that mozilla would still think that
squid is disabled.  that was caused by the fact that we never notify the PPS
when a proxy works, we only inform it when a proxy fails.  my solution is to
enable any proxy that i return from ExamineForProxy.  it seems to work well.
Attachment #151972 - Attachment is obsolete: true
(Assignee)

Updated

15 years ago
Attachment #152146 - Flags: superreview?(bryner)
(Assignee)

Updated

15 years ago
Attachment #152146 - Flags: superreview?(bryner)
(Assignee)

Comment 16

15 years ago
fixed-on-trunk
Status: ASSIGNED → RESOLVED
Last Resolved: 15 years ago
Resolution: --- → FIXED
(Assignee)

Comment 17

15 years ago
I also rev'd the IID for nsIProxyInfo ;-)
(Assignee)

Comment 19

14 years ago
*** Bug 207633 has been marked as a duplicate of this bug. ***

Comment 20

14 years ago
*** Bug 261765 has been marked as a duplicate of this bug. ***

Comment 21

14 years ago
Reporter, please set this fix to be blocking-aviary1.0, because it is core
functionality that must work to use Firefox and Thunderbird in corporate
environments.

Must the bug report be reopened as well?
> Must the bug report be reopened as well?

do NOT reopen the bug. also, you can set blocking-aviary? yourself.

Updated

14 years ago
Flags: blocking-aviary1.0+

Comment 23

14 years ago
Please do not set the blocking-aviary1.0+ flag unless you are a member of the
Aviary or drivers@mozilla.org teams. Thanks.
Flags: blocking-aviary1.0+ → blocking-aviary1.0?

Comment 24

14 years ago
If we consider this for aviary and 1.7, we should also look at the patch for bug
253190 which was a crash that seems to have been caused here. 

Comment 25

14 years ago
It would be really useful if we can find out whether or not any of our current
or prospective enterprise customers need this. 

Updated

14 years ago
Flags: blocking-aviary1.0? → blocking-aviary1.0-
*** Bug 261764 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.