Closed Bug 84798 Opened 23 years ago Closed 21 years ago

PAC: Failover (multiple return values for FindProxyforURL)

Categories

(Core :: Networking, enhancement, P2)

enhancement

Tracking

()

VERIFIED FIXED
mozilla1.4beta

People

(Reporter: benc, Assigned: darin.moz)

References

Details

(Keywords: embed, helpwanted, Whiteboard: [adt2 rtm], pac)

Attachments

(1 file, 13 obsolete files)

54.82 KB, patch
dougt
: review+
Details | Diff | Splinter Review
Using:

http://www.packetgram.com/pktg/proxy/pac/pacfiles/failover.pac.

Console shows that PAC resolves to two servers, one available, one not, but 
then it doesn't work.
Attached patch failvoer pac (obsolete) — Splinter Review
The first returned proxy server is not available, so presumably it should 
failover.

I need to check the javascript syntax, because it looks weird, but someone told 
me that was correct.
The proxy APIs only allow for one server, and so the pac stuff just picks the
first, as I understand it.
-->jpm
Assignee: neeti → jpm
Failover is specified in the PAC definition, and worked in Comm4.
+ lots of keywords.
Priority: -- → P1
There also is no UI for complete failure, which should ask if you go to direct
connection:
"
If all proxies are down, and there was no DIRECT option specified, the Navigator
will ask if proxies should be temporarily ignored, and direct connections
attempted. The Navigator will
ask if proxies should be retried after 20 minutes has passed (then the next time
40 minutes from the previous question, always adding 20 minutes).
"
http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/proxy-live.html 
Severity: normal → critical
Keywords: nsenterprise
*** Bug 85680 has been marked as a duplicate of this bug. ***
*** Bug 86874 has been marked as a duplicate of this bug. ***
Blocks: 85656, 86856
Blocks: 79893
The failover return which works in other browsers but not in mozilla/netscape
6.1pr1:

        return "PROXY full.its.deakin.edu.au:3128; PROXY
mini.its.deakin.edu.au:3128";
Blocks: 87272
No longer blocks: 87272
QA Contact: benc → pacqa
Serge, can you take a look at these PAC issues? Thanks - Jussi-Pekka
Assignee: jpm → serge
Attached patch patch (obsolete) — Splinter Review
The patch was co-authored with _basic@yahoo.com.

Parse the LocalFindProxyForURL output and check dns resolvability of each in
turn.  If nothing satisfiable is found, treat as a direct connection.
Keywords: patch, review
Umm, hold the phone on that one.  It appears I may be able to make a very minor
change and get the fix for bug 79057 much easier than I had expected to.
Keywords: patch, review
Attached patch patch v2, with fix for 79057 (obsolete) — Splinter Review
Take 2.  A week ago I could have sworn getting a wrapper around dns.resolve() to
work inside the PAC sandbox was very tough, but apparently I was wrong.  A one
line addition allows a fix for bug 79057 as well.

While I was at it I tweaked isInNet so it doesn't behave badly if dnsResolve
can't get an ip address for the hostname.

cc'ing gagan because he seems to have created the pac code, and mstoltz because
we were talking about security issues.
As usual, co-authored by basic.
Keywords: patch, review
Blocks: 79057
What about if the DNS resolves fine, but the proxy elected by PAC is down?
You're right -- this is only a partial solution.  Getting it to fall back
properly when it can't connect to its first-choice proxy would require chasing
this all over the netwerk code, and the scope of it seems beyond my present
knowledge.  As bbaetz said: "The proxy APIs only allow for one server, and so
the pac stuff just picks the first, as I understand it."  This is still the
case, but now the PAC stuff picks the first that is not obviously bogus.

Changing Platform/OS

the patch tingley attach is more of a workaround than a fix, but it is much
better than the current behavior. A real fix would involve rewriting the proxy
api, which I doubt will happen before bug 89928 is fixed.
OS: Windows NT → All
Hardware: PC → All
Still hoping for an r/sr for this.  The odds of making it before the 0.9.3
freeze are zero and dropping, but since I'll be away from July 30 - August 10,
if possible I'd like to get some sort of feedback on this this week.  In truth,
I'm more eager to get bug 79057 fixed than this one (as it's more broken, is a
more complete fix, and has more votes to boot); I'd be happy to split this patch
in two if it will speed things up any.
You realise that this isn't actually fallover - full fallover means that you
need to try the second proxy if the connection is refused, not just if the host
is not existing. (And, arguably, for some proxy errors, I guess) That would
require api changes, though.

A dnsresolve call on each call is expensive, but we cache it, so it really isn't
that important. I just don't think that it solves the problem.

r=bbaetz on the 79057 part. Is FindProxyForURL reentrant? if so, won't the
dnsresolve call serialise everything? If not, r=bbaetz on that part too.
As I've said before, this is completely right -- I didn't fully realize that
this bug was *not* filed simply against the particular shortcoming of the PAC
code until the patch was already posted.

I'd love to fix the real failover problem, and maybe I'll try, but it's a more
ambitious effort and I need to spend more time with the code and probably
inspire a bit more trust with the network crew first.  I have half a mind at
this point to postpone the non-79057 bit but whatever, it does a little bit of
good at least.

I'll look into the reentrancy issue.
*sigh*

Upon further consideration and discussion I'm icing this patch until I or
someone else produces something monstrous to fix the underlying issue.  In the
short term it adds little (as it is much for likely for the proxy to be
unreachable rather than unresolvable), and in the long run it is of dubious
value and would quite possibly just be rewritten.

I'm splitting off the bug 79057 stuff and posting it there.  Sorry for all the
noise.
Keywords: patch, review
No longer blocks: 79057
working on this
Blocks: 91630
Attached patch the first patch proposal (obsolete) — Splinter Review
Woah.  This appears to also fix bug 91630 and a number bugs on feedback in error
cases.  It also includes the alert() code from bug 86846.

I don't know enough about the thread architecture yet to comment on the event
stuff just by looking at it.  I'll try the patch out to get more feedback.  In
the mean time, a couple things:

* Is there something better than an array for keeping track of the servers
(gNodeArray)?  This code is begging for some sort of hashing.

* What's up with this:

+            if (pac == '' || pac.search("<TITLE>Not Found</TITLE>") != -1) {
+                //httpd can return <TITLE>Not Found</TITLE><H1>Not Found</H1>

Are the status/errorMsg params not enough?

* Rather than doing this to catch the error,

+            if (pac.search("FindProxyForURL") == -1) {

why not put a try block around the assignment
             LocalFindProxyForURL=ProxySandBox.FindProxyForURL;

iirc, this will currently throw an exception if FindProxyForURL does not exist
or is not well-formed.)

and nits (not counting obvious debugging code, etc)...
- the regexp pattern "/\s* \s*/" should be "/\s+/"
- the if(!canDoFailover) block contains the typo "rerurn"
- the PACF_TYPE_SOCKS{,4,5} constants should be "socks{,4,5}" rather than
  "sock{,4,5}" (iirc -- nsProtocolProxyService.cpp appears to return "socks" in
  a similar situation)
- s/Cer/Cr/("CereateAndPostMainThreadEvent")
- var PACF_* should be const PACF_*

cool.  i'm very eager to try this out.
This feature is pretty important, so lets ask any heavy hitters we need for
help, like brendan.

Also, is there any reason why this patch is addressing several bugs?
tingley, thanks for the comments,
I would like to notice the patch is *FIRST DRAFT PROPOSAL I posted mostly for
backup reason.

>* What's up with this:
>+            if (pac == '' || pac.search("<TITLE>Not Found</TITLE>") != -1) {
>+                //httpd can return <TITLE>Not Found</TITLE><H1>Not Found</H1>
as I said in comment client can get valid content, for example 
"the requested object does not exist on this server." from httpd and no
status/errorMsg is set. 
 
>why not put a try block around the assignment
> LocalFindProxyForURL=ProxySandBox.FindProxyForURL;
what is the reason to create new sandbox object from invalid source? 

No longer blocks: 86856
Status: NEW → ASSIGNED
fix for spring release.  marking nsenterprise-.
Attached patch the patch #2 (obsolete) — Splinter Review
the patch#2 is actually workable patch, 
tingley@sundell.net could you try to take a look on this, please?
some comments:

1- please add descriptive comments to each IDL method.

2- please put the nsHttpChannel code in a member function.
about the design:

is there any chance that the code for PAC failover can be handled at the socket
transport level completely instead of at the protocol level?  you may be able to
modify nsIProxyInfo to convey any necessary info down to the socket transport. 
why wouldn't such an approach be viable?
1&2 will be done.
I like the idea to handle  pac failover at socket transport level, but I'm not 
sure if it's possible to handle all JS dialogs on non ui thread? Any examples?
PSM launches dialogs from the socket thread... it probably involves
synchronously proxying the dialog invocation over the ui thread.
*** Bug 105419 has been marked as a duplicate of this bug. ***
>is there any chance that the code for PAC failover can be handled at the socket
>transport level completely instead of at the protocol level?
intuitively it feels like the socket transport level is an appropriate place to 
do so, and this is true for statically build pac failover list, but pac spec 
allows to change that list dynamically.
Here is one example how: 
function FindProxyForURL(url, host) {
 if (timerange(0, 0, 0, 0, 0, 30)) {//between midnight and 30 sec past midnight.
   return "PROXY proxy_0; PROXY proxy_1";
 }
 else if (timerange(0, 0, 0, 0, 1,0)) {//between midnight and 1 min past
   return "PROXY proxy_2; PROXY proxy_3";
 }                    
 else if (timerange(0, 0, 0, 0, 2,0)) { //between midnight and 2 min past
   return "PROXY proxy_4; PROXY proxy_5";
 }
...
for the same arguments FindProxyForURL() returns different proxies depends on 
the current time, it means from socket transport level we should be able to make 
js call FindProxyForURL(url, host) to get new proxy info, but we know nothing 
about what url we are processing at this point.
So it looks to me the protocol level is the right palace to handle pac failover. 
Darin, what do you think, am I wrong?
right, plus you don't want to have to deal with the fact that the socket
transport is running on a background thread... sync proxying to the UI thread in
order to call JS code should be avoided.
darin, serge: The proxyinfo object passed down to the socket transport stuff was
designed to be extensible. All you have to do is add a next attribute to it, and
extend the internal PAC interfaces to return a list of values, and then you have
all the data you need in a linked list - no calls to js needed. I don't know how
to do an out array param in idl, though.
bbaetz: but serge was saying that he wanted to query the PAC file each time...
am i missing something?
AIUI, serge was saying that he needs to query PAC each time.we make a
connection. I'm claiming that whilst pac needs to be queried, it doesn't have to
be the socket transport thread which does the querying.

Note that the http proxy code would need to be tweaked, especially the "should
we reuse this connection" logic.

Then theres also the SOCKS fallback to http logic (or the other way arround),
where we run into the same problems wrt people QIing for an http channel if
we're doing ftp.
bbaetz: timerange() function from pac spec allows to build proxy list 
dynamically (se my example above),
so we have to call  FindProxyForURL() to get the current proxy list...
PAC is called (when enabled) every time a proxyinfo object is created. This
happens in NewChannelFromURI (usually). The proxy info is passed into the
sockettransport service. These proxyinfo structures are not cached by anyone.

Or are you saying that if we start connection at 29 seconds past midnight, and
fail at 31 seconds past midnight, we need call pac again to see if anything has
changed?
That is exactly what I'm trying to say,
well, of course in real life this is very rare situation,
but why we cannot assume that same ISP wouldn't do some sort of balance loading 
or shuts down their set of proxy servers and configure user's pac to handle 
failover depends on particular time?
Then tough. We get a list at the time we call PAC, and we need to try every
entry in that line. Thats why there are multiple entries - we should not call
PAC again; its just too confusing. I doubt ns4 did that, and I really don't see
the point. Its a _lot_ of additional complexity for 0 gain.

Think about it - if all proxy servers are down, would you just keep calling pac
forever?

I do agree that it will need code in at least the http protocol. Doing socks<->
http is going to be impossible though, w/o a wrapper channel which we've all
agreed (in other bugs) won't work.
I don't think the time-based features was ever intended to create this type of
tight control...

If someone tried to connect right before midnight, then failed over, and talked
to an offline proxy, it would not be a big deal. They would fail to connect.
They would say some choice words, hit reload, and then re-PAC, and get the new
(post-midnight) directives and connect.

A possilbe feature to correct this would be possible in PAC2, which would be
allow a reload() function that could be time based.
Attached patch serge's patch v2 (obsolete) — Splinter Review
Attachment #50336 - Attachment is obsolete: true
darin, bbaetz: could you take a look on attachment id 56789, thanks.
+        } else if (pac == '' || pac.search("<TITLE>Not Found</TITLE>") != -1) {

Trying to sniff failure codes out of raw HTML is a bad idea.  Letting the js
parser validate the integrity/correctness of the PAC file (by calling
evalInSandbox()) is more heavyweight, but who cares?  It's an operation we only
have to do once (and if the PAC is valid, one we would have to do anyway), and
it will catch all sorts of things that string searches won't.
just try comment out
        }/* else if (pac == '' || pac.search("<TITLE>Not Found</TITLE>") != -1) 
{
...
        } else  if (pac.search("FindProxyForURL") == -1) { //check if there is 
function FindProxyForURL()
...
        }
        */
and for unknown pac, but valid http://host after evalInSandbox() you will get 
something like:
"Error: syntax error
Source File: .....
Line: 211 Source Code:
<TITLE>Not Found</TITLE><H1>Not Found</H1> The requested object does not exist 
on this server. The link you followed is either outdated, inaccurate, or the 
server has been instructed not to let you have it."
Do you think this is an apropriate message for the end user, rather than 
"ProxyPACNotFound"?
So you need to check if its http and the http response code starts with a 2.
(Yes, doing it that wasy sucks). For other protocols, just assume success if you
get data back.

You should print a useful error message if the js engine fails to parse the
script, as well.

Don't do pac.search() - let the js engine parse the file, and then check if
(FindProxyForURL in ProxySandBox). I tihnk that that will work.

I haven't had time to look at this patch in detail.
You are right, the new patch is following.
Attachment #56789 - Attachment is obsolete: true
Attached patch patch v3 (obsolete) — Splinter Review
Attachment #37708 - Attachment is obsolete: true
Attachment #41779 - Attachment is obsolete: true
Attachment #42177 - Attachment is obsolete: true
Attachment #46697 - Attachment is obsolete: true
Attachment #50182 - Attachment is obsolete: true
Attachment #50186 - Attachment is obsolete: true
No longer blocks: 91630
*** Bug 121732 has been marked as a duplicate of this bug. ***
Is there anything holding up the commit of the patch?  Is there anything that
I can do to speed up the commit?
This is a very nice to have feature. We need to get this fixed ASAP.
+helpwanted, mozilla1.0, nsbeta1.
serge, is there a reason you didn't nominate this for any of the post 6.2
milestones? Like mozilla 0.9.8, mozilla 0.9.9?
lack of review, and some internal reorg is keeping this bug untouched fro a long 
time.
reassign this to gagan for properly assignment in necko team.
Assignee: serge → gagan
Status: ASSIGNED → NEW
I believe the new owner for this code is in trudelle's team. Letting him make
the call on that. 
Assignee: gagan → trudelle
->morse
Assignee: trudelle → morse
nsbeta1- per Nav triage team.
Keywords: nsbeta1nsbeta1-
Target Milestone: --- → mozilla1.1alpha
+nsbeta again.

If you minus, please provide reason for nsbeta-. 

This is probably the most serious PAC bug.

This is a specified feature, and there is no workaround. If we don't support
this, we have to still test for handling multiple directives correctly, and a
couple other special conditions.

Historically, really large enterprise customers have selected PAC over manual
configs because it offers more reliable connectivity.
Keywords: nsbeta1-nsbeta1
QA Contact: pacqa → benc
This is only a problem if the primary server is not present, it works in the
overwhelmingly typical case.  Also, it is only an issue among enterprise
customers who use PAC, and we are not aware of any that consider this a
high-priority problem for MachV (if you know any, please cite them here).  It is
getting very late in the release cycle to be considering changes that have low
impact (severity X number of users), and a correct fix for this seems risky too.
 Finally, we have shipped a few times with this defect already.  Do you have
strong reasons why fixing this is critical for MachV?
This is a "chicken and egg" bug (some people think ALL bugs are this way, but
this one really is...)

The really big IT shops really like this feature. Even if PAC failover is needed
rarely, and is very slow (our Communicator failover was kind of sluggish), it
works. Every call from a down server costs money, esp. if your help desk is
outsourced.

Most really big IT shops will not consider using Netscape 6 unless it has
features like this, so they have not been complaining because they have not be
using it.

If the marketing is to sell to these big shops, I'm telling you now you want
this to work. When I supported PAC and Proxy server, I talked to NationsBank,
CSFB, Citicorp, Chase, JP Morgan, etc. (the names have changed, but you know
what I mean).

I don't see a lot of contributors that work for these organizations, so I've
either proven my point or completely missed the boat.

If we ship without this, expect this to be on the short list of things big
customers demand.

nsbeta1-, per Nav triage team. If they were demanding it, we'd plus.  cc
jhooker, will consider plus if he has strong demand from enterprise.
Keywords: nsbeta1nsbeta1-
I agree that this is a valuable bug to fix. Perhaps all that is needed is
someone verifying that Serge's patch is still valid and does work. Any volunteers?
Blocks: 142498
*** Bug 85656 has been marked as a duplicate of this bug. ***
cc'ing jhooker on this.  Jeff, please let us know whether or not this is 
considered essential for the enterprise folks.
PM chime in - this is considered essential for the enterprise deployments using PAC.
The patch v3 has bit-rotted -- specifically the "-80,52 +140,111" part of the 
patch for nsProxyAutoConfig.js.  Attaching a revised patch for just that one 
file.  Serge, can you check and see if the revised patch is still doing what 
your original patch did?

Also renominating for nsbeta1 based on jhooker's input (comment 69).
Status: NEW → ASSIGNED
The patch v3 does not compile.  Specifically, mReloadPACURLCount is undefined.

Serge, can you post a new patch?
OK, reviving serge's patch is going to take much more work than I originally 
thought.  By his estimates, it would take about a week of work.  Therefore I 
don't see any way that this one can make it for Netscape 7.0.

That's unfortuante since the enterprise group has now indicated that this is 
essential. 
I just talked with Jeff, and we agreed that this doesn't look like a good bet
for 7.0 at this point.  This will be important for enterprise customers who try
to deploy, but currently there are none, and it is doubtful that a large patch
with little/no immediate benefit will be allowed onto the branch the last week
before zarro boogs anyway.  We do expect this to be on the short list of fixes
that big enterprise customers will demand, and that will no doubt be a major
focus of the point release.  If anyone discovers how this can be fixed with less
work/risk, or finds a large customer who would deploy 7.0 except for this bug,
please renominate.
Target Milestone: mozilla1.1alpha → mozilla1.2alpha
Attaching much shorter and much safter patch.  It doesn't do everything that 
serge's patch did, such as putting up dialogs.  But it does do the failover, and 
that's what this bug report is all about.  And this patch has a better chance of 
being approved at this late date.

A significant difference between this patch and serge's is that the way I keep 
track of which retry I am up to is by adding a ProxyRetryCount field to the 
ProxyInformation.  Also I've added an mProxyRetryCount member to nsHttpChannel 
for the same reason.  This way I don't have to keep a node structure in 
nsProxyAutoconfig.js to maintain this state.  That was a substantial 
simplification.

Most of the changes in this patch are simply to account for this extra 
ProxyRetryCount member.  The actual changes to handle the failover itself are 
relatively small.

Renominating for nsbeta1 in the hopes that we can still get this in for Mach V.
Keywords: nsbeta1-nsbeta1
Attachment #57049 - Attachment is obsolete: true
Attachment #87136 - Attachment is obsolete: true
Comment on attachment 87761 [details] [diff] [review]
simpler patch that does just failover and nothing more

r=caillon (from morse's computer) with a few changes that I had him make. 
darin or someone in necko should probably sr= this.
Attachment #87761 - Flags: review+
Attachment #87761 - Attachment is obsolete: true
Comment on attachment 87765 [details] [diff] [review]
same patch but with minor changes per caillon's review

r=caillon, carried forward from previous patch
Attachment #87765 - Flags: review+
Comment on attachment 87765 [details] [diff] [review]
same patch but with minor changes per caillon's review

This code hooks into http. What about fallover involving socks proxies or even
DIRECT, which is often the last thing on the list? How will/can that be
handled?

Fallover from http->direct/socks with an ftp url, using this code, will end up
really confusing things, because the http protocol handler can't deal with
those, unless there is an http proxy involved.

Also, you can't have ProxyForURL requery the PAC js file, because the PAC file
could switch over based on the time of the day, or teh destination IP (think
roundrobin DNS) The entire string needs to be stored in the proxy info as an
array, then your requery code can just iterate through that.

I realise that we want a quick fix, and that those issues won't affect the vast
majority of installations, so long as you file a new bug, the only thing which
needs to be fixed is not doing fallback from an HTTP proxy to SOCKS or none if
the url type is not http/https AND you test fallback from http to none with,
say, and ftp url.

Do file a bug on the rest, though.
Attachment #87765 - Flags: needs-work+
Yes, you are correct that we are looking for a quick fix here.  Can you modify 
the existing patch (or post a new one) that also does the HTTP SOCKS failover 
and tests for non http/https.  Without someone posting a better patch quickly, 
we either go with this one now or we get nothing.

jhooker, what are your feelings about the functionality in the existing patch?  
Would this be better than nothing for the existing enterprise customers?
> This code hooks into http. What about fallover involving socks proxies or
> even DIRECT, which is often the last thing on the list? How will/can that
> be handled?

DIRECT is handled by the current patch.
I'm not really set up to do this ATM, or even to test (uni exams/assignments)

Does the patch handle an ftp url, where the pac file returnes:

PROXY some-unknown-host:1234, DIRECT

? If it does, then thats ok, but I see nsHttpChannel::ProxyPACFailover
explicitly calling NewProxiedChannel on nsHttpHandler::get(), and that won't
work for ftp. You should either a) be calling nsIIOService::getProtocolHandler,
(special casing http), then trying to QI to nsIProxiedProtocolHandler, or b) add
a newChannelWithProxyInfo method to the io service, and call that (factoring the
current NewChannelFromURI stuff out)

b) is cleanest and simplest, it already handles the special cases for you. You'd
change the nsIIOService code like this (uncompiled/untested):

NS_IMETHODIMP
nsIOService::NewChannelFromURI(nsIURI *aURI, nsIChannel **result)
{
    nsresult rv;
    NS_ENSURE_ARG_POINTER(aURI);
    NS_TIMELINE_MARK_URI("nsIOService::NewChannelFromURI(%s)", aURI);

    nsCOMPtr<nsIProxyInfo> pi;
    rv = mProxyService->ExamineForProxy(aURI, getter_AddRefs(pi));
    if (NS_FAILED(rv)) return rv;

    return NewProxiedChannelFromURI(aURI,pi,result);
}

NS_IMETHODIMP
nsIOService::NewProxiedChannelFromURI(nsIURI* aURI, nsIProxyInfo* pi, nsIChannel
**result) {
    nsCAutoString scheme;
    rv = aURI->GetScheme(scheme);
    if (NS_FAILED(rv)) return rv;

    nsCOMPtr<nsIProtocolHandler> handler;

    if (pi && !strcmp(pi->Type(),"http")) {
        // we are going to proxy this channel using an http proxy
        rv = GetProtocolHandler("http", getter_AddRefs(handler));
        if (NS_FAILED(rv)) return rv;
    } else {
        rv = GetProtocolHandler(scheme.get(), getter_AddRefs(handler));
        if (NS_FAILED(rv)) return rv;
    }

    nsCOMPtr<nsIProxiedProtocolHandler> pph = do_QueryInterface(handler);

    if (pph)
        rv = pph->NewProxiedChannel(aURI, pi, result);
    else
        rv = handler->NewChannel(aURI, result);

    return rv;
}

Which, apart from getting the scheme after the proxy info, rather than before,
is literally just adding a closing brace and a method declaration.

However, I know darin wanted to move this stuff out of the ioservice a bit, into
the protocolproxyservice, and even had a patch to do so. Not sure if this would
meet with his approval - he may want NewProxiedChannelFromURI there, but thast
just cut and paste in any event.

darin?
Comment on attachment 87765 [details] [diff] [review]
same patch but with minor changes per caillon's review

>Index: netwerk/protocol/http/src/nsHttpChannel.cpp

>+    // PAC failover stuff (mProxyRetryCount of -1 means no more proxies to try)
>+    if ((status == NS_ERROR_UNKNOWN_HOST || status == NS_ERROR_CONNECTION_REFUSED ||
>+            status == NS_ERROR_NET_TIMEOUT || status == NS_ERROR_NET_RESET) &&
>+            (mProxyRetryCount != -1) && mConnectionInfo->UsingHttpProxy()) {
>+        if (NS_SUCCEEDED(ProxyPACFailover(++mProxyRetryCount))) {
>+            // pac created new channel so don't propogate error status
>+            // to current channel's listener
>+            status = NS_OK;
>+        }
>+    }

code like this actually needs to live in nsHttpChannel::OnStartRequest.  you
need
to check the status of the transaction in OnStartRequest and suppress sending
mListener->OnStartRequest until you have made a successful proxy connection. 
as
is your patch results in multiple OnStart/OnStop request notifications, and
that
will cause all sorts of chaos in docshell land.

i have some ideas about how to solve this bug, and i think we need to have a
design meeting to come up with the best solution.  hacking HTTP like this will
work for HTTP proxies, but it doesn't help other proxy consumers.  i think we
should spend some time and explore other solutions.  we still have time...
let's
not rush in a partial solution, ok?

morse: can we meet this monday to discuss this bug?
PM chime in - per steve's comment (Attaching much shorter and much safter patch.
 It doesn't do everything that serge's patch did, such as putting up dialogs. 
But it does do the failover, and that's what this bug report is all about.  And
this patch has a better chance of being approved at this late date.) it sounds
as if this patch would be of value to the customer base. If a particular feature
does not work as expected it makes it difficult to get past the evaluation stage
to be even considered for deployment. Again, the summary (does do the failover)
sounds like a valuable resolution.
Adding nsbeta1+ and [adt2 rtm] per nav triage team.
Keywords: nsbeta1nsbeta1+
Whiteboard: [adt2 rtm]
Blocks: 152663
reassigning to darin who said he might have time to do some work on it.
Assignee: morse → darin
Status: ASSIGNED → NEW
actually, i'm going on vacation tomorrow and won't be able to look at this until
mid-july.  how critical for RTM is this?
*** Bug 154575 has been marked as a duplicate of this bug. ***
->morse (per darin's comments)
Assignee: darin → morse
reassigning to darin at his suggestion.
Assignee: morse → darin
Whiteboard: [adt2 rtm] → [adt2 rtm], pac
future (until we find a real PAC owner or until i find time)
Target Milestone: mozilla1.2alpha → Future
*** Bug 167491 has been marked as a duplicate of this bug. ***
batch: adding topembed per Gecko2 document
http://rocknroll.mcom.com/users/marek/publish/Gecko/Gecko2Tasks.html
Keywords: topembed
*** Bug 169128 has been marked as a duplicate of this bug. ***
removed jhooker from cc: list
Changing from topembed, to embed, as this is not currently blocking a major
embedding customer.
Keywords: topembedembed
Keywords: mozilla1.0
Target Milestone: Future → ---
Blocks: 125875
-> moz 1.3 beta
Severity: critical → major
Status: NEW → ASSIGNED
Priority: P1 → P3
Target Milestone: --- → mozilla1.3beta
Coming here from bug #79893 :

In a related incident to comment #13:

Our proxy.pac file is loaded as:
http://internal.server/path/proxy.pac (pretty generic)

function FindProxyForURL(url, host)
{
    if (dnsDomainIs(host,"internal")) {
        return "DIRECT";
    } else

     if (isInNet(myIpAddress(),"192.168.0.0","255.255.255.0")) {
        return "PROXY proxy1.internal:3128; PROXY proxy2.internal:3128; DIRECT";
     } else
       return "DIRECT";
}

In this file I respect the specs with host:port.

Mozilla 1.2b behaviour:
always uses DIRECT.

*** Bug 190068 has been marked as a duplicate of this bug. ***
*** Bug 190053 has been marked as a duplicate of this bug. ***
flagged for 1.3b.
Need to know if this is planned for 1.3b so I can start working on testcases.

Deferral to 1.3f is fine from a testing and feature perspective, but waiting for
1.4 is probably too much, because it would miss a lot of 1.3 branch based products.
Flags: blocking1.3b?
Summary: PAC: Failover does not work. → PAC: Failover does not work
no plan to work on this for mozilla 1.3.  maybe during the 1.4 cycle.
Target Milestone: mozilla1.3beta → mozilla1.4alpha
Flags: blocking1.3b? → blocking1.3b-
I agree strongly with Comment #102: "Deferral to 1.3f is fine from a testing and
feature perspective, but waiting for 1.4 is probably too much, because it would
miss a lot of 1.3 branch based products."

Please maintain the 1.3f target.

*** Bug 192604 has been marked as a duplicate of this bug. ***
*** Bug 190086 has been marked as a duplicate of this bug. ***
might still happen for 1.4 ...
Target Milestone: mozilla1.4alpha → mozilla1.4beta
Same here.

I can't speak for the enterprise corporation I work for, other than I'm a
Unix/Network sysadmin for about 3000 hosts (500 users) in a division of said
corporation.

Corporate (central) IT has asked us to use/implement the official proxy.pac
autoconfiguration script since they use it seamlessly on all of their windows/IE
boxes to handle failover between our different sites - and this is their method
  to handle seamless proxy/firewall upgrades for end customers.

Example autoconfiguration script:

http://internal.proxypac.com/proxy.pac

function FindProxyForURL(url, host)
{

// blah blah blah - filtered out but here's the sticker
proxyPath = "PROXY MIDWESTPROXY:74; PROXY WESTCOASTPROXY:74";

return proxyPath;
}

So when upgrades to MIDWESTPROXY happen all windows/IE users bounce
automatically to WESTCOASTPROXY; all unix/mozilla users that I support have to
reset their proxy to MANUAL and choose WESTCOASTPROXY.

I'd really like to see this happen...

Thanks!
Blocks: 200456
Severity: major → enhancement
Priority: P3 → P2
Thanks for the great work! The flurry of 1.4 related activity over the past few
weeks is impressive. 

I see 80918 (Proxy: "No Proxy for:" does not support ip address wildcards) was
checked in yesterday. Can this bug (PAC failover) also be prioritized for 1.4b? 

Either way, given the number of other Mozilla 1.4 fixes and enhancements, we
will likely deploy it (or a Mozilla 1.4 based Netscape release) corporately.
Attached patch v2 patch (obsolete) — Splinter Review
this patch is a work in progress.  i've taken a simpler route i think.	instead
of querying the PAC component multiple times, i query it once and build up a
list of nsIProxyInfo objects.  each has an owning reference to the next.  this
meant moving the PAC string parsing into nsProtocolProxyService.

the HTTP code picks up errors in OnStartRequest and tries to failover to the
next nsIProxyInfo.  if there isn't a next nsIProxyInfo, then i try a direct
connection (note: this will happen even in the case of manual proxy
configuration, which is i think a nice feature).  the HTTP code creates a new
channel and does a "load replace" ... this patch just copies code from the 3xx
redirect code.	the two code paths should overlap to avoid duplicate code.

finally, it might be good to remember failures somewhere... probably
nsProtocolProxyService, but that'll require a method call from the HTTP code. 
i might add a "MarkFailed" method on nsIProxyInfo.  the implementation of that
interface could then add something to a global hash table to remember the
failure.  we could expire these remembered failures after some interval (maybe
10 seconds or 1 minute or something progressive).

anyhow, with a little bit of cleanup, this patch should be good for 1.4 beta.
Attachment #87765 - Attachment is obsolete: true
Attached patch v2.1 patchSplinter Review
revised patch.	this one is ready for reviews.	changes include:

1- no longer automatically failover to direct connections.  only do this if
explicitly given in the PAC file.  a number of reasons why this is good: a)
compatible with expectations, b) DIRECT could appear between PROXY decls, c)
failover to DIRECT may not be desirable, d) messes up proxy error reporting..
we end up reporting the last error, which would be errors to directly connect. 
enough reasons to not go there right now IMO.  later on we could add a pref to
always failover to DIRECT.

2- don't bother remembering failures.  i think it would be best to get the
basic functionality in place first, and then go back and add this part later.

3- merged ProcessRedirection and ProxyFailover code.  created new helper method
on nsHttpChannel called SetupReplacementChannel.
Attachment #120339 - Attachment is obsolete: true
Comment on attachment 120546 [details] [diff] [review]
v2.1 patch

bbaetz: can you please review this patch.  i think you'll find it fairly
straightforward.. just please check out my previous comments outlining the
changes.  thx!
Attachment #120546 - Flags: review?(bbaetz)
Comment on attachment 120546 [details] [diff] [review]
v2.1 patch

Yeah, don't fall back to DIRECT. Lots of organisations block direct access, and
those which don't will list it at the end.

This looks fine, but I'm really not in a position to test now, and won't be for
the next couple of weeks at least.
bbaetz: hrm.. i have tested this fairly well.. if you have suggested tests for
this bug, could you please list them here.  i can then run your tests =)
Attachment #120546 - Flags: superreview?(alecf)
Attachment #120546 - Flags: review?(dougt)
Attachment #120546 - Flags: review?(bbaetz)
Comment on attachment 120546 [details] [diff] [review]
v2.1 patch

this all looks fine to me. I don't know if I'm really much of an expert here
but it looks like the meat of this is SetupReplacementChannel and 
ProxyFailover, both of which seem fine. My only question would be if  we
already had some kind of code for replacing a channel that we could leverage

oh, one nit:

+    if (httpChannel) {
+	 if (preserveMethod) {
+	     if (mUploadStream) {

...
+		 }
+	     }
+	     // must happen after setting upload stream since SetUploadStream
+	     // may change the request method.
+	    
httpChannel->SetRequestMethod(nsDependentCString(mRequestHead.Method()));
+	 }

can we combine some of that if() logic, to reduce indentation here?

sr=alecf with that.
Attachment #120546 - Flags: superreview?(alecf) → superreview+
alecf: yeah, i think actually i can get rid of maybe two levels of indentation.
 thx!
Comment on attachment 120546 [details] [diff] [review]
v2.1 patch

nit: f you going to add your name AND gagan's name to the
nsIProxyAutoConfig.idl file, you should also add the rest of the contributors.

nit: indent your case's:
+    switch (len) {
+    case 5:

looks fine to me.
Attachment #120546 - Flags: superreview?(alecf)
Attachment #120546 - Flags: superreview+
Attachment #120546 - Flags: review?(dougt)
Attachment #120546 - Flags: review+
fixed-on-trunk :-)
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
FYI: to resolve this bug, I'm going to do a two-PROXY run.

I'll eventually add tests for greater than two, as well as multiple directive
testing. 
Attachment #120546 - Flags: superreview?(alecf) → superreview+
This check-in have added a warning on Brad TBox:

+netwerk/base/src/nsProtocolProxyService.cpp:425
+ `const char*hostEnd' might be used uninitialized in this function
Aleksey: thanks!  (fortunately, not a real bug.)
*** Bug 204598 has been marked as a duplicate of this bug. ***
*** Bug 205798 has been marked as a duplicate of this bug. ***
Bug does not seem to be fixed. My Mozilla 1.4b refuses to load proxy.pac file 
from remote http server. Reload button does not do it either. Actually it seems 
that Reload button does nothing. Not to say that if I enter incorrect URL and 
click Reload button it just does not warn me about. When I downgrade to Mozilla 
1.3 everything works fine excluding the failover. 

Anybody else with similar problem ? Could it be that I upgraded from 1.3 to 
1.4b ? May be clean install is needed ?
Hristo: yes, a fresh install is required.  bug 205514 tracks the problem you're
experiencing.
VERIFIED: all plats, Mozilla 1.4b

I'm updating the testcases and working through the releated bugs as well.

I've filed bug 207633 for the remaining issues about failover behavior (failover
is not stateful, it doesn't remember that a server was unavailable, so it has to
have a connection error and then failover every time PAC is called).

This bug has gone through two or three design cycles, and has enough drift about
the expected behavior to confuse most people.

If you have a problem post 1.4b, please look for duplicates and file a new bug
if needed.
Status: RESOLVED → VERIFIED
QA Contact: benc → pacqa
Summary: PAC: Failover does not work → PAC: Failover (multiple return values for FindProxyforURL)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: