Closed Bug 238654 Opened 20 years ago Closed 20 years ago

[FIXr]Content-location possible regression in 1.7b: may 'fail' on many servers

Categories

(Core :: DOM: HTML Parser, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla1.7final

People

(Reporter: wolruf, Assigned: bzbarsky)

References

()

Details

(Keywords: fixed1.7, regression)

Attachments

(2 files)

After fix for bug 109553 was committed, one major server failed because of HTTP
content-location badly configured: bug 231072.

After 1.7b was released, it seems like this rate is increasing: bug 238156, bug
238314, and now a bank site in bug 238626.

The worrying part is that 10% of the IIS servers seem to be wrongly configured
on the Internet (not to mention the Intranet servers), according to:
http://www.securityspace.com/s_survey/data/man.200402/firewalled_cloc.html

[...]
Number of IIS Servers that revealed an IP in the Content-Location tag different
from the IP accepting the connection    75,938    9.40%
[...]

1.7 will to be the first non-beta release of Mozilla (Firefox 0.8 doesn't have
it) to ship with this new feature and I'm afraid it could fail on many servers.

Possible solutions:
 - a pref to disable it ? by default ?
 - not taking care of content-location when served by IIS and Oracle9 (the
servers we've seen so far) ?

Mozilla has done something similar with HTTP pipelining in the past:
http://lxr.mozilla.org/seamonkey/source/netwerk/protocol/http/src/nsHttpConnection.cpp#236

(originally posted on n.p.m.netlib on march 22nd)
it must be possible to implement the feature differently.
If an image is missing, the whole pageload doesn't stall for 10 minutes before
it loads with a "broken image". It would be a disaster to the user experience.

So why allow other broken tags cause pauses like that? Isn't it possible to
detect and deal with the missing path, throw it aside and and let the browser
fall back to "life before bug 109553 was fixed"?
QA Contact: ian
Flags: blocking1.7?
> If an image is missing, the whole pageload doesn't stall for 10 minutes

No, but if a script is missing it does.....  and has to to support
document.write.  :(

What exact failures are we seeing in those evang bugs?  Bogus IPs?  Those should
be pretty fast (DNS lookups giving "no such host" are fast), but could result in
bad rendering.  A bigger problem is random ports specified on which the server
doesn't respond and so we have to wait out a TCP timeout....

If someone can collect a list of the actual hostnames and content-location
headers for the affected sites, that would be great.

I do think we should decide this one way or the other (fix, wontfix, back out
bug 109553, whatever) for 1.7; I'd rather not ship several slightly different
versions of content-location support.
> What exact failures are we seeing in those evang bugs?  Bogus IPs?

Trying to summarize the issues in the content-location header from the various
bugs reported so far:

bug 238156: bogus port (3003), bad rendering (images don't load),
bug 238314: bogus IP (192.168.130.34), can't load site after home page,
bug 238626: bogus port (8080), bad rendering, doesn't finish loading,
bug 231072: bogus port (3002), bad rendering (CSS+images),
 
> If someone can collect a list of the actual hostnames and content-location
> headers for the affected sites, that would be great.

using http://web-sniffer.net/

bug 238156:
http://www.gv.com.sg/Booking/servlet/BkShowDetailsServlet?gvAction=buyTickets
Content-Location: http://www.gv.com.sg:3003/Booking/BkSelectShowTime.jsp

bug 238314:
http://www.petro-points.com/
Content-Location: http://192.168.130.34/ns-pp-prod/index.html

bug 238626:
https://ibank.amsouth.com/
Content-Location: http://ibank.amsouth.com:8080/home.html

bug 231072:
http://pub.tv2.no/nettavisen/spray/slideshow/article195382.ece
Content-Location: http://pub.tv2.no:3002/dyn-nettavisen/spray/slideshow/article.jsp
OK.  I would be perfectly happy to do any of the following:

1)  Ignore content-location from those two server types
2)  As #1, but only if the content-location is the same host but a different port
3)  Anything else reasonable that's suggested.

Darin?  What do you think?
To repeat the steps above with the Server: lines and a few more sites
that I happen to know get broken because of Content-Location headers:

bug 238156:
http://www.gv.com.sg/Booking/servlet/BkShowDetailsServlet?gvAction=buyTickets
Content-Location: http://www.gv.com.sg:3003/Booking/BkSelectShowTime.jsp
Server: Oracle9iAS/9.0.2 Oracle HTTP Server Oracle9iAS-Web-Cache/9.0.3

bug 238314:
http://www.petro-points.com/
Content-Location: http://192.168.130.34/ns-pp-prod/index.html
Server: Microsoft-IIS/5.0
X-powered-by: ASP.NET

bug 238626:
https://ibank.amsouth.com/
Content-Location: http://ibank.amsouth.com:8080/home.html
Server: Microsoft-IIS/5.0

bug 231072:
http://pub.tv2.no/nettavisen/spray/slideshow/article195382.ece
Content-Location: http://pub.tv2.no:3002/dyn-nettavisen/spray/slideshow/article.jsp
Server: Oracle9iAS/9.0.2 Oracle HTTP Server Oracle9iAS-Web-Cache/9.0.3.1.0
(TH;max-age=214364716+0;age=66945)

bug 237372:
https://e.pkobp.pl/ssl/
Content-Location: http://e.pkobp.pl:81/ssl/Default.htm
Server: Microsoft-IIS/4.0

http://www.stj.gov.br/webstj/
Content-Location: http://www.stj.gov.br/webstjDefault.htm
Server: Microsoft-IIS/5.0

http://www.hihostels.com/openHome.sma
Content-Location: http://www.hihostels.com/pages/layouts/main.jsp
Server: Oracle9iAS (1.0.2.2.1) Containers for J2EE

http://bookwire.com/bookwire/
Content-Location: http://bookwire.com:3170/bookwire/bookwire.html
Server: Microsoft-IIS/4.0

http://www.deutschebahn.sagenet.co.uk/
Content-Location:
http://www.deutschebahn.sagenet.co.uk/d/e/u/deutschebahn_sagenet_co_uk/Default.htm
Server: Microsoft-IIS/5.0

https://www.motortax.ie/mtoapp/logout.do
(click the "Exit" button at the top right of http://www.motortax.ie/)
Content-Location: https://www.motortax.ie/pages/
Server: Oracle9iAS/9.0.2 Oracle HTTP Server

http://www.personal-system.com/
Content-Location:
http://www.personal-system.com/publicSite12/NMORETTI001/Default.htm
Server: Microsoft-IIS/5.0


So if we want to sniff for servers, it looks like we have to search for any of:
 - an exact match for "Microsoft-IIS/4.0"
 - an exact match for "Microsoft-IIS/5.0"
 - a leading substring match for "Oracle9iAS (1.0.2.2.1)"
 - a leading substring match for "Oracle9iAS/9.0.2"

I can't see any heuristics that would work out if the Content-Location is
reliable or not.
Well, all the content-locations fall into the following categories:

1)  Private IP address range
2)  Different port
3)  Ends in Default.htm

But yes, just killing off servers (prefix match on "Microsoft-IIS/" and
"Oracle9iAS", I would guess) would work....

If we can get some consensus here, I'll whip up a patch.  Darin?
(In reply to comment #6)
> 
> But yes, just killing off servers (prefix match on "Microsoft-IIS/" and
> "Oracle9iAS", I would guess) would work....

I think we should make sure we include version numbers; we don't want to 
prematurely pigeon hole future products into this particular quirks mode.
I agree.  IIS/6.0 might not have this bug.
according to http://support.microsoft.com/default.aspx?scid=kb;en-us;218180 and
http://support.microsoft.com/default.aspx?scid=kb;EN-US;244998
IIS/4.0 & 5.0 are affected, 6.0 seems to be ok, eventhough it's optional and can
be configured by an admin (may be a few cases, which can be treated by Tech
Evangelism then ?):
http://support.microsoft.com/default.aspx?scid=kb;en-us;245099
Those articles just talk about the IP screwup and a possible port screwup.  They
only cover some of the problem cases we are seeing; in particular, they don't
cover the cases when a bogus port is stuck in the content-location, nor do they
cover the case of totally bogus content-location (a la the
http://www.personal-system.com/ example and others in comment 5).

I'm fine with including specific versions only in the blacklist, but we should
include IIS 6.0.

Note also that those articles give me the impression that MS is treating this
header as meaning something different than what we think it means, which does
not give me much hope for it changing in IIS any time soon.
I hope it is not intended to ship 1.7 with this bug.
As a matter of fact, I was testing a patch as you wrote that.  Patience, please.  ;)
Attached patch Proposed patchSplinter Review
I felt the pref justified, so people can easily deal with new bogus servers
popping up without recompiling.

I also still feel that we should just blacklist IIS altogether, because, as I
said, it looks to me from the MS documentation that they disagree with us on
what this header actually means.
Assignee: parser → bzbarsky
Status: NEW → ASSIGNED
Attachment #145378 - Flags: superreview?(darin)
Attachment #145378 - Flags: review?(bugmail)
Priority: -- → P1
Summary: Content-location possible regression in 1.7b: may 'fail' on many servers → [FIX]Content-location possible regression in 1.7b: may 'fail' on many servers
Target Milestone: --- → mozilla1.7final
Comment on attachment 145378 [details] [diff] [review]
Proposed patch

looks good, except i'm not too happy with the preference name.	IMO
"network.http." should be for necko's http implementation only.
Attachment #145378 - Flags: superreview?(darin) → superreview+
Hmm... In a way, this is affecting our "HTTP" support, which is why I put it in
this namespace.

But I'm open to suggestions on pref names, if someone feels something else would
be better.
No performance issue with checking the pref branch on every hit or anything?
Not really.  This would happen once per pageload, basically.  That's absolutely
negligible.
And it would only happen on pages that have a content-location header.
> Hmm... In a way, this is affecting our "HTTP" support, which is why I put it in
> this namespace.

sure, it is affecting our HTTP support, but then i think there's something to be
said for having pref namespaces correspond to code modules.  however, it's not
like we've been very consistent at all :-/

note: we only honor this preference for some consumers of HTTP.  suppose some
other consumer wants to inspect the content-location header.  because we don't
put this in necko, we don't affect all consumers.  hence, it seems strange to
mention this as a necko preference.  it's only changing the behavior of gecko's
layout engine.
OK, I buy that.  jst, would this pref be OK in the "dom." namespace?  Or should
it be in the "browser." namespace?
(In reply to comment #17)
> Not really.  This would happen once per pageload, basically.  That's absolutely
> negligible.

Cool, just checking.
Comment on attachment 145378 [details] [diff] [review]
Proposed patch

Wow, this is so sad. Do we have any idea what IE does?

Anyhow, r=me
Attachment #145378 - Flags: review?(bugmail) → review+
Comment on attachment 145378 [details] [diff] [review]
Proposed patch

Actually, do you think we'll be adding more HeaderCallback functions? Otherwise
i'd just add a |if (ContentLocationOK(...))| in ProcessHeaderData. No biggie
though
Sicking, IE simply doesn't support this HTTP header.

I'd rather keep the table-driven style of this code so that new headers are easy
to add.  And yes, I do suspect that we may need more callbacks like that if we
add any more headers not supported by IE....
Comment on attachment 145378 [details] [diff] [review]
Proposed patch

Could this be approved for 1.7?  This disables support for the content-location
header when we encounter a server known to have a high probability of sending a
bogus content-location.
Attachment #145378 - Flags: approval1.7?
Summary: [FIX]Content-location possible regression in 1.7b: may 'fail' on many servers → [FIXr]Content-location possible regression in 1.7b: may 'fail' on many servers
Comment on attachment 145378 [details] [diff] [review]
Proposed patch

a=chofmann for 1.7
Attachment #145378 - Flags: approval1.7? → approval1.7+
Fix checked in, with the pref renamed to
"browser.content-location.bogus-servers", but backed out due to the typical
Windows bustage.

Anyone have any idea why the following code:

344  static const HeaderData headers[] = {
345    { "link", nsnull },

Gives the errors:

c:/builds/tinderbox/firefox/WINNT_5.0_Clobber/mozilla/content/base/src/nsContentSink.cpp(345)
: error C2440: 'initializing' : cannot convert from 'char [5]' to 'const struct
HeaderData'
c:/builds/tinderbox/firefox/WINNT_5.0_Clobber/mozilla/content/base/src/nsContentSink.cpp(345)
: error C2440: 'initializing' : cannot convert from 'const int' to 'const struct
HeaderData'
c:/builds/tinderbox/firefox/WINNT_5.0_Clobber/mozilla/content/base/src/nsContentSink.cpp(345)
: fatal error C1903: unable to recover from previous error(s); stopping compilation

?
hmmm, maybe the char needs to be a "const char*"?
Maybe.  I seem to recall that breaking the Windows compiler (which is why I did
not do it), but maybe.  Could someone with a Windows env try it?
Checked in, no bustage this time.  Fixed for 1.7.  I guess a bunch of those
evang bugs can be resolved now....
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Blocks: 239798
See Bug 239845, there the server is a "Oracle-Application-Server-10g/9.0.4.0.0
Oracle-HTTP-Server" with bogus ports. Should this server name be added in the pref?
If that's a common server, I suppose we could add it to the list.  Otherwise we
should evangelize them.  Or we could add "Oracle" to the list in general, I guess.  
*** Bug 240758 has been marked as a duplicate of this bug. ***
See Bug 241094: Site is broken even with current 1.8a builds like 2004041809 Win
or 20040419 Linux. Is the patch only in the 1.7 branch?
the site in Bug 241094 uses the following: "Oracle9iAS (9.0.2.2)", which is not
checked for in the patch ("Oracle9iAS/9.0.2,Oracle9iAS (1.0.2.2.1)").
Seems Mozilla should check even more browsers...
Yep.  So everyone in favor of blacklisting that oracle server altogether raise
your hand?  They change the server line for every single minor revision, so
trying to deal is just a whack-a-mole exercise.
Fine by me; it's not a major server family anyway.
shouldn`t we try to contact them ?
Sure.  So I propose we convert bug 241094 to evang on the server and I'll change
the pref to just blacklist that server family if darin is OK with it.
(In reply to comment #39)
> Sure.  So I propose we convert bug 241094 to evang on the server and I'll change
> the pref to just blacklist that server family if darin is OK with it.

you don't think trying to evang someone, about something that works perfectly
fine on all browsers (once this server is blacklisted), is somewhat unlikely to
work?
We tell them that their bug limits their customers' ability to use a very useful
HTTP feature.  <shrug>.  And maybe threaten them with a "list of shame" like thing.
Depends on: 241094, 241320
Depends on: 241402
To summarize the impact this has on Oracle:

Mozilla 1.7 will ignore a Content-Location header sent by the following Oracle
application servers (identified by Server header):

  Oracle9iAS/9.0.2
  Oracle9iAS (1.0.2.2.1)

This change makes Mozilla 1.7 behave like older versions of Mozilla, Netscape,
and IE 6 -- w.r.t. its handling of the Content-Location header -- when talking
to an instance of one of these servers.

The point to emphasize is that installations of these servers that worked with
Mozilla 1.6 will be unaffected by the changes made to Mozilla 1.7 to support the
Content-Location header.

Mozilla 1.7 will honor any Content-Location header sent by an Oracle server,
whose Server header does not match the list given above.

Also, the patch in bug 241320 proposes ignoring a Content-Location header sent
by any server with a Server header starting with "Oracle9iAS".  At the moment,
that patch has not been checked into the 1.7 branch.

Chofmann is in touch with folks at Oracle regarding this issue.  He is working
to find the right folks within Oracle to possibly suggest a better Server header
"filter" string (or even a better solution for Oracle9iAS).  He said he'd be
commenting in this bug when he has more info.
Depends on: 239845
Depends on: 234778
Note that issues with two more servers have come up:

  Server: HDS/1.0 (GZ/44)
  Server: Orion/1.5.2
Depends on: 241652
One has to wonder why so many servers send this header with the wrong value?
Anyway, might it be better to use a whitelist or compleatly disable this feature
for 1.7?
> One has to wonder why so many servers send this header with the wrong value?

IIS just does (and MS knows about it).  For the rest, they do it because IE
doesn't support it, of course.

If we disable this feature, we may as well back it out.  It's not going to be
better for 1.8.

I'm fine with a whitelist, but Darin seems opposed...
I think a whitelist is conceptually the wrong way to go. I don't want to have to 
make sure I claim to be Apache just because I want to use Content-Location.

And I do think we should support this; much like Link: headers, it is something 
that I would actually use.

Currently we have four server families that require blacklisting. Our DOCTYPE 
blacklist is in the high double figures, so I don't think this is a big issue.
while I agree that a whitelist is the WrongThing it sucks that so many random
servers send a broken value. The current DOCTYPE list took us a lot longer to
arrive at then we've had between enabling Content-Location and creating a stable
branch. This is why I think disabling (or whitelisting) could be a good thing to
do for 1.7. We would then have a few months to iron out a blacklist for 1.8.

Hixie does bring up a good point though, a whitelist might end up making servers
pretend to be Apache just so they can support Content-Location. So my vote goes
towards disabling this compleatly for 1.7 and then reenable and finalize the
blacklist for 1.8 and forward.
If this is really that big a problem then yeah, let's back it out on the 1.7 
branch and get it right in 1.8.
Depends on: 241065
1. I think I agree that diabling completely for 1.7 is the path that we should
take, based on 1.7 as long lived branch, desire for max backward compatibilty
with past behavior in 1.7, and lowest possible number of percieved regressions,
etc...

2. I also think that if it's not to much work it would be good to enable an
on/off pref *and* whitelist/blacklist prefs to allow folks that are interested
in continuing to work with this, and test out the feature easily with any
distributed build.   That would also make choices for which selections go in the
mozilla deafult builds, and other distributions easy to make, and allow it to be
easily turnned on again for 1.8.

can someone drum up a patch for 1, so we can get this into the next RC?
> can someone drum up a patch for 1, so we can get this into the next RC?

It's just a straightforward CVS backout.  I don't have time to turn the relevant
cvs up commands into a patch, but if someone else does I'll review it. 
Branch-only backout, mind you.
note that tinderbox/bonsai will generate the cvs commands to do the backout...
Depends on: 241981
Attached patch BackoutSplinter Review
This should do it.
Comment on attachment 147337 [details] [diff] [review]
Backout

In light of bug 241981 I propose to check this in on both trunk and branch and
just let the matter be.
Attachment #147337 - Flags: superreview?(darin)
Attachment #147337 - Flags: review?(darin)
Attachment #147337 - Flags: approval1.7?
Comment on attachment 147337 [details] [diff] [review]
Backout

r+sr=darin for backout on both the trunk and 1.7 branch.
Attachment #147337 - Flags: superreview?(darin)
Attachment #147337 - Flags: superreview+
Attachment #147337 - Flags: review?(darin)
Attachment #147337 - Flags: review+
Comment on attachment 147337 [details] [diff] [review]
Backout

a=chofmann for 1.7
Comment on attachment 147337 [details] [diff] [review]
Backout

Setting flag per chofmann's comment.
Attachment #147337 - Flags: approval1.7? → approval1.7+
Keywords: fixed1.7
Comment on attachment 147337 [details] [diff] [review]
Backout

Backout checked in, trunk and branch.
You need to log in before you can comment on or make changes to this bug.