Closed Bug 335909 Opened 15 years ago Closed 14 years ago

Sporadic success viewing Amazon / Wikipedia site

Categories

(Core :: Networking: HTTP, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.8.1beta2

People

(Reporter: starakan, Assigned: darin.moz)

References

()

Details

(Keywords: fixed1.8.1, regression)

Attachments

(5 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.2) Gecko/20060308 Firefox/1.5.0.2
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.2) Gecko/20060308 Firefox/1.5.0.2

When I try to access Amazon.com, I have sporadic success. Much of the time, I receive a blank page. Usually, if I refresh the page anywhere from 5 to 10 times, the page appears. But when I try to access another part of Amazon, the same thing happens. Repeated refreshes sometimes causes the page to appear, but usually it becomes so frustrating, I switch over to IE to use Amazon.

Reproducible: Always

Steps to Reproduce:
1. Go to http://www.amazon.com
2. If blank page appears, refresh screen
3. Continue refreshing screen to see if page appears (it usually does not)

Actual Results:  
Frustration.

Expected Results:  
Success.
WFM using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.2) Gecko/20060308 Firefox/1.5.0.2
Have you tried emptying your cache?
Does it work in Firefox's safe-mode or with a new profile?
http://kb.mozillazine.org/Safe_Mode_(Firefox)
http://kb.mozillazine.org/Profile_Folder
I see this issue as well on my XP Pro system. It only seems to happen at Amazon and it ONLY seems to happen on this one particular machine out of the three running XP Pro that I regularly use (two at home, one at work).
confirming, because putterman tells me he sees this too.

putterman tells me:

"Either ever since I upgraded to 1.5 (I'm on 1.5.0.4 ) or I got a new computer (a Dell XP 400 running XP Home) I have been unable to use amazon.com in Firefox.  The majority of the time, I get blank pages whenever I go to Amazon or attempt to navigate through the site.  Refreshing multiple times can often make the page show up." 
Status: UNCONFIRMED → NEW
Ever confirmed: true
gavin / adam, have you heard of this sort of issue?

I think I've seen it with wikipedia, too.
Can't say that I have... but I'm mostly using Linux trunk these days.
> I think I've seen it with wikipedia, too.

I'm seeing it (or what I think is it) on wikipedia right now with my debug trunk build.
I'm not sure if this is related to the amazon issue, but here's what I'm seeing with wikipedia

I searched for "sith wiki" on google and found http://en.wikipedia.org/wiki/Sith and then I clicked on that link.

(yes yes, I'm a nerd, you caught me.)

when I first tried to load that page, it failed.

My guess is that we cached it somehow, because on reload (not, not shift reload) here's what I'd get back from the server:

HTTP/1.0 304 Not Modified
Date: Mon, 10 Jul 2006 21:29:42 GMT
Server: Apache
ETag: W/"enwiki:pcache:idhash:68184-0!1!0!0!!en!2--20060710194242"
Cache-Control: private, s-maxage=0, max-age=0, must-revalidate
Vary: Accept-Encoding, Cookie
Vary: Accept-Encoding
X-Cache: MISS from srv6.wikimedia.org
X-Cache-Lookup: MISS from srv6.wikimedia.org:80
Connection: keep-alive

Would a response like that cause the same "blank" page to show up on reload (note, regular reload, not shift reload)?  

a shift reload "fixes" the problem for me.

d'oh!  I should have checked about:cache?device=disk for "Key: http://en.wikipedia.org/wiki/Sith" to see if I had an empty entry before doing the shift reload!)

I'm going to enable PR logging for HTTP so that next time this happens, I can see what's going on.

darin, any hunches on what might be going on?
Unfortunately, the 304 response doesn't tell us very much about the copy in the browser's cache.  The next time you see this, it might be a good idea to try "View->Source" or check out the entry in about:cache to see if it is an empty file.  Page Info might also be interesting.
I did do a view source and page info, and it was a completely empty document.

darin tells me that it means the original cached document was empty

I'm now running with:

NSPR_LOG_MODULES=nsHttp:5
NSPR_LOG_FILE=C:\home\mozilla-nspr.log

so if it happens again I'll have more info.

I'm going to morph the summary here to cover wikipedia until I figure out the two are not related.
Summary: Sporadic success viewing Amazon site → Sporadic success viewing Amazon / Wikipedia site
I started the trunk and tried again (googling for "wiki cheese" and then clicking on the http://en.wikipedia.org/wiki/Cheese link.

see the end of the look, look for "Cheese" to see the http protocol log info.
I think I see what's going on here.  This is triggered by a prefetch that is aborted because the prefetched document is not cacheable.  For some reason, aborting the prefetch leaves a malformed cache entry that is empty.  When the user visits the page for real, the browser attempts to use the empty cache entry.  It validates it against the server, and the server says that the entry is valid.  As a result, we end up serving the empty file to the user.  Looks like a pure HTTP bug to me.
Assignee: nobody → darin
Component: General → Networking: HTTP
Product: Firefox → Core
QA Contact: general → networking.http
Target Milestone: --- → mozilla1.8.1beta2
Version: unspecified → Trunk
This was probably caused by my patch for bug 330397.
Blocks: 330397
Status: NEW → ASSIGNED
Keywords: regression
Yup, this is a regression from bug 330397 (and possibly bug 189570 for the 1.8 branch).  The problem is that we are storing partial cache entries in the cache that lack a 'content-length' header.  As a result, the code that re-uses cache entries thinks that the cache entries are complete since there is no information to suggest otherwise.
Attached patch v1 patch (obsolete) — Splinter Review
This patch solves the problem for the "wiki cheese" google search testcase that seth produced.  It is a 1.8 branch specific patch since the corresponding code on the trunk changed considerably.  I'm going to post a separate trunk patch shortly.
Attachment #228755 - Flags: superreview?(bzbarsky)
Attachment #228755 - Flags: review?(cbiesinger)
(In reply to comment #15)
> Created an attachment (id=228755) [edit]

Is the issue that ProcessNormal (and hence, InstallCacheListener) never gets called in the case here, which would otherwise set mOpenedCacheForWriting?
scott writes:

"I opened the browser to the mozilla home page, then went to Amazon.  I got a blank page. I hit reload 3 or 4 times and still got a blank page."
> Is the issue that ProcessNormal (and hence, InstallCacheListener) never gets
> called in the case here, which would otherwise set mOpenedCacheForWriting?

Yes.  I believe that OnStartRequest is returning an error, which causes us never to reach InstallCacheListener.  Another choice would be to fix this bug by catching that error there.
Comment on attachment 228755 [details] [diff] [review]
v1 patch

OK... are you intentionally using == for these comparisons instead of & like in ProcessNormal?
Yes, "==" is key because I'm selecting the case where these cache entries are newly created.  OK... I think you've convinced me that more documentation is needed in the patch.  Revised patch coming up...
Attachment #228755 - Attachment is obsolete: true
Attachment #228755 - Flags: superreview?(bzbarsky)
Attachment #228755 - Flags: review?(cbiesinger)
*** Bug 344920 has been marked as a duplicate of this bug. ***
Flags: blocking1.8.1+
At least something must have happened between 1.9a1_2006022804 and 1.9a1_2006022812 that triggered the bug suddenly.
I see the white page (the Google link to the English Wikipedia page) already in 1.9a1_2006022812.
related to bug 342119?
Attached patch v2 patchSplinter Review
Simpler and more direct patch.  This patch was created against the MOZILLA_1_8_BRANCH.  The affected code is much different on the trunk, so a different patch is going to be necessary.
Attachment #229756 - Flags: review?(cbiesinger)
Attachment #229756 - Flags: review?(cbiesinger) → review+
Attachment #229756 - Flags: approval1.8.1?
Comment on attachment 229756 [details] [diff] [review]
v2 patch

a=dbaron on behalf of drivers.  Please check in to MOZILLA_1_8_BRANCH and mark fixed1.8.1 once you have.

However, I'm also a little concerned that there's no trunk patch -- please make sure that the issue doesn't get lost so that we don't regress in the next release.
Attachment #229756 - Flags: approval1.8.1? → approval1.8.1+
fixed1.8.1

holding bug open for a trunk patch.
Keywords: fixed1.8.1
This was filed against 1.5.0.2 and is very evil, should it go to 1.8.0 branch?
Flags: blocking1.8.0.6?
*** Bug 342119 has been marked as a duplicate of this bug. ***
Please ignore comment 28 -- sorry.
darin, putterman downloaded "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1b1) Gecko/20060722 BonEcho/2.0b1" which should have your fix.

unfortunately, he is still able to reproduce the amazon problem.

he writes:

"This didn't solve it for me.  I've attached nspr logs for a run where I went to the home page followed by amazon.com.  Amazon.com is blank."

I've just attached his nspr log.

should I spin this issue off to a new bug?  (note, his issue matches the original issue reported in this bug.)
I think this patch has also cleared up the issues I reported in bug 322851 - I cannot reproduce it in build 2006080305. However, I would rather those with more knowledge than I of how prefetching works give it a quick read to confirm the changes would have resolved those issues before closing it.
 
*** Bug 347233 has been marked as a duplicate of this bug. ***
Attached patch trunk patch (obsolete) — Splinter Review
Trunk patch.  This patch solves the "wiki cheese" google search scenario.
Attachment #232042 - Flags: review?(cbiesinger)
*** Bug 339459 has been marked as a duplicate of this bug. ***
Comment on attachment 232042 [details] [diff] [review]
trunk patch

OK, as discussed, this is wrong because ResponseIsComplete can be true even though not all content has actually been written to the cache. In addition, it looks like this would doom entities that are sent without chunked and without a content-length.
Attachment #232042 - Flags: review?(cbiesinger) → review-
Attached patch trunk patch (v2)Splinter Review
Revised patch per discussion with biesi.  It's best not to use nsHttpTransaction::ResponseIsComplete because that field is not set to true when the connection is closed by the server (in the case of a response that lacks a Content-Length header and is not chunked), and ResponseIsComplete may report true before all of the data has been written to the cache.  The cache is only populated when the stream listener reads data from the channel.
Attachment #232042 - Attachment is obsolete: true
Attachment #232051 - Flags: review?(cbiesinger)
Attachment #232051 - Flags: review?(cbiesinger) → review+
Blocks: 333275
the "wiki cheese" problem doesn't seem to be what putterman is seeing, so I've spun off his specific problem to bug #347685
fixed-on-trunk
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
*** Bug 333275 has been marked as a duplicate of this bug. ***
Comment on attachment 229756 [details] [diff] [review]
v2 patch

approved for 1.8.0 branch, a=dveditz for drivers
Attachment #229756 - Flags: approval1.8.0.7+
Flags: blocking1.8.0.7? → blocking1.8.0.7+
*** Bug 348451 has been marked as a duplicate of this bug. ***
Comment on attachment 229756 [details] [diff] [review]
v2 patch

This is not needed for Firefox 1.5.0.x.  The mOpenedCacheForWriting business doesn't exist there, and that's what caused this bug.
Attachment #229756 - Flags: approval1.8.0.7+
Thanks for clarifying, Darin.
Flags: blocking1.8.0.7+ → blocking1.8.0.7-
Flags: blocking1.9a1?
Depends on: 454411
You need to log in before you can comment on or make changes to this bug.