Favicon fails to load on some sites

RESOLVED FIXED in mozilla2.0b4

Status

()

Core
Networking
RESOLVED FIXED
8 years ago
8 years ago

People

(Reporter: Matheus Kerschbaum, Assigned: mayhemer)

Tracking

({regression})

Trunk
mozilla2.0b4
All
Windows 7
regression
Points:
---

Firefox Tracking Flags

(blocking2.0 final+)

Details

(URL)

Attachments

(4 attachments)

(Reporter)

Description

8 years ago
User-Agent:       Mozilla/5.0 (Windows NT 6.1; rv:2.0b4pre) Gecko/20100806 Minefield/4.0b4pre
Build Identifier: 

Likely a regression from bug 546606. Tested with safe-mode/new profile and got same results.

Reproducible: Always

Steps to Reproduce:
1.Load http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win32/
Actual Results:  
No favicon is shown.

Expected Results:  
Favicon should have been shown.
(Reporter)

Updated

8 years ago
Blocks: 546606
Version: unspecified → Trunk
I see this also, confirming and setting to NEW
Status: UNCONFIRMED → NEW
Ever confirmed: true
blocking2.0: --- → ?
Keywords: regression
xp sp3, win32, latest nightly (localized), I don't experience that.
Favicon is being shown after maybe a 0.3 sec delay after throbbler got finished loading.
even in the example link
Was it broken in the 8-5 nightly too? Or just in 8-6?
Keywords: regressionwindow-wanted

Comment 5

8 years ago
range is between 2010080404 and 2010080504 Nightly (Windows)
Keywords: regressionwindow-wanted

Updated

8 years ago
Assignee: nobody → bjarne
Works for me, similar behavior as in comment 2.

Mozilla/5.0 (X11; Linux x86_64; rv:2.0b4pre) Gecko/20100808 Minefield/4.0b4pre

Would've suspected Adblock Plus (which breaks redirects horribly since bug 546606 landed) but original reporter saw this in safe-mode/new profile so probably not it.

Comment 9

8 years ago
Sometimes the favicon loads correctly on the initial load, if I then refresh the page the favicon disappears and stays gone.

Comment 10

8 years ago
I can reproduce on Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b4pre)
Gecko/20100808 Minefield/4.0b4pre ID:20100808040523.

I get same regression range of comment #7,

Press F5 then Favicon disappears always as same as comment #9.

Comment 11

8 years ago
I'm not able to reproduce on Linux. Also, observe that failure-reports seems to come from Windows but the reporter of comment #8 (where it works) uses Linux.

Could anyone confirm that this issue is only seen on Windows?

Comment 12

8 years ago
(Alternatively, of-course, state that the issue can also be seen on Mac/Linux)
Can someone who can reproduce this produce an HTTP log and attach it to this bug?

Comment 14

8 years ago
Works for me, Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0b4pre) Gecko/20100809 Minefield/4.0b4pre.

Probably Windows only?

Comment 15

8 years ago
Created attachment 464076 [details]
HTTP log

I try to OPEN and RELOAD the url in clean profile.
Open URL --- Favicon appears
Reload ------Favicon disappers,
That log is ... odd.  It shows way more requests for favicon.ico than I would expect, and shows 3 requests with no responses ever appearing at the end.

Comment 17

8 years ago
Odd indeed...

First: the code changed for favicons as part of fix for bug #546606 is trivial and should not cause any problems. However, a possible cause for regressions is the time-slot introduced when making the redirect-notification async.

Looking at the log from comment #15 with line-numbers enabled, the first request to favicon comes at line 1712 handled by nsHttpChannel@641ab40. The corresponding 302-response is at line 1882 and it receives the OnRedirectVerifyCallback() at line 1958. Cache-entry is closed at line 2014 - so far, this looks reasonable. The redirected request is sent at line 1989 by nsHttpChannel@641aff0 and it looks like its response comes at line 2397 and the cache-entry is closed at line 2468.

However, another request to favicon is made at line 2042 by nsHttpChannel@641acd0 and its 302-response is at line 2217. Big question: Where did this request come from?

It looks like it originates from nsHttpChannel@641acd0 created at line 1730. It receives OnCacheEntryAvailable() at line 2016 (reasonable - see two sections above) but decides to validate the entry based on expiration-time at line 2032 and hence hits the net. Big question now becomes: Why is nsHttpChannel@641acd0 created?

It's getting too late for me to pursue this further - any additions and/or corrections to my analysis so far is appreciated.

Comment 18

8 years ago
in Comment #15
> I try to OPEN and RELOAD the url in clean profile.

So, is the second a request by the reloading?

Comment 19

8 years ago
By the way,
After I open and reload the url ( http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win32/  ) , I can not open  http://ftp.mozilla.org/favicon.ico, it is loading forever.

However,when I never carried out reloading the url ,  I can open  http://ftp.mozilla.org/favicon.ico .
Given the unknown here we should probably sort this regression out for 2.0, or decide it's not important enough once we understand what's going on here.
blocking2.0: ? → final+

Comment 21

8 years ago
(In reply to comment #18)
> in Comment #15
> > I try to OPEN and RELOAD the url in clean profile.
> 
> So, is the second a request by the reloading?

I don't think so. Current thinking is that it's one request from the tab and another from the address-field. If anyone can verify or falsify this, it might be useful. However, the real issue is that the cache seems to lock up on the entry somehow - I'm digging through this now.

Comment 22

8 years ago
Continuing analysis from comment #17...  the reload starts at line 2473.

The second request for favicon handled by nsHttpChannel@641acd0 created at line 1730 decides to validate the entry based on expiration-time at line 2032. It hits the net at line 2042 and receives the 302-response in line 2217. The replacement-channel nsHttpChannel@641ab40 (re-using the address) is created at line 2271 and set up at line 2282, but I cannot find any trace of this channel later in the log, nor can I find any trace of nsHttpChannel@641acd0 later in the log which is really odd and may indicate that it's stuck waiting for a callback. (There is no logging when nsHttpChannel starts waiting for callbacks though, and we should probably add this to nsHttpChannel.)

Alice0775: Could I ask you to make a new log? I need you to set "nsRedirect:5" in addition to the settings you used to make the HTTP-log (email me directly if it is unclear how to do this).

Comment 23

8 years ago
Created attachment 464782 [details]
HTTP log

set NSPR_LOG_MODULES=nsHttp:5,nsSocketTransport:5,nsHostResolver:5,nsRedirect:5
set NSPR_LOG_FILE=D:\log.txt

And run Minefied, Open url and reload.

Comment 24

8 years ago
Wow...  that was quick. Thanks!

However, I seem to have done a mistake when setting up logging for redirects (same effect as bug #545995) so the desired output did not appear in the new log. Sorry about that! :( I have filed bug #586264 together with a quick fix which I hope to get approved and landed asap.

If you build minefield yourself, could I perhaps ask you to apply the patch for bug #586264, re-run, and submit a new log like the previous one? Otherwise, I'll get back to you when the nightly builds include this fix...

Comment 25

8 years ago
Created attachment 464834 [details]
HTTP log with patch Bug 586264

Comment 26

8 years ago
Appreciated! Your log shows clearly that a "sink" (instance of nsIChannelEventSink) does not reply/call back. I'll go on tracking it down...

(Btw, this is a general weakness of the current approach: A sink not calling back will hang lots of things. We should address this in a separate bug IMO.)

Comment 27

8 years ago
(In reply to comment #25)
> Created attachment 464834 [details]
> HTTP log with patch Bug 586264

Just for the record (my apologies if this is obvious): You do run minefield without any add-ons, right (ref bug #546606, comment 98)? And there are no stack-dumps or anything in addition to the log?

Comment 28

8 years ago
(In reply to comment #27)
> (In reply to comment #25)
> > Created attachment 464834 [details] [details]
> > HTTP log with patch Bug 586264
> 
> Just for the record (my apologies if this is obvious): You do run minefield
> without any add-ons, right (ref bug #546606, comment 98)? And there are no
> stack-dumps or anything in addition to the log?

I did not install any add-ons.
What is stack-dumps, log? where is it?

Comment 29

8 years ago
I'm really not able to reproduce this on 32bit Vista with nightly builds from Aug 5th, 6th nor 12th. I don't have access to a 64bit Windows machine at the moment - are there anyone who can check for differences between 32 and 64 bit versions with respect to this bug (just to cover that base)?

(In reply to comment #28)
> I did not install any add-ons.

Are you using a new profile (to ensure there's nothing left from earlier issues)? Could you try that if not?

> What is stack-dumps, log? where is it?

I was just curious whether there are any indications of Minefield crashing of otherwise malfunctioning. :) I guess not. Thanks a lot for your help so far in any case!
I asked again in the mozillazine forums about this and I've been told that it does not happen in XP, or Vista 32bit - this is looking more and more like a 64 bit issue - what's different about the icon cache or whatever between 32 and 64 ?
I can reproduce the problem on 32bit just fine using the URL testcase.  Hit the reload button, the favicon disappears.  I also don't see why using 64bit would make any difference.
(Assignee)

Comment 32

8 years ago
The log also comes from 32 bit system.

Dennis (or actually anyone CC'ed): are you using a fresh profile?  If not, would you be willing to compress it and send it to me by mail + exact steps you are doing to reproduce, so I can test with it?  I'm unable to reproduce with a clean profile.

Comment 33

8 years ago
This happens with a fresh profile, I ran through these steps 5 times, deleting the profile between tries and reproduced it each time:

1) load http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win32/
2) refresh the page and the favicon disappears

After step one http://ftp.mozilla.org/favicon.ico no longer loads, if I don't load the ftp url then I can access http://ftp.mozilla.org/favicon.ico without problems

Comment 34

8 years ago
(In reply to comment #31)
>  also don't see why using 64bit would
> make any difference.

Me neither, but it's worth to check. :)

Ok - so we have a regression which seems to happen on Win only. I cannot reproduce it on (32bit) Vista with nightlies, nor can Honza (and I guess he has also tested his own builds). It happens with clean profiles.

Can anyone who reproduce this build minefield themselves? And has access to a debugger? Could we arrange a time on irc and work together on this? Alice0775...?

Comment 35

8 years ago
(In reply to comment #34)
> Alice0775...?
Sorry, It is difficult for me.

BTW. Disable cache, then the problem is gone.
browser.cache.memory.enable;false
browser.cache.disk.enable;false
(Assignee)

Comment 36

8 years ago
I looked through 'HTTP log with patch Bug 586264'.  I cannot provide conclusion, just what I can see in the log:

There are 3 requests for http://ftp.mozilla.org/favicon.ico, as we know this redirects (302) to http://www.mozilla.org/favicon.ico.

Some addresses gets recycled, there are 4 nsHttpChannel addresses involved:
A = 3dc9880
B = 3dc9550
C = 3dc9220
D = 3dc93b8

- A is instantiated and have to load http://ftp.mozilla.org/favicon.ico
- B is instantiated and have to load http://ftp.mozilla.org/favicon.ico, waits for the cache entry currently held by A
- C is instantiated and have to load http://ftp.mozilla.org/favicon.ico, waits for the cache entry currently held by A
- A handles 302, redirects to http://www.mozilla.org/favicon.ico
- A setups D to load http://www.mozilla.org/favicon.ico (A redirects to D)
- A successfully calls all sinks, gets redirect veto result
- D asyncOpens
- A cancels and releases the cache entry
- B gets OnCacheEntryAvailable as the cache entry had been released
- C still waits, because now B holds the cache entry
- B bypasses the load from cache and loads again from the net
- A calls OnStopRequest and is being destroyed (released from memory), A is not a valid pointer since now
- B handles 302, redirects to http://www.mozilla.org/favicon.ico
- B setups NEW A (the redirect target channel has the same pointer value as A was having) to load http://www.mozilla.org/favicon.ico (B redirects to NEW A)
- B calls redirect sinks, but doesn't get answer from its callbacks sink !
- B stuck, holding the cache entry for http://ftp.mozilla.org/favicon.ico
- D loads the icon data from the net
- D is released from memory

I slightly believe there is some mess with the pointers.  This is pure guessing, but I can think of a code that uses channel pointers as keys for a hash table or something and later branches the code based on presence of the key in the hash table.  But this is only a guess.

Now I'm going to look at AsyncOnChannelRedirect implementations one by one and see where we can get stuck.

That's all I have.
(In reply to comment #32)
> The log also comes from 32 bit system.
> 
> Dennis (or actually anyone CC'ed): are you using a fresh profile?  If not,
> would you be willing to compress it and send it to me by mail + exact steps you
> are doing to reproduce, so I can test with it?  I'm unable to reproduce with a
> clean profile.

Honza, regular profile, simple STR, load the URL in the testcase, reload it.  I'm sending you an email now.
(Assignee)

Comment 38

8 years ago
I might find the culprit.

Alice0775 White: would you be willing to temper a bit one javascript file in
your Minefield installation directory?

If so, follow this instructions:
- close your failing Minefield instance, ensure it is down
- go to you c:\Program Files\<your failing Minefield installation dir>
- please double check it is the version that fails!  sorry, I rather emphasize
it
- go to modules/ directory
- find CertUtils.jsm file
- search for 'asyncOnChannelRedirect: function' in that file (should be line
146)
- then change the body from:

  asyncOnChannelRedirect: function(oldChannel, newChannel, flags, callback) {
    if (this.allowNonBuiltInCerts)
      return;

to

  asyncOnChannelRedirect: function(oldChannel, newChannel, flags, callback) {
    if (this.allowNonBuiltInCerts) {
      callback.onRedirectVerifyCallback(Components.results.NS_OK);
      return;
    }

This is the only place I found (so far) that is missing the call back where it
should be done.

Please let me know if it helped.  If you cannot find the file or modify it
manually, I'll create a TryServer build with that fix and provide a link.

Comment 39

8 years ago
(In reply to comment #38)
> I might find the culprit.
> 
> Alice0775 White: would you be willing to temper a bit one javascript file in
> your Minefield installation directory?
> 
> If so, follow this instructions:
> - close your failing Minefield instance, ensure it is down
> - go to you c:\Program Files\<your failing Minefield installation dir>
> - please double check it is the version that fails!  sorry, I rather emphasize
> it
> - go to modules/ directory
> - find CertUtils.jsm file
> - search for 'asyncOnChannelRedirect: function' in that file (should be line
> 146)
> - then change the body from:
> 
>   asyncOnChannelRedirect: function(oldChannel, newChannel, flags, callback) {
>     if (this.allowNonBuiltInCerts)
>       return;
> 
> to
> 
>   asyncOnChannelRedirect: function(oldChannel, newChannel, flags, callback) {
>     if (this.allowNonBuiltInCerts) {
>       callback.onRedirectVerifyCallback(Components.results.NS_OK);
>       return;
>     }
> 
> This is the only place I found (so far) that is missing the call back where it
> should be done.
> 
> Please let me know if it helped.  If you cannot find the file or modify it
> manually, I'll create a TryServer build with that fix and provide a link.
According to your instruction, I modified the jsm file.
And I confirm that the problem is fixed!.
(Assignee)

Comment 40

8 years ago
Created attachment 465995 [details] [diff] [review]
v1 [Check-in comment 43]

Curious why this is not invoked in debug builds and only on windows.
Assignee: bjarne → honzab.moz
Status: NEW → ASSIGNED
Attachment #465995 - Flags: review?(jst)
(In reply to comment #39)
> According to your instruction, I modified the jsm file.
> And I confirm that the problem is fixed!.

Confirmed also with Honza this fixed it for me too.  Just an FYI, for anyone following along, we had to also delete the local profile directories' XUL and XPC .mlf files before the .jsm change was noticed.
Comment on attachment 465995 [details] [diff] [review]
v1 [Check-in comment 43]

Good find guys! r=jst
Attachment #465995 - Flags: review?(jst) → review+

Comment 43

8 years ago
http://hg.mozilla.org/mozilla-central/rev/40f2bdf80c55
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b4

Comment 44

8 years ago
(In reply to comment #42)
> Good find guys! r=jst

Fantastic Honza!  :)  Stupidly enough, I must have screwed up this sink in the last round of changing the API. The sink is actually correct in the patch reviewed by biesi at https://bugzilla.mozilla.org/attachment.cgi?id=457284

I'm still pondering over a few questions though...

1) why is the BadCertHandler used with a plain http-url?
2) ... and why only on Windows?
(Assignee)

Comment 45

8 years ago
Comment on attachment 465995 [details] [diff] [review]
v1 [Check-in comment 43]

timeless, thanks.
Attachment #465995 - Attachment description: v1 → v1 [Check-in comment 43]
(In reply to comment #44)
> 1) why is the BadCertHandler used with a plain http-url?
> 2) ... and why only on Windows?

This is really confusing me. I'm pretty confident that the CertUtils.jsm code patched here isn't used for favicon loads at all. Is it possible that the broken nsIChannelEventSink implementation was somehow interfering with completely separate requests?
(Assignee)

Comment 47

8 years ago
(In reply to comment #46)
> (In reply to comment #44)
> > 1) why is the BadCertHandler used with a plain http-url?
> > 2) ... and why only on Windows?
> 
> This is really confusing me. I'm pretty confident that the CertUtils.jsm code
> patched here isn't used for favicon loads at all.

Apparently is, follow:

http://mxr.mozilla.org/mozilla-central/source/browser/components/wintaskbar/WindowsPreviewPerTab.jsm#108
http://mxr.mozilla.org/mozilla-central/source/browser/components/wintaskbar/WindowsPreviewPerTab.jsm#127
http://mxr.mozilla.org/mozilla-central/source/browser/components/wintaskbar/WindowsPreviewPerTab.jsm#533

Also explains why only on windows.

> Is it possible that the
> broken nsIChannelEventSink implementation was somehow interfering with
> completely separate requests?

Definitely not.  It is clear from the log, implementation and results of the fix.
(Assignee)

Comment 48

8 years ago
And the stack starts here, sorry I forgot to add it:
http://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/NetUtil.jsm#186
(In reply to comment #47)
> Apparently is, follow:
> 
> http://mxr.mozilla.org/mozilla-central/source/browser/components/wintaskbar/WindowsPreviewPerTab.jsm#108

Oh, yeah, that file will have been affected for sure. But it only affects the favicon displayed on Windows taskbar tab previews, AIUI (not the one displayed in the location bar). The early comments on this bug make no mention of that, which is rather confusing.
You need to log in before you can comment on or make changes to this bug.