Closed Bug 463215 Opened 16 years ago Closed 16 years ago

Browser intermittently stalls/hangs for long periods resolving hostnames - Looking up <hostname> in status bar

Categories

(Core :: Networking, defect)

x86
Linux
defect
Not set
major

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: wgianopoulos, Unassigned)

References

Details

(Keywords: regression)

Starting with today's nightly,

Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.1b2pre) Gecko/20081105 Minefield/3.1b2pre ID:20081105030619

If you have your proxy settings set to Auto-detect proxy settings for this network, and the network required no proxy, the browser is extremely slow in following links and the status line indicates it is resolving the hostname during the long delay period.

I suspect this regression is caused either by the check-in for bug 235853 or for bug 453403.

I will add more info once I figure out which check-in is responsible.
Flags: blocking1.9.1?
It appears bug 235853 was backed out so that leaves bug 453403.  I will try to verify this via backout later today, when I have more time.
Blocks: 453403
A build for changeset f85ebe02a384 does not exhibit the problem, but a build from changeset 19b3caf108d1 does.  This verifies bug 453403 as the cause of this regression.
This no longer occurs in the current trunk as the check-in for bug 453403 has been backed out.
I'm seeing similar behavior in yesterday's nightly on Windows XP (Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b2pre) Gecko/20081105 Minefield/3.1b2pre). Loading pages seems to stall *a lot*, and frequently it will sit for quite a while looking up the hostname. The UI is responsive, but the network access feels a lot worse. Waiting on today's nightly to see if that patch is to blame.
Oh, except my proxy settings are set to "no proxy". Did you verify that that proxy setting was necessary to reproduce?
(In reply to comment #5)
> Oh, except my proxy settings are set to "no proxy". Did you verify that that
> proxy setting was necessary to reproduce?

It seemed to be rather random when it stalled Looking up hostname.  It would do it for a while and then not for awhile.

I could sometimes run a while without it happening.  I changed my proxy setting to No proxy and it seemed better, but I am not sure I ran that way long enough to be certain the issue did not occur.

I then set it back to Auto-detect in order to isolate the regression.

Updating summary.
Summary: Browser hangs for long periods resolving hostnames if auto-detect proxy setting selected and no proxy is required → Browser intermittently stalls/hangs for long periods resolving hostnames - Looking up <hostname> in status bar
Definitely seems better with today's build.
I'm hoping one of the reporters can provide a better recipe.. I'm trying to reproduce, and I *think* I've seen it a couple of times, but nothing as dramatic as the reports. I'll keep hunting, of course - but if you've got a guide please post it. Thanks.
I have not been able to reproduce this issue at all under windows.  Under Linux, it seems to happen if I just launch the browser and try to visit my webmail on my onwn server, my yahoo mail, google reader and the mozillazine forums.  In most cases I can not just cycle once between those four sites without encding out in a completely unresponsive browser.
Also, although I know see that this does also happen if you have the proxy setting set to "No proxy".  It is easier to reproduce, at least for me, if it is set to "Auto-detect proxy settings for this network".  In my case looking up the name wpad with either nslookup or dig results with an immediate NXDOMAIN response, so the issue is not that trying to resolve wpad is taking a long time and triggering the issue.
Patrick: if any PR logging output would help you, or if you'd like to make a try server build with extra debug logging that you think could help diagnose this, I'll gladly run such a build to get you more info.
ok, here is what I think is going on - the dns threads are divided into "any priority" and "high priority".. the pool grows dynamically up to some limits as needed, with high priority only threads being added after the first group is exhausted.

fwiw this isn't a straight partitioning - the "any" threads will run high priority requests before anything else in the queue, but the "high" threads will never run anything lower than high. The difference in practice is that a long running prefetch lookup (which run at a lower priority) can gum up an "any" thread, but not a high one (which only serve click activity). I suspect that a lot more prefetch lookups timeout than their "active" counterparts. "click" activity shouldn't ever have to wait for prefetch activity.

"click activity" is just shorthand for any dns lookup that isn't a prefetch.

I think the problem is that, due to a late change in the patch series, click activity is no longer submitted as high priority directly. Instead it starts as speculative and is upgraded to high a little later on. I see a couple problems now:

* That upgrade isn't quite right - it does make it onto the high queue, but it does not consider expanding the thread pool (and adding a high-only thread) as it would if it was just submitted as high originally. So if we're gummed up, we stay that way.

* It should really be submtited as high priority in the first place.

This is consistent with the reports that it freezes up, and then things are good for a while, and then it freezes up again..

I'll make those changes.
I believe the version 9 patch attached to bug 453403 will resolve the problem. I have been browsing for the last hour with a minefield build with it included and have not seen a glitch.
I created a build including the version 10 patch and have been running it for an hour without seeing the issue.

I still occasionally see the Looking up <hostname> message, but it lasts only a couple of seconds and then the page displays and I can click on another link or bookmark without any issue.  With the previous patch, once I saw that message I could cancel that pageload and click on another bookmark and get exactly the same stall.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Flags: blocking1.9.1?
Resolution: FIXED → WORKSFORME
You need to log in before you can comment on or make changes to this bug.