Closed Bug 7417 Opened 25 years ago Closed 25 years ago

[PP]Linux: Flakey startup, menus disappear, resize not working

Categories

(Core Graveyard :: Tracking, defect, P1)

x86
Linux
defect

Tracking

(Not tracked)

VERIFIED WORKSFORME

People

(Reporter: mcafee, Assigned: warrensomebody)

References

Details

(Whiteboard: [PDT+]have workaround for m7, this is randomly reproduceable and ugly without it,)

Attachments

(4 files)

Linux, gtk 1.3 (tip of both mozilla & gtk)

Start up apprunner, you need to resize the window
to get content to show up.  Also, possibly related,
the menubar has disappeared and I can't resize to
content larger than the original size (both horiz.
and vertical resizing look broken).  CC-ing some people
so we can narrow this down.
Target Milestone: M7
Here's what I see on startup:

  ftp://mocha/mcafee/bugs/7417/7417.gif

resizing the window wakes up the content,
but the menubar never shows up.  viewer does
not have this problem.
[akkana says in mozilla.builds:]

With a build pulled Tuesday morning around 10:15am, the first time I
ran apprunner everything worked great, but the second and third times,
I saw all the problems Chris describes.
I don't see this problem when i run apprunner by hand... setting LD_LIBRARY_PATH
and MOZILLA_FIVE_HOME to dist/bin, but i *do* see this behaviour when i run the
app using mozilla-apprunner.sh
QA Contact: leger → phillip
Updating QA Contact
This problem comes and goes.  Furthermore, it seems to come and go for different
people at the same time (Chris and I both saw it at the same time, then the
problem went away for both of us at the same time).  I'm guessing that maybe
it's related to the netlib-flaky-loading bug (which previously only showed up
with images), 3291.
I now notice that pulling my machine off the network
hangs the rendering process before it gets to the menus,
maybe we're hitting a network problem as Akkana suggests?
The resize event might be "trying again" which is working?
I also saw this bug after I updated and built my tree this morning. I am doing a
clobber build now to see if it is any better.

The flash panel does a sychronous load to grab the flash information. Maybe it
is hanging because that page is not loading properly? Waterson has another bug
on that.

Also, Paul MacQuiddy reported that the bookmarks are not loading properly on any
platform. That may also be holding things up. He said that even the "Manage
Bookmarks" dialog is not working.
It's baaaack.  I cannot track down what is causing
this, it appears to be slightly-random for me.
Timer bug?  Old pentium bug? ;-)
Scratch what i said earlier; rerunning enough times, i eventually reproduce the
naughty behaviour.
Severity: normal → blocker
Priority: P3 → P2
Summary: Linux: content needs resize event to trigger rendering → [Block] Linux: content needs resize event to trigger rendering
Raising priority, marking as "stability" blocker.
Summary: [Block] Linux: content needs resize event to trigger rendering → [PP]Linux: content needs resize event to trigger rendering
Looks like mail/news is having the same problem:

  http://bugzilla.mozilla.org/show_bug.cgi?id=7478

Not duping this until we know a little more about what's
going on.
Summary: [PP]Linux: content needs resize event to trigger rendering → [Blocker] [PP]Linux: content needs resize event to trigger rendering
Sorry, I consider this a blocker.
Resummarizing.  If you feel differently,
please add a comment and mark it so.
Priority: P2 → P1
Summary: [Blocker] [PP]Linux: content needs resize event to trigger rendering → [PP]Linux: content needs resize event to trigger rendering
And a P1 to fix please!
This doesn't seem to have anything to do with the sidebar. If I change the
sidebar to be about:blank, the menubar comes up, but the menus are still blank.
Also, the resizing is still messed up.
RE: Linux 6/1 build (1999-06-01-08 m7)
I have seen some problem similiar to this:
Yesterday evening, I sent my mail test account qatest38 several mail messages
with an attached jpeg image. I could view the jpeg image on win_nt 4.0 using the
6/1 seamonkey build, and 6/1 seamonkey build on Mac but I cannot view it on
Linux 6/1 seamonkey build
2. Then I used  the 4.6.1 build, I could view it on solaris 2.51 but I cannot
view it on Linux. All I saw was a broken image icon. I sent several mail with
the attached jpeg. Still could not view it on my linux box.
3. However, this morning, using the same Linux build (6/1) and viewing the same
mail messages on the same Linux box, I could view the jpeg image in each of the
mail messages.  I thought it was my machine problem because Seth said he had no
problem.
*** Bug 7478 has been marked as a duplicate of this bug. ***
Assignee: don → chofmann
OK, this problem seems to appear and then disappear for everyone at the same
time.  Two hours ago it was happening consistently, and then an hour ago it
stopped happening consistently.  And now it's back.

I've been in and out of slamm's cube this afternoon with most of the xheads and
it's our best guess that since this starts and stops happening to everyone at
the same time, it is likely some weird network or timing problem.  But we don't
know how to even debug it.

I talked to Warren and he says that Rick Potts might be able to figure it out if
it's actually a netlib problem.

Other than that, I don't know what to do with this bug.  Chris, what should I
do?  Re-assign this to rpotts?
Apprunner runs fine this morning. I am running the same bits I used yesterday
when it was broken.

The menus show up, all the content loads, and even resize works properly.
As of 10:00 it is broken for me with today's builds :-(
Yep, it is broken again. I tried replacing the home page here are my results:

homepage (set in navigator.xul)		results
---------				-------
http://www.mozilla.org/			broken
http://www.uiuc.edu/			works
http://www.yahoo.com/			works
http://photo.net/			works
http://www.cnn.com/			works
bad url					works
http://cvs-mirror.mozilla.org/webtools/tinderbox/showbuilds.cgi?tree=SeaMonkey
					broken
not found url on mozilla.org		works
mozilla pages copied to my server	works
unwrapped mozilla.org page		works

So, it seems to have something to do with the way mozilla.org is serving up
wrapped pages.
Slamm, you have manus working when you change the start page?
Yes, menus work when I change the start page. Then, if I go to
http://www.mozilla.org/, the menubar stays, but the menus only come up as tiny
empty boxes.
Summary: [PP]Linux: content needs resize event to trigger rendering → [PP]Linux: Flakey startup, menus disappear, resize not working
Resummarizing.  Yes, changing the home page from
  http://www.mozilla.org/quality/smoketests
to
  http://www.apple.com
fixes everything.  I also tried other pages
at mozilla.org, no go, this seems to be failing
with mozilla.org pages.
Mozilla.org has long shown a tendency to trigger sporadic sometimes-not-loading
bugs in our code -- see 3291 for more history on that.
I didn't run into this for a couple of days, but I am again today. Same deal:
June07 Linux build started out fine, became broken after a restart. Can't really
use it for testing like this....although resizing does allow a message to draw,
the resizing has to be done for every message, and it still doesn't enable the
mail menus.
Sorry; if I change package/chrome/messenger/content/messagePane.xul to use
another home page, in analogous fashion, the workaround DOES work.
Assignee: chofmann → dp
dp said he would try to contribute some analysis to this...
Messenger has the same problem.

The messenger start page is http://www.mozilla.org/mailnews/prefs-info.html

(see mozilla/dist/bin/chrome/messenger/content/default/shareglue.js)

So on Linux messenger, we get the same bug:  no menus, resizing not working.

If I save prefs-info.html and 7 gifs used by the page to my disk, and change the
start page to
file:/u/sspitzer/chupadocs/7417/mailnews/prefs-info.html, it works, I get menus,
and #7417 doesn't happen to me.

If I use enterprise server 3.6 (on solaris) and set the start page to
http://chupa/sspitzer/7417/mailnews/prefs-info.html  (which is really
/u/sspitzer/chupadocs/7417/mailnews/prefs-info.html) it works.

wild off the wall guesses:

1) netlib problem on linux?

2) perhaps it is an apache / mozilla combination problem.  I've seen this with
slashdot and 4.x on linux.  some times, pages never seems to load.  if the page
never finished loading, would that prevent the menus from showing up?
I'm assuming by "wrapped" pages, you mean the addition of the sidebar and
titlebar.  Maybe it would be possible to experiment with those and find out if
it's one thing in particular that's causing this problem to show up.  Are those
added via SSI, a CGI script, or some other method?  Are there URL's that will
bring the the top bar and side bar separately?
adding sspitzer to the cc list.
Seth: I loaded the mozilla.org pages from my apache server just fine. Is there
any way to trace netlib to see exactly what gets loaded and when?

bryner: Pages get "wrapped" with the title and side menu when you check in
files to the document tree. When you load mozilla.org pages, you are loading
straight html.
I need to get more sleep.

http://www.mozilla.org is Netscape-Enterprise/3.6 running on IRIX.

ignore my apache comments.
bash-2.00# uname -a
SunOS gila.mozilla.org 5.6 Generic_105181-04 sun4u sparc SUNW,Ultra-2

Right about the Enterprise Server part, though.
(thanks leaf, I telneted to mozilla.org not www.mozilla.org)

http://chupa/sspitzer/7417/mailnews/prefs-info.html
is being served by Netscape-Enterprise/3.6 on
SunOS chupacabra 5.5.1 Generic_103640-24 sun4u sparc SUNW,Ultra-1

so if file:// works and local (fast) web pages work, I'd say its time to debug
netlib and make sure we're getting the whole page from mozilla.org and the
connection is being closed.
Interestingly, this bug doesn't happen when the sidebar is collapsed (the one in
the apprunner browser window, not the fakie side table cell that's built into
mozilla pages).

Try it: when you see this problem in the browser, hit the flippy to collapse the
sidebar.  Quit apprunner (with the windowmanager 'x' since there are no menus).
Start apprunner again.  It comes up with the sidebar collapsed, and loads the
page perfectly well, and displays menus!  Now click on the flippy to bring the
sidebar back; the sidebar doesn't really come back, but it sets internal state
so that if you quit and restart, apprunner starts with the sidebar, and sure
enough, now the bug is back!
Here is how you get a netlib trace:

setenv NSPR_LOG_MODULE NETLIB:5
setenv NSPR_LOG_FILE netlib.log
./apprunner

netlib.log has the log from netlib. Can one of you seeing the bug do that and
add an attachment.
I tried dp's suggested commands, since I'm seeing the bug right now, but it
didn't put a netlib.log into the current directory.  In case it was a path
problem, I tried
setenv NSPR_LOG_FILE /tmp/netlib.log
but that didn't help, it didn't write anything to /tmp/netlib.log either.
I also tried PR_LOG_MODULE and PR_LOG_FILE, still no file.  Any ideas?
Status: NEW → ASSIGNED
Typo: NSPR_LOG_MODULES not NSPR_LOG_MODULE (sorry)

setenv NSPR_LOG_MODULES NETLIB:5
setenv NSPR_LOG_FILE netlib.log
I tried the corrected Netlib logging instructions and all I got was a 0-length
netlib.log.  Does this require a debug build?
IMAP (via NSPR) logging in seamonkey used to require a debug build. bienvenu
fixed it...perhaps netlib logging would require a similar change.
dp- did you do the same actions on both runs?  I noticed that the log from the
non-working run is almost twice as long as for the working run.
Ah, nevermind I think it's just because of the 1024[804f108]: on every line.
Brian, you are right, the bad run does have a lot more stuff. Dp said that might
have to do with reaching the 4 connection limit:

More than 4 cached connections.  Deleteing one...

But actually, I see more of those messages in the good run.
It looks like you got the working run by turning off the sidebar.  Can anyone
seem to get a working run WITH the sidebar?
Note that to create the above diff I made the mozilla paths the same and removed
the beginning of line headers, so what's shown is the real differences.
*** Bug 3291 has been marked as a duplicate of this bug. ***
Status: Still cant reproduce the bug reliably. Theory is that this looks server
and time of day dependent. Netlib logs look ok.

Suggestion: wait until necko lands.
Whiteboard: no progress towards reproducability in m7 -> moving to m8
Target Milestone: M7 → M8
Blocks: 7232
I don't see how M7 can ship with this still happening.
Whiteboard: no progress towards reproducability in m7 -> moving to m8 → this is randomly reproduceable, we need a workaround for M7
Target Milestone: M8 → M7
Agreed, we can't ship M7 like this.
Moving back to M7, asking for a work-around.
Maybe a Linux-only work-around?
Assignee: dp → chofmann
Status: ASSIGNED → NEW
start page changes to http://www.browserwatch.com until we fix this.

I had to change mozilla/xpfe/browser/src/navigator.xul.

For M8, I'm going to write some javascript to get the value of the start up
page pref.

Something like:
  var pref = Components.classes['component://netscape/preferences'];
  if (pref) {
    pref = pref.getService();
  }
  if (pref) {
    pref = pref.QueryInterface(Components.interfaces.nsIPref);
  }
  if (pref) {
	startuppage = pref.GetCharPref('startpage');
  }

you get the idea.
Target Milestone: M7 → M8
using work around for m7. moving to m8.
dmose wisely point out that:

"just curious to know if anyone has checked with browserwatch.com to make
sure that they're ok with this.  we've had problems in the past with
doing stuff that we thought was benign and causing people's sites to get
hammered, and they haven't been happy..."

can someone confirm that #7417 won't rear its ugly head if the start page is
http://www.netscape.com?
Could we possibly just make the start page about:blank or one of the built-in
samples?  That would also immensely speed startup for those of us who are
testing something other than the default page.  If the smoketests need a
complicated, slow startup page, they could specify -url (is there a way to do
this on mac?)
Target Milestone: M8 → M7
Back onto the M7 radar for URL confirmation.
when m8 opens up, I'll check in some changes to

mozilla/xpfe/browser/src/navigator.js
mozilla/modules/libpref/src/init/all.js
mozilla/xpfe/browser/src/navigator.xul

then, the start page will be determined by the int pref "browser.startup.page"

// 0 = blank, 1 = home (browser.startup.homepage), 2 = last, 3 = splash (browser
.startup.splash)

depending on that pref, we use about:blank, the value of the char pref
"browser.startup.homepage", the last visted page, or the value of the char pref
"browser.startup.splash"

all of this is done through js!
I've sent mail to the mozillazine.org owner asking if it's ok to switch to his
page... it's a bit more relevant to mozilla users than www.netscape.com, and it
loads relatively quickly.
Assignee: chofmann → leaf
Whiteboard: this is randomly reproduceable, we need a workaround for M7 → have workaround for m7, this is randomly reproduceable and ugly without it,
Assignee: leaf → warren
Target Milestone: M7 → M8
back to m8 and over to warren
www.pathfinder.com, which eventually redirects to
http://cgi.pathfinder.com/time, exhibits this same behavior on Linux.
Moving all Apprunner bugs past and present to Other component temporarily whilst
don and I set correct component.  Apprunner component will be deleted/retired
shortly.
Status: NEW → ASSIGNED
Blocks: 9184
Target Milestone: M8 → M9
necko now lands in m9
*** Bug 9610 has been marked as a duplicate of this bug. ***
I have been noticing this too for a while...and i just wanted to add a few urls
that lead to the menu and resizing to be disabled.  If you go to any of these
sites, you will no longer be able to use the menu or properly resize the window:

www.pathfinder.com
www.warnerbros.com
www.usatoday.com
www.mozilla.org
so does www.bild.de. (don't ask me how i knew that).

i'll try to figure out what these sites have in common.
well there is a mix of versions but they are all
enterprise servers and they are all http 1.1

grok /u/chofmann % telnet www.mozilla.org 80
Trying 207.200.73.41...
Connected to gila.mozilla.org.
HTTP/1.1 400 Bad Request
Server: Netscape-Enterprise/3.6

grok /u/chofmann % telnet www.warnerbros.com 80
Trying 204.140.6.16...
Connected to www.warnerbros.com.
HTTP/1.1 400 Bad Request
Server: Netscape-Enterprise/3.5.1C

grok /u/chofmann % telnet www.usatoday.com 80
Trying 206.251.19.72...
Connected to www.usatoday.com.
HTTP/1.1 400 Bad Request
Server: Netscape-Enterprise/3.6


----------------
its interesting to note that cnn is http 1.0 enterprise 2.0

grok /u/chofmann % telnet www.cnn.com 80
Trying 207.25.71.20...
Connected to cnn.com.
HTTP/1.0 400 Bad Request
Server: Netscape-Enterprise/2.01

and photnet is ms iis
rok /u/chofmann % telnet photo.com 80
Trying 206.31.52.194...
Connected to photo.com.
HTTP/1.1 400 Bad Request
Server: Microsoft-IIS/4.0
maybe a enterprise server bug that was broken in 3.x but
fixed in an enterprise service pack? www.uiu.edu is
on the working list and it shows 3.6 SP2

Trying 128.174.5.27...
Connected to spider.cso.uiuc.edu.
HTTP/1.1 400 Bad Request
Server: Netscape-Enterprise/3.6 SP2
Has anyone figured out yet whether this still happens in necko?
As far as I can see this doesn't happen with Necko.  I changed the start page to
www.mozilla.org and I get working menus and working resizing.  But since the
problem comes and goes some other people should try to verify it.
Blocks: 11346
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → WORKSFORME
I think this has fixed itself with the NECKO landing,
marking worksforme.
Status: RESOLVED → VERIFIED
verified worksforme...these problems have disappeared after the necko landing.
Whiteboard: have workaround for m7, this is randomly reproduceable and ugly without it, → [PDT+]have workaround for m7, this is randomly reproduceable and ugly without it,
Putting on [PDT]+ radar.
No longer blocks: 11346
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: