Closed
Bug 72027
Opened 24 years ago
Closed 24 years ago
Linux orange on tinderbox
Categories
(SeaMonkey :: General, defect, P1)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
mozilla0.8.1
People
(Reporter: mcafee, Assigned: mcafee)
Details
(Keywords: helpwanted, smoketest)
Linux is orange on tinderbox, attinasi & co. talking about
this on irc. This is a blocker, we need to fix this.
Assignee | ||
Comment 3•24 years ago
|
||
testing backout of rods change to config/config.mk, clobbering
content & layout.
Comment 4•24 years ago
|
||
Locally, backing out rods changes to config.mk and clobbering/rebuilding makes
the assertions at the end of the bloat tests go away. We are testing this on the
coffee tinderbox now.
Comment 5•24 years ago
|
||
Just for the record, there are two problems:
1. the alive test, which needs a clobber of at least layout and content to get
it to work. This is because of bug 72018 - the config.mk change didn't cause a
rebuild. Regardless of what happens with the bloat test, most of the ports will
need a clobber.
2. The bloat test, which looks like its only on the debug tinderboxes - ie the
ones that rods' checkin affected. I don't see this problem, but I'm building
with --enable-optimse=-O2 (but NOT --disable-debug). Everyone on IRC who sees
this is running debug, and the optimised builds aren't.
Assignee | ||
Comment 6•24 years ago
|
||
other builds are randomly failing, and coffee test of config.mk just failed.
Some builds are working, I'm now pretty confused.
Comment 7•24 years ago
|
||
WFT man?!? Well, removing Rod's change doesn't seem to solve the problem (other
than the required clobber on linux that Bradley described).
Some curiousities:
* tinderbox-test-1 has NEVER gone orange, and it is Linux (-nondebug)
* after fixing the AliveTest problems by clobbering, we get intermittent timeout
failures and an occasional assert 'Assertion failure: 0 == rv, at ptsynch.c:168'
* we get several cycles of green when nothing has changed, then several orange,
usually with a timeout
Comment 8•24 years ago
|
||
I'm not sure what, if any, tests are running on tinderbox-test-1
cls should know... if its running no tests it won't turn orange.
Comment 9•24 years ago
|
||
ok, I guess I should have looked at latest cycles
before making my comment..
mcafee turned on tests early this morning...
[mcafee@mocha.com - 03/14 01:59]
just turned on tests, bloat
numbers might be off.?
Comment 10•24 years ago
|
||
Has anyone outside netscape seen this? Is it a local networking issue of some
sort? That would match the intermittent behaviour, and it appearing and
disappearing without any code change.
The bloat test is exactly the same as ./mozilla -f bloaturls.txt isn't it? That
always works for me.
I've occasionally got a crash on shutdown similar to coffee's current problems.
Its very intermittant (about once every couple of weeks)?
It seems to be clear now though, on the main page and ports (cement and muerte
need a clobber)
Comment 11•24 years ago
|
||
Bryner suggested I CC an NSPR rep - cc'ing larryh@netscape.com
We are getting periodic assertions in NSPR threads
'Assertion failure: 0 == rv, at ptsynch.c:168'
on redHat 6.x machines
Assignee | ||
Comment 12•24 years ago
|
||
tinderbox-test-1 is rh7.0. only builds with the sighup problem
are rh6.x + depend, I think.
Assignee | ||
Comment 13•24 years ago
|
||
adding darin, who noted the opposite, that rh7 was failing
but rh6.2 worked-for-him. tinderboxes break down as follows
coffee=rh6.2
shrike=rh6.0
harpoon=rh6.0
tinderbox-test-1=rh7.0
speedracer = Solaris 2.6
Comment 14•24 years ago
|
||
I'm on RH7 and am not seeing this. The last time I saw those assertions was on
my old machine running 6.2.
Comment 15•24 years ago
|
||
mcafee: turns out i was seeing a different problem (on my rh7 box) which was
simply solved by clobbering layout. I have, however, seen the ptsynch.c
assertion (at least) once on a rh6.2 box (it's dual processor, if this makes any
difference).
Comment 16•24 years ago
|
||
Could we perhaps modify ptsynch.c on one of the machines to actually give the
error code that pthread_mutex_lock is returning?
That might be helpful, rather than just knowing that it failed.
Assignee | ||
Comment 17•24 years ago
|
||
this is also happening on cement, IRIX 6.5, on the ports page.
Assignee | ||
Comment 18•24 years ago
|
||
mkaply: good idea, I just did that on lespaul build,
on the main tbox page. Currently hidden:
http://tinderbox.mozilla.org/showbuilds.cgi?tree=SeaMonkey&noignore=1#status
Assignee | ||
Comment 19•24 years ago
|
||
pthread_mutex_lock() is returning 22, recent lespaul log.
Comment 20•24 years ago
|
||
22 is EINVAL, which means the mutex has not been properly initialized (according
to the man page).
Comment 21•24 years ago
|
||
Has anyone looked at the core file on the IRIX machine (cement)? This might
gives us a clue about why it is crashing on shutdown (although not the assert
problem).
Comment 22•24 years ago
|
||
The last line of the full log on cement is "killing plugin host". Has this
something to do with plugins? Is there a plugin in the page or recent plugin
checkins? I didn't check anything in.
Comment 23•24 years ago
|
||
This may be a build problem. That it is intermittent is troublning; it it is
what we have seen before, it would be solid.
-lpthread should always go ahead of -lc (or -lg++) when linking applications.
Comment 24•24 years ago
|
||
I don't really know that much about this stuff, but maybe my lack of knowledge
can be helpful :)
Are only machines that do bloat statistics having this problem?
I find it interesting that shut down happens, then bloat statistics, then the
assert.
Can we turn off bloat statistics on either of the machine and see if that
affects things?
Assignee | ||
Comment 25•24 years ago
|
||
I have seen this happen on the alivetest, which aborted
the bloattest.
Comment 26•24 years ago
|
||
larryh--
I'm not sure if this is relevant, but it appears that when we landed NSPR
autoconf, we no longer link libnspr4 with -lc. Could that be causing a problem?
Comment 27•24 years ago
|
||
I don't see this anymore, has it been fixed, or have these tinderboxen been
really taken offline?
Comment 28•24 years ago
|
||
coffee is still orange with the assertion sometimes. (That machine still has
rods' config.mk changes backed out locally - that should probably be fixed and
then clobbered at some point)
Comment 29•24 years ago
|
||
downgrading from blocker to major after discussion about Mozilla 0.8.1 bugs in
Performance meeting. I am not at all sure what we can do about this for 0.8.1
Severity: blocker → major
Comment 30•24 years ago
|
||
invalid now? no more chronic orange, me observes...
Assignee | ||
Updated•24 years ago
|
Assignee: attinasi → mcafee
Status: ASSIGNED → NEW
Assignee | ||
Comment 31•24 years ago
|
||
coffee still had orange, I've taken it offline for other testing.
Please leave this open for a while, I'll assume ownership.
Assignee | ||
Comment 32•24 years ago
|
||
the orange has gone away, sigh. wfm.
no QA verification needed.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → WORKSFORME
Updated•20 years ago
|
Product: Browser → Seamonkey
You need to log in
before you can comment on or make changes to this bug.
Description
•