Closed
Bug 620115
Opened 14 years ago
Closed 14 years ago
No SeaMonkey nightly builds generated any more due to sqlite/python threading issue.
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: rsx11m.pub, Assigned: dustin)
References
Details
Attachments
(3 files, 1 obsolete file)
|
4.43 KB,
patch
|
catlee
:
review+
|
Details | Diff | Splinter Review |
|
4.77 KB,
patch
|
Details | Diff | Splinter Review | |
|
526 bytes,
patch
|
Details | Diff | Splinter Review |
The last regular nightly branch builds were 20101215 for Mac and Linux, the last one for Windows 20101216. This seems to be unrelated to the buildbot 0.8 upgrade time frame, but coincides with Thunderbird switching off their 3.0.x tinderboxes on December 15.
There was a push on mozilla-1.9.1, triggering the build/test boxes, but there are no corresponding 20101218 nightly builds and no indication that they were actually started.
Comment 1•14 years ago
|
||
It coincides with turning on the new nightly build code for SeaMonkey, which uses the same build ID and revision for nightlies of all platforms. I'm not sure yet why we don't build any nightlies at all right now with this, but it may be connected with the fact that we don't have succeeding normal builds on all platforms right now.
This caught my attention because there was no log for any nightly at all. Even
if it didn't succeed, I wouldn't expect a nightly, but still a build log. :-\
Comment 3•14 years ago
|
||
Hrm, the problem actually seems to lie deeper (we now seem to see it on trunk as well) and go into threading problems with SQLite:
2010-12-19 00:00:01-0800 [-] lastGoodRev: took 0.62 seconds to run; returned None
2010-12-19 00:00:01-0800 [-] Unhandled error in Deferred:
2010-12-19 00:00:01-0800 [-] Unhandled Error
Traceback (most recent call last):
File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/Twisted-9.0.0-py2.6-linux-i686.egg/twisted/enterprise/adbapi.py", line 429, in _runInteraction
result = interaction(trans, *args, **kw)
File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/schedulers/timed.py", line 249, in _check_timer
self._maybe_start_build(t)
File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/schedulers/timed.py", line 277, in _maybe_start_build
self.start_HEAD_build(t)
File "/tools/buildbotcustom/buildbotcustom/scheduler.py", line 65, in start_HEAD_build
d = defer.maybeDeferred(self.ssFunc, self, t)
--- <exception caught here> ---
File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/Twisted-9.0.0-py2.6-linux-i686.egg/twisted/internet/defer.py", line 102, in maybeDeferred
result = f(*args, **kw)
File "/tools/buildbotcustom/buildbotcustom/misc_scheduler.py", line 249, in ssFunc
rev = lastChangeset(db, branch)
File "/tools/buildbotcustom/buildbotcustom/misc_scheduler.py", line 111, in lastChangeset
for c in db.changeEventGenerator(branches=[branch]):
File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/db/connector.py", line 364, in changeEventGenerator
rows = self.runQueryNow(q, tuple(args))
File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/db/connector.py", line 182, in runQueryNow
return self.runInteractionNow(self._runQuery, *args, **kwargs)
File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/db/connector.py", line 212, in runInteractionNow
return self._runInteractionNow(interaction, *args, **kwargs)
File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/db/connector.py", line 235, in _runInteractionNow
c = conn.cursor()
File "/tools/buildbot-0.8.0/lib/python2.6/site-packages/buildbot-0.8.2_hg_161d0a80925c_default-py2.6.egg/buildbot/db/dbspec.py", line 131, in cursor
return RetryingCursor(self.dbapi, self.conn.cursor())
sqlite3.ProgrammingError: SQLite objects created in a thread can only be used in that same thread.The object was created in thread id -1208527168 and this is thread id -1212355696
Comment 4•14 years ago
|
||
I'm a bit at a loss here, as I neither have a clue of how to fix this nor on how to switch us to a different DB than sqlite to perhaps fix that.
Chris or Ben, can you help me with either of those two sides here?
Updated•14 years ago
|
Severity: major → blocker
Summary: No SeaMonkey 2.0.12pre nightly builds since December 15/16 → No SeaMonkey nightly builds generated any more due to sqlite/python threading issue.
Version: SeaMonkey 2.0 Branch → unspecified
| Assignee | ||
Comment 6•14 years ago
|
||
It looks like something is going wrong in how the custom schedulers are written. There's a lot of trickiness there to make sure that appropriate transactions run in appropriate threads. In particular, start_HEAD_build is called in a thread with a transaction object, so using maybeDeferred there is a mistake. I think the error here is calling db.changeEventGenerator, which assumes it's run in the main thread and thus uses the main thread's sqlite connection.
We haven't seen this in RelEng production because MySQL does not have this one-object-per-thread limitation.
I'll see if I can work up a patch to the custom nightly stuff.
Assignee: nobody → dustin
| Assignee | ||
Comment 7•14 years ago
|
||
This patch attempts to thread the 't' parameter (the transaction) through all of the methods that start_HEAD_build invokes. The sticking point was generateChangeEvents (which is going to die in 0.8.4, btw), so I just copied that function into buildbotcustom and modified it.
I don't have tuxedo on my system to be able to test this, and in general I'm too new to this stuff to have a quick way to stage changes like this, so I've flagged it for feedback rather than review. I can try (later) to stage it in the releng environment, but since things work in our environment *without* this patch, I'm not sure that would be conclusive. Let me know what you think.
Attachment #499042 -
Flags: feedback?
Comment 8•14 years ago
|
||
Dustin, thanks a lot for looking into this. The patch has been locally applied to the SeaMonkey buildmaster, checkconfig liked it, let's see what the next nightly cycle tells us.
Severity: blocker → major
Version: unspecified → SeaMonkey 2.0 Branch
Comment 9•14 years ago
|
||
Dustin, it looks like there's still a problem in there:
2010-12-22 00:00:53-0800 [-] lastGoodRev: took 0.04 seconds to run; returned None
2010-12-22 00:00:53-0800 [-] Unhandled Error
Traceback (most recent call last):
File "/tools/python-2.6.5/lib/python2.6/threading.py", line 504, in __bootstrap
self.__bootstrap_inner()
File "/tools/python-2.6.5/lib/python2.6/threading.py", line 532, in __bootstrap_inner
self.run()
File "/tools/python-2.6.5/lib/python2.6/threading.py", line 484, in run
self.__target(*self.__args, **self.__kwargs)
--- <exception caught here> ---
--
File "/tools/buildbotcustom/buildbotcustom/scheduler.py", line 69, in start_HEAD_build
if ss is None:
File "/tools/buildbotcustom/buildbotcustom/misc_scheduler.py", line 277, in ssFunc
File "/tools/buildbotcustom/buildbotcustom/misc_scheduler.py", line 138, in lastChangeset
sourcestamps.revision IS NOT NULL AND
exceptions.NameError: global name 'changeEventGeneratorWithTransaction' is not defined
Looks to me like just a typo, as you actually named it changeEventGeneratorInTransaction - am I right in that should fix this?
Comment 10•14 years ago
|
||
(In reply to comment #9)
> Looks to me like just a typo, as you actually named it
> changeEventGeneratorInTransaction - am I right in that should fix this?
I guess it somehow kept retrying, because I saw that error repeated, and when I fixed that line and did a reconfig, it apparently triggered nightlies correctly.
| Assignee | ||
Comment 11•14 years ago
|
||
I figured there was a typo lurking there somewhere.
I'd like to get catlee's review, too, before committing this, since he wrote the original implementation. Do you mind letting this "bake" until he's back on Jan 4?
Attachment #499042 -
Attachment is obsolete: true
Attachment #499293 -
Flags: superreview?
Attachment #499293 -
Flags: review?(catlee)
Attachment #499042 -
Flags: feedback?
| Assignee | ||
Comment 12•14 years ago
|
||
Comment on attachment 499293 [details] [diff] [review]
m620115-buildbotcustom-r2.patch [applied SM]
Ben, if you have time and can take a look before catlee's back, that would be great, too.
Attachment #499293 -
Flags: superreview? → review?(bhearsum)
Comment 13•14 years ago
|
||
(In reply to comment #8)
> Dustin, thanks a lot for looking into this. The patch has been locally applied
> to the SeaMonkey buildmaster, checkconfig liked it, let's see what the next
> nightly cycle tells us.
Nightly builds have appeared today for all platforms in both the "latest-comm-central-trunk" (SeaMonkey 2.1b2pre) and "latest-comm-1.9.1" (SeaMonkey 2.0.12pre) ftp directories.
Comment 14•14 years ago
|
||
oh, and also in the corresponding l10n directories, for all languages except comm-1.9.1 Polish Linux i686 and comm-central Win32. These exceptions don't affect the langpack.xpi addons, which were generated on some other platform.
Comment 15•14 years ago
|
||
(In reply to comment #11)
> Do you mind letting this "bake" until he's back
> on Jan 4?
Sure, no problem with that, I'm happy we have that patch running there - again, thanks a ton for that!
Comment 16•14 years ago
|
||
Comment on attachment 499293 [details] [diff] [review]
m620115-buildbotcustom-r2.patch [applied SM]
>diff --git a/scheduler.py b/scheduler.py
>--- a/scheduler.py
>+++ b/scheduler.py
>@@ -58,23 +58,25 @@ class SpecificNightly(Nightly):
> Nightly.__init__(self, *args, **kwargs)
>
> def start_HEAD_build(self, t):
>- """Slightly mis-named, but this function is called when it's time to start a build. We call our ssFunc to get a sourcestamp to build.
>+ """
>+ Slightly mis-named, but this function is called when it's time to start
>+ a build. We call our ssFunc to get a sourcestamp to build.
>
>- ssFunc is called in a thread with an active database transaction running.
>+ ssFunc is called in a thread with an active database transaction
>+ running. It cannot use Deferreds, nor any db.*Now methods.
> """
>+ #### NOTE: called in a thread!
>+ ss = self.ssFunc(self, t)
> d = defer.maybeDeferred(self.ssFunc, self, t)
I think you want to delete this line now, right? Looks good otherwise.
Attachment #499293 -
Flags: review?(catlee) → review+
| Assignee | ||
Comment 17•14 years ago
|
||
In particular, you mean this line, right?
> d = defer.maybeDeferred(self.ssFunc, self, t)
Comment 18•14 years ago
|
||
(In reply to comment #17)
> In particular, you mean this line, right?
>
> > d = defer.maybeDeferred(self.ssFunc, self, t)
yes
| Assignee | ||
Comment 19•14 years ago
|
||
landed in build/buildbotcustom branch 'default': 6d5e6ca97a95
Attachment #499293 -
Attachment is obsolete: true
Attachment #499293 -
Flags: review?(bhearsum)
| Assignee | ||
Comment 20•14 years ago
|
||
moving this into the firefox releng product so we can flag it for deployment
Product: SeaMonkey → mozilla.org
Version: SeaMonkey 2.0 Branch → other
| Assignee | ||
Comment 21•14 years ago
|
||
Needs a buildmaster restart. This isn't blocking anything, so this can happen during the next otherwise-scheduled downtime.
Flags: needs-treeclosure?
| Assignee | ||
Comment 22•14 years ago
|
||
Well, this got landed as a ridealong with other changes, but unfortunately:
130 q += " ORDER BY changeid DESC"
131 rows = t.execute(q, tuple(args))
132 for (changeid,) in rows:
133 yield dbconn._txn_getChangeNumberedNow(t, changeid)
t.execute does not return a list of rows.
Flags: needs-treeclosure?
Comment 24•14 years ago
|
||
Comment on attachment 502402 [details] [diff] [review]
m620115b-buildbotcustom-r1.patch
Soryr, you need a catlee-like object for this review.
Attachment #502402 -
Flags: review?(nrthomas)
| Assignee | ||
Comment 25•14 years ago
|
||
apparently the world of staging masters has moved on without me since I last staged a change, as I can't figure out how to run this in staging. I'm getting
Traceback (most recent call last):
File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-i686.egg/twisted/scripts/_twistd_unix.py", line 317, in startApplication
app.startApplication(application, not self.config['no_save'])
File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-i686.egg/twisted/application/app.py", line 648, in startApplication
service.IService(application).startService()
File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-i686.egg/twisted/application/service.py", line 278, in startService
service.startService()
File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/buildbot-0.8.2_hg_a63f22816750_production_0.8-py2.6.egg/buildbot/master.py", line 567, in startService
self.loadTheConfigFile()
--- <exception caught here> ---
File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/buildbot-0.8.2_hg_a63f22816750_production_0.8-py2.6.egg/buildbot/master.py", line 600, in loadTheConfigFile
d = self.loadConfig(f)
File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/buildbot-0.8.2_hg_a63f22816750_production_0.8-py2.6.egg/buildbot/master.py", line 624, in loadConfig
exec f in localDict
File "/builds/buildbot/builder_master1/master.cfg", line 46, in <module>
branchObjects = generateBranchObjects(BRANCHES[branch], branch)
File "/tools/buildbotcustom/buildbotcustom/misc.py", line 994, in generateBranchObjects
**extra_args
File "/tools/buildbotcustom/buildbotcustom/process/factory.py", line 1771, in __init__
MercurialBuildFactory.__init__(self, **kwargs)
File "/tools/buildbotcustom/buildbotcustom/process/factory.py", line 621, in __init__
assert self.platform in getSupportedPlatforms()
exceptions.AssertionError:
from sm01, which as far as I can tell has nothing to do with my change.
| Assignee | ||
Comment 26•14 years ago
|
||
Comment on attachment 502402 [details] [diff] [review]
m620115b-buildbotcustom-r1.patch
Ignore this patch - re-uploaded as bug 624298 attachment 502484 [details] [diff] [review]. Would that I could mark it obsolete without uploading a new patch..
| Assignee | ||
Updated•14 years ago
|
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → INCOMPLETE
| Reporter | ||
Comment 27•14 years ago
|
||
This should be resolved either as FIXED by another bug or WFM if the reason for filing it disappeared, but not as INCOMPLETE as in fact the bug was complete and acknowledged as it was filed.
| Assignee | ||
Updated•14 years ago
|
Resolution: INCOMPLETE → FIXED
| Reporter | ||
Comment 28•14 years ago
|
||
Thanks, from SeaMonkey's point of view this was apparently already fixed per comment #12 after applying the initial patch to its local build system.
| Assignee | ||
Comment 29•14 years ago
|
||
I think it works because this particular, incorrect syntax happens to work on sqlite.
| Reporter | ||
Comment 30•14 years ago
|
||
Comment on attachment 499293 [details] [diff] [review]
m620115-buildbotcustom-r2.patch [applied SM]
Let's unobsolete this patch then as it appears to be the only one which was actually checked-in somewhere (though just locally for SeaMonkey). Otherwise
it's a bit unclear what was done in this bug.
Attachment #499293 -
Attachment description: m620115-buildbotcustom-r2.patch → m620115-buildbotcustom-r2.patch [applied SM]
Attachment #499293 -
Attachment is obsolete: false
Comment 31•14 years ago
|
||
FYI, since today (i.e. starting from tomorrow's nightlies), the SeaMonkey buildmaster has been updated to buildbotcustom default branch without changes, so now runs what has been checked into the official repo.
Comment 32•14 years ago
|
||
(In reply to comment #31)
> FYI, since today (i.e. starting from tomorrow's nightlies), the SeaMonkey
> buildmaster has been updated to buildbotcustom default branch without changes,
> so now runs what has been checked into the official repo.
...and the SeaMonkey nightlies are still appearing dutifully on schedule, on both comm-central and comm-1.9.1. I think baking for a week is long enough => VERIFIED on behalf of SeaMonkey QA.
Status: RESOLVED → VERIFIED
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•