Closed Bug 448727 Opened 12 years ago Closed 11 years ago

need shark builds for 3.1 (trunk)

Categories

(Release Engineering :: General, defect, P2, critical)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: vlad, Assigned: nthomas)

References

Details

Attachments

(2 files, 4 obsolete files)

Currently, http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/experimental/shark/ only has 3.0 and 3.0.1 -- we probably don't need 3.0 any more, but we do need 3.1/trunk there.
Component: Release Engineering → Release Engineering: Future
Priority: -- → P3
Okay, what gives.  "Future" doesn't cut it -- it should not be hard to just change the config to point to 3.1 for this.

We have massive responsiveness problems currently on OSX, and it also happens to be the one platform where we can give people an easy set of steps to follow to tell us exactly what's going on when firefox is beachballing.  We can't do that without giving them a useful build with symbols, which is what we used to have here.
Severity: normal → critical
Component: Release Engineering: Future → Release Engineering
Ben, can you help us understand what the constraints are here? As we get into the endgame our eye turns - as always - to performance, and Shark builds are a great boon to that.
First of all, it is not necessarily a simple configuration change. We cannot simply bump the version number of the old system, because it is running on Tinderbox, which cannot do Mercurial based builds. We haven't attempted a build of this sort on the new system - so I simply don't know how much work it is.

I can't speak to other peoples workload and priorities but I know I've been busy with other things (release automation, 3.1b1, 3.1b2, branching) - and therefore haven't had the time to pick this bug up.
(In reply to comment #0)
> Currently,
> http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/experimental/shark/ only
> has 3.0 and 3.0.1 -- we probably don't need 3.0 any more, but we do need
> 3.1/trunk there.
1) Those two shark builds were done by rhelmer in june2008. I'll find out what was involved to do this, and what machines he used for this. As there are only two builds, he may have done this manually on his own machine, but I will confirm. However, at this time we dont have any details.


2) If we change our pool of mac slaves, so they can do shark builds in addition to doing "normal" mac builds, then we can automatically have shark builds on all branches (mozilla-1.9.1, m-c, tracemonkey). This approach seems better then doing a series of "quick hacks" now for mozilla-191, followed by another quick hack for m-c, then...

Its a question of priorities, of this compared to other work already scheduled for closing weeks of Q4. 

In the meanwhile, is it possible for someone in development to manually do a shark build, so you can at least start looking at performance numbers right away?
(In reply to comment #4)
> (In reply to comment #0)
> > Currently,
> > http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/experimental/shark/ only
> > has 3.0 and 3.0.1 -- we probably don't need 3.0 any more, but we do need
> > 3.1/trunk there.
> 1) Those two shark builds were done by rhelmer in june2008. I'll find out what
> was involved to do this, and what machines he used for this. As there are only
> two builds, he may have done this manually on his own machine, but I will
> confirm. However, at this time we dont have any details.
Hey, instead of emailing Rob, why dont I just cc him!? He's got a bugzilla account after all! :-0

Rob, do you remember any details on these old FF3.0 shark builds from june2008?
nthomas found the following during a our 1x1 this evening:

http://mxr.mozilla.org/seamonkey/source/tools/buildbot-configs/automation/production-1.9/master.cfg#180
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/tools/tinderbox-configs/firefox/macosx/mozconfig&rev=1.16.2.2

Good find, Nick. I stand corrected - seems we *used* to have automated shark builds on FF3.0. Dont know when/why it was disabled, but I'd guess around 18june2008 from the builds on ftp.m.o. Dont know yet what is involved to bring this up from "FF3.0 on cvs on tinderbox/buildbot" to "m-c on hg on buildbot".

From offline discussions with Vlad and Shaver, priority wise this rates below everything on the goals list.
Yes this was done nightly through buildbot, I don't sully my own machines with building :)

This was originally set up in bug 412780 FWIW.
You've probably figured this out by now from reading the previous bug, but the reason there are only two builds is because these builds are relatively huge and we figured expedience was better than keeping an archive of dubious value.

The nightly build was in fact published to the same location, overwriting the previous.

The work needed before to do this was more or less:

1) some --enable flags in mozconfig, see attachment 299868 [details] [diff] [review]
2) PKG_SKIP_STRIP=1 during "make package"

Simplest way at the time seemed to be to have a special-purpose builder, although it did share the production Mac slave (only ran nightly, and on a different schedule).

It'd be nice to make setting up these kinds of experimental builds more of a no-brainer, we discussed something like that a while ago but priority-wise it comes down to a dev need versus a release need. E.g. couldn't it be easy as instantiating a new custom BuildFactory, and just override the mozconfig, "make package" and publishing location?

You'd probably want to skip the whole update-packaging bit too (or not!). 

I think this is similar to the idea of being able to quickly bring up a builder to follow a new hg repo, perhaps automatically, that we discussed a while ago. I don't want to get too offtopic for this bug though so if anyone is interested in discussing further let's do.

For the purposes of this bug I'd suggest doing something like I did before (new builder that shares the nightly buildslave), and if you want to get all forward-looking try to make it easier to spin up experimental builds in the future via a BuildFactory or something like that.
Component: Release Engineering → Release Engineering: Future
rather than just moving this to future - can you guys give everyone a timeline this *will* happen in?  think that's where most of the issues came from.
Per discussion with shaver, this "shark build" request is lower priority then our existing Q4 goals, hence moving to Future. Happy to do this as a Q1 goal. We'll start this work early Q1, but I reserve comment on how long it will take until we investigate the cvs-hg, tinderbox-buildbot and diskspace issues discussed below.  

In the meanwhile, Shaver will do some shark builds using tryserver. Fingers crossed that will suffice until we get this going in production.
(In reply to comment #10)
> Per discussion with shaver, this "shark build" request is lower priority then
> our existing Q4 goals, hence moving to Future. Happy to do this as a Q1 goal.
> We'll start this work early Q1, but I reserve comment on how long it will take
> until we investigate the cvs-hg, tinderbox-buildbot and diskspace issues
> discussed below.  

Can we get an update here?  I have bad news about the try server route...
 
> In the meanwhile, Shaver will do some shark builds using tryserver. Fingers
> crossed that will suffice until we get this going in production.

...but you can't easily do working shark builds with tryserver web interface, because there's no way to suppress the symbol stripping via mozconfig.
Please try this build to see if has Shark built in correctly
 http://people.mozilla.org/~nthomas/misc/firefox-3.1b3pre.en-US.mac-shark.dmg

It's much smaller than I was expecting given bug 412780 comment 19, only 23MB (vs 16MB for normal opt, yes I did set PKG_SKIP_STRIP=1 for make package). mozconfig was
  . $topsrcdir/build/macosx/universal/mozconfig

  ac_add_options --enable-application=browser
  ac_add_options --disable-tests
  ac_add_options --enable-codesighs

  export CFLAGS="-gdwarf-2"
  export CXXFLAGS="-gdwarf-2"

  # Needed to enable breakpad in application.ini
  export MOZILLA_OFFICIAL=1

  # Enable parallel compiling
  mk_add_options MOZ_MAKE_FLAGS="-j4"

  # shark specific options
  ac_add_options --enable-libxul
  ac_add_options --enable-shark
  ac_add_options --enable-debugger-info-modules
which is the nightly config with update stuff removed, and these last 4 lines added.
Shark likes that build!

	0.0%	8.5%	XUL	                      nsLayoutUtils::PaintFrame(nsIRenderingContext*, nsIFrame*, nsRegion const&, unsigned int)	
	0.3%	8.1%	XUL	                       nsDisplayList::Paint(nsDisplayListBuilder*, nsIRenderingContext*, nsRect const&) const	
	0.0%	7.5%	XUL	                        nsDisplayClip::Paint(nsDisplayListBuilder*, nsIRenderingContext*, nsRect const&)	
	0.0%	7.5%	XUL	                         nsDisplayList::Paint(nsDisplayListBuilder*, nsIRenderingContext*, nsRect const&) const	
	0.0%	2.7%	XUL	                          nsDisplayClip::Paint(nsDisplayListBuilder*, nsIRenderingContext*, nsRect const&)	
	0.0%	2.7%	XUL	                           nsDisplayList::Paint(nsDisplayListBuilder*, nsIRenderingContext*, nsRect const&) const	
	0.0%	1.7%	XUL	                            nsDisplayBorder::Paint(nsDisplayListBuilder*, nsIRenderingContext*, nsRect const&)	
	0.0%	1.7%	XUL	                             nsCSSRendering::PaintBorder(nsPresContext*, nsIRenderingContext&, nsIFrame*, nsRect const&, nsRect const&, nsStyleBorder const&, nsStyleContext*, int)	
	0.0%	1.7%	XUL	                              nsCSSBorderRenderer::DrawBorders()	
	0.0%	1.0%	XUL	                               nsCSSBorderRenderer::DrawBorderSides(int)	

etc.

Yay!

(Word on the street is that
 ac_add_options --disable-install-strip
will work too!)
Awesome. I saw some talk on IRC about line numbers. Are we good to go as is or does it need more work ?
Assignee: nobody → nthomas
Status: NEW → ASSIGNED
Component: Release Engineering: Future → Release Engineering
Priority: P3 → P2
Mostly use MercurialBuildFactory, just overload doUpload to prevent getting a 2nd copy of the package in tinderbox-builds, and make sure the package has a unique name.
Attachment #363612 - Flags: review?(bhearsum)
Attached patch Staging buildbot-configs (obsolete) — Splinter Review
* enables shark builds for m-c, m-1.9.1, & tracemonkey
* adds the mozconfig (nightly less codesighs, update lines, plus shark stuff)
* changes to master-main.cfg to setup the build using the factory

This stuff works OK on separate master, would verify on full staging before proceeding further.
Attachment #363613 - Flags: review?(bhearsum)
Attached patch Production buildbot-configs (obsolete) — Splinter Review
Almost the same as the staging configs (separated for clarity reasons), with some whitespace changes to sync mozilla2/master.cfg and mozilla-staging/master-main.cfg.
Attachment #363614 - Flags: review?(bhearsum)
Attachment #363612 - Attachment is obsolete: true
Attachment #363612 - Flags: review?(bhearsum)
Comment on attachment 363612 [details] [diff] [review]
Add SharkBuildFactory to buildbotcustom

>+        self.env.update({'MOZ_PKG_SPECIAL': 'shark'})

This ends up in the environment for other mac builds, not so good. Testing a fix.
Can't say this is pretty, but it doesn't screw up non-shark builds.
Attachment #363655 - Flags: review?(bhearsum)
Attachment #363655 - Flags: review?(bhearsum) → review+
Comment on attachment 363655 [details] [diff] [review]
 Add SharkBuildFactory to buildbotcustom, v2

Yeah, not ideal, but this is a pretty minimal amount of code reduplication.
You could also change this to do the RepackFactory thing and add self.postUploadCmd in __init__. Totally up to you. r=bhearsum if you'd rather keep it how it is though.
Comment on attachment 363613 [details] [diff] [review]
Staging buildbot-configs

>+                 nightly=True,

AFAICT this is a dep build - it appears like it will be triggered on-change and periodically. Is there a reason we need to flip this bit? You're overriding doUpload, so the only other reason I can see is because it clobbers. I dunno if we want to do a clobber build on two platforms for every build, though.

>+                 createSnippet=False,
>+                 buildSpace=buildSpace,
>+                 clobberURL=BASE_CLOBBER_URL,
>+                 clobberTime=clobberTime,
>+                 buildsBeforeReboot=pf['builds_before_reboot']

It kindof sucks that we can't do this as just another platform (eg, 'macosx-shark') - I don't particularly like all the special casing. We'd have to move the mozconfig definition and BuildFactory type out to that, though. Could get icky, eg:

mozilla2_shark_factory = pf['build_factory'](
...
)

That route would end up with a lot of reduplication in config.py, I guess, so it's not great either way.

Leaving open because of the first paragraph.
Comment on attachment 363614 [details] [diff] [review]
Production buildbot-configs

After re-reading this (and Nick pointing it out) I realized that this is indeed a nightly builder, so nightly=True makes sense.

We talked on IRC a bit and decided it would be best to get rid of the doUpload override, to avoid duplicating most of that logic.

Nick, r=me with that change.
Attachment #363614 - Flags: review?(bhearsum) → review+
Comment on attachment 363613 [details] [diff] [review]
Staging buildbot-configs

Same thing here.
Attachment #363613 - Flags: review?(bhearsum) → review+
It belatedly occurs to me that I could set MOZ_PKG_SPECIAL in the mozconfig, and do without a buildbotcustom change at all - testing that instead.
(In reply to comment #24)
This works ok, provided I export it rather than use mk_add_options. Alternative patch to follow.

Regarding removing doUpload, we'll get a copy in tinderbox-builds/ and in a nightly/YYYY/MM/YYYY-MM-DD-HH-branch directory. The first shouldn't confuse talos as the FtpPoller is only matching on "en-US.mac.dmg" (files here are en-US.mac-shark.dmg). Alice filed bug 458243 about a similar linux situation so it's clearly unhelpful. The nightly copy will chew up space, but 23MB/branch/day isn't much.
Ben, how does this float your boat ? Does it all in buildbot-configs repo by exporting in the mozconfig, and switching mozilla2_shark_factory to NightlyBuildFactory.
Looks fine to me
Comment on attachment 363855 [details] [diff] [review]
buildbot-configs using NightlyBuildFactory

Ok, carrying over review from earlier patches. Staging part of this pushed
  http://hg.mozilla.org/build/buildbot-configs/rev/b2437ed6e866
Attachment #363855 - Flags: review+
Attachment #363613 - Attachment is obsolete: true
Attachment #363614 - Attachment is obsolete: true
Attachment #363655 - Attachment is obsolete: true
Comment on attachment 363855 [details] [diff] [review]
buildbot-configs using NightlyBuildFactory

Pushed the production part of this
  http://hg.mozilla.org/build/buildbot-configs/rev/44c9349df30c
so all of this attachment is now landed.
Attachment #363855 - Flags: checked‑in+
Master reconfig'd to put this into use in production.

Shark builds will be triggered at 3:02am with all the other nightly builds, starting tomorrow. Example links to the result are
http://ftp.build.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-1.9.1/firefox-3.1b4pre.en-US.mac-shark.dmg
http://ftp.build.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-central/firefox-3.2a1pre.en-US.mac-shark.dmg
http://ftp.build.mozilla.org/pub/mozilla.org/firefox/nightly/latest-tracemonkey/firefox-3.2a1pre.en-US.mac-shark.dmg
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Shark builds are like nightlies and drop from the waterfall some 12 or so hours after running. This changes our waterfall monitoring script to treat them the same and not cause a nagios alert.
Attachment #366677 - Flags: review?(aki)
Comment on attachment 366677 [details] [diff] [review]
Ignore shark builds dropping from waterfall

r=aki for the check_tinderbox_status part.
Attachment #366677 - Flags: review?(aki) → review+
Comment on attachment 366677 [details] [diff] [review]
Ignore shark builds dropping from waterfall

Checked in check_tinderbox_status part, apologies for the extra file sneaking in there.
Added the shark files to the monitoring for "freshness" of nightlies (all three repos).
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.