<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Comment 4

•

13 years ago

(In reply to Gary Kwong [:gkw, :nth10sd] from comment #3) > (In reply to Ehsan Akhgari [:ehsan] from comment #2) > > Done. For future reference, you can click on Tree Info and then Adjust > > Hidden Builders and do it yourself. > > I don't have a sheriff password. Lies! ;-) > Please also unhide "Linux x86-64 mozilla-central valgrind". Done.

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Updated

•

13 years ago

Status: REOPENED → RESOLVED

Closed: 13 years ago → 13 years ago

Resolution: --- → FIXED

Reporter

Updated

•

13 years ago

Status: RESOLVED → VERIFIED

Phil Ringnalda (:philor)

Comment 5

•

13 years ago

Um. A job which is not hidden on mozilla-central is a tier 1 job which may not be broken, and when it fails the cause must be immediately backed out. Valgrind doesn't run on try, it doesn't run on inbound, it doesn't run on fx-team, it doesn't run on services-central, it doesn't run on-push, it is not tier-1. Rehidden.

Status: VERIFIED → REOPENED

Resolution: FIXED → ---

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 6

•

13 years ago

> Valgrind doesn't run on try, it doesn't run on inbound, it doesn't run on > fx-team, it doesn't run on services-central, it doesn't run on-push, it is > not tier-1. > > Rehidden. Valgrind builds can eventually be made to run on try and all the other branches, but it will take up a lot of resources to be run on-push, and we never planned for it to be run on-push. Does this mean it can never be tier-1, and never be unhidden?

Phil Ringnalda (:philor)

Comment 7

•

13 years ago

No, the other option would be to make it the first thing ever which is tier 1 despite running only once a day (desktop nightly builds don't really count, because they are 99% identical to jobs that run throughout the day, and the other 1% is absolutely essential even if it's miserable to have it broken by something within the last 24 hours). Well, the first thing since the Netscape days - from what I hear, they used to close the tree to build nightly builds, manually test them, and then chase after people who committed during the previous day to figure out who had broken what, because they didn't have on-push tests to speak of. Seemed like a pretty miserable way to develop, to me. But that would be fairly similar to what would need to happen with a 3am nightly-only tier 1 job - at 4am, edmorley would close mozilla-central, close mozilla-inbound since that would almost certainly be where the bustage had come from, and begin bisecting, all the while leaving mozilla-central and mozilla-inbound closed. Of course, it would be pointless to have him do the bisecting, since Valgrind is not his field of expertise, and it would be pointless to have mozilla-central and mozilla-inbound closed, since the reason you close a tree when it is busted is to prevent adding more undetectable bustage, but piling bustage on bustage is exactly what a once-a-day job is all about. So, you could make it the only visible thing which does not require an immediate backout or tree closure, and does not require the sheriff to figure out what busted it, at which point... why is it visible again?

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Comment 8

•

13 years ago

So you're saying that a test can/should only be visible if it's on-push?

Reporter

Comment 9

•

13 years ago

The way to deal with Valgrind issues could be to file a bug and add a suppression for the new bug, then (get Releng to) retrigger only Valgrind builds after the suppression is landed as a DONTBUILD. Assuming it is the only problem found, the retriggered Valgrind build should be green again, and we'd have a bug on file to chase down the problem. Just giving a bit of overview here.

Comment 10

•

13 years ago

(In reply to Nicholas Nethercote [:njn] from comment #8) > So you're saying that a test can/should only be visible if it's on-push? Up until now, pretty much yes (PGO is every 3/6 hours depending on tree; Nightly is unavoidable + 99% similar as philor said). They also need to run on Try + all trunk trees that merge into mozilla-central. I've thought for a while it would be useful to have this documented somewhere, since it comes up most of the times we add something new to TBPL that isn't running on all trees (last one was Marionette iirc). /me adds to pile of sheriffing things to update on the wiki.

Comment 11

•

13 years ago

Gary has a good point in comment 9 -- this case is a bit different because there's a mechanical process for getting the test green again while yielding a bug on file. IIRC we run with --gen-suppressions=yes so Valgrind even spits out the necessary suppression.

Andrew McCreight [:mccr8]

Comment 12

•

13 years ago

I agree that it's more of a grey area than for other cases, but it still puts extra load on our (other than myself) volunteer sheriffs, who until now have not had to actively check things into mozilla-central on a daily basis just to maintain the green status-quo. That said, given that the Valgrind builds only run once a day & will presumably complete within my timezone, I guess perhaps it may not affect other sheriffs as much (other than weekends). I also don't have any idea how often it will turn red - if infrequently, then perhaps we are worrying over nothing (especially if it's just a case of copy-pasta-ing a snippet from the log & filing a short bug). Philor, RyanVM, thoughts?

Comment 13

•

13 years ago

Is there some reason to unhide it? If any time it goes orange, you are just going to push an autogenerated patch to ignore the failure, I'm not sure what the value is in showing it.

Ryan VanderMeulen [:RyanVM]

Comment 14

•

13 years ago

(In reply to Andrew McCreight [:mccr8] from comment #13) What he said. If we're pushing changes to the suppression file to keep things green, then it seems that we're effectively ignoring it anyway. Seems like it would just end up causing confusion amongst people to have it showing. I agree that it's probably best left hidden by default unless we get to a point of running it at the same frequency as all of our other regression tests.

Ryan VanderMeulen [:RyanVM]

Comment 15

•

13 years ago

That said, maybe there's a compromise here. If enough machine resources could be allocated to run Valgrind builds on the main branches (m-c, m-i, m-a, m-b, m-r, m-esr10/17, s-c, fx-team, try maybe optionally), could we have an arrangement similar to PGO builds where they run as often as possible, but not necessarily on each push? That would at least narrow down a regression window to a shorter timeframe.

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Comment 16

•

13 years ago

> Is there some reason to unhide it? If it's hidden, will anyone notice if it stops being green? Running the tests more often would be great, if possible. How long do the tests take? > I also don't have any idea how often it will turn red - if > infrequently, then perhaps we are worrying over nothing Indeed. If it turns out to happen frequently, we could go to plan B.

Reporter

Comment 17

•

13 years ago

> Running the tests more often would be great, if possible. How long do the > tests take? They take < 1 hour now, but that's because they're only running PGO tests. The more tests we run, the longer it would take. Julian has done tests on our mochitest suite w/ Valgrind and we think it takes about 11-12 hours. > Indeed. If it turns out to happen frequently, we could go to plan B. fwiw, we had another green Valgrind build this morning. I'd say it depends on what the developers land to influence prevalence of redness.

Updated

•

13 years ago

Assignee: ehsan → nobody

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Comment 18

•

13 years ago

> fwiw, we had another green Valgrind build this morning. And we've had them for the five days since, AFAICT. (Well, https://tbpl.mozilla.org/?noignore=1&jobname=valgrind&rev=942ed5747b63 is red today but that looks like "abort: data/toolkit/Makefile.in.i@1b08914858da: no match found!" is the problem -- maybe an infrastructure problem?)

Julian Seward [:jseward]

Comment 19

•

13 years ago

(In reply to Gary Kwong [:gkw, :nth10sd] from comment #17) > Julian has done tests on our mochitest suite w/ Valgrind and we think it > takes about 11-12 hours. Roughly 10 CPU hours and 2.5GB real memory, for x86_64-linux built at gcc -O2, running on a 3.47 GHz Core i5. Being able to run Mochitests here would be awesome (for lack of a better word :). It routinely picks up new bugs on the once-per-month basis that I've been running it by hand for a while. It will take some effort to get it green, but it'd be well worth the effort.

Reporter

Comment 20

•

13 years ago

> (Well, > https://tbpl.mozilla.org/?noignore=1&jobname=valgrind&rev=942ed5747b63 is > red today but that looks like "abort: > data/toolkit/Makefile.in.i@1b08914858da: no match found!" is the problem -- > maybe an infrastructure problem?) The subsequent rebuilds are all green.

Comment 21

•

13 years ago

(In reply to comment #20) > > (Well, > > https://tbpl.mozilla.org/?noignore=1&jobname=valgrind&rev=942ed5747b63 is > > red today but that looks like "abort: > > data/toolkit/Makefile.in.i@1b08914858da: no match found!" is the problem -- > > maybe an infrastructure problem?) > > The subsequent rebuilds are all green. That's hg repository corruption.

Comment 22

•

13 years ago

(In reply to Nicholas Nethercote [:njn] from comment #18) > (Well, > https://tbpl.mozilla.org/?noignore=1&jobname=valgrind&rev=942ed5747b63 is > red today but that looks like "abort: > data/toolkit/Makefile.in.i@1b08914858da: no match found!" is the problem -- > maybe an infrastructure problem?) That run should have retried, but didn't due to mock; filed bug 802114.

Updated

•

13 years ago

Depends on: 801955

Comment 23

•

12 years ago

Like any other build/testsuite, displaying by default is dependant on running per push (see bug 801955); which I don't see happening given the resource usage and comments from releng. Please reopen the bug if it does. In a TBPLv2 world (TBPL rewrite, happening this year), Valgrind is the kind of job that could be made more visible if we implement team-specific view modes etc. In the meantime, you'll just need to use &noignore=1.

Status: REOPENED → RESOLVED

Closed: 13 years ago → 12 years ago

Resolution: --- → INCOMPLETE

Nobody; OK to take it and work on it

Updated

•

12 years ago

No longer blocks: valgrind-on-tbpl

Depends on: valgrind-on-tbpl

Assignee

Updated

•

11 years ago

Product: Webtools → Tree Management