Closed Bug 594415 Opened 15 years ago Closed 12 years ago

TBPL should report on individual suite results.

Tracking

(Not tracked)

Status:

RESOLVED INCOMPLETE

People

(Reporter: catlee, Assigned: Swatinem)

References

Details

Attachments

(3 files, 3 obsolete files)

group reftest suites 15 years ago Arpad Borsos [:Swatinem] 6.88 KB, patch	mstange : review+	Details \| Diff \| Splinter Review
do not use logID as runID 15 years ago Arpad Borsos [:Swatinem] 9.41 KB, patch		Details \| Diff \| Splinter Review
do not use logID as runID 15 years ago Arpad Borsos [:Swatinem] 11.20 KB, patch	mstange : review+ dbaron : review-	Details \| Diff \| Splinter Review
split mochitest-other into individual tests 15 years ago Arpad Borsos [:Swatinem] 9.02 KB, patch		Details \| Diff \| Splinter Review
group reftest suites (fixed) 15 years ago Arpad Borsos [:Swatinem] 6.47 KB, patch	Swatinem : review+	Details \| Diff \| Splinter Review
split mochitest-other into individual tests 15 years ago Arpad Borsos [:Swatinem] 6.65 KB, patch	mstange : feedback+	Details \| Diff \| Splinter Review

Chris AtLee [:catlee]

Reporter

Description

•

15 years ago

In suites like mochitest-other, several individual suites are run together. They're currently all lumped together under M(oth). It would be nice to break each suite out into its own box somehow. This is also blocking us merging the crashtest and jsreftest suites together, since developers want the results reported separately.

(not currently active) Ted Mielczarek

Comment 1

•

15 years ago

My thought was to report it something like we currently do with the split mochitests. Right now we have M(1 2 3 4 5 oth), so it's not much of a stretch to split that out to M(1 2 3 4 5 B C A P) (browser, chrome, a11y, plugins). We could similarly group the 3 reftest suites, as R(R C J).

Arpad Borsos [:Swatinem]

Assignee

Comment 2

•

15 years ago

Grouping the reftests should be straightforward. Splitting the results however would require a complete rework of our “run” concept in tbpl.

Arpad Borsos [:Swatinem]

Assignee

Comment 3

•

15 years ago

Attached patch group reftest suites (obsolete) — Details — Splinter Review

As suggested by comment 1, this groups the reftest suites, alphabetically.

Assignee: nobody → arpad.borsos

Status: NEW → ASSIGNED

Attachment #474653 - Flags: review?(mstange)

Arpad Borsos [:Swatinem]

Assignee

Comment 4

•

15 years ago

(In reply to comment #3) And it does only group C J R, not X. Maybe that should also be in the group?

Arpad Borsos [:Swatinem]

Assignee

Comment 5

•

15 years ago

Attached patch do not use logID as runID (obsolete) — Details — Splinter Review

Attachment #474655 - Flags: review?(mstange)

Arpad Borsos [:Swatinem]

Assignee

Comment 6

•

15 years ago

Attached patch do not use logID as runID — Details — Splinter Review

Was missing a change in AddCommentUI.

Attachment #474655 - Attachment is obsolete: true

Attachment #474656 - Flags: review?(mstange)

Attachment #474655 - Flags: review?(mstange)

Markus Stange [:mstange]

Comment 7

•

15 years ago

Comment on attachment 474653 [details] [diff] [review] group reftest suites + return machineType missing semicolon r=me, though this doesn't really have anything to do with the issue this bug was filed about, does it?

Attachment #474653 - Flags: review?(mstange) → review+

Markus Stange [:mstange]

Comment 8

•

15 years ago

Comment on attachment 474656 [details] [diff] [review] do not use logID as runID diff --git a/js/TinderboxJSONUser.js b/js/TinderboxJSONUser.js - - getBuildScrape: function TinderboxJSONUser_getBuildScrape(td, machineRunID) { - return td.scrape[machineRunID]; - }, - I know what you must be thinking... this was a much longer function when we were still parsing the Tinderbox HTML. + var runID = index++; So this is preparation for using the buildbot build ID in the future? Great!

Attachment #474656 - Flags: review?(mstange) → review+

Arpad Borsos [:Swatinem]

Assignee

Comment 9

•

15 years ago

Attached patch split mochitest-other into individual tests (obsolete) — Details — Splinter Review

I’m not too happy about this patch, it introduces a lot of quirks and does not play very well with the legend and the machine type handling. This also fixes an issue I noticed in the “group reftest suites” patch that had duplicated names in the title of links. I also noticed that B for browser-chrome collides with the B for build. The way stars work is also somehow messed up. You can star without problems, but the star only shows up on the individual test until you refresh, then the same star is shown on all the tests. This may also lead to accidental double starring. It does however show you right away which one of the other tests is failing.

Attachment #474672 - Flags: feedback?(mstange)

Arpad Borsos [:Swatinem]

Assignee

Comment 10

•

15 years ago

Attached patch group reftest suites (fixed) — Details — Splinter Review

Attachment #474653 - Attachment is obsolete: true

Attachment #474673 - Flags: review+

Arpad Borsos [:Swatinem]

Assignee

Comment 11

•

15 years ago

group reftest suites: http://hg.mozilla.org/users/mstange_themasta.com/tinderboxpushlog/rev/7fb6d0c8a7af do not use logID as runID: http://hg.mozilla.org/users/mstange_themasta.com/tinderboxpushlog/rev/19a181cfb2ab

Arpad Borsos [:Swatinem]

Assignee

Comment 12

•

15 years ago

(In reply to comment #9) > I also noticed that B for browser-chrome collides with the B for build. And the C is ambiguous for crashtest or chrome.

Phil Ringnalda (:philor)

Comment 13

•

15 years ago

And I'm betting that this is the reason that the summary for every orange build I've looked at today is "invalid id" making it impossible to star anything with tbpl. Markus, if you're still awake, please undeploy this (or whatever else you might have pulled overnight, if it isn't actually this).

Markus Stange [:mstange]

Comment 14

•

15 years ago

undeployed

Arpad Borsos [:Swatinem]

Assignee

Comment 15

•

15 years ago

pushed a bustage fix: http://hg.mozilla.org/users/mstange_themasta.com/tinderboxpushlog/rev/335c76ca4e5b

(not currently active) Ted Mielczarek

Comment 16

•

15 years ago

(In reply to comment #4) > (In reply to comment #3) > And it does only group C J R, not X. Maybe that should also be in the group? X is xpcshell, that's totally different. (In reply to comment #2) > Grouping the reftests should be straightforward. Splitting the results however > would require a complete rework of our “run” concept in tbpl. That's unfortunate, since splitting the mochitest suites was the real impetus of this bug. bug 586418 aims to reduce wasted setup/teardown time by combining some suites, but we'd like the reporting to be better.

Arpad Borsos [:Swatinem]

Assignee

Comment 17

•

15 years ago

Attached patch split mochitest-other into individual tests — Details — Splinter Review

Attachment #474672 - Attachment is obsolete: true

Attachment #475088 - Flags: feedback?(mstange)

Attachment #474672 - Flags: feedback?(mstange)

Markus Stange [:mstange]

Comment 18

•

15 years ago

Comment on attachment 475088 [details] [diff] [review] split mochitest-other into individual tests I think this is the right approach. It obviously requires some more work to iron out the kinks you mentioned.

Attachment #475088 - Flags: feedback?(mstange) → feedback+

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 19

•

15 years ago

Comment on attachment 474656 [details] [diff] [review] do not use logID as runID I think this patch caused a pretty serious regression in TBPL, which is that if you have a build highlighted and TBPL reloads data from tinderbox, the build that's highlighted will jump to some other build. We need to use a persistent ID here; using a counter that changes each reload doesn't work.

Attachment #474656 - Flags: review-

Arpad Borsos [:Swatinem]

Assignee

Comment 20

•

15 years ago

http://hg.mozilla.org/users/mstange_themasta.com/tinderboxpushlog/rev/98535a5c1fd3 Backed out. This will probably re-land once we don’t reload the whole list all over, instead just appending new runs.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 21

•

15 years ago

Also, I'm skeptical that this is a good approach to take at this time. It's going to break starring. And, furthermore, TBPL isn't really where we should be trying to split up build logs, which is quite expensive. I think splitting the runs into the appropriate units that we want to display belongs on the server side.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 22

•

15 years ago

... I should probably explain "break starring" slightly more: right now we store stars on tinderbox, which means that a star is associated with whatever tinderbox considers a build. If we do this, we'll end up starring all of the split builds when we attempt to star one. Now, at some point we should move starring off of tinderbox. But there's still the issue that TPBL is slow enough already, and we're better off doing expensive splitting operations on the server side.

(not currently active) Ted Mielczarek

Comment 23

•

15 years ago

The immediate use case here is that in bug 586418, we want to combine crashtest+jsreftest into a single run without making the data display worse than it currently is. FWIW, for what you're describing, if we split the display of mochitest-other out, it's not going to be worse than it currently is.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 24

•

15 years ago

(In reply to comment #23) > The immediate use case here is that in bug 586418, we want to combine > crashtest+jsreftest into a single run without making the data display worse > than it currently is. FWIW, for what you're describing, if we split the display > of mochitest-other out, it's not going to be worse than it currently is. It will be worse, since we tend to follow the "don't star a build unless you've figured out everything that's wrong with it" rule. With this change, people will be unable to follow that rule.

Arpad Borsos [:Swatinem]

Assignee

Comment 25

•

15 years ago

I already mentioned the starring problem in comment 9. Moving it to the server side definetely leads to a much clearer client. But we are not quite there yet.

Markus Stange [:mstange]

Comment 26

•

15 years ago

(In reply to comment #0) > In suites like mochitest-other, several individual suites are run together. > They're currently all lumped together under M(oth). > > It would be nice to break each suite out into its own box somehow. What should we do with the logs? Should we make the server side log parsing look for separators between the individual log parts and break them up into separate logs?

cmtalbert

Comment 27

•

14 years ago

(In reply to comment #22) > ... I should probably explain "break starring" slightly more: right now we > store stars on tinderbox, which means that a star is associated with > whatever tinderbox considers a build. If we do this, we'll end up starring > all of the split builds when we attempt to star one. > > Now, at some point we should move starring off of tinderbox. But there's > still the issue that TPBL is slow enough already, and we're better off doing > expensive splitting operations on the server side. So, we've moved starring off of tinderbox now with the advent of Orange Factor. Is there anything else blocking this bug from landing?

Phil Ringnalda (:philor)

Comment 28

•

14 years ago

You've got your tenses mixed up: "we will have moved starring off of tinderbox in the future, so at that time will there be anything else?" I think the answer is "that depends on whether the future buildbot-only version of tbpl knows how to split apart multiple arbitrary suites in a single log, by having buildbot always insert the (currently undefined) One True Separator between suites." The other problem that I don't see quite explicitly brought up yet is that currently a log has a status, busted or testfailed (or, I've heard, success), which would need to be output for each suite rather than once per log. Quite often it's the result of the run's status rather than a suite's status, so I suspect releng will need to rework all the stuff that currently decides whether a testfailed in a suite turns the whole log to testfailed, or doesn't because an infra failure in another buildstep made it busted rather than just testfailed - some failures, like failing to download the build, make every test suite in the run busted, but others, like failing to unpack jsreftests in a combined run of J and X, would only bust part.

Nick Thomas [:nthomas] (UTC+12)

Updated

•

14 years ago

Blocks: 659328

Armen [:armenzg]

Comment 29

•

14 years ago

Where are we at in this bug? This is currently blocking me from merging unit test jobs. It seems that grouping or no grouping does not affect when dealing with talos runs (as I discovered it when adding tp5 to the tp4 run).

Markus Stange [:mstange]

Comment 30

•

14 years ago

(In reply to comment #29) > Where are we at in this bug? I haven't done any work on this yet. > This is currently blocking me from merging unit test jobs. Which ones? Still crashtest and jsreftest, as stated in comment 0? > It seems that grouping or no grouping does not affect when dealing with > talos runs (as I discovered it when adding tp5 to the tp4 run). Well, depends on what you mean. You didn't get separate "T"s for tp4 and tp5; they shared the same T. If you don't mind having the shared crashtest/jsreftest run to have a shared "CJ", for example, then we're already all set. But splitting up the "letters" is what this bug was originally about, and that's harder. If we want split letters, we'll need to solve the problems philor mentioned in comment 28: We need a decision about the log separator (this is the easy part), and we need to teach the buildbot scripts that create the build json how to specify what parts a build consists of, along with a result code for each one. This need to be done on the buildbot side before anything can be done on the TBPL side.

Armen [:armenzg]

Updated

•

14 years ago

No longer blocks: 659328

Ed Morley [:emorley]

Comment 31

•

12 years ago

I think our requirements have changed somewhat here - and any future use cases will need to be handled by treeherder anyway. Closing per IRC discussions with Swatinem :-)

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

Resolution: --- → INCOMPLETE

Nobody; OK to take it and work on it

Updated

•

11 years ago

Product: Webtools → Tree Management

Nobody; OK to take it and work on it

Updated

•

10 years ago

Product: Tree Management → Tree Management Graveyard

You need to log in before you can comment on or make changes to this bug.