Optimize by replacement appears to be considering canceled tasks
Categories
(Firefox Build System :: Task Configuration, defect, P2)
Tracking
(firefox85 fixed)
Tracking | Status | |
---|---|---|
firefox85 | --- | fixed |
People
(Reporter: ahal, Assigned: bc)
References
Details
Attachments
(1 file)
In this push:
https://treeherder.mozilla.org/jobs?repo=try&revision=09bb1f2929ed3a4977f99a2b8e658a2cdaed3bf1
There were tons of build and test tasks scheduled (see decision task log), but none of them ended up running. Digging into the decision task, bc found that many of the docker-image tasks actually had an exception. E.g:
https://firefox-ci-tc.services.mozilla.com/tasks/H6Nonoe6TASfWD7QSaKp0g
This is the image task for the lint
image, which has "canceled" as its "reasonResolved". It originally ran on an earlier push and was optimized by replacement here.
One would assume that a canceled task is not a valid replacement :). This could be broader than just canceled tasks too, I'm not sure.
Comment 1•4 years ago
|
||
I'm not sure what I'm looking at here, but I was seeing similar behavior with my sixgill
toolchain jobs. I think this might be one of them:
This was pushed with try fuzzy requesting a sixgill toolchain job and two dependent haz jobs, none of which ran. The decision tasks shows the toolchain-linux64-gcc-sixgill
with "EXCEPTION" state, with reasonResolved=DEADLINE_EXCEEDED. No such job seems to appear on treeherder for that push
The following push
is similar, this time with toolchain-linux64-gcc-10 showing DEADLINE_EXCEEDED.
These don't match what I remembered from the time (which was that I did some pushes for haz jobs that depended on the sixgill toolchain. But maybe that's being added in before pushing try_task_config.json
? I guess that would make sense.) Sorry if this is unrelated noise; I don't understand this stuff to know whether it's related to this bug or not.
I see a preceding failed gcc-10 fetch job on a previous push: https://treeherder.mozilla.org/jobs?repo=try&author=sfink%40mozilla.com&fromchange=ef6a1fccbcc16d75a45f7118e69f8b6ab801f517&collapsedPushes=812259%2C812245&selectedTaskRun=Vq8rPKgETD67XPRUrcnDDw.0
When I look at the decision task here, though, it seems to say these were not being optimized out. So I suspect whatever I was running into is probably not related. I also don't remember any DEADLINE_EXCEEDED or timed out jobs at the time; back then, I remember it seeming like those jobs just weren't showing up at all. My faulty memory of what was happening:
- I did a push selecting only the two haz jobs, which depend on sixgill and gcc
- my push touched the code for one or the other, so it tried to rebuild the toolchain and failed
- I did another push selecting only the two haz jobs, with a fix for the toolchain
- it didn't run anything of interest, and looking at the decision job it had some kind of failure status on the toolchain listed
- I thought I had "fixed" it by explicitly requesting the toolchain jobs along with the haz jobs. But looking at it now, I suspect that that was mostly irrelevant; somehow the next push got what I wanted.
I'm having difficulty matching that recollection to my historical try pushes, though.
Assignee | ||
Comment 2•4 years ago
|
||
Updated•4 years ago
|
Updated•4 years ago
|
Comment 5•4 years ago
|
||
bugherder |
Description
•