when requesting a talos test via titanic backfill, I always get an error

RESOLVED WONTFIX

Status

RESOLVED WONTFIX
4 years ago
4 years ago

People

(Reporter: jmaher, Unassigned)

Tracking

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

4 years ago
I have done many requests for titanic to backfill jobs.  These requests almost always require a build for each revision.  I seem to get errors, and what I believe is happening is that we launch the builds successfully, then immediately go to launch the tests and it fails instead of waiting for the builds to finish.  I suspect we need to adjust our test launching process.

Here is an edited version of the output:
INFO:titanic:Your return code is: 202
INFO:titanic:https://secure.pub.build.mozilla.org/buildapi/revision/mozilla-inbound/e59e29b14cd6
Building for Job...
Builds are triggered!
Error: For e59e29b14cd6 Windows XP 32-bit mozilla-inbound pgo talos chromez
Can you list a few other tests which error out?

As an optimization we basically decided to let the processing "fall through" ie. process multiple stages in one go. The problem is I think the build system takes some time to register the existence of the builds via the API we are using. So my suggest would be to not fall through from "updated" status to "building" status and instead break when we update status to "building".

I think this could be a simple fix.
https://github.com/gakiwate/titanic/blob/master/backfill.py#L81

What do you think?
(Reporter)

Comment 2

4 years ago
I have not seen any of my pushes work in the last 10 or so tries without an error- in fact all of them need at least one build if not many.  If we could:
schedule builds
sleep 2 minutes
check builds / schedule tests

If we run this in a cron job every 10-15 minutes, then we don't need a sleep but just do this:
if anybuilds are needed:
    schedule_builds()
    return
else:
    schedule_tests()
    return

this is a good place to weigh in here that if we need to do builds, waiting 15 extra minutes to schedule some tests is not a big deal;  The reason for backfilling is to look at data from point a->b which cannot be completed until all data is there.
Created attachment 8522739 [details] [diff] [review]
0001-BUG-1098250-Proposed-Fix.patch

I was thinking something as simple as this. I am not very keen on having a sleep inside the loop as it might lead to trouble later on and it's just bad practice! :)
Attachment #8522739 - Flags: review?(jmaher)
(Reporter)

Comment 4

4 years ago
Comment on attachment 8522739 [details] [diff] [review]
0001-BUG-1098250-Proposed-Fix.patch

Review of attachment 8522739 [details] [diff] [review]:
-----------------------------------------------------------------

this looks great!
Attachment #8522739 - Flags: review?(jmaher) → review+
(Reporter)

Comment 5

4 years ago
we are backfilling with mozilla_ci_tools (https://github.com/armenzg/mozilla_ci_tools) which has taken a lot of titanic pieces and knowledge.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.