Closed Bug 1492271 Opened 6 years ago Closed 6 years ago

Build a cluster-wide integration-testing framework

Tracking

(Not tracked)

Status:

RESOLVED DUPLICATE of bug 1575956

People

(Reporter: dustin, Unassigned)

References

Details

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Description

•

6 years ago

We should have a robust integration / smoke testing framework that can be run against an active cluster. This will relieve some of the pressure on per-service unit tests to cover integration-related concerns (for example, does the index properly index real pulse messages from the queue, or are its fake messages subtly different?). It also means we can test things that are difficult to fake, such as workers' interactions with the rest of the cluster. Per discussion with Brian: * It'd be nice to have the tests relevant to a particular service or repo be located *in* that repo, and extracted during the cluster build process. In which case it'd also be nice to be able to run those alone without building/deploying (to iterate on the tests themselves, for example). The other option is a dedicated repo. * Where a test originates from service A, but involves services B and C, it should depend only on stable, documented behavior of B and C, but can rely on implementation details of A. Then we'll see bustage in smoke tests if we accidentally change documented behavior in B and C. * This should have some amount of UI, so that it can be triggered and the results reviewed in a browser aimed at the deployment in question. This doesn't have to be in the tc-web/tc-tools UI, but maybe that makes sense. * These tests will probably make assumptions about cluster configuration, such as what services are enabled. They should get access to some of that configuration to decide when to skip a test. * These tests will need some configuration in place. Is it enough to create a client that has access to set up that configuration? Can that be done with sufficiently limited scopes that it's not dangerous to have active in Firefox CI production? * The taskcluster-diagnostics repo might provide a starting place - it's basically a `mocha` run with some reporting hooked up. Another alternative is a task-like interface where we run a collection of commands in docker images. That would let tests be language-specific, and give a nice consistent API for how each test case is run and how its results are reported. This is not high-priority, but I'll continue to think about this and start with something simple that can provide a robust kernel for a fuller implementation to develop.

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Updated

•

6 years ago

Depends on: tc-monorepo

Nobody; OK to take it and work on it

Assignee

Updated

•

6 years ago

Component: Redeployability → Services

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 1

•

6 years ago

I still like this idea, just shouldn't pretend I'm working on it.

Assignee: dustin → nobody

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 2

•

6 years ago

We talked a bit about this this week.

This should be a tool that can be used both by us and by an organization deploying TC -- a good way to get a sense of whether something is seriously broken, but without any attempt to cover all functionality. So, for example, cloudops could run it against new staging deployments to catch cases where something has gone wrong in the deployment process.

It can also address some known regressions, so that it's useful during an outage. A current example is, sometimes the queue's dependency resolver stops resolving dependencies. Integration tests could check whether dependencies resolve. Then an operations team would have a good way to find out quickly that this is the issue.

We'd probably like to write this in Go, as part of Taskcluster-CLI. Something like taskcluster diagnose.

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 3

•

6 years ago

Oh, also, I think bstack agreed to write at least the framework of this.

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Updated

•

6 years ago

Depends on: 1560650

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Updated

•

6 years ago

Status: NEW → RESOLVED

Closed: 6 years ago

Resolution: --- → DUPLICATE

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Build a cluster-wide integration-testing framework

Categories

(Taskcluster :: Services, enhancement)

Tracking

(Not tracked)

People

(Reporter: dustin, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Comment 1

Comment 2

Comment 3

Updated

Updated