Stand up TPS in services infrastructure

RESOLVED INCOMPLETE

Status

RESOLVED INCOMPLETE
3 years ago
2 years ago

People

(Reporter: sphilp, Assigned: kthiessen)

Tracking

unspecified
Points:
---
Dependency tree / graph
Bug Flags:
firefox-backlog +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments)

(Reporter)

Description

3 years ago
https://developer.mozilla.org/en-US/docs/TPS

Looking at moving this into the services qa infrastructure so that :markh can take ownership of build and test, and use it for sync going forward.
Flags: firefox-backlog+
Priority: -- → P2
Whiteboard: [fxsync]
I replied to the email which has been sent out by last week. I hope it gives you all necessary bits to get started. If not you can always ask me further questions.
Created attachment 8681488 [details]
Log showing failure of test when supplies with a valid FxA account

This is different from failures associated with bad username/password.
(Assignee)

Updated

3 years ago
QA Contact: kthiessen
Attachment #8681488 - Attachment mime type: text/x-log → text/plain
(In reply to Karl Thiessen [:kthiessen] from comment #2)
> This is different from failures associated with bad username/password.

That's a good result :) It looks like it reflects a relatively successful environment but an early test failure. I'll find (or open) the client-side bug to fix that tomorrow, but reproducing that failure in a CI environment would mean we are almost there.
Stuart, can we close this bug or is there further work to do here?
Flags: needinfo?(sphilp)
(Reporter)

Comment 5

3 years ago
I wouldn't mind keeping this open until we have all the pieces complete. although technically TPS runs in our jenkins, it's not 100% in CI building on changes and such and there's also reporting into treeherder (though that part can/should be a separate bug)
Flags: needinfo?(sphilp)
Created attachment 8725875 [details]
Test run showing Fxa/Hawk errors

This shows the output of a TPS run from the QA Jenkins box.
(Reporter)

Comment 7

3 years ago
Woo! Need to figure out the best way to get this to build on commit so markh can see progress
Good news and bad news, as always.

Good news: We've gotten TPS to run from a job on the QA Jenkins box.  Woot!

Bad news:  The tests do not appear to be running correctly.  :markh, could you have a look at the above log attachment and tell me if anything jumps out at you?
Flags: needinfo?(markh)
The clock on this machine is diverged a lot. So make sure you have NTP installed and running. See the failure:

1456952781036	Hawk	DEBUG	(Response) /account/login?keys=true: code: 400 - Status text: Bad Request
1456952781036	Hawk	DEBUG	Clock offset vs https://api.accounts.firefox.com/v1: -1036
> CROSSWEAVE INFO: Login user: __FX_ACCOUNT_USERNAME__

It looks like config.json doesn't have the username and password of a real Firefox account - that's the default value that we expect to be edited - https://dxr.mozilla.org/mozilla-central/source/testing/tps/config/config.json.in#8

It kinda sucks that we need that, but a solution isn't obvious - I doubt we want to create a new one each run.
Flags: needinfo?(markh)
Created attachment 8725984 [details]
Jenkins output, 2016-03-02

Account credentials properly in place, but some unexpected failures.
This looks much better, but I may need some help chasing down those last few failures.
> JavaScript error: resource://tps/tps.jsm, line 616: TypeError: Async.isShutdownException is not a function

my first speculation is that this is being called after Firefox has torn down, but the logs don't make it obvious that's the case. I guess I should see if I can reproduce this on Linux.
Flags: needinfo?(markh)
Thom, do you think you would be able to take a look at this? "tps" is a test framework that is mostly in-tree, but is a bit funky to get running. What's different about this test suite is that it is able to start and stop Firefox multiple times and with multiple profiles, and it compares the sync state between runs to check things have worked as expected. Once we get this working reliably in automation, I think it will make sense to hook your validator into it, so every test does the validation check and fails if it finds a problem. It shouldn't (as the tests run in a fairly controlled fashion, so are unlikely to trigger the cases where we screw up, such as premature shutdowns or creating bookmarks while syncing) but it still seems worthwhile - it's a way to test Sync in a way that the rest of the in-tree tests aren't.

Anyway, short term task is to get it running on Linux without errors, so it can run automagically and we get to deal with any failures.

Basic docs are at https://developer.mozilla.org/en-US/docs/Mozilla/Projects/TPS and https://developer.mozilla.org/en-US/docs/Mozilla/Projects/TPS_Tests and source is split a little between services/sync/tests/tps (the test definitions) services/sync/tps (an addon used by the harness) and testing/tps (the harness itself)
Flags: needinfo?(markh) → needinfo?(tchiovoloni)
Yep, definitely can add this to my list of things to do.
Flags: needinfo?(tchiovoloni)
This test is running on the QA Jenkins instance, with output going to IRC #services-test once an hour.

It's currently failing with the same error given above -- console output is here:
   https://s3-us-west-1.amazonaws.com/services-qa-jenkins-artifacts/jobs/sync_e2e-test_prod/342/test_log.txt

The '342' in that URL can be replaced by higher numbers to get output from later runs.

The QA Jenkins requires VPN access -- this job is at:
   https://services-qa-jenkins.stage.mozaws.net:8443/job/sync_e2e-test_prod/
Many thanks to :rpapa for his tireless work getting this stood up.  Yay, Richard!
See Also: → bug 1273347
Priority: P2 → --
In discussion in Sync meetings last week, Mark suggested that we break this into two pieces:

1. A minimal once-a-day job that runs against production, in such a away as not to disrupt production FxA.

2. A more thorough job that runs more often against stage.

Unless someone objects this week, I intend to assign myself these two tasks as new bugs and close this one out.
Status: NEW → ASSIGNED
(Assignee)

Updated

2 years ago
Depends on: 1293426
(Assignee)

Updated

2 years ago
Depends on: 1287365
Whiteboard: [fxsync]
Seems I neglected to close this out.  Doing so now: bug 1287365 is still outstanding.
Assignee: sphilp → kthiessen
QA Contact: kthiessen
(Assignee)

Updated

2 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.