Closed Bug 1168643 Opened 10 years ago Closed 9 years ago

Automate the test cases for Telemetry data QA

Tracking

()

Status:

RESOLVED WORKSFORME

People

(Reporter: areinald.bug, Assigned: areinald.bug)

References

(Blocks 1 open bug)

Details

(Whiteboard: [measurement:client])

Attachments

(2 files, 9 obsolete files)

test-telemetry.py 10 years ago André Reinald 1.38 KB, text/x-python-script		Details
bugzilla-1168643-1.patch 10 years ago André Reinald 5.41 KB, patch		Details \| Diff \| Splinter Review
bugzilla-1168643-2.patch 9 years ago André Reinald 6.12 KB, patch		Details \| Diff \| Splinter Review
bugzilla-1168643-3.patch 9 years ago André Reinald 9.03 KB, patch		Details \| Diff \| Splinter Review
bugzilla-1168643-4.patch 9 years ago André Reinald 13.97 KB, patch		Details \| Diff \| Splinter Review
bugzilla-1168643-5.patch 9 years ago André Reinald 16.08 KB, patch		Details \| Diff \| Splinter Review
bugzilla-1168643-6.patch 9 years ago André Reinald 17.21 KB, patch		Details \| Diff \| Splinter Review
bugzilla-1168643-7.patch 9 years ago André Reinald 18.31 KB, patch	gfritzsche : feedback+	Details \| Diff \| Splinter Review
bugzilla-1168643-8.patch 9 years ago André Reinald 18.13 KB, patch		Details \| Diff \| Splinter Review
test-telemetry.sh 9 years ago André Reinald 1015 bytes, application/x-sh		Details
telemetry-delays-prefs.patch 9 years ago André Reinald 4.79 KB, patch	gfritzsche : feedback+	Details \| Diff \| Splinter Review

André Reinald

Assignee

Description

•

10 years ago

Write tests to execute plan described in bug 1156718.

André Reinald

Assignee

Comment 1

•

10 years ago

Attached file test-telemetry.py (obsolete) — Details

WIP

Assignee: nobody → bugzilla

Status: NEW → ASSIGNED

André Reinald

Assignee

Updated

•

10 years ago

Depends on: 1156718

André Reinald

Assignee

Updated

•

10 years ago

Attachment #8610896 - Flags: feedback?(ato)

André Reinald

Assignee

Updated

•

10 years ago

Version: unspecified → Trunk

André Reinald

Assignee

Comment 2

•

10 years ago

(In reply to André Reinald from comment #0) > Write tests to execute plan described in bug 1156718. This bug is about translating the test plan from plain English (bug 1156718) to Python-Marionette, so it can be machine executed repeatedly instead of requiring manual QA.

Andreas Tolfsen ❲:ato❳

Comment 3

•

10 years ago

Comment on attachment 8610896 [details] test-telemetry.py I’m not really sure what you want me to give feedback on here? The program implements functions for starting and stopping Firefox, which the Marionette test runner already supplies. There are also various other libraries such as mozprofile, to provision profiles, and mozrunner, to manage the binary lifetime.

Attachment #8610896 - Flags: feedback?(ato)

André Reinald

Assignee

Comment 4

•

10 years ago

Attached patch bugzilla-1168643-1.patch (obsolete) — Details — Splinter Review

WIP

Georg Fritzsche [:gfritzsche]

Comment 5

•

10 years ago

Comment on attachment 8613978 [details] [diff] [review] bugzilla-1168643-1.patch Review of attachment 8613978 [details] [diff] [review]: ----------------------------------------------------------------- That looks like a good start. I don't know about marionette, but this seems to implement some things manually that should already be provided (as pointed out above). #automation might offer some help or point to examples. Installing addons seems like a more common task too, maybe that functionality already exists for marionette? ::: .gitignore @@ +52,5 @@ > .settings/ > > # Python virtualenv artifacts. > +bin/ > +_virtualenv/ Why is that change needed now? That seems like it should be at least in a different patch.

Georg Fritzsche [:gfritzsche]

Comment 6

•

10 years ago

Did the test plan look ok to you? Was there anything standing out or missing?

Andreas Tolfsen ❲:ato❳

Comment 7

•

10 years ago

If this test is meant to run in CI you will need to integrate it with structural logging (provided by the mozlog.structured module). It also seems to rely rather heavily on system specific characteristics (such as the path to profiles, binaries, &c.) which isn’t ideal. It spins up the pingserver on a static port, which might conflict with existing used ports on the system. It’s better to bind to port 0 and query the assigned port later when needed. There is use of local IO through a temporary folder that is removed after the program is done that doesn’t look safe at all. As mentioned earlier you should also consider reusing existing libraries in mozbase, such as for profile management, preferences writing, extension injection and so on. Generally the program should conform to the PEP8 coding style and not rely on hard coded waits because we need programs to be reproducible regardless of the system it runs on. I hope this is useful.

André Reinald

Assignee

Comment 8

•

10 years ago

(In reply to Andreas Tolfsen (:ato) from comment #7) > I hope this is useful. I first wanted to have a few tests coded and running, as a proof of concept that marionette could automate the test plan. I'm well aware that there is lots of room for improvement, and this will be part 2. I'll try to reach you on IRC, so I have more interactive help. Thanks Andreas.

André Reinald

Assignee

Comment 9

•

10 years ago

(In reply to Georg Fritzsche [:gfritzsche] from comment #6) > Did the test plan look ok to you? > Was there anything standing out or missing? I don't yet feel qualified enough to point out if there is anything missing in the test plan itself. You've all been working in this area much longer that myself, so I basically have to trust you. I guess that if there are inconsistencies remaining, they will show up while I'm "translating" English to Python.

André Reinald

Assignee

Comment 10

•

10 years ago

(In reply to Georg Fritzsche [:gfritzsche] from comment #5) > > # Python virtualenv artifacts. > > +bin/ > > +_virtualenv/ > > Why is that change needed now? > That seems like it should be at least in a different patch. bin is not needed. _virtualenv is for my own setup (if absent, we get a much larger patch). I will remove that from the final patch.

André Reinald

Assignee

Comment 11

•

9 years ago

Attached patch bugzilla-1168643-2.patch (obsolete) — Details — Splinter Review

WIP solved issues with xpi installation starting every test from a new profile tests are enumerated on command line added lz4 decompression and json parsing

Attachment #8613978 - Attachment is obsolete: true

André Reinald

Assignee

Comment 12

•

9 years ago

Attached patch bugzilla-1168643-3.patch (obsolete) — Details — Splinter Review

WIP Added sleep after launch to allow ping sending. Added relaunch to allow previously unsent pings. Added manual launch (FF + ping server, with same options). Testing ping coherence after automated and manual tests: assert (receivedPings + savedPings == archivedPings) It seems that in some cases (TBD) pings get lost if server is not running.

Attachment #8617300 - Attachment is obsolete: true

Alessio Placitelli [:Dexter]

Comment 13

•

9 years ago

(In reply to André Reinald from comment #12) > It seems that in some cases (TBD) pings get lost if server is not running. That's interesting, there may be a bug lurking around in our code there then. It would be great if the conditions triggering those cases could be pinned down!

André Reinald

Assignee

Comment 14

•

9 years ago

Attached patch bugzilla-1168643-4.patch (obsolete) — Details — Splinter Review

WIP: - integrated ping server - same file - different process -> different thread - clean up and factorization - split ARCH5 test into ARCH5 (no expiring pings) and ARCH6 (expiring pings). - implemented above + ARCH4.

Attachment #8623018 - Attachment is obsolete: true

André Reinald

Assignee

Comment 15

•

9 years ago

Attached patch bugzilla-1168643-5.patch (obsolete) — Details — Splinter Review

WIP: - code factorization - portable path constructions - more meaningful messages - more thorough tests - archiving of ping folders after each test for later examination ARCH1 is failing: - pings are sent and saved even when telemetry is disabled

Attachment #8630176 - Attachment is obsolete: true

André Reinald

Assignee

Comment 16

•

9 years ago

Attached patch bugzilla-1168643-6.patch (obsolete) — Details — Splinter Review

WIP: - GEN1 test failing: ping is sent on addon installation but it contains the addon (it should not)

Attachment #8635502 - Attachment is obsolete: true

Alessio Placitelli [:Dexter]

Comment 17

•

9 years ago

Thanks André! As we discussed over Vidyo: - ARCH test should not really test if pings get sent to the server or not. Moreover, due to the introduction of the "UNIFIED" telemetry pref the "toolkit.telemetry.enabled" pref behaves a bit differently from before. Please see the in-tree docs [1] for an up-to-date overview of the expected behaviour. - The QA document I wrote a while back contains the wrong preference name: "toolkit.telemetry.telemetry.enabled", but it really is just "toolkit.telemetry.enabled" - I'll check and manually confirm the GEN1 test. [1] - https://ci.mozilla.org/job/mozilla-central-docs/Tree_Documentation/toolkit/components/telemetry/telemetry/preferences.html

Alessio Placitelli [:Dexter]

Comment 18

•

9 years ago

(In reply to André Reinald from comment #16) > Created attachment 8636100 [details] [diff] [review] > bugzilla-1168643-6.patch > > WIP: > - GEN1 test failing: ping is sent on addon installation but it contains the > addon (it should not) I can't seem to be able to reproduce this issue: would you mind double checking that manually?

André Reinald

Assignee

Comment 19

•

9 years ago

Marionette stopped working. Python issue I guess. Will investigate this week (though I'm on PTO officially, I want to understand). I wanted to re-run the automated tests with unchanged delays see if they are the cause.

André Reinald

Assignee

Comment 20

•

9 years ago

Attached patch bugzilla-1168643-7.patch (obsolete) — Details — Splinter Review

WIP: - ARCH6 became obsolete because changes in ping sending policy (no more expiring) - GEN1 still failing - GEN3 added Fixed bugs and improved code. New reference doc at: https://gecko.readthedocs.org/en/latest/toolkit/components/telemetry/telemetry/index.html

Attachment #8636100 - Attachment is obsolete: true

Georg Fritzsche [:gfritzsche]

Comment 21

•

9 years ago

Comment on attachment 8643009 [details] [diff] [review] bugzilla-1168643-7.patch Review of attachment 8643009 [details] [diff] [review]: ----------------------------------------------------------------- This is coming along, nice. One major request that i have is to put focus on landing a minimal version of this in the tree first and make sure it can be run straight from it (i.e. no hard-coded locations or anything). That way we would be easily able to reproduce issues and could start using it as a sanity check before checkin etc. We could also start to add test cases too. Also, what do you think about breaking the work down into multiple bugs with smaller scope, to make it a bit more manageable? ::: toolkit/components/telemetry/tests/marionette/test-telemetry.py @@ +16,5 @@ > +import stat > +import string > + > + > +FIREFOX_BINARY_HOME_RELATIVE = os.path.join('Programmes', 'mozilla-central', 'obj-firefox.noindex', 'dist', 'Nightly.app', 'Contents', 'MacOS', 'firefox') Can we launch this from the obj dir, so we don't need to put Firefox in a special location? @@ +26,5 @@ > +USER_PROFILE_PATH = os.path.join(HOME_PATH, 'TelemetryTestProfile') > +SAVED_PINGS_PATH = os.path.join(USER_PROFILE_PATH, 'saved-telemetry-pings') > +DATAREPORTING_PATH = os.path.join(USER_PROFILE_PATH, 'datareporting') > +ARCHIVED_PINGS_PATH = os.path.join(DATAREPORTING_PATH, 'archived', date.today().strftime('%Y-%m')) > +XPI_PATH = os.path.join(HOME_PATH, 'Desktop', 'restartless.xpi') We generate a restartless.xpi for our unit tests, you could just use that one and avoid depending on paths on your machine. @@ +51,5 @@ > + ('setIntPref', 'toolkit.telemetry.min-subsession-length', '10'), > + ('setIntPref', 'toolkit.telemetry.tick.interval', '15'), > + ('setIntPref', 'toolkit.telemetry.tick.idle-interval', '120'), > + ('setIntPref', 'toolkit.telemetry.scheduler.midnight-tolerance', '900'), > + ('setIntPref', 'toolkit.telemetry.scheduler.coalesce-threshold', '120') Can you put the patch that implements those prefs up too? @@ +123,5 @@ > + else: > + plainData = postData > + > + jsonData = json.loads(plainData) > + fileName = jsonData["id"] + '.json' It would be nice to prefix the filenames with a running counter or the current timestamp. That way, if we ever need to inspect them manually, they naturally sort in received-order. @@ +187,5 @@ > +def readPing(fileName, compressed): > + file = open(fileName, 'rb') > + if compressed: > + file.seek(8) > + jsonData = lz4.decompress(file.read()) Ah, cool, its as simple as that? Good find. @@ +202,5 @@ > + pingId = pingJson['id'] > + diskPings[pingId] = pingJson > + return diskPings > + > +def nestedEqual(a, b): This is for deep-equal comparisons of objects? I think most functions could use a short comment on what they are supposed to do. @@ +226,5 @@ > + if isinstance(a, set): > + return a == b > + return a == b > + > +def altern(cond, valYes, valNo): We don't need that, Python has: valYes if cond else valNo @@ +253,5 @@ > + > +def findAddon(ping, label): > + addonFound = False > + try: > + addons = ping[u'payload'][u'info'][u'addons'].split(',') This is an old field in Telemetry that is left in for backwards compability for now. We should instead look at: ping.environment.addons.activeAddons/.activePlugins/etc. See: https://gecko.readthedocs.org/en/latest/toolkit/components/telemetry/telemetry/environment.html @@ +301,5 @@ > + > +def installXPI(client, xpiPath): > + print ' Installing xpi: ' + xpiPath > + client.set_context(client.CONTEXT_CHROME) > + scriptString = str(FILE_INSTANCE_JS + '\n' I think we should move all the JavaScript into a separate JS file, then import it using import_script: http://marionette-client.readthedocs.org/en/latest/reference.html#marionette_driver.marionette.Marionette.import_script Its pretty hard to read and maintain as embedded scripts.

Attachment #8643009 - Flags: feedback+

Georg Fritzsche [:gfritzsche]

Updated

•

9 years ago

Depends on: 1191324

André Reinald

Assignee

Comment 22

•

9 years ago

Attached patch bugzilla-1168643-8.patch — Details — Splinter Review

Not yet taking into account previous comments by Georg, but a step to make the script runnable on other machines (as talked with Stuart). Constant paths are read from environment vars.

Attachment #8643009 - Attachment is obsolete: true

André Reinald

Assignee

Comment 23

•

9 years ago

Attached file test-telemetry.sh — Details

Shell script which sets up environment vars. Explanations to setup in script comments.

André Reinald

Assignee

Comment 24

•

9 years ago

Attached patch telemetry-delays-prefs.patch (obsolete) — Details — Splinter Review

This patch allows telemetry delays to be adjusted by prefs rather than hardcoded. Current marionette telemetry tests rely on shorter delays than the default ones.

Attachment #8610896 - Attachment is obsolete: true

Georg Fritzsche [:gfritzsche]

Comment 25

•

9 years ago

Comment on attachment 8644447 [details] [diff] [review] telemetry-delays-prefs.patch Review of attachment 8644447 [details] [diff] [review]: ----------------------------------------------------------------- We should also document the prefs in toolkit/components/telemetry/docs/preferences.rst ... probably in a separate "testing" section. I've left some comments on changing pref names etc., but otherwise this looks good. ::: toolkit/components/telemetry/TelemetryController.jsm @@ +51,5 @@ > > const PING_FORMAT_VERSION = 4; > > // Delay before intializing telemetry (ms) > +const TELEMETRY_DELAY = Preferences.get("toolkit.telemetry.delay", 60) * 1000; This pref should be more explicit about what the delay is for, e.g.: toolkit.telemetry.initDelay ::: toolkit/components/telemetry/TelemetrySession.jsm @@ +43,5 @@ > > const ENVIRONMENT_CHANGE_LISTENER = "TelemetrySession::onEnvironmentChange"; > > const MS_IN_ONE_HOUR = 60 * 60 * 1000; > +const MIN_SUBSESSION_LENGTH_MS = Preferences.get("toolkit.telemetry.min-subsession-length", 10 * 60) * 1000; Pref-naming is usually camelCase. @@ +75,5 @@ > // Maximum number of content payloads that we are willing to store. > const MAX_NUM_CONTENT_PAYLOADS = 10; > > // Do not gather data more than once a minute > +const TELEMETRY_INTERVAL = Preferences.get("toolkit.telemetry.interval", 60) * 1000; This is odd and only used for the cycle collector - do we need to change it? @@ +80,2 @@ > // Delay before intializing telemetry (ms) > +const TELEMETRY_DELAY = Preferences.get("toolkit.telemetry.delay", 60) * 1000; toolkit.telemetry.initDelay @@ +82,4 @@ > // Delay before initializing telemetry if we're testing (ms) > const TELEMETRY_TEST_DELAY = 100; > // Execute a scheduler tick every 5 minutes. > +const SCHEDULER_TICK_INTERVAL_MS = Preferences.get("toolkit.telemetry.tick.interval", 5 * 60) * 1000; toolkit.telemetry.scheduler.tickInterval @@ +88,1 @@ > // The maximum number of times a scheduled operation can fail. toolkit.telemetry.scheduler.idleTickInterval @@ +88,5 @@ > // The maximum number of times a scheduled operation can fail. > const SCHEDULER_RETRY_ATTEMPTS = 3; > > // The tolerance we have when checking if it's midnight (15 minutes). > +const SCHEDULER_MIDNIGHT_TOLERANCE_MS = Preferences.get("toolkit.telemetry.scheduler.midnight-tolerance", 15 * 60) * 1000; toolkit.telemetry.scheduler.midnightTolerance @@ +93,4 @@ > > // Coalesce the daily and aborted-session pings if they are both due within > // two minutes from each other. > +const SCHEDULER_COALESCE_THRESHOLD_MS = Preferences.get("toolkit.telemetry.scheduler.coalesce-threshold", 2 * 60) * 1000; We don't need this - i killed the coalescing logic in bug 1187339, apparently i missed removing this const though. @@ +97,5 @@ > > // Seconds of idle time before pinging. > // On idle-daily a gather-telemetry notification is fired, during it probes can > // start asynchronous tasks to gather data. > +const IDLE_TIMEOUT_SECONDS = Preferences.get("toolkit.telemetry.idle.timeout", 5) * 60; toolkit.telemetry.idleTimeout

Attachment #8644447 - Flags: feedback+

Georg Fritzsche [:gfritzsche]

Updated

•

9 years ago

Blocks: 1122482

André Reinald

Assignee

Updated

•

9 years ago

Depends on: 1197292

André Reinald

Assignee

Comment 26

•

9 years ago

Comment on attachment 8644447 [details] [diff] [review] telemetry-delays-prefs.patch Attached updated telemetry-delays-prefs.patch to bug 1197292, and added dependency accordingly. Marking current attachment in this bug as obsolete.

Attachment #8644447 - Attachment is obsolete: true

Thomas Huelbert

Updated

•

9 years ago

Priority: -- → P3

Whiteboard: [measurement:client]

Alessio Placitelli [:Dexter]

Updated

•

9 years ago

Priority: P3 → P4

Georg Fritzsche [:gfritzsche]

Updated

•

9 years ago

No longer depends on: 1156718

Georg Fritzsche [:gfritzsche]

Comment 27

•

9 years ago

The work here was completed but efforts have moved to bug 1191324.

Blocks: 1247589
No longer blocks: 1122482

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

No longer depends on: 1191324

Resolution: --- → WORKSFORME

Georg Fritzsche [:gfritzsche]

Updated

•

9 years ago

No longer depends on: 1197292

You need to log in before you can comment on or make changes to this bug.