Closed Bug 1634064 Opened 4 years ago Closed 4 years ago

Implement the ping tagging system (for pings generated in automation)

Categories

(Data Platform and Tools :: Glean: SDK, task, P1)

task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Dexter, Assigned: Dexter)

References

Details

Attachments

(4 files)

For end to end testing, we need to be able to tag our pings in a way that makes it easy to tell if some ping was generated in automation or not, for the same application id.

This might also be used by the perf team to run Fenix on CI and submit tagged telemetry to our pipeline, in order to capture telemetry from builds on CI.

Priority: -- → P3
Whiteboard: [telemetry:glean-rs:m?]
Priority: P3 → P2
Whiteboard: [telemetry:glean-rs:m?]
Assignee: nobody → alessio.placitelli
Priority: P2 → P1

Chris, Raphael and Mike, would you kindly review the proposal in comment 1?

Flags: needinfo?(rpierzina)
Flags: needinfo?(mdroettboom)
Flags: needinfo?(chutten)

Whoops, I forgot about Frank :-) Can you kindly review the proposal in comment 1?

Flags: needinfo?(fbertsch)

Feedback provided in the document. Tracking it there.

Flags: needinfo?(chutten)

Also added feedback in doc.

Flags: needinfo?(fbertsch)

Read the doc -- I basically agree with :chutten and :frank's concerns, so looking forward to reading v2 :)

Flags: needinfo?(mdroettboom)

Feedback provided in the document.

Flags: needinfo?(rpierzina)
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Hey Frank, Raphael, Mike

I went over this document again and would appreciate your feedback on it.

@Frank, I've changed the parts on the SDK side after our last meeting, but it's unclear to me if we plan on doing something on the pipeline/data views. Nothing is strictly required for Raphael's use-case, because burnham currently sends stuff to its own tables, but things may be different in the future when released applications will be tested.

Flags: needinfo?(rpierzina)
Flags: needinfo?(mdroettboom)
Flags: needinfo?(fbertsch)

Read through. Only had one minor and mostly ignorable comment.

Flags: needinfo?(mdroettboom)

Hey Alessio, I added my suggestions to the document.

Flags: needinfo?(rpierzina)

Hi Alessio, I just added a new section to the document describing two concerns around tagging on Android and testing the staging Data Platform.

Flags: needinfo?(fbertsch)

Ok, the proposal was reviewed. This is now ready for implementation.

Summary: Enable tagging pings generated in automation → Implement the ping tagging system (for pings generated in automation)
See Also: → 1650798

This is stuck again: we need to find a way to enable tagging for tests running on Android, since we can't necessarily use the glean debug activity to run tests on CI.

The reason why we can't use the GleanDebugActivity is that some perf tests require to be launched with a specific activity, if we launch them with the GleanDebugActivity they will fail. Unfortunately, we can't set environment variables on Android devices (apparently, that's not a thing on Android!), so we need to work around this problem.

One potential solution we found is to enable starting other activities after executing the GleanDebugActivity. This would mean, for example, executing the tests as:

adb shell am start -n [applicationId]/mozilla.telemetry.glean.debug.GleanDebugActivity --es tagPings automation --es nextActivity PerfTestActivity

Michael, would this be helpful?

Flags: needinfo?(michael.l.comella)

Hey Andrew, I have a few questions with respect to bug 1648183 comment 7, since I believe I found a solution :-)

You said:

(In reply to Andrew Creskey [:acreskey] [he/him] from comment #7)

(In reply to Alessio Placitelli [:Dexter] from comment #6)

(In reply to Andrew Creskey [:acreskey] [he/him] from comment #5)

As far as I can tell, it won't be possible for us to set an environment variable that can be read on the android device.
However we do have a hook for setting performance options in Fenix (e.g. disabling certain pop-ups which interfere with visual metrics).
We could in theory add the additional glean metric here.

@Andrew, Ah. How does that work? Can you provide me a link to that piece of code?

:Dexter, this is the function in fenix that prepares it for performance tests:
https://github.com/mozilla-mobile/fenix/blob/82facbfa00b4a701b6919d618df3ee6459bc74b7/app/src/main/java/org/mozilla/fenix/perf/Performance.kt#L26-L33

This is enable by an additional intent argument, an example use in a new test is here:
https://searchfox.org/mozilla-central/rev/a87a1c3b543475276e6d57a7a80cb02f3e42b6ed/testing/performance/hooks_home_activity.py#15-17

I don't actually see this option being passed in for the general CI performance tests, but it shouldn't be hard to add it.

I think that the reason why starting the tests using the GleanDebugActivity has failed is that Glean does not propagate any extra option that is passed to the main Activity that gets launched, and that's why these options don't get processed and make the perf test fail when running in tagged mode.

To make sure that's the problem (and it's easily fixable on our end!), I'd need to know what the final adb command that start the test looks like when all these options are passed. This would also allow me to test this locally. Any chance you could help me with that or point me to who could?

Flags: needinfo?(michael.l.comella) → needinfo?(acreskey)

(In reply to Alessio Placitelli [:Dexter] from comment #14)
...

I think that the reason why starting the tests using the GleanDebugActivity has failed is that Glean does not propagate any extra option that is passed to the main Activity that gets launched, and that's why these options don't get processed and make the perf test fail when running in tagged mode.

To make sure that's the problem (and it's easily fixable on our end!), I'd need to know what the final adb command that start the test looks like when all these options are passed. This would also allow me to test this locally. Any chance you could help me with that or point me to who could?

:Dexter, yes, because we launch Fenix into various activities for performance tests, we would need to ability to tell the GleanDebugActivity which activity to proceed to next, as you mentioned in Comment 13.

Depending on the test being run, there are different calls.

• Browsertime pageload
This one boils down to this:
adb shell am start -d "about:home" -a android.intent.action.VIEW org.mozilla.fenix.nightly/org.mozilla.fenix.IntentReceiverActivity --ez performancetest true

• mach perftest (VIEW)
This is can be run from the latest mozilla-central.
You can customize the android-install-apk to be a local apk
./mach perftest --flavor mobile-browser --android --android-capture-logcat logcat --output artifacts --android-app-name org.mozilla.fenix.nightly --android-activity org.mozilla.fenix.IntentReceiverActivity --android-clear-logcat --android-perf-tuning --perfherder-metrics processLaunchToNavStart --android-install-apk https://firefox-ci-tc.services.mozilla.com/api/index/v1/task/project.mobile.fenix.v2.nightly.2020.07.16.latest/artifacts/public/build/armeabi-v7a/geckoNightly/target.apk --hooks testing/performance/hooks_applink.py --perfherder testing/performance/perftest_applink.js --browsertime-iterations 5

But it also boils down to
adb shell am start -d {some url} -a android.intent.action.VIEW org.mozilla.fenix.nightly/org.mozilla.fenix.IntentReceiverActivity --ez performancetest true

For the MAIN / home activity tests, they start like this:
adb shell am start -n org.mozilla.fenix.nightly/.App --ez performancetest true

As a less disruption solution would it not be possible to set the pingTags to automation in this function in Fenix on launch?
https://github.com/mozilla-mobile/fenix/blob/82facbfa00b4a701b6919d618df3ee6459bc74b7/app/src/main/java/org/mozilla/fenix/perf/Performance.kt#L26-L33
Or is it already too late at that point?

Flags: needinfo?(acreskey)

(In reply to Andrew Creskey [:acreskey] [he/him] from comment #15)

But it also boils down to
adb shell am start -d {some url} -a android.intent.action.VIEW org.mozilla.fenix.nightly/org.mozilla.fenix.IntentReceiverActivity --ez performancetest true

For the MAIN / home activity tests, they start like this:
adb shell am start -n org.mozilla.fenix.nightly/.App --ez performancetest true

This is of great help! Thank you so much!

As a less disruption solution would it not be possible to set the pingTags to automation in this function in Fenix on launch?
https://github.com/mozilla-mobile/fenix/blob/82facbfa00b4a701b6919d618df3ee6459bc74b7/app/src/main/java/org/mozilla/fenix/perf/Performance.kt#L26-L33
Or is it already too late at that point?

I'm afraid this would work for this specific class of tests, but not for other classes nor for products other than Fenix :-)
It's ok to fix this, I think the fact that we don't propagate intent options is a bug on the Glean SDK side.

Blocks: 1654197

(In reply to Andrew Creskey [:acreskey] [he/him] from comment #15)

The PRs in this bug were merged and we're bubbling up the updates to Fenix. It will then be possible to change the commands as follows:

• Browsertime pageload
This one boils down to this:
adb shell am start -d "about:home" -a android.intent.action.VIEW org.mozilla.fenix.nightly/org.mozilla.fenix.IntentReceiverActivity --ez performancetest true

adb shell am start -d "about:home" -a android.intent.action.VIEW -n org.mozilla.fenix.nightly/mozilla.telemetry.glean.debug.GleanDebugActivity --esa sourceTags automation, browsertime-pageload --es startNext org.mozilla.fenix.IntentReceiverActivity --ez performancetest true

• mach perftest (VIEW)
But it also boils down to
adb shell am start -d {some url} -a android.intent.action.VIEW org.mozilla.fenix.nightly/org.mozilla.fenix.IntentReceiverActivity --ez performancetest true

adb shell am start -d {some url} -a android.intent.action.VIEW -n org.mozilla.fenix.nightly/mozilla.telemetry.glean.debug.GleanDebugActivity --esa sourceTags automation, mach-perftest --es startNext org.mozilla.fenix.IntentReceiverActivity --ez performancetest true

For the MAIN / home activity tests, they start like this:
adb shell am start -n org.mozilla.fenix.nightly/.App --ez performancetest true

adb shell am start -n org.mozilla.fenix.nightly/mozilla.telemetry.glean.debug.GleanDebugActivity --esa sourceTags automation, main-home --es startNext .App --ez performancetest true

Note that nothing changes too much: we're only telling adb to start the GleanDebugActivity, which tags to apply and which activity to start next.

@Andrew, to keep on fixing this, I need to ask you a few more questions:

  1. Where should the commands be changed?
  2. Are tests running only on the latest org.mozilla.fenix.nightly? If so, this means we should land the changes to the test runner as soon as a new Nightly is available.
Flags: needinfo?(acreskey)

:Dexter, these commands are setup in a few places:

  1. I believe that the activity for Fenix used in the browsertime pageload tests (including the live runs that affect telemetry) is set here:
    https://searchfox.org/mozilla-central/rev/3b6958c26049c1e27b2790a43154caaba9f6dd4a/taskcluster/ci/test/browsertime-mobile.yml#51
    I'm not sure where the additional arguments will go (I'm not that familiar with the CI browsertime solution).
    And yes, I believe that these only test org.mozilla.fenix.nightly
    cc: davehunt

  2. mach perftest (VIEW)

  3. mach perftest (MAIN /Home activity)
    These are both defined in
    https://searchfox.org/mozilla-central/rev/3b6958c26049c1e27b2790a43154caaba9f6dd4a/taskcluster/ci/perftest/android.yml
    These measure time elapsed from process start to either navigationStart in the content process or else the HomeActivity being rendered.
    It won't be hard to try adding the new parameters -- but if this changes the results significantly then the test isn't useful anymore.

Note: tests 2 and 3 run once per day and only 14 iterations each. This seems pretty minor to me.

Flags: needinfo?(dave.hunt)

:sparky can you answer the questions related to where these changes would be needed and what we're running against?

:dexter would there need to be further work for any other tests running against Fenix? We also have YouTube media playback tests, resource tests, and benchmarks such as speedometer. Could we have a common tag rather than have distinct tags such as 'browsertime-pageload', 'mach-perftest', and 'main-home'? These as they are could be misleading (for example 'main-home' is also run using mach perftest).

Flags: needinfo?(gmierz2)
Flags: needinfo?(dave.hunt)
Flags: needinfo?(alessio.placitelli)

(In reply to Dave Hunt [:davehunt] [he/him] ⌚BST from comment #22)

:dexter would there need to be further work for any other tests running against Fenix? We also have YouTube media playback tests, resource tests, and benchmarks such as speedometer.

Yes, any custom run of Fenix needs to be fixed.

Could we have a common tag rather than have distinct tags such as 'browsertime-pageload', 'mach-perftest', and 'main-home'? These as they are could be misleading (for example 'main-home' is also run using mach perftest).

Oh, sure! We don't need the second tag at all, we only require automation. Any other tag might be helpful if we want to understand which CI generated which telemetry.

Flags: needinfo?(alessio.placitelli)

(In reply to Andrew Creskey [:acreskey] [he/him] from comment #21)

Note: tests 2 and 3 run once per day and only 14 iterations each. This seems pretty minor to me.

Agreed, but we'd still want to fix everything and have the cleanest possible data, unless this is a ton of work :)

(In reply to Alessio Placitelli [:Dexter] from comment #23)

(In reply to Dave Hunt [:davehunt] [he/him] ⌚BST from comment #22)

:dexter would there need to be further work for any other tests running against Fenix? We also have YouTube media playback tests, resource tests, and benchmarks such as speedometer.

Yes, any custom run of Fenix needs to be fixed.

I'm 99% certain that we only need to modify tests that use live internet.
So none of the tests that run against the recorded http archives need to be changed. (Since they can't be hitting the telemetry backend due to their https proxy to mitmproxy).

(In reply to Alessio Placitelli [:Dexter] from comment #24)

(In reply to Andrew Creskey [:acreskey] [he/him] from comment #21)

Note: tests 2 and 3 run once per day and only 14 iterations each. This seems pretty minor to me.

Agreed, but we'd still want to fix everything and have the cleanest possible data, unless this is a ton of work :)
I can do a run with the generic perf tests (applink and MAIN/Home_activity) and the modified activity/arguments.
My concern (unproven!), is that launching into a different activity will significantly change the results. (We will see).
An alternative here is to make these test run against mitmproxy archives as well.

(In reply to Andrew Creskey [:acreskey] [he/him] from comment #26)

(In reply to Alessio Placitelli [:Dexter] from comment #23)

(In reply to Dave Hunt [:davehunt] [he/him] ⌚BST from comment #22)

:dexter would there need to be further work for any other tests running against Fenix? We also have YouTube media playback tests, resource tests, and benchmarks such as speedometer.

Yes, any custom run of Fenix needs to be fixed.

I'm 99% certain that we only need to modify tests that use live internet.
So none of the tests that run against the recorded http archives need to be changed. (Since they can't be hitting the telemetry backend due to their https proxy to mitmproxy).

Oh, yes, good point.

I can do a run with the generic perf tests (applink and MAIN/Home_activity) and the modified activity/arguments.
My concern (unproven!), is that launching into a different activity will significantly change the results. (We will see).

This is good to verify. Please keep me posted.

An alternative here is to make these test run against mitmproxy archives as well.

Can you MITM the telemetry endpoint and make it just respond HTTP 200?

Intent arguments would need to be added here: https://searchfox.org/mozilla-central/source/testing/raptor/raptor/browsertime/android.py#74-79
Activity changes are done where acreskey pointed to.

Flags: needinfo?(gmierz2)

:Dexter, I tried this with today's Fenix nightly (July 14th) but it would not launch the app using the mozilla.telemetry.glean.debug.GleanDebugActivity

This binary:
https://firefox-ci-tc.services.mozilla.com/api/index/v1/task/mobile.v2.fenix.nightly.latest.armeabi-v7a/artifacts/public/build/armeabi-v7a/geckoNightly/target.apk

This command:
adb shell am start -d https://www.mozilla.org -a android.intent.action.VIEW org.mozilla.fenix/mozilla.telemetry.glean.debug.GleanDebugActivity --esa automation, mach-perftest --es startNext org.mozilla.fenix.IntentReceiverActivity --ez performancetest true

This was the output:
Starting: Intent { act=android.intent.action.VIEW dat=https://www.mozilla.org/... cmp=org.mozilla.fenix/mozilla.telemetry.glean.debug.GleanDebugActivity }
(But the application was only briefly visible)

adb logs here: https://paste.rs/uHr

Flags: needinfo?(acreskey) → needinfo?(alessio.placitelli)

(In reply to Andrew Creskey [:acreskey] [he/him] from comment #29)

This command:
adb shell am start -d https://www.mozilla.org -a android.intent.action.VIEW org.mozilla.fenix/mozilla.telemetry.glean.debug.GleanDebugActivity --esa automation, mach-perftest --es startNext org.mozilla.fenix.IntentReceiverActivity --ez performancetest true

Sorry, my bad, I typed in the wrong commands in comment 20 (I have now fixed them). This command is missing the -n switch before the debug activity name and the property name for the --esa option. The correct command is this one:

adb shell am start -d https://www.mozilla.org -a android.intent.action.VIEW -n org.mozilla.fenix/mozilla.telemetry.glean.debug.GleanDebugActivity --es startNext org.mozilla.fenix.IntentReceiverActivity --ez performancetest true --esa sourceTags automation, mach-perftest

You can convince yourself this is working correctly by additionally adding a --es debugViewTag andrew-glean-1 and then checking in the Glean Debug View for andrew-glean-1 after the app is moved to background.

Would you kindly confirm this works for you?
Any chance you might be able to help me out with the changes to the test runners?

Flags: needinfo?(alessio.placitelli) → needinfo?(acreskey)
Blocks: 1655477

Closing this as the implementation landed and is correctly working. See bug 1650798 for follow-up work to filter telemetry.

Blocks: 1650798
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Flags: needinfo?(acreskey)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: