Existing FOG testing prefs aren't working out
Categories
(Toolkit :: Telemetry, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox96 | --- | fixed |
People
(Reporter: chutten, Assigned: chutten)
References
(Blocks 1 open bug)
Details
Attachments
(1 file)
The existing behaviour of "If datareporting.healthreport.uploadEnabled
is false
FOG neither records nor reports data" isn't gonna work out for testing.
In order for instrumentation tests to work we must support recording and forbid reporting/uploading. The current way we support this is by requiring d.h.uploadEnabled
to be true
and a telemetry.fog.test.localhost_port
set to a non-0 value (-1
for "pretend everything's a success without touching the network", any positive value for "connect to a local server on the provided port").
There are too many systems in Firefox that treat a true
value of d.h.uploadEnabled
as license to send data. Many, like PingCentre have their own enabled prefs we could whackamole our way through... but this would be a lot of effort to try and plaster over the reality that d.h.uploadEnabled
isn't just for Telemetry any more (if it ever was).
I assert that this means we need to support instrumentation tests in situations where d.h.uploadEnabled
is false
, and propose we address this by allowing FOG to enable Glean when d.h.uploadEnabled
is false
IF AND ONLY IF telemetry.fog.test.localhost_port
is set to -1
.
( I thought about proposing it for any non-0
port, but the risk of letting any data escape firefox even to a local server when the user's stated preference is for no data reporting doesn't seem worth it. )
Assignee | ||
Comment 1•3 years ago
|
||
ni?raphael and Jan-Erik for their input. Please feel free to needinfo anyone else you think should be consulted.
My intent is to put a telemetry.fog.test.localhost_port
of -1
next to every test suite that sets datareporting.healthreport.uploadEnabled
to false
and change FOG to treat that -1
as a reason to enable upload. This means FOG'll also need to query d.h.uploadEnabled
wherever it currently queries t.f.t.localhost_port
so nothing accidentally escapes.
Assignee | ||
Comment 2•3 years ago
|
||
Oh, and any time t.f.t.localhost_port
goes from -1
to any other value it should disable and re-enable Glean so we wipe storage. I'm not sure it'll matter in any real situations, but we should try our best to keep any data that was recorded-but-not-reported from being reported under any circumstances.
Comment 3•3 years ago
|
||
That does sound like a reasonable way out of this.
I agree being careful and wiping storage on changes is good and shouldn't be too much additional work either, we already have a pref observer.
Comment 4•3 years ago
|
||
(In reply to Chris H-C :chutten from comment #0)
I assert that this means we need to support instrumentation tests in situations where
d.h.uploadEnabled
isfalse
, and propose we address this by allowing FOG to enable Glean whend.h.uploadEnabled
isfalse
IF AND ONLY IFtelemetry.fog.test.localhost_port
is set to-1
.
Not sure if this will help you, but you might also consider using xpc::IsInAutomation()
which asserts that non-local connections are disabled.
Assignee | ||
Comment 5•3 years ago
|
||
Comment 6•3 years ago
|
||
Hi :chutten,
I'm back from PTO now. Thanks for the ni. This is an interesting problem!
If I understand correctly there are at least three different scenarios:
- Firefox and FOG are running outside of automated tests (This is the default mode)
- Firefox and FOG are running in integration tests in which we verify recording metrics, submitting pings and sending HTTP requests with the pings
- FOG is running in instrumentation tests in which FOG consumers verify that feature code records the correct metrics and submits the correct pings
This translates to the following preferences values and behavior in FOG with your proposed changes:
- The
telemetry.fog.test.localhost_port
pref is set to0
and FOG is configured to usehttps://incoming.telemetry.mozilla.org
as the ping server. Firefox respects the user's choice and will only send pings ifdatareporting.healthreport.uploadEnabled
istrue
- The
telemetry.fog.test.localhost_port
pref is set to the port of the HTTP server running onlocalhost
, meaning that FOG is configured to usehttp://localhost:port
as the ping server anddatareporting.healthreport.uploadEnabled
is set totrue
in the test runner. It's required that Firefox (or pingsender) sends the pings to that server so that we can verify the JSON contents of the HTTP request. - The
telemetry.fog.test.localhost_port
pref is set to-1
which instructs FOG to tell Glean thatdatareporting.healthreport.uploadEnabled
istrue
regardless of its actual value, so that we can inspect the recorded metrics and submitted pings in Glean, but FOG will not send any of these stored pings as HTTP requests
Is my understanding correct?
Integration test prefs for reference:
Assignee | ||
Comment 7•3 years ago
|
||
You are correct in all three cases, Raphael... with one very minor nit:
FOG doesn't
tell Glean that
datareporting.healthreport.uploadEnabled
istrue
regardless of its actual value
Because Glean doesn't know anything about prefs (Prefs being a very Firefox-specific tech). It is more correct to say that, in the event that telemetry.fog.test.localhost_port
is -1
, FOG calls set_upload_enabled
with true
regardless of datareporting.healthreport.uploadEnabled
value even though FOG makes sure that no uploads reach the network.
(( it would be even more correct to say that, if the localhost_port pref is -1
before init then the Glean SDK is configured with upload_enabled
of true
regardless of d.h.uploadEnabled
, and then use the words in the paragraph above if the pref changes values after the Glean SDK is init'd... but now we're getting more correct while becoming harder to understand)
Comment 8•3 years ago
|
||
Thank you for clarifying that, :chutten. That sounds reasonable to me.
Comment 10•3 years ago
|
||
Backed out for causing failures on test_FOGPrefs.js. CLOSED TREE
Backout link : https://hg.mozilla.org/integration/autoland/rev/ff765823111f8ead2e1edb3e847b48ab82649273
Push with failures: https://treeherder.mozilla.org/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception%2Crunnable&revision=3ded83469acf9277c9d829720fb7030953867e03
Link to failure log :
https://treeherder.mozilla.org/logviewer?job_id=357414332&repo=autoland&lineNumber=1804
https://treeherder.mozilla.org/logviewer?job_id=357412761&repo=autoland&lineNumber=2410
Assignee | ||
Comment 11•3 years ago
|
||
Turns out if you turn on data upload, data might be uploaded. Whoops.
Think I just need to set toolkit.telemetry.server
to a local address and we'll be sorted. Got it up here on try to find out.
Comment 12•3 years ago
|
||
Assignee | ||
Comment 14•3 years ago
|
||
Well that's weird, I tested that on try
and it came back clean. I wonder what was different on autoland
.
Comment 15•3 years ago
|
||
Comment 16•3 years ago
|
||
Backed out changeset edefcf1ad9a5 (Bug 1737157) for causing xpcshell failures on test_FOGPrefs.js.
Backout link
Push with failures
Failure Log
Comment 17•3 years ago
|
||
Assignee | ||
Comment 18•3 years ago
|
||
I don't understand. This should be impossible.
I need to take a closer look.
Comment 19•3 years ago
|
||
Comment 20•3 years ago
|
||
Backed out for causing xpc failures in glean/tests/xpcshell/test_FOGPrefs.js
Backout link: https://hg.mozilla.org/integration/autoland/rev/3a3c8e058d2f25ef7e8df30b825781539b4fa16b
Assignee | ||
Comment 21•3 years ago
|
||
ARGH. But it succeeded when I ran it here. What in the world is going on.
Jan-Erik, could you please take a look at this? I've run out of ways my mind can conceive of this going wrong. And I've run out of ways to test this without pushing it.
Comment 22•3 years ago
|
||
Comment 23•3 years ago
|
||
bugherder |
Updated•3 years ago
|
Description
•