Closed Bug 1749356 Opened 3 years ago Closed 3 years ago

Default-browser-agent with “1” in document_type space should be dropped in MessageScrubber

Categories

(Data Platform and Tools :: General, task)

task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: akomar, Assigned: wstuckey)

References

Details

(Whiteboard: [dataquality])

This came up in missing doctypes/versions section of Platform Health Check: https://mozilla.cloud.looker.com/dashboards/387?Submission+Date=90+day

Assignee: nobody → akomarzewski

This would involve a change in https://github.com/mozilla/gcp-ingestion/blob/main/ingestion-beam/src/main/java/com/mozilla/telemetry/decoder/MessageScrubber.java and adding an associated test.

We would throw UnwantedDataException in the case that document_type = "1" and document_namespace = "default-browser-agent". See https://github.com/mozilla-services/mozilla-pipeline-schemas for more context on these concepts.

I'm reassigning this to :wil as a good first bug to gain some familiarity with the platform.

Assignee: akomarzewski → wstuckey

Wil found that we already handle this case. See https://bugzilla.mozilla.org/show_bug.cgi?id=1626020 and this block in MessageScrubber. So these should already be getting filtered out and there's something we're missing.

See Also: → 1626020

We discard this by throwing AffectedByBugException. There's a comment few lines above mentioning that these exceptions should appear in monitoring. This is consistent with the query that "fired" this alarm.

I think since this bug has been fixed in the client we can switch the exception to UnwantedDataException. This will make it disappear from the monitoring dashboard.

Exception class has been switched to UnwantedDataException in https://github.com/mozilla/gcp-ingestion/pull/1998

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Whiteboard: [data-quality] → [dataquality]
You need to log in before you can comment on or make changes to this bug.