Closed Bug 1425055 Opened 6 years ago Closed 5 years ago

subsession_start_date not compatible with presto's `from_iso8601_timestamp`

Categories

(Data Platform and Tools :: General, defect, P3)

defect
Points:
2

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: amiyaguchi, Unassigned)

References

Details

The subsession start date (and creation date) will break the from_iso8601_timestamp function. The subsession start date contains a time zone designator "[+|-]hh" outside of the [-12, 12] range.[1]

[1] https://sql.telemetry.mozilla.org/queries/49895/source
Timezones can be outside of the [-12, 12] range AFAIK:
https://en.wikipedia.org/wiki/Time_zone#List_of_UTC_offsets

Do you suspect a client issue here? Are there actual invalid timezones submitted?
Or do we need a presto fix?
Or dataset normalization as a workaround?
Flags: needinfo?(amiyaguchi)
There are timezones out of those UTC offsets (+17, -19, etc). The workaround during query time is to use `date_parse(substr(___))`, but something in main_summary might be worthwhile to avoid the hassle involved with different date/time formats.
Flags: needinfo?(amiyaguchi)
These are probably timezones clients are legitly reporting (from a Telemetry perspective), so i would suggest that we fix them up for derived datasets if needed.
This would be good to document though for people that are interested in the real offsets clients see.
Component: Telemetry → Datasets: General
Product: Toolkit → Data Platform and Tools
Points: --- → 2
Priority: -- → P1
Looks like the plan is to clean the data flowing into main_summary, so moving this bug to that component.
Component: Datasets: General → Datasets: Main Summary
Priority: P1 → P3
See Also: → 1456264
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
Component: Datasets: Main Summary → General
You need to log in before you can comment on or make changes to this bug.