Closed Bug 1547683 Opened 6 years ago Closed 5 years ago

Handle negative durations in pre-account ping

Categories

(Toolkit :: Telemetry, defect, P3)

defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: janerik, Unassigned)

References

Details

As a followup, with the new data we have, we should make a decision on how to handle negative durations.

Some options:

  1. Don't handle them, but document behavior.
  2. Catch them client-side and submit 0? And maybe also increase an error counter?

Meanwhile: see if it is an bug of the underlying timestamp mechanism and fix that instead.

Assignee: nobody → jrediger
Priority: -- → P1

Analysis notebook: https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/106375/

Example data:

{'duration': -922334468, 'new': u'-922334467857.4777', 'old': u'0', 'os': u'Windows_NT'},

In words: The internal implementation of monotonicNow[^1] is giving us a negative value.
Notable: Negative durations only from Windows.

[^1]: That is: TimeStamp::NowLoRes() - TimeStamp::ProcessCreation()

I have the suspicion that we are getting INT64_MIN (-9223372036854775808) or INT64_MAX (9223372036854775807) somewhere internally. That value looks very similar to the reported new/duration in its higher digits.

:chutten's suggestion to split up this work:

  1. Fix pre-account durations (because that's the bug)
  2. Figure out how deeply this affects Telemetry code
  3. Bring in a Windows/Timestamp expert to weigh in on what this might be and might mean for Firefox at large

The whole call stack for the timestamps:

  1. TelemetryUtils.monotonicNow
  2. TelemetryImpl::MsSinceProcessStart
  3. TelemetryCommon::MsSinceProcessStart
  4. TimeStamp::NowLoRes() - TimeStamp::ProcessCreation() (in TimeStamp.cpp)

:nfroyd, :wlach told me he talked to you about this a while ago. Do you happen to know this part of the code and know what's going on or should I try finding someone else?

Flags: needinfo?(nfroyd)

(In reply to Jan-Erik Rediger [:janerik] from comment #4)

The whole call stack for the timestamps:

  1. TelemetryUtils.monotonicNow
  2. TelemetryImpl::MsSinceProcessStart
  3. TelemetryCommon::MsSinceProcessStart
  4. TimeStamp::NowLoRes() - TimeStamp::ProcessCreation() (in TimeStamp.cpp)

:nfroyd, :wlach told me he talked to you about this a while ago. Do you happen to know this part of the code and know what's going on or should I try finding someone else?

I remember having discussions about this issue. Searches in my inbox don't seem to be turning up a lot, but bug 1204823 looks relevant to this discussion? It doesn't seem quite like the case you have here, though...

Flags: needinfo?(nfroyd)

Is this actively worked on?
Is it blocking anything currently?

Flags: needinfo?(jrediger)

(wrong comment in wrong bug)

Flags: needinfo?(jrediger)

It's not actively worked on. Not sure if this is actually blocking things. I'm also out of ideas on how to further tackle this. Unassigning and moving to triage.

Assignee: jrediger → nobody
Priority: P1 → --

The priority flag is not set for this bug.
:gfritzsche, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(gfritzsche)

Currently the ecosystem work is paused until dependencies are cleared.
Moving this to P3, we can move it back to active work later.

Flags: needinfo?(gfritzsche)
Priority: -- → P3

Bulk action: Closing down old Ecosystem Telemetry bug tree. Long live Accounts Ecosystem Telemetry!

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.