Closed Bug 1571219 Opened 5 years ago Closed 5 years ago

Intermittent dom/performance/tests/test_worker_user_timing.html | X === Y : measure start time should be equal to the mark startTime

Categories

(Core :: Performance, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla72
Tracking Status
firefox-esr68 --- unaffected
firefox67 --- unaffected
firefox68 --- unaffected
firefox69 --- unaffected
firefox70 --- wontfix
firefox71 --- fixed
firefox72 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: tjr)

References

(Regression)

Details

(Keywords: intermittent-failure, regression)

Attachments

(2 files)

Filed by: ncsoregi [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=259779563&repo=mozilla-inbound
Full log: https://queue.taskcluster.net/v1/task/YlkN7TsbTz2p7FU1SmlgUg/runs/0/artifacts/public/logs/live_backing.log


[task 2019-08-03T14:56:23.394Z] 14:56:23 INFO - 3367 INFO TEST-PASS | dom/performance/tests/test_worker_user_timing.html | true: Running measure addition with only start time test
[task 2019-08-03T14:56:23.394Z] 14:56:23 INFO - Buffered messages finished
[task 2019-08-03T14:56:23.394Z] 14:56:23 INFO - 3368 INFO TEST-UNEXPECTED-FAIL | dom/performance/tests/test_worker_user_timing.html | 32.46 === 32.48: measure start time should be equal to the mark startTime
[task 2019-08-03T14:56:23.394Z] 14:56:23 INFO - SimpleTest.ok@SimpleTest/SimpleTest.js:275:18
[task 2019-08-03T14:56:23.394Z] 14:56:23 INFO - worker.onmessage@dom/performance/tests/test_worker_user_timing.html:21:7
[task 2019-08-03T14:56:23.394Z] 14:56:23 INFO - 3369 INFO TEST-PASS | dom/performance/tests/test_worker_user_timing.html | true: measure duration should not be negative

Summary: Intermittent dom/performance/tests/test_worker_user_timing.html | 32.46 === 32.48: measure start time should be equal to the mark startTime → Intermittent dom/performance/tests/test_worker_user_timing.html | X === Y : measure start time should be equal to the mark startTime

test_worker_user_timing.html continues to fail both here and in bug 1571099, having about 30 failures/week.

Entire history looks like this: https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-06-10&endday=2019-10-08&tree=trunk&bug=1571219

https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-06-10&endday=2019-10-08&tree=trunk&bug=1571099

Geoff I have created a disable patch, do you think we should try another way here?

Flags: needinfo?(gbrown)

This test has been failing for a long time and fails frequently; I agree it should be fixed or disabled very soon. Let's ni the triage owner and test author (from bug 1155761) to see if there is interest in fixing this.

Blocks: 1155761
Flags: needinfo?(vchin)
Flags: needinfo?(gbrown)
Flags: needinfo?(amarchesini)
See Also: → 1571099

Thank you.
@sheriffs please don't land the patch until there is a reply from the developers.

I think this is something that should be fixed or at least it shouldn't be something that we can simply disable and leave it there.

The test uses performance.mark API to create a mark, and performance.measure API is used here to calculate the elapsed time between marks. The test fails because the measurement's startTime doesn't equal to the startTime of the starting mark. So given the definition of performance.measure API , I'd expect those 2 timestamps to be the same.

My 2 cents guess is, given the fact that the failing startTime differences were small (32.46 === 32.48), and the test sets privacy.reduceTimerPrecision pref to false explicitly, I'd expect the pref has something to do for this failure.

Assignee: nobody → sefeng

By looking at this a bit more, I wonder it might be related to the performance marker timer precision changes we made in https://phabricator.services.mozilla.com/D39969.

Tom, what do you think?

Flags: needinfo?(tom)

Yes. I believe that patch is now double reducing different timestamps and needs to be fixed. I'll try to get to it in the next week or two.

Assignee: sefeng → tom
Flags: needinfo?(vchin)
Flags: needinfo?(tom)
Flags: needinfo?(amarchesini)
Priority: P5 → P2
Regressed by: 1387894

Tom should we disable this until there is a fix?

Flags: needinfo?(tom)

Sure; sorry. I'll try to get to it soon though.

Flags: needinfo?(tom)
Pushed by tritter@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8b72f1465217
Refactor PerformanceMeasure to avoid double-reducing timestamps r=baku
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla72

Please nominate this for Beta uplift when you get a chance.

Comment on attachment 9103375 [details]
Bug 1571219 - Refactor PerformanceMeasure to avoid double-reducing timestamps r?baku

Beta/Release Uplift Approval Request

  • User impact if declined: We'll get intermittent test failures. Web Platform impact is zero or near-zero.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: High
  • Why is the change risky/not risky? (and alternatives if risky): I'm going to mark this as High; because while this code is covered by automated tests; it's entirely possible this will create a new intermittent. (I don't think it will, of course, otherwise I wouldn't have landed it - but that's what I thought about the patch that caused this intermittent.) The whole situation we have with accidently double-reducing timestamps and causing intermittents is pretty brittle, and in hindsight a better architecture should have been chosen.

I think the chance of crashes or web platform problems are zero or close to zero. But the chance of accidentally causing a new intermittent, seems historically to be high. So maybe we want to wait on landing this for some time until we see if any new intermittent pop up on central.

  • String changes made/needed:
Flags: needinfo?(tom)
Attachment #9103375 - Flags: approval-mozilla-beta?

(In reply to Tom Ritter [:tjr] (needinfo for responses to sec-[approval/ratings/advisories/cve's]) from comment #28)

  • Risk to taking this patch: High
  • Why is the change risky/not risky? (and alternatives if risky): I'm going to mark this as High; because while this code is covered by automated tests; it's entirely possible this will create a new intermittent. (I don't think it will, of course, otherwise I wouldn't have landed it - but that's what I thought about the patch that caused this intermittent.) The whole situation we have with accidently double-reducing timestamps and causing intermittents is pretty brittle, and in hindsight a better architecture should have been chosen.

To be clear, is this a user-facing risk or one only affecting automation?

Flags: needinfo?(tom)

Just automation.

Flags: needinfo?(tom)

Comment on attachment 9103375 [details]
Bug 1571219 - Refactor PerformanceMeasure to avoid double-reducing timestamps r?baku

Intermittent fix, no risk for end users but may cause other automation failures, approved for 71 beta 6, if it causes more intermittent failures than it fixes, we will back it out.

Attachment #9103375 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: