Open Bug 1920369 Opened 1 month ago Updated 5 days ago

Intermittent /webdriver/tests/bidi/network/before_request_sent/before_request_sent.py | test_url_with_fragment - AssertionError

Categories

(Remote Protocol :: WebDriver BiDi, defect, P5)

defect

Tracking

(Not tracked)

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: intermittent-failure)

Filed by: abutkovits [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=475360372&repo=try
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/MXeFzyINTDiF6E4a5SpXkw/runs/0/artifacts/public/logs/live_backing.log


[task 2024-09-22T12:20:40.789Z] 12:20:40     INFO - TEST-PASS | /webdriver/tests/bidi/network/before_request_sent/before_request_sent.py | test_serviceworker_request 

The assertion that is failing here is the following:

[task 2024-09-22T12:20:40.819Z] 12:20:40     INFO -     def _(actual: Any) -> None:
[task 2024-09-22T12:20:40.819Z] 12:20:40     INFO -         any_number(actual)
[task 2024-09-22T12:20:40.820Z] 12:20:40     INFO - >       assert start <= actual <= end
[task 2024-09-22T12:20:40.820Z] 12:20:40     INFO - E       AssertionError
[task 2024-09-22T12:20:40.820Z] 12:20:40     INFO - 
[task 2024-09-22T12:20:40.820Z] 12:20:40     INFO - actual     = 1727007639646
[task 2024-09-22T12:20:40.820Z] 12:20:40     INFO - end        = 1727007639678
[task 2024-09-22T12:20:40.820Z] 12:20:40     INFO - start      = 1727007639649

That's a change that is coming from bug 1916685.

But it's strange. Do we maybe have some other network event that is received before the expected one and causes this failure? This failure has occurred only twice so far so it's not really a priority for now.

Depends on: 1916685
Flags: needinfo?(jdescottes)

Do we maybe have some other network event that is received before the expected one and causes this failure?

Sounds unlikely. This is the only test where we hit https://web-platform.test:8443/webdriver/tests/bidi/network/support/empty.html#foo and the URL looks correct in the event payload.

The timing we see here comes from nsITimedChannel's channelCreationTime
Maybe channel we end up using here was created earlier and recycled? I am wondering if using asyncOpenTime would be a better fit.

Valentin, do you known which channel timing would be the closest to fetch's start time? channelCreationTime or asyncOpenTime (or something else)?

Flags: needinfo?(jdescottes) → needinfo?(valentin.gosu)

Based on https://www.w3.org/TR/navigation-timing/#dom-performancetiming-fetchstart

fetchStart attribute
If the new resource is to be fetched using HTTP GET or equivalent, fetchStart must return the time immediately before the user agent starts checking any relevant application caches. Otherwise, it must return the time when the user agent starts fetching the resource.

I think that maybe asyncOpenTime would be the one we should be using?

But I'm not 100% sure if that's the problem. Considering it's the test_serviceworker_request test that is failing, and these comments in the IDL:
https://searchfox.org/mozilla-central/rev/466db50bfdddc209f068e281e552a2583d82f958/netwerk/base/nsITimedChannel.idl#40-45

// These properties should only be written externally when they must be
// propagated across an internal redirect.  For example, when a service
// worker interception falls back to network we need to copy the original
// timing values to the new nsHttpChannel.
[noscript] attribute TimeStamp channelCreation;
[noscript] attribute TimeStamp asyncOpen;

I'm thinking maybe the mismatch is caused by the fact that we're copying some of the timestamps over to the service worker timings?
I think we do that here

Flags: needinfo?(valentin.gosu)

Thanks!

it's the test_serviceworker_request test that is failing

The bug summary is a bit confusing, but I think all the failures so far have been with test_url_with_fragment. I will try to see if asyncOpenTime helps (not sure if it's easy to trigger on try though)

Depends on: 1921712
See Also: 1921712

A bit hard to tell if using asyncOpenTime actually fixes the issue, but I had 2 try pushes with many retriggers, and it failed once with the current baseline with the following assertion failure:

[task 2024-09-30T09:12:54.193Z] 09:12:54     INFO - >       assert start <= actual <= end
[task 2024-09-30T09:12:54.193Z] 09:12:54     INFO - E       AssertionError
[task 2024-09-30T09:12:54.193Z] 09:12:54     INFO - 
[task 2024-09-30T09:12:54.193Z] 09:12:54     INFO - actual     = 1727687567415
[task 2024-09-30T09:12:54.193Z] 09:12:54     INFO - end        = 1727687567451
[task 2024-09-30T09:12:54.193Z] 09:12:54     INFO - start      = 1727687567417

1727687567415 < 1727687567417, 2ms of difference. Whereas with the patch using asyncOpenTime, the only failure was:

[task 2024-09-30T11:48:34.436Z] 11:48:34     INFO - >       assert start <= actual <= end
[task 2024-09-30T11:48:34.436Z] 11:48:34     INFO - E       AssertionError
[task 2024-09-30T11:48:34.436Z] 11:48:34     INFO - 
[task 2024-09-30T11:48:34.436Z] 11:48:34     INFO - actual     = 1727696907751.183
[task 2024-09-30T11:48:34.436Z] 11:48:34     INFO - end        = 1727696907794
[task 2024-09-30T11:48:34.437Z] 11:48:34     INFO - start      = 1727696907752

1727696907751.183 < 1727696907752, less than 1ms of difference, could be a ceil/rounding issue?

Depends on: 1922390
You need to log in before you can comment on or make changes to this bug.