Closed Bug 1542561 Opened 6 months ago Closed 4 months ago

Intermittent netwerk/test/unit/test_trr.js | test6 - [test6 : 84] 2152398878 == 0

Categories

(Core :: Networking: DNS, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla69
Tracking Status
firefox-esr60 --- unaffected
firefox67 --- wontfix
firefox68 --- fixed
firefox69 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: valentin)

References

Details

(Keywords: intermittent-failure, regression, Whiteboard: [necko-triaged][trr][stockwell unknown])

Attachments

(2 files)

#[markdown(off)]
Filed by: csabou [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=238623057&repo=autoland

https://queue.taskcluster.net/v1/task/alhzoND0SUeC-yNfkOmgzQ/runs/0/artifacts/public/logs/live_backing.log

[task 2019-04-07T00:06:16.131Z] 00:06:16 INFO - TEST-START | services/sync/tests/unit/test_clients_engine.js
[task 2019-04-07T00:06:22.772Z] 00:06:22 INFO - TEST-PASS | services/sync/tests/unit/test_clients_engine.js | took 6644ms
[task 2019-04-07T00:06:22.780Z] 00:06:22 INFO - Retrying tests that failed when run in parallel.
[task 2019-04-07T00:06:22.787Z] 00:06:22 INFO - TEST-START | netwerk/test/unit/test_trr.js
[task 2019-04-07T00:06:23.239Z] 00:06:23 WARNING - TEST-UNEXPECTED-FAIL | netwerk/test/unit/test_trr.js | xpcshell return code: 0
[task 2019-04-07T00:06:23.239Z] 00:06:23 INFO - TEST-INFO took 451ms
[task 2019-04-07T00:06:23.240Z] 00:06:23 INFO - >>>>>>>
[task 2019-04-07T00:06:23.241Z] 00:06:23 INFO - (xpcshell/head.js) | test MAIN run_test pending (1)
[task 2019-04-07T00:06:23.242Z] 00:06:23 INFO - (xpcshell/head.js) | test run_next_test 0 pending (2)
[task 2019-04-07T00:06:23.243Z] 00:06:23 INFO - (xpcshell/head.js) | test MAIN run_test finished (2)
[task 2019-04-07T00:06:23.244Z] 00:06:23 INFO - running event loop
[task 2019-04-07T00:06:23.245Z] 00:06:23 INFO - "CONSOLE_MESSAGE: (info) No chrome package registered for chrome://branding/locale/brand.properties"
[task 2019-04-07T00:06:23.246Z] 00:06:23 INFO - netwerk/test/unit/test_trr.js | Starting setup
[task 2019-04-07T00:06:23.246Z] 00:06:23 INFO - (xpcshell/head.js) | test setup pending (2)
[task 2019-04-07T00:06:23.247Z] 00:06:23 INFO - PID 13186 | start!
[task 2019-04-07T00:06:23.247Z] 00:06:23 INFO - TEST-PASS | netwerk/test/unit/test_trr.js | setup - [setup : 16] "37043" != null

Following bug 1540656 it seems that bug 1465504 has morphed into this one.
As Daniel says in bug 1465504 comment 28 these are failures to get an AAAA response from the TRR h2 test server.

Assignee: nobody → valentin.gosu
Blocks: DoH
Component: Networking → Networking: DNS
Priority: P5 → P3
Whiteboard: [necko-triaged][trr]
Duplicate of this bug: 1465504
Duplicate of this bug: 1539121

Over the last 7 days there are 31 failures present on this bug. These happen on linux32-shippable, linux64, linux64-qr, linux64-shippable, linux64-shippable-qr, windows10-64-shippable, windows7-32-shippable

Here is the most recent log example: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=241591552&repo=mozilla-central&lineNumber=2141

Flags: needinfo?(valentin.gosu)

There are 49 failures in the past 7 days happening mostly on linux64-shippable.

Recent log link: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=243164751&repo=mozilla-inbound&lineNumber=2171

[task 2019-04-28T09:44:01.729Z] 09:44:01 INFO - TEST-START | netwerk/test/unit/test_trr.js
[task 2019-04-28T09:44:02.169Z] 09:44:02 WARNING - TEST-UNEXPECTED-FAIL | netwerk/test/unit/test_trr.js | xpcshell return code: 0
[task 2019-04-28T09:44:02.171Z] 09:44:02 INFO - TEST-INFO took 436ms
[task 2019-04-28T09:44:02.171Z] 09:44:02 INFO - >>>>>>>
[task 2019-04-28T09:44:02.171Z] 09:44:02 INFO - (xpcshell/head.js) | test MAIN run_test pending (1)
[task 2019-04-28T09:44:02.172Z] 09:44:02 INFO - (xpcshell/head.js) | test run_next_test 0 pending (2)
[task 2019-04-28T09:44:02.174Z] 09:44:02 INFO - (xpcshell/head.js) | test MAIN run_test finished (2)
[task 2019-04-28T09:44:02.175Z] 09:44:02 INFO - running event loop

Valentin, can you take a look here?

Flags: needinfo?(valentin.gosu)

This test uses prefs added in Bug 1518730, but the pref is ignored when it
doesn't exist, so the test is still valid.

Depends on D33471

Pushed by valentin.gosu@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/71341d91163e
TRR: Don't return NS_ERROR_UNKNOWN_HOST when a AAAA response comes back first, but the second A response is NXDOMAIN r=dragana
https://hg.mozilla.org/integration/autoland/rev/584d67c324a8
Test that a IPv4 NXDOMAIN still uses the IPv6 response, regardless which one comes back first r=dragana
Status: NEW → RESOLVED
Closed: 4 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla69

Please nominate this for Beta approval when you get a chance.

Flags: needinfo?(valentin.gosu)
Flags: in-testsuite+

Comment on attachment 9069313 [details]
Bug 1542561 - TRR: Don't return NS_ERROR_UNKNOWN_HOST when a AAAA response comes back first, but the second A response is NXDOMAIN r=dragana

Beta/Release Uplift Approval Request

  • User impact if declined: Intermittent test failures.
    TRR might not work for IPv6 only hosts.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch makes sure we return the actual status code of the record we report, instead of the status of the latest response (which could be an error code)
    Low risk as: it's covered by unit tests, and TRR is not yet enabled by default.
  • String changes made/needed:
Flags: needinfo?(valentin.gosu)
Attachment #9069313 - Flags: approval-mozilla-beta?
Attachment #9069315 - Flags: approval-mozilla-beta?

Comment on attachment 9069313 [details]
Bug 1542561 - TRR: Don't return NS_ERROR_UNKNOWN_HOST when a AAAA response comes back first, but the second A response is NXDOMAIN r=dragana

DoH fix; approved for 68.0b8

If you'd care to humor me, I'm curious how returning NXDOMAIN for a rrtype and a non-empty response for a different rrtype but same name is even valid (as a non-transient situation, anyway).

Attachment #9069313 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9069315 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

(In reply to Julien Cristau [:jcristau] from comment #21)

Comment on attachment 9069313 [details]
If you'd care to humor me, I'm curious how returning NXDOMAIN for a rrtype and a non-empty response for a different rrtype but same name is even valid (as a non-transient situation, anyway).

It's possible to for a domain to only have an IPv4 or IPv6 address without having the other.
When resolving example.com with TRR, we issue both an A, then an AAAA request, and wait for both. If a domain only has IPv6, and the AAAA response comes back first, we save it. Then the A response comes back, but with an error result. In this case we want to use the AAAA response, but we mistakenly used the status code from the A response, which was interpreted as a failure.

There are conflicts for the second revision. Does bug 1552886 (and anything else) also need to be uplifted?

Flags: needinfo?(valentin.gosu)

(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #23)

There are conflicts for the second revision. Does bug 1552886 (and anything else) also need to be uplifted?

Oh right, it depends on some changes made in bug 1552886. I don't think we need to uplift that (although we can)
I don't think the test needs to be uplifted. attachment 9069313 [details] should fix the intermittent test failures in beta anyway.
attachment 9069315 [details] is just a more thorough test and exercises the code path that used to fail.

Flags: needinfo?(valentin.gosu)
You need to log in before you can comment on or make changes to this bug.