Closed Bug 1560960 Opened 1 year ago Closed 1 year ago

Fix timeout handling for "test_servers()"

Categories

(Testing :: web-platform-tests, defect, P1)

69 Branch
defect

Tracking

(firefox69 fixed)

RESOLVED FIXED
mozilla69
Tracking Status
firefox69 --- fixed

People

(Reporter: whimboo, Assigned: jgraham)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Within ensure_started() the method test_servers() is called each and every 10ms:

https://searchfox.org/mozilla-central/rev/0b7007a23bc16c857f894140e12f307bfeef2fdd/testing/web-platform/tests/tools/wptrunner/wptrunner/environment.py#207-217

This is actually causing a crash on MacOS, maybe due to some socket not being freed yet. When I increase this value to 1s it works all fine. I think that this is what we should do anyway. It doesn't make sense to run it that often.

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000106da7a3a
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Segmentation fault: 11
Termination Reason:    Namespace SIGNAL, Code 0xb
Terminating Process:   exc handler [79388]

VM Regions Near 0x106da7a3a:
    MALLOC_LARGE           0000000106d96000-0000000106da7000 [   68K] rw-/rwx SM=COW  
--> 
    MALLOC_LARGE           0000000106e13000-0000000106e24000 [   68K] rw-/rwx SM=COW  

Application Specific Information:
crashed on child side of fork pre-exec

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libsystem_trace.dylib         	0x00007fff59fca90a _os_log_preferences_refresh + 76
1   libsystem_trace.dylib         	0x00007fff59fcb13d os_log_type_enabled + 627
2   libsystem_info.dylib          	0x00007fff59edc30f si_destination_compare_internal + 1023
3   libsystem_info.dylib          	0x00007fff59edbd3f si_destination_compare + 559
4   libsystem_info.dylib          	0x00007fff59eba6df _gai_addr_sort + 111
5   libsystem_c.dylib             	0x00007fff59e64e5b _isort + 193
6   libsystem_c.dylib             	0x00007fff59e64d88 _qsort + 2125
7   libsystem_info.dylib          	0x00007fff59eb1f2d _gai_sort_list + 781
8   libsystem_info.dylib          	0x00007fff59eb0885 si_addrinfo + 2021
9   libsystem_info.dylib          	0x00007fff59eaff77 _getaddrinfo_internal + 231
10  libsystem_info.dylib          	0x00007fff59eafe7d getaddrinfo + 61
11  _socket.so                    	0x0000000106cad74c setipaddr + 356

Also due to the runtime of test_servers() the loop will actually not be aborted after 30s, but it takes way longer.

Actually another issue with the current approach is that the temporarily socket as created for the check doesn't set a timeout. Which means if the server on the specified port hasn't been started yet, the call to connect() will block forever. That is what I was facing with the crash, which actually stops Python to not create at least one of the test servers. To fix that a socket timeout has to be set.

Pushed by james@hoppipolla.co.uk:
https://hg.mozilla.org/integration/autoland/rev/13f1ce87a10a
Fix waiting for wpt servers to start, r=whimboo
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/17488 for changes under testing/web-platform/tests
Upstream web-platform-tests status checks passed, PR will merge once commit reaches central.

The crash is actually coming from spawning the test server threads, and is covered by bug 1561224.

Assignee: hskupin → james
Severity: critical → normal
OS: macOS → All
Hardware: x86_64 → All
Summary: Repeatedly calling "test_servers()" each 10ms causes a crash on MacOS → Fix timeout handling for "test_servers()"
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla69
Upstream PR merged
You need to log in before you can comment on or make changes to this bug.