Closed Bug 1528061 Opened 9 months ago Closed 4 months ago

./mach wpt stalls due to install_windows_font() never returning

Categories

(Testing :: web-platform-tests, defect)

Version 3
Unspecified
Windows 10
defect
Not set

Tracking

(firefox70 fixed)

RESOLVED FIXED
mozilla70
Tracking Status
firefox70 --- fixed

People

(Reporter: bryce, Assigned: bryce)

Details

Attachments

(1 file)

When running ./mach wpt mach stalls and the wpt are never run. If I add logging to the python code involved, I'm able to trace the issue to this line never returning. I'm able to work around it by short circuiting the font install, with (seemingly) no problems.

STR:

  • ./mach wpt testing/web-platform/tests/encrypted-media/
  • I get some output, but then the processes appears to freeze. Example output below:
 0:00.18 INFO Skipping manifest download because existing file is recent
 0:10.19 mozversion INFO application_buildid: 20190214083728
 0:10.19 mozversion INFO application_changeset: f0ea53f47215d2449c7e759fa36aa76863ced3ff
 0:10.19 mozversion INFO application_display_name: Nightly
 0:10.19 mozversion INFO application_id: {ec8030f7-c20a-464f-9b0e-13a3a9e97384}
 0:10.19 mozversion INFO application_name: Firefox
 0:10.19 mozversion INFO application_remotingname: firefox
 0:10.19 mozversion INFO application_vendor: Mozilla
 0:10.19 mozversion INFO application_version: 67.0a1
 0:10.19 mozversion INFO platform_buildid: 20190214083728
 0:10.19 mozversion INFO platform_changeset: f0ea53f47215d2449c7e759fa36aa76863ced3ff
 0:10.19 mozversion INFO platform_version: 67.0a1
 0:10.85 INFO Using 1 client processes

I've waited ~5 minutes to test if it's just running slowly, but no new output is produced.

If I replace the problematic line with a return True, the test run progresses as expected.

I don't believe this as a result of bug 1522696, as my problems with this preceded the changes there (and I've just been slack about reporting this until now).

It sounds like this is likely a result of SendMessageW broadcasting to all top level Windows and one of them not responding. As such I culled various windows to see if that would remedy the problem and give and indication of what the issue is. The problem appearred to be resolved when closing vscode, which seems like an unlikely culprit. In addition, the issue does not represent when vscode is reopened. I closed visual studio prior to vscode, and it seems more plausable that I could have had a zombie Firefox debug process hanging off of visual studio that could have been eating messages, and that I did not wait long enough after killing visual studio before trying the tests again.

Now that I understand what causes this, I'll try to keep an eye out to see if I can isolate the culprit.

I have encountered this same issue today. It was reproduced after performing a clean reboot with no windows other than task tray items and a terminal open.

Adding the "return True" on the problematic line enabled the tests to continue for me also.

I've run into this a few times more and it's still not clear what the culprit is.

My work around in comment 0 is kinda janky. I'd not suggested further fixes since I wasn't sure if this was impacting any body else and I wanted to find the culprit. However, we could instead use SendNotifyMessageW instead of SendMessageW, which would result in a non blocking call.

I've got a bit of backlog to clear, but will look into cooking up a patch for this. Throwing up NI on myself so I don't forget. If any one else wants this before I get to it, please clear my NI.

Flags: needinfo?(bvandyk)

It's also possible that this codepath will disappear since there's ongoing work to remove the requirement to install Ahem. But a patch to use an async method seems desirable anyway.

(In reply to James Graham [:jgraham] from comment #4)

It's also possible that this codepath will disappear since there's ongoing work to remove the requirement to install Ahem. But a patch to use an async method seems desirable anyway.

Sounds good. Putting up a patch to mitigate this in the mean time, since I'm running into it again.


For posterity this is a useful reference should anyone come across this and want to know more.

Flags: needinfo?(bvandyk)

This changes the wptrunner to use SendNotifyMessageW instead of SendMessageW
when installing and removing fonts on Windows. The difference between these calls
is that SendNotifyMessageW will not block on each window processing the message
while SendMessageW will. This addresses an issue where the wptrunner would stall
if a window was not processing the font added or removed message.

Pushed by bvandyk@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cde6d92a2230
wptrunner sends non-blocking message when handling fonts on Windows. r=jgraham
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/17793 for changes under testing/web-platform/tests
Upstream web-platform-tests status checks passed, PR will merge once commit reaches central.
Status: NEW → RESOLVED
Closed: 4 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla70
Assignee: nobody → bvandyk
Upstream PR merged
You need to log in before you can comment on or make changes to this bug.