Closed Bug 1896106 Opened 29 days ago Closed 26 days ago

Marionette's processhandler for psutil should not wait for reader thread when calling `wait()` from within the `poll()` method (causing a 1s extra delay when starting the session)

Categories

(Testing :: Marionette Client and Harness, defect, P2)

defect
Points:
2

Tracking

(firefox-esr115 unaffected, firefox125 unaffected, firefox126 unaffected, firefox127 fixed, firefox128 fixed)

RESOLVED FIXED
128 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox125 --- unaffected
firefox126 --- unaffected
firefox127 --- fixed
firefox128 --- fixed

People

(Reporter: whimboo, Assigned: whimboo)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression, Whiteboard: [webdriver:m11])

Attachments

(2 files, 1 obsolete file)

When running Marionette tests on MacOS these days it can be seen that there is a 1s delay between each and every test. After doing some investigation I can trace this back to using psutil. This feature was added over on bug 1884401 for Firefox 127.

Specifically the delay comes from calling this line in marionette.py:

returncode = self.instance.runner.returncode

And that then drills down to waiting for the reader thread to join in Marionette's own process handler for psutil:

https://searchfox.org/mozilla-central/rev/8e885f04a0a4ff6d64ea59741c10d9b8e45d9ff8/testing/marionette/client/marionette_driver/processhandler.py#352

After comparing to the mozprocess implementation I can see that this one doesn't call into wait() for poll(), which means that we never try to join the reader threads and that the poll() method returns immediately.

Actually the poll() method is not available when the process handler is observing an external process like Firefox after a restart. That means we have to continue using wait() but potentially should not wait for the reader thread to have joined.

Summary: Marionette's processhandler for psutil should not call `wait()` from the `poll()` method (causing a 1s extra delay when starting the session) → Marionette's processhandler for psutil should not wait for reader thread when calling `wait()` from within the `poll()` method (causing a 1s extra delay when starting the session)
Blocks: 1892940
Points: --- → 2
Priority: -- → P2
Whiteboard: [webdriver:m11]
Pushed by hskupin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/78ea7783f16f
[marionette] Don't wait for reader thread when polling psutil process. r=webdriver-reviewers,jgraham
Status: ASSIGNED → RESOLVED
Closed: 26 days ago
Resolution: --- → FIXED
Target Milestone: --- → 128 Branch

The patch landed in nightly and beta is affected.
:whimboo, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox127 to wontfix.

For more information, please visit BugBot documentation.

Flags: needinfo?(hskupin)

Comment on attachment 9401088 [details]
Bug 1896106 - [marionette] Don't wait for reader thread when polling psutil process.

Beta/Release Uplift Approval Request

  • User impact if declined: Users of Marionette on MacOS will expect a 1s delay at the beginning of each test that gets run, which can accumulate quite a lot of time
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Detecting if a process is running doesn't need to wait for the readers of stdout and stderr to wait. This patch just removed these parts and made it equal to what subprocess is running.
  • String changes made/needed: No
  • Is Android affected?: No
Flags: needinfo?(hskupin)
Attachment #9401088 - Flags: approval-mozilla-beta?

Perfherder has detected a awsy performance change from push 78ea7783f16f564aa8886dd8e51b42c2971c508b.

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
10% Images macosx1015-64-shippable-qr fission tp6 6,642,001.27 -> 5,979,633.87

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 221

For more information on performance sheriffing please see our FAQ.

Keywords: perf-alert
Attachment #9402207 - Flags: approval-mozilla-beta?

beta Uplift Approval Request

  • User impact if declined: Users of Marionette on MacOS will expect a 1s delay at the beginning of each test that gets run, which can accumulate quite a lot of time
  • Code covered by automated testing: yes
  • Fix verified in Nightly: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: None
  • Risk associated with taking this patch: Low
  • Explanation of risk level: Detecting if a process is running doesn't need to wait for the readers of stdout and stderr to wait. This patch just removed these parts and made it equal to what subprocess is running.
  • String changes made/needed: No
  • Is Android affected?: no
Attachment #9402206 - Attachment is obsolete: true
Attachment #9401088 - Flags: approval-mozilla-beta?
Attachment #9402207 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: