1450938 - [meta] Intermittent "Automation Error: mozprocess timed out after 1000 seconds running"

Reporter

Description

•

7 years ago

Lets have a meta bug for all of the known mozprocess timeout issues after 1000 seconds of runtime. Maybe it gives us a chance to figure out the underlying issue.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 1

•

7 years ago

All the "Automation Error" failures for mozprocess actually are coming from mozharness: https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/base/script.py#1400 Nothing in that file has been changed in the last couple months which would be related to the timeout issues with mozprocess. But something which comes into my mind are the updates of the copy of mozprocess for mozharness. Maybe one of those caused a regression. https://hg.mozilla.org/mozilla-central/log/c44f60c43432d468639b5fe078420e60c13fd3de/testing/mozharness/mozprocess/processhandler.py I wonder if we should better forward the maximum execution timeout of 1000s to the harnesses itself, or have better fine-granded timeouts for them. It might / should help us to figure out if the hang is inside the harness code itself, or when those are controlling the Firefox binary. Geoff, what do you think?

Flags: needinfo?(gbrown)

Summary: [meta] Intermittent "mozprocess timed out after 1000 seconds running" → [meta] Intermittent "Automation Error: mozprocess timed out after 1000 seconds running"

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 2

•

7 years ago

Please also note my comment on bug 1444831 comment 6, where we had such a hang because of too much logging output of the geckodriver executable. Once we reduced the amount of logged lines the hang was gone. So maybe we are hitting the case in `Popen.wait()` due to the usage of PIPE.

Geoff Brown [:gbrown]

Comment 3

•

7 years ago

(In reply to Henrik Skupin (:whimboo) from comment #1) > I wonder if we should better forward the maximum execution timeout of 1000s > to the harnesses itself, or have better fine-granded timeouts for them. It > might / should help us to figure out if the hang is inside the harness code > itself, or when those are controlling the Firefox binary. > > Geoff, what do you think? The reftest "mozprocess timed out after 1000" logs in bug 1436237 look very much like the crashtest/jsreftest "application timed out after 370" logs in bug 1441580: if the logs can be trusted, browser startup is not completing, and the harness is waiting for the browser. It seems like sometimes the harness 370 second timeout is reported correctly and sometimes that mechanism fails -- an intermittent fault in mozprocess, I suppose. At any rate, since we already have the 370 second timeout in the harnesses (is it actually in mozrunner?), I'm not sure what else/where else we can watch for timeouts. What do you have in mind?

Flags: needinfo?(gbrown)

Geoff Brown [:gbrown]

Updated

•

7 years ago

Updated

•

7 years ago

Priority: -- → P3

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 5

•

7 years ago

Geoff, ok so what I miss to make further progress for reftests are screenshots similar to mochitests. Do we have a plan to get those added?

Flags: needinfo?(gbrown)

Geoff Brown [:gbrown]

Comment 6

•

7 years ago

I filed bug 1443654 for that, but it seems more complicated than I had hoped for, and I'm not finding time to pursue it.

Flags: needinfo?(gbrown)

Comment hidden (Intermittent Failures Robot)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Updated

•

6 years ago

Depends on: 1457830

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Updated

•

6 years ago

Depends on: 1391545

Comment hidden (Intermittent Failures Robot)

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 10

•

6 years ago

Can we please try to classify failures against the depending bugs per test harness instead of this meta bug? Currently this destroys the OF metrics. Thanks.

Flags: needinfo?(aryx.bugmail)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 11

•

6 years ago

All classifications by the same person and explained it to them last week.

Flags: needinfo?(aryx.bugmail)

BugBot [:suhaib / :marco/ :calixte]

Updated

•

6 years ago

Keywords: meta

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 12

•

6 years ago

All dependencies of this meta bug have been fixed. As such I don't wee why we have to keep this bug open anytime longer. Closing as WFM.

Status: NEW → RESOLVED

Closed: 6 years ago

Resolution: --- → WORKSFORME

Bugzilla

[meta] Intermittent "Automation Error: mozprocess timed out after 1000 seconds running"

Categories

(Testing :: General, defect, P3)

Tracking

(Not tracked)

People

(Reporter: whimboo, Unassigned)

References

Details

(Keywords: meta)

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Updated

Comment 4

Updated

Comment 5

Comment 6

Comment 7

Comment 8

Updated

Updated

Comment 9

Comment 10

Comment 11

Updated

Comment 12