Closed Bug 950401 Opened 6 years ago Closed 3 years ago

Missing process logging on Mac/BSD

Categories

(Core :: IPC, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
firefox49 --- fixed
firefox50 --- fixed

People

(Reporter: billm, Assigned: whimboo)

References

Details

Attachments

(1 file, 1 obsolete file)

Attached patch process-logging (obsolete) — Splinter Review
Pretty self-explanatory. We need this code for zombie checking to work on tinderbox. I just copied the code from the Linux file.
Attachment #8347669 - Flags: review?(benjamin)
Attachment #8347669 - Flags: review?(benjamin) → review+
Backed out in http://hg.mozilla.org/integration/mozilla-inbound/rev/85c56bbcf37c for introducing several OSX mochitest timeouts like these: 
https://tbpl.mozilla.org/php/getParsedLog.php?id=32119410&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=32119219&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=32120161&tree=Mozilla-Inbound


If this was expected and these are just existing timeouts that are now just being reported and need to be fixed, please say so and reland. :)
I think these were existing timeouts, but I'm not able to work on this.
Assignee: wmccloskey → nobody
I pushed this to Try because we'll need this to fix bug 1143547 effectively and I don't believe this caused those timeouts:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5c085c2c0812
Oops, silly build bustage from a header that has apparently been removed. Pushed again:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a7a1eba89579
Terrifyingly there seems to be a reproducible debug Jetpack crash on that Try push, so I rebased to a newer base changeset and tried again:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=881de5eb8b16
That Try push looks clean, I'll land this.
Assignee: nobody → wmccloskey
Backed out in:
https://hg.mozilla.org/integration/mozilla-inbound/rev/335f1295e99b
for causing miscellaneous orange on 10.6 browser-chrome mochitests, mostly on mochitest-e10s, but not entirely.

In particular, what I'm blaming on this are all the oranges involving:
FATAL ERROR: AsyncShutdown timeout in ShutdownLeaks: Wait for cleanup to be finished before checking for leaks
(which is a message from https://mxr.mozilla.org/mozilla-central/source/toolkit/components/asyncshutdown/AsyncShutdown.jsm that it writes shortly before calling gDebug.abort()), and sometimes preceded by some other failures.

Regression window narrowed with:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=Rev4%20MacOSX%20Snow%20Leopard%2010.6%20mozilla-inbound%20opt%20test%20mochitest-e10s-browser-chrome-3&fromchange=4b51391dc2a1&tochange=d084a35e8d79

I'll compile some failure logs in another comment.
Flags: needinfo?(ted)
https://treeherder.mozilla.org/logviewer.html#?job_id=8508779&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8490608&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8491954&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8493084&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8497716&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8508288&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8501714&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8505490&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8505491&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8505492&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8505747&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8507723&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8508465&repo=mozilla-inbound (on 10.10, not 10.6)
https://treeherder.mozilla.org/logviewer.html#?job_id=8508444&repo=mozilla-inbound (not mochitest-e10s, although I think some others above also were not)
https://treeherder.mozilla.org/logviewer.html#?job_id=8508722&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8508729&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=8508457&repo=mozilla-inbound
Sorry about that, I got green on Try and didn't retrigger. I'll be more cautious with Try after I sort out why this is breaking things.
Flags: needinfo?(ted)
I pushed this to try and got ~50 green re-triggers of OSX 10.6 mochitest-e10s-browser-chrome [1]. If we can figure out a way to track child processes from outside the browser this wont be a problem, but if we can't I'm inclined to re-land this (and keep an eye on the trees).

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=532abb8ba1e1
I was able to get psutil installed on all the test slaves, I'm not intending to pursue this.
No longer blocks: 1143547
We need this feature for bug 1176758 to be able to kill Firefox via mozprocess after an update on OS X where it spawns itself into a new process group. With that we currently loose any control of the process. :(

So I will try another time to get this feature added. I actually had some code locally working before I found this bug, so there are some subtle differences to Bill's one attached here on that bug. I did a try push here:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ada87682e8ff

I will have a look later today how mochitest results look like on OS X, but if there are failures I cannot check them before over next week.
Blocks: 1176758
So far the try run looks promising. David and Ted, what specifically I would have to look out for? All of the links above are not usable anymore. I can't also find the mentioned mochitest(-browser) tests and assume those have been split up in multiple chunks meanwhile?

I will upload my patch because Bill missed to declare the global gProcessLog method.
Flags: needinfo?(ted)
Flags: needinfo?(dbaron)
I really don't remember.
Flags: needinfo?(dbaron)
Oh, and we also desupported OS X 10.6, so it might be a great time to try again! :)
Comment on attachment 8759782 [details]
Bug 950401 - Add process logging to OS X / BSD.

https://reviewboard.mozilla.org/r/57648/#review55464
Attachment #8759782 - Flags: review?(benjamin) → review+
I never really looked into this very much, I was just trying to get the patch landed (as you are). I would say retrigger some extra Mochitest runs on OS X on try and make sure nothing looks broken, and then feel free to land it.

Per comment 13 this is probably fine anyway.
Flags: needinfo?(ted)
Yes, try results look fine. I'm just going to push it. Lets fingers crossed that all will be fine now!
Assignee: wmccloskey → hskupin
Status: NEW → ASSIGNED
OS: Linux → All
Hardware: x86_64 → All
Attachment #8347669 - Attachment is obsolete: true
Pushed by hskupin@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/66211bb627bf
Add process logging to OS X / BSD. r=bsmedberg
https://hg.mozilla.org/mozilla-central/rev/66211bb627bf
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Comment on attachment 8759782 [details]
Bug 950401 - Add process logging to OS X / BSD.

Approval Request Comment
[Feature/regressing bug #]: New feature on OS X and BSD which is hidden behind an env variable
[User impact if declined]: None but necessary for test harnesses to track process ids of Firefox.
[Describe test coverage new/current, TreeHerder]: No changes
[Risks and why]: Will be only set by test harnesses or if the user explicitely enables it.
[String/UUID change made/needed]: None
Attachment #8759782 - Flags: approval-mozilla-aurora?
Comment on attachment 8759782 [details]
Bug 950401 - Add process logging to OS X / BSD.

Seems worth a try to uplift to improve testing capabilities on aurora. If we see any regressions from this please be ready to back it out.
Attachment #8759782 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.