Closed
Bug 488298
Opened 16 years ago
Closed 14 years ago
Talos should indicate more clearly when it forcibly kills the browser
Categories
(Testing :: Talos, defect, P3)
Testing
Talos
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: ted, Assigned: anodelman)
References
Details
(Whiteboard: [talos])
Attachments
(1 file, 2 obsolete files)
22.24 KB,
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
There's some confusion in bug 486580 about some Talos runs where the browser appears to have crashed, but there's no stack trace. Specifically, we see output like this:
../Minefield.app/Contents/MacOS/run-mozilla.sh: line 399: 958 Terminated
"$prog" ${1+"$@"}
Failed tp:
Stopped Thu, 02 Apr 2009 05:34:54
FAIL: Busted: tp
FAIL: browser crash
I *think* this means that Talos believed the browser was hung and forcibly killed it, but it's not very clear. It would be nice if Talos explicitly noted that it killed the browser in that situation. Replacing "FAIL: browser crash" with "FAIL: browser was hung, killed" or something like that would probably clear up some confusion here. If we do hit that situation a lot, we could then look at fixing bug 483968.
Updated•16 years ago
|
Assignee: nobody → anodelman
Assignee: anodelman → nobody
Component: Release Engineering: Talos → Release Engineering
Updated•16 years ago
|
Assignee: anodelman → nobody
Updated•16 years ago
|
Assignee: nobody → anodelman
Updated•16 years ago
|
Priority: -- → P3
Assignee | ||
Comment 1•15 years ago
|
||
Moving to future.
Assignee: anodelman → nobody
Component: Release Engineering → Release Engineering: Future
Comment 2•15 years ago
|
||
Mass move of bugs from Release Engineering:Future -> Release Engineering. See
http://coop.deadsquid.com/2010/02/kiss-the-future-goodbye/ for more details.
Component: Release Engineering: Future → Release Engineering
Updated•15 years ago
|
Whiteboard: [talos]
Comment 3•15 years ago
|
||
From comment#0, I *think* this is about changing how the talos suite detects a hung browser and reports killing it. afaict, this is not about how RelEng invokes the overall talos suite.
]
Moving this bug to Testing:General for suite development, but please reassign this back if I misunderstood.
Component: Release Engineering → General
Product: mozilla.org → Testing
QA Contact: release → general
Version: other → unspecified
Reporter | ||
Comment 4•15 years ago
|
||
This is about how the Talos code itself works. Do those bugs no longer belong in the Releng component?
Assignee | ||
Comment 6•14 years ago
|
||
This looks like a minor fix, assigning to myself.
Assignee: nobody → anodelman
![]() |
||
Comment 7•14 years ago
|
||
Any chance of fixing this one soon, Alice?
Assignee | ||
Comment 8•14 years ago
|
||
This totally fell off my radar. Let me take another look.
Assignee | ||
Comment 9•14 years ago
|
||
Assignee | ||
Comment 10•14 years ago
|
||
Did initial testing on fedora only, works fine. Indicates which processes were killed and with what signal, then prints stack. This should clear up the confusion.
Actual check-in/deployment will be blocked on a more complete test run.
Attachment #544618 -
Flags: review?(jmaher)
Comment 11•14 years ago
|
||
Comment on attachment 544618 [details] [diff] [review]
indicate when talos kills processes (take 1)
Review of attachment 544618 [details] [diff] [review]:
-----------------------------------------------------------------
overall this is good as it cleans up some stuff and simplifies the flow. I would like to clean up some of the code related to the remote process before r+.
::: bcontroller.py
@@ +154,5 @@
> self.bwaiter = BrowserWaiter(self.command, self.log, self.timeout, self.mod, self.ffprocess)
> noise = 0
> prev_size = 0
> while not self.bwaiter.hasTime():
> if noise > self.timeout: # check for frozen browser
so if we remove ffprocess.cleanupProcesses from bcontroller, we only use ffprocess for the remote stuff. Can we clean up the code to pass in host to BrowserWaiter (instead of self.ffprocess) and handle the setup and managing of the remote device inside there?
::: ffprocess.py
@@ -104,5 @@
> -
> - # Timed out.
> - self.TerminateAllProcesses(browser_wait, process_name)
> - return (None, True)
> -
why is this removed, I assumed it was because we don't use it anymore?
::: ffprocess_remote.py
@@ +132,5 @@
> try:
> self.testAgent.killProcess(process_name)
> + if result:
> + result = result + ', '
> + result = result + process_name + ': terminated by testAgent.killProcess'
we never set result here. killProcess returns None|String, so if we return a value it might be similar to this "Successfully killed 12906 org.mozilla.fennec".
Attachment #544618 -
Flags: review?(jmaher) → review-
Assignee | ||
Comment 12•14 years ago
|
||
::: ffprocess.py
@@ -104,5 @@
> -
> - # Timed out.
> - self.TerminateAllProcesses(browser_wait, process_name)
> - return (None, True)
> -
why is this removed, I assumed it was because we don't use it anymore?
This was removed as the cleanup is now handled in ttest.py - I wanted to better control who was killing the browser so that I could bubble up the correct message to the user. If the browser is terminated here, then I won't know in ttest.py if the browser crashed or froze.
::: ffprocess_remote.py
@@ +132,5 @@
> try:
> self.testAgent.killProcess(process_name)
> + if result:
> + result = result + ', '
> + result = result + process_name + ': terminated by testAgent.killProcess'
we never set result here. killProcess returns None|String, so if we return a value it might be similar to this "Successfully killed 12906 org.mozilla.fennec".
The way I set this up is to generate a simple generic "we killed something" - since there is no return value from killProcess. I just assume that killProcess was successful; and, even if it wasn't, having the message that we attempted to kill a process can also be useful. This code chunk also brings the behavior in line with the rest of the TerminateAllProcesses procedures.
Comment 13•14 years ago
|
||
this makes sense. What about cleaning up bcontroller.py now that we have only one dependency on ffprocess (with this patch)?
Assignee | ||
Comment 14•14 years ago
|
||
This patch includes cleanup of bcontroller. Only remaining reference to ffprocess is for the remote device case. Also removed a bunch of extraneous imports.
Attachment #544595 -
Attachment is obsolete: true
Attachment #544618 -
Attachment is obsolete: true
Attachment #545463 -
Flags: review?(jmaher)
Comment 15•14 years ago
|
||
Comment on attachment 545463 [details] [diff] [review]
[checked in]indicate when talos kills processes (take 2)
Review of attachment 545463 [details] [diff] [review]:
-----------------------------------------------------------------
looks good, is this tested in staging and on android?
Attachment #545463 -
Flags: review?(jmaher) → review+
Assignee | ||
Comment 16•14 years ago
|
||
This is just waiting on availability of the ateam staging environment.
Assignee | ||
Comment 17•14 years ago
|
||
My staging run is going green across the board. Just waiting for confirmation that it is okay on mobile and then it is good to land.
Assignee | ||
Comment 18•14 years ago
|
||
jmaher says green on mobile. This is ready for check in and deployment.
Assignee | ||
Comment 19•14 years ago
|
||
Comment on attachment 545463 [details] [diff] [review]
[checked in]indicate when talos kills processes (take 2)
changeset: 241:8ae066ed37c8
Attachment #545463 -
Attachment description: indicate when talos kills processes (take 2) → [checked in]indicate when talos kills processes (take 2)
![]() |
||
Comment 20•14 years ago
|
||
Thanks Alice! Can you also reply to bug 486580 comment 8 noting how forced kills can be detected?
Assignee | ||
Comment 21•14 years ago
|
||
This is now deployed.
All done here.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•