Closed Bug 581733 Opened 15 years ago Closed 15 years ago

mozmill-restart can't shutdown browser if a modal dialog is open (Timeout: bridge.execFunction)

Categories

(Testing Graveyard :: Mozmill, defect)

Product:

Component:

Type:

defect

Priority:

Not set

Severity:

normal

Tracking

(Not tracked)

Status:

VERIFIED FIXED

People

(Reporter: whimboo, Assigned: k0scist)

Details

(Whiteboard: [mozmill-1.4.2+])

Attachments

(2 files)

reproduction script for this bug 15 years ago Jeff Hammel 474 bytes, text/plain		Details
workaround for killing browser 15 years ago Jeff Hammel 1.01 KB, patch	ahal : review+	Details \| Diff \| Splinter Review

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Description

•

15 years ago

With the beta 1 of Mozmill 1.4.2 installed on our Ubuntu 9.10 machine in the QA lab the browser doesn't get closed during the restart tests, i.e. installing an extension. Sometimes we hang in the modal installation dialog. I can see the following error in the console: Timeout: bridge.execFunction("40a9a744-9739-11df-8c69-000c291ab448", bridge.registry["{23062957-73e0-4730-b6f3-fe43481f45d4}"]["cleanQuit"], []) Haven't ever seen this before. So I would assume this is a regression introduced with the global timeout.

Assignee

Comment 1

•

15 years ago

Do you have reproduction steps?

Andrew Halberstadt [:ahal]

Comment 2

•

15 years ago

Isn't beta 1 just the re-factored directory structure (i.e before global timeout was added)? Also I haven't been able to reproduce this while running the extension install/uninstall test on Ubuntu 10.04.

Updated

•

15 years ago

Whiteboard: [mozmill-1.4.2?] → [mozmill-1.4.2+]

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 3

•

15 years ago

Reproduced today while running Mozmill (latest dev build) tests against the patches mozmill-tests tests for localized builds: TEST-START | /Volumes/data/testing/mozmill/default/firefox/testTabbedBrowsing/testBackgroundTabScrolling.js | teardownModule Timeout: bridge.execFunction("204f65e2-9a4f-11df-845c-002332b130e8", bridge.registry["{3d0f67c4-f96f-244a-a58d-ea170984e29c}"]["runTestDirectory"], ["/Volumes/data/testing/mozmill/default/firefox"]) TEST-UNEXPECTED-FAIL | Disconnect Error: Application unexpectedly closed INFO Passed: 191 INFO Failed: 8 INFO Skipped: 8 Timeout: bridge.set("14a96214-9a50-11df-845c-002332b130e8", Components.utils.import('resource://mozmill/modules/mozmill.js'))

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 4

•

15 years ago

Well, on my local box Firefox get closed correctly. So seeing this message on qa-horus means Firefox has closed and we have that message in the logs now? WFM?

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 5

•

15 years ago

Ok, this can be related to the mystic hangs which we cover on bug 571630. Steps: 1. Do a git pull mozauto 1.4.2 to get the latest dev version 2. Get all mozmill-tests 3. Execute Mozmill like "mozmill -b /Applications/Minefield.app/ --show-errors -t firefox/testTabbedBrowsing/" As you can see above we hang inside of the teardownModule. It's not related to its content because even removing all the code doesn't help. We are still waiting for something and don't continue with the next test. That means after the global timeout we kill Firefox. If this the expected behavior I'm fine with it and we should continue the discussion on bug 571630.

Assignee

Updated

•

15 years ago

Assignee: nobody → jhammel

Assignee

Comment 6

•

15 years ago

i'll take this bug if i can reproduce

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 7

•

15 years ago

Well, isn't that the exact notice we want to give when a test-run has been stopped by killing the process?

Assignee

Comment 8

•

15 years ago

(In reply to comment #5) > Ok, this can be related to the mystic hangs which we cover on bug 571630. > > Steps: > 1. Do a git pull mozauto 1.4.2 to get the latest dev version > 2. Get all mozmill-tests > 3. Execute Mozmill like "mozmill -b /Applications/Minefield.app/ --show-errors > -t firefox/testTabbedBrowsing/" > > As you can see above we hang inside of the teardownModule. It's not related to > its content because even removing all the code doesn't help. We are still > waiting for something and don't continue with the next test. That means after > the global timeout we kill Firefox. > > If this the expected behavior I'm fine with it and we should continue the > discussion on bug 571630. I don't get this hang, ubuntu 10.04, minefield from a few days ago (can update to today's version if applicable), tests and mozmill 1.4.2 from today. Am I doing something wrong or is this system dependent?

Assignee

Comment 9

•

15 years ago

Is there anything displayed on the failing system that might be a clue as to why it is failing?

Assignee

Comment 10

•

15 years ago

(In reply to comment #8) > (In reply to comment #5) > > Ok, this can be related to the mystic hangs which we cover on bug 571630. > > > > Steps: > > 1. Do a git pull mozauto 1.4.2 to get the latest dev version > > 2. Get all mozmill-tests > > 3. Execute Mozmill like "mozmill -b /Applications/Minefield.app/ --show-errors > > -t firefox/testTabbedBrowsing/" > > > > As you can see above we hang inside of the teardownModule. It's not related to > > its content because even removing all the code doesn't help. We are still > > waiting for something and don't continue with the next test. That means after > > the global timeout we kill Firefox. > > > > If this the expected behavior I'm fine with it and we should continue the > > discussion on bug 571630. > > I don't get this hang, ubuntu 10.04, minefield from a few days ago (can update > to today's version if applicable), tests and mozmill 1.4.2 from today. Am I > doing something wrong or is this system dependent? Likewise, tried on Mac OSX and do not get this hang.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 11

•

15 years ago

All the failures below happen with the testrun_daily.py automation script on our Windows boxes: Windows Vista - Mozmill 1.4.2 Beta 1: Test Failed : testInstallExtension in c:\users\mozilla\appdata\local\temp\tmpzeathw.mozmill-tests\firefox\restartTests\testExtensionInstallUninstall\test1.js Test Failed : testInstallExtension in c:\users\mozilla\appdata\local\temp\tmpzeathw.mozmill-tests\firefox\restartTests\testExtensionInstallUninstall\test1.js Timeout: bridge.execFunction("67063a31-9b29-11df-a1e7-000c297be8a8", bridge.registry["{f2d9e826-07de-47f1-a60a-4949be896a3b}"]["cleanQuit"], []) 2 Namoroka instances were open in parallel and there were some error dialogs that the update cannot be installed because another version is running. Windows XP - Mozmill 1.4.2 Beta 1: 41 files updated, 0 files merged, 0 files removed, 0 files unresolved Test Failed : testLarryGreen in c:\docume~1\mozilla\locals~1\temp\tmpqmj3d6.mozmill-tests\firefox\testSecurity\testGreenLarry.js Test Failed : testNotificationBar in c:\docume~1\mozilla\locals~1\temp\tmpqmj3d6.mozmill-tests\firefox\testSecurity\testSafeBrowsingNotificationBar.js Passed 224 :: Failed 2 :: Skipped 0 Sending results to 'http://brasstacks.mozilla.com/couchdb/mozmill/_design/reports/_update/report' failed (No JSON object could be decoded: line 1 column 0 (char 0)). Timeout: bridge.execFunction("55f70ce5-9bee-11df-bfef-000c29bced40", bridge.registry["{877fa6cb-398f-438d-8c2b-05df1547ad5e}"]["runTestFile"], ["c:\\docume~1\\mozilla\\locals~1\\temp\\tmpqmj3d6.mozmill-tests\\firefox\\restartTests\\testThemeInstallUninstall\\test1.js"]) No hanging instances on this box. Everything has been shutdown cleanly. Windows 2000 - Mozmill 1.4.2 Beta 1: Sending results to 'http://brasstacks.mozilla.com/couchdb/mozmill/_design/reports/_update/report' failed (Extra data: line 1 column 9 - line 1 column 32 (char 9 - 32)). Timeout: bridge.execFunction("afd94350-9bed-11df-97f3-000c29e0532e", bridge.registry["{f10f35e7-d866-4482-89ab-cc750f1320b0}"]["runTestFile"], ["c:\\docume~1\\mozilla\\locals~1\\temp\\tmpngwqdw.mozmill-tests\\firefox\\restartTests\\testMasterPassword\\test1.js"]) Hang in master password dialog is correctly handled and browser has been killed. No running instances. So the only issue I can see here is on the Vista box with the additional message: bridge.registry["{f2d9e826-07de-47f1-a60a-4949be896a3b}"]["cleanQuit"]

OS: Linux → All

Hardware: x86 → All

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 12

•

15 years ago

Windows 2000, XP, and Vista were hanging today in the restart test restartTests/testInstallMultipleExtensions. The hang happened in test1.js when trying to install the extensions. For one of those the timeout was too short for the download and waitForEval throws an error. That normally should close the browser but there is a modal dialog which asks you, if you want to cancel the download. In that state our test-run hung and the browser hasn't been killed. To reproduce that the timeout for the download would simply have to be decreased to under 1s. Should be easy to reproduce. I hope that will help you Jeff.

Assignee

Comment 13

•

15 years ago

(In reply to comment #12) > Windows 2000, XP, and Vista were hanging today in the restart test > restartTests/testInstallMultipleExtensions. I assume you mean testMultipleExtensionInstallation? http://hg.mozilla.org/qa/mozmill-tests/file/tip/firefox/restartTests/testMultipleExtensionInstallation

Assignee

Comment 14

•

15 years ago

(In reply to comment #12) > Windows 2000, XP, and Vista were hanging today in the restart test > restartTests/testInstallMultipleExtensions. The hang happened in test1.js when > trying to install the extensions. For one of those the timeout was too short > for the download and waitForEval throws an error. That normally should close > the browser but there is a modal dialog which asks you, if you want to cancel > the download. In that state our test-run hung and the browser hasn't been > killed. > > To reproduce that the timeout for the download would simply have to be > decreased to under 1s. Should be easy to reproduce. > > I hope that will help you Jeff. I get an addons failure using nightly minefield and latest mozmill + tests. Should I be running any particular version of FF? $ Scripts/mozmill-restart.exe -b /c/Program\ Files/Minefield/firefox.exe -t src /mozmill-tests/firefox/restartTests/testMultipleExtensionInstallation/ --showal l DEBUG | mozmill.startRunner | true DEBUG | mozmill.setState | "currentState" DEBUG | mozmill.setModule | "currentModule" DEBUG | mozmill.setState | "currentState" TEST-START | c:\mozmill\mozmill\src\mozmill-tests\firefox\restartTests\testMulti pleExtensionInstallation\test1.js | setupModule DEBUG | mozmill.setTest | {"name": "setupModule", "filename": "c:\\mozmill\\mozm ill\\src\\mozmill-tests\\firefox\\restartTests\\testMultipleExtensionInstallatio n\\test1.js"} WARNING | setupModule | (SKIP) Bug 569813: New add-ons manager not supported yet DEBUG | mozmill.endTest | {"passes": [], "fails": [], "skipped": true, "name": " setupModule", "filename": "c:\\mozmill\\mozmill\\src\\mozmill-tests\\firefox\\re startTests\\testMultipleExtensionInstallation\\test1.js", "failed": 0, "passed": 0, "skipped_reason": "Bug 569813: New add-ons manager not supported yet"} TEST-START | c:\mozmill\mozmill\src\mozmill-tests\firefox\restartTests\testMulti pleExtensionInstallation\test1.js | testInstallExtensions DEBUG | mozmill.setTest | {"name": "testInstallExtensions", "filename": "c:\\moz mill\\mozmill\\src\\mozmill-tests\\firefox\\restartTests\\testMultipleExtensionI nstallation\\test1.js"} INFO | Test Skipped: "setupModule failed." WARNING | testInstallExtensions | (SKIP) setupModule failed. DEBUG | mozmill.endTest | {"passes": [], "fails": [], "skipped": true, "name": " testInstallExtensions", "filename": "c:\\mozmill\\mozmill\\src\\mozmill-tests\\f irefox\\restartTests\\testMultipleExtensionInstallation\\test1.js", "failed": 0, "passed": 0, "skipped_reason": "setupModule failed."} DEBUG | mozmill.persist | {"addons": [{"url": "https://preview.addons.mozilla.or g/firefox/addon/6543/", "name": "Nightly Tester Tools", "id": "{8620c15f-30dc-4d ba-a131-7c5d20cf4a29}"}, {"url": "https://preview.addons.mozilla.org/firefox/add on/5428", "name": "Mozilla QA Companion", "id": "{667e9f3d-0096-4d2b-b171-9a96af babe20}"}]} DEBUG | mozmill.endRunner | true DEBUG | mozmill.startRunner | true DEBUG | mozmill.setState | "currentState" DEBUG | mozmill.setModule | "currentModule" DEBUG | mozmill.setState | "currentState" TEST-START | c:\mozmill\mozmill\src\mozmill-tests\firefox\restartTests\testMulti pleExtensionInstallation\test2.js | setupModule DEBUG | mozmill.setTest | {"name": "setupModule", "filename": "c:\\mozmill\\mozm ill\\src\\mozmill-tests\\firefox\\restartTests\\testMultipleExtensionInstallatio n\\test2.js"} WARNING | setupModule | (SKIP) Bug 569813: New add-ons manager not supported yet DEBUG | mozmill.endTest | {"passes": [], "fails": [], "skipped": true, "name": " setupModule", "filename": "c:\\mozmill\\mozmill\\src\\mozmill-tests\\firefox\\re startTests\\testMultipleExtensionInstallation\\test2.js", "failed": 0, "passed": 0, "skipped_reason": "Bug 569813: New add-ons manager not supported yet"} TEST-START | c:\mozmill\mozmill\src\mozmill-tests\firefox\restartTests\testMulti pleExtensionInstallation\test2.js | testCheckInstalledExtensions DEBUG | mozmill.setTest | {"name": "testCheckInstalledExtensions", "filename": " c:\\mozmill\\mozmill\\src\\mozmill-tests\\firefox\\restartTests\\testMultipleExt ensionInstallation\\test2.js"} INFO | Test Skipped: "setupModule failed." WARNING | testCheckInstalledExtensions | (SKIP) setupModule failed. DEBUG | mozmill.endTest | {"passes": [], "fails": [], "skipped": true, "name": " testCheckInstalledExtensions", "filename": "c:\\mozmill\\mozmill\\src\\mozmill-t ests\\firefox\\restartTests\\testMultipleExtensionInstallation\\test2.js", "fail ed": 0, "passed": 0, "skipped_reason": "setupModule failed."} DEBUG | mozmill.persist | {"addons": [{"url": "https://preview.addons.mozilla.or g/firefox/addon/6543/", "name": "Nightly Tester Tools", "id": "{8620c15f-30dc-4d ba-a131-7c5d20cf4a29}"}, {"url": "https://preview.addons.mozilla.org/firefox/add on/5428", "name": "Mozilla QA Companion", "id": "{667e9f3d-0096-4d2b-b171-9a96af babe20}"}]} DEBUG | mozmill.endRunner | true INFO Passed: 0 INFO Failed: 0 INFO Skipped: 4

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 15

•

15 years ago

All add-ons test on default (Minefield) are marked as skipped, as the report says. You would have to use Namoroka or Shiretoko. But that's not really necessary. I have a much better and totally simple testcase for you. The problem we have is, once a test is using a modal dialog which can also be a system dialog, we are failing to kill the browser when the global timeout kicks in: var testNoKill = function() { controller = mozmill.getBrowserController(); // Open the "File Open" dialog controller.keypress(null, "o", {accelKey: true}); } Here the log from the command line: TEST-START | /Volumes/data/build/tools/mozmill/mozmill/test/restart/test_user_restart/test1.js | testRestartBeforeTimeout Timeout: bridge.execFunction("04059ae0-9f32-11df-8fbc-002332b130e8", bridge.registry["{13ba93ab-fcf6-2d41-a5c6-844d9b07cf18}"]["runTestFile"], ["/Volumes/data/build/tools/mozmill/mozmill/test/restart/test_user_restart/test1.js"]) TEST-UNEXPECTED-FAIL | Disconnect Error: Application unexpectedly closed INFO Passed: 1 INFO Failed: 0 INFO Skipped: 0

Assignee

Comment 16

•

15 years ago

(In reply to comment #15) > All add-ons test on default (Minefield) are marked as skipped, as the report > says. You would have to use Namoroka or Shiretoko. But that's not really > necessary. I have a much better and totally simple testcase for you. > > The problem we have is, once a test is using a modal dialog which can also be a > system dialog, we are failing to kill the browser when the global timeout kicks > in: > > var testNoKill = function() { > controller = mozmill.getBrowserController(); > > // Open the "File Open" dialog > controller.keypress(null, "o", {accelKey: true}); > } > > Here the log from the command line: > > TEST-START | > /Volumes/data/build/tools/mozmill/mozmill/test/restart/test_user_restart/test1.js > | testRestartBeforeTimeout > Timeout: bridge.execFunction("04059ae0-9f32-11df-8fbc-002332b130e8", > bridge.registry["{13ba93ab-fcf6-2d41-a5c6-844d9b07cf18}"]["runTestFile"], > ["/Volumes/data/build/tools/mozmill/mozmill/test/restart/test_user_restart/test1.js"]) > > TEST-UNEXPECTED-FAIL | Disconnect Error: Application unexpectedly closed > INFO Passed: 1 > INFO Failed: 0 > INFO Skipped: 0 I cannot reproduce this on the windows 2000 VM. I run mozmill with the above test case. I do get the bridge disconnect error above. But the minefield gets correctly closed. Additional runs have no problem starting up and produce identical results. I *do* get cleanup errors for the profile -- windows complains that the files are in use by another process (presumedly the Firefox process before its shutdown is complete). But can't reproduce the failure to shutdown the browser error.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 17

•

15 years ago

Alright, as my last comment showed I was running restart tests with mozmill-restart. I haven't executed "mozmill". Now with the latest statement from Jeff I have tried it again on all systems and I have seen that the normal mozmill command works quite fine. Only the mozmill-restart command is not able to kill the application. I'm able to reproduce this behavior with the above mentioned script across all platforms.

Summary: Mozmill sometimes can't shutdown browser (Timeout: bridge.execFunction) → mozmill-restart can't shutdown browser if a modal dialog is open (Timeout: bridge.execFunction)

Assignee

Comment 18

•

15 years ago

Attached file reproduction script for this bug — Details

this script will reproduce the bug. should work on any os with bash

Assignee

Comment 19

•

15 years ago

(In reply to comment #0) > With the beta 1 of Mozmill 1.4.2 installed on our Ubuntu 9.10 machine in the QA > lab the browser doesn't get closed during the restart tests, i.e. installing an > extension. Sometimes we hang in the modal installation dialog. I can see the > following error in the console: > > Timeout: bridge.execFunction("40a9a744-9739-11df-8c69-000c291ab448", > bridge.registry["{23062957-73e0-4730-b6f3-fe43481f45d4}"]["cleanQuit"], []) > > Haven't ever seen this before. So I would assume this is a regression > introduced with the global timeout. This is not a regression. Behaviour is identical to previous to the global timeout except that the python end gets closed.

Assignee

Comment 20

•

15 years ago

This has to do with how we stop the runner and cleanup our harnesses. Note that this logic hasn't at all changed for `mozmill-restart`. It has changed slightly (in implementation, hopefully not in intent) for regular `mozmill`, but mostly due to the fact that I couldn't figure out what the intent was of cleanup for `mozmill-restart`, I left that part alone entirely except moving code around. So our cleanup code is here http://github.com/mozautomation/mozmill/blob/master/mozmill/mozmill/__init__.py#L456 and here http://github.com/mozautomation/mozmill/blob/master/mozmill/mozmill/__init__.py#L497 Note that there is no end-of-test stop for MozmillRestart: http://github.com/mozautomation/mozmill/blob/master/mozmill/mozmill/__init__.py#L659 Again, this is all, control-flow-wise, as it was originally. These blocks of code are really messy. close_bridge is set to false for MozmillRestart's call to stop_runner. This is switched on to determine what the stop_runner function does. In the case that it is true, additional cleanup is done. Additionally, (fallback) cleanup is done in Mozmill.stop() . None of this is robust. Also, none of this is particularly obvious to me as far as why it is done this way. I wanted to clean up this control flow a bit, but I couldn't figure out what was necessary as far as what should be called and the order it should be called. This can probably be worked around for now with a hard runner kill in MozmillRestart.stop() def stop(self, hard=False): if hard: self.runner.kill() self.runner.profile.cleanup() If it is desirable to do this one-off, we can. However, the usual caveats: - this is not robust! There will be edge cases that this does not suffice for! the solution for this of course is... - this entire control should be reconsidered. We should make sure that we're doing what we think we're doing. We should make this robust. Since it is vitally important to automation that things are cleaned up correctly, this is the critical point

Comment 21

•

15 years ago

(In reply to comment #20) > This can probably be worked around for now with a hard runner kill in > MozmillRestart.stop() > > def stop(self, hard=False): > if hard: > self.runner.kill() > self.runner.profile.cleanup() > > If it is desirable to do this one-off, we can. However, the usual caveats: > > - this is not robust! There will be edge cases that this does not suffice > for! the solution for this of course is... > > - this entire control should be reconsidered. We should make sure that we're > doing what we think we're doing. We should make this robust. Since it is > vitally important to automation that things are cleaned up correctly, this is > the critical point Yeah, I hear Henrik's point that he wants to be able to ensure randomly started modal dialogs don't block tests. However, randomly started modal dialogs have ALWAYS blocked mozmill tests. This is not a new issue, and it's certainly not a regression. If this hard kill does the trick with modal dialogs, then let's take that for 1.4.2. But this is as far down the rabbit hole as we go on this issue. You will always be able to find one more modal dialog that won't die, and until we completely refactor this code, we're just chasing ghosts. We can't refactor the code for 1.4.2, so let's try this simple fix. If it works in this case, and in the normal "shutdown case" when a test ends, then we take it. But if you find another edge case here, I wouldn't be surprised, and I wouldn't hold 1.4.2 for it.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 22

•

15 years ago

(In reply to comment #21) > Yeah, I hear Henrik's point that he wants to be able to ensure randomly started > modal dialogs don't block tests. However, randomly started modal dialogs have > ALWAYS blocked mozmill tests. This is not a new issue, and it's certainly not > a regression. First thanks to Jeff for the investigation. It's true that a restart test-run never has been killed before, but the same was the case for restartless test-runs. While the latter one has been changed, so that we can kill the application, the former one hasn't. I got the impression that the global timeout patch will enable us to kill any of our supported test types. But as I can see now, that's not the case. > If this hard kill does the trick with modal dialogs, then let's take that for > 1.4.2. But this is as far down the rabbit hole as we go on this issue. You > will always be able to find one more modal dialog that won't die, and until we > completely refactor this code, we're just chasing ghosts. We can't refactor > the code for 1.4.2, so let's try this simple fix. If it works in this case, > and in the normal "shutdown case" when a test ends, then we take it. But if > you find another edge case here, I wouldn't be surprised, and I wouldn't hold > 1.4.2 for it. Agree. On the other hand that would definitely throw bad light on Mozmill regarding the buildbot integration. When we can't kill the application when a restart test hangs, releng will not be happy about. If the hard kill trick doesn't work should we target a follow-up release for those critical items only? I think we should add the dependency to bug 516984.

Keywords: regression

Assignee

Comment 23

•

15 years ago

Attached patch workaround for killing browser — Details — Splinter Review

This works around this bug for the reproduction script above. I have not tested it on user restart tests and, as the long comment in the patch says, it is not robust and i'm sure doesn't handle all cases. But if we want a quick fix, this could be it.

Attachment #462856 - Flags: review?(ahalberstadt)

Assignee

Comment 24

•

15 years ago

(In reply to comment #21) > (In reply to comment #20) > > This can probably be worked around for now with a hard runner kill in > > MozmillRestart.stop() > > > > def stop(self, hard=False): > > if hard: > > self.runner.kill() > > self.runner.profile.cleanup() > > > > If it is desirable to do this one-off, we can. However, the usual caveats: > > > > - this is not robust! There will be edge cases that this does not suffice > > for! the solution for this of course is... > > > > - this entire control should be reconsidered. We should make sure that we're > > doing what we think we're doing. We should make this robust. Since it is > > vitally important to automation that things are cleaned up correctly, this is > > the critical point > Yeah, I hear Henrik's point that he wants to be able to ensure randomly started > modal dialogs don't block tests. However, randomly started modal dialogs have > ALWAYS blocked mozmill tests. This is not a new issue, and it's certainly not > a regression. It is also not just modal dialogues. Anything that triggers the global timeout with mozmill-restart will not clean up correctly. Not surprising, since no one wrote any cleanup code. As per comment 20, I wanted to clean up correctly when I was looking at it, but I couldn't figure out the logic that is actually there. > If this hard kill does the trick with modal dialogs, then let's take that for > 1.4.2. But this is as far down the rabbit hole as we go on this issue. You > will always be able to find one more modal dialog that won't die, and until we > completely refactor this code, we're just chasing ghosts. We can't refactor > the code for 1.4.2, so let's try this simple fix. If it works in this case, > and in the normal "shutdown case" when a test ends, then we take it. But if > you find another edge case here, I wouldn't be surprised, and I wouldn't hold > 1.4.2 for it. I've added another level of cleanup here. It is undoubtedly in the wrong place and will not handle all edge cases, but it does workaround this issue. For 1.4.2 I would hate to do much more. See attachment.

Assignee

Comment 25

•

15 years ago

(In reply to comment #22) > (In reply to comment #21) > > Yeah, I hear Henrik's point that he wants to be able to ensure randomly started > > modal dialogs don't block tests. However, randomly started modal dialogs have > > ALWAYS blocked mozmill tests. This is not a new issue, and it's certainly not > > a regression. > > First thanks to Jeff for the investigation. It's true that a restart test-run > never has been killed before, but the same was the case for restartless > test-runs. While the latter one has been changed, so that we can kill the > application, the former one hasn't. I got the impression that the global > timeout patch will enable us to kill any of our supported test types. But as I > can see now, that's not the case. Firstly, there are two issues here. Global timeout *does* kill mozmill for all of our cases. That's all it does. It does not add additional cleanup code. Since the cleanup code didn't exist before, nothing additional (like shutting down the browser). Our entire stop procedure is very brittle. As per comment 20, since this is vital to automation, this is a big deal. This deals with one obvious edge case. But as the comment on the patch says, it is not robust and there are probably several edge cases where it won't work. This fix is, plainly put, dead wrong. If another edge case came up that this didn't fix, and it were to be fixed in the same way, it would involve adding yet another kill functionality at the end without understanding any of the preceding code. > > If this hard kill does the trick with modal dialogs, then let's take that for > > 1.4.2. But this is as far down the rabbit hole as we go on this issue. You > > will always be able to find one more modal dialog that won't die, and until we > > completely refactor this code, we're just chasing ghosts. We can't refactor > > the code for 1.4.2, so let's try this simple fix. If it works in this case, > > and in the normal "shutdown case" when a test ends, then we take it. But if > > you find another edge case here, I wouldn't be surprised, and I wouldn't hold > > 1.4.2 for it. > > Agree. On the other hand that would definitely throw bad light on Mozmill > regarding the buildbot integration. When we can't kill the application when a > restart test hangs, releng will not be happy about. If the hard kill trick > doesn't work should we target a follow-up release for those critical items > only? I think we should add the dependency to bug 516984. The tests cherry-picked for mozilla-central should be robust and thoroughly tested and staged as documented on bug 516984. Ultimately, when global timeout is triggered, it usually means that something is wrong with a test. While we are still finding harness failures, it will always be possible to write a test that does something bad. Again, the correct solution is to make our closing procedure as robust as possible, which means providing a single point of failure. This patch does not address that. Mozmill has a number of bugs and serious architecture issues. Hopefully these will be addressed by mozmill 2.0. But this shouldn't be a criteria for using the software in production; if it was, MS Office wouldn't be used by any business. The question is, for the set of tests and buildbot steps provided, will Mozmill hang? The answer in practicality is probably "No if we make sure we pick a good set of tests".

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Updated

•

15 years ago

Attachment #462807 - Attachment mime type: application/x-sh → text/plain

Assignee

Comment 26

•

15 years ago

see also the bigger issue: https://bugzilla.mozilla.org/show_bug.cgi?id=584464

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 27

•

15 years ago

The patch works fine on all platforms, except that we kill the whole restart test-run. As talked with Jeff on IRC I have filed that as bug 584470 which is definitely post Mozmill-1.4.2 stuff.

Andrew Halberstadt [:ahal]

Comment 28

•

15 years ago

Comment on attachment 462856 [details] [diff] [review] workaround for killing browser Tested with the user-restart tests as well as manually closing other random tests and it appears to work fine. It also fixes the case where the user doesn't specify a user restart and then restarts the browser. It used to just throw the 'Application Disconnected' error and then the instance of the browser would pop back up and be left running. Now the restarted instance gets closed. r+, r=ahalberstadt

Attachment #462856 - Flags: review?(ahalberstadt) → review+

Assignee

Comment 29

•

15 years ago

pushed to master http://github.com/mozautomation/mozmill/commit/2852266041a448fdacdefb93cb1c875f623a3232 I'll push to 1.4.2 as soon as I can figure out how to do that

Assignee

Comment 30

•

15 years ago

pushed to 1.4.2 http://github.com/mozautomation/mozmill/commit/1decfe71d8f7a23be7dd19562e55e0179d707d9b

Status: NEW → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 31

•

15 years ago

Works as expected on OS X and Windows. Lets call it verified fixed.

Status: RESOLVED → VERIFIED

Nobody; OK to take it and work on it

Updated

•

9 years ago

Product: Testing → Testing Graveyard

You need to log in before you can comment on or make changes to this bug.