Closed Bug 1057246 Opened 10 years ago Closed 6 years ago

Firefox doesn't honor application.restart() and simply quits

Categories

(Mozilla QA Graveyard :: Mozmill Tests, defect)

All
Windows 8
defect
Not set
normal

Tracking

(firefox34 affected)

RESOLVED WONTFIX
Tracking Status
firefox34 --- affected

People

(Reporter: andrei, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

I see this with attachment 8466966 [details] what I initially thought is bug 1046078, but I think it might be a different issue. This fails on Windows, this is what the log shows (with opt builds): 
> (restart) C:\work\bugs\1046078\testcase>mozmill -t testcase.js -b C:\Mozilla\Nig
> htly\en-US\2014-08-18_firefox-34.0a1.en-US.win32\firefox\firefox.exe
> ^/(.*)$
> ^/(.*)\.asis$
> ^/(.*)\.py$
> No handlers could be found for logger "wptserve"
> ^/(.*)$
> ←[1;34mTEST-START←[0m | C:\work\bugs\1046078\testcase\testcase.js | test1
> ←[1;32mTEST-PASS←[0m | C:\work\bugs\1046078\testcase\testcase.js | test1
> TEST-END | C:\work\bugs\1046078\testcase\testcase.js | finished in 1172ms
> 
> ###!!! [Child][MessageChannel::SendAndWait] Error: Channel closing: too late to
> send/recv, messages will be lost
> 
> 2014-08-22 00:08:43: stackwalker.cc:125: INFO: Couldn't load symbols for: ntdll.
> pdb|86C5ABD953F644798B23D22818BE23A42
> 2014-08-22 00:08:43: stackwalker.cc:125: INFO: Couldn't load symbols for: kernel
> base.pdb|DD4522B344B04180A82AA5F4E0378A1A2
> 2014-08-22 00:08:43: stackwalker.cc:125: INFO: Couldn't load symbols for: nss3.p
> db|0C7067079C7C4938B7E3F7B4EE22E16D1
> 2014-08-22 00:08:43: stackwalker.cc:125: INFO: Couldn't load symbols for: xul.pd
> b|156728B68CAD4406B8F2CD73B42A5D7D2
> 2014-08-22 00:08:43: basic_code_modules.cc:88: INFO: No module at 0xdcf528
> 2014-08-22 00:08:43: basic_code_modules.cc:88: INFO: No module at 0xdcf650
> 2014-08-22 00:08:43: basic_code_modules.cc:88: INFO: No module at 0xdcf528
> 2014-08-22 00:08:43: basic_code_modules.cc:88: INFO: No module at 0x11114a0
> ←[1;31mTEST-UNEXPECTED-FAIL←[0m | Connection to application lost (exit code: Non
> e)
> ←[1;31mTEST-UNEXPECTED-FAIL←[0m | Disconnect Error: Application unexpectedly clo
> sed
> RESULTS | Passed: 1
> RESULTS | Failed: 1
> RESULTS | Skipped: 0
> Traceback (most recent call last):
>   File "c:\work\mozmill\mozmill\mozmill\__init__.py", line 899, in run
>     mozmill.run(tests, self.options.restart)
>   File "c:\work\mozmill\mozmill\mozmill\__init__.py", line 444, in run
>     self.handle_disconnect(e)
>   File "c:\work\mozmill\mozmill\mozmill\__init__.py", line 516, in handle_discon
> nect
>     self.stop_runner()
>   File "c:\work\mozmill\mozmill\mozmill\__init__.py", line 585, in stop_runner
>     raise errors.ShutdownError('client process shutdown unsuccessful')
> ShutdownError: client process shutdown unsuccessful


First started failing with the 18th August build.
Pushlog from mc: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=94ba78a42305&tochange=0aaa2d3d15cc

From which I suspect bug 1026561 is to blame.
I'll check tinderbox builds to make sure.
Tinderbox pushlog:
https://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=82722ad59732&tochange=05f052f7c894

Which is bug 1026561.

I can't see how the no of column change would affect this.
We are seeing certain instability with all recent newTab tiles changes. 

Ed, do you know how the change in the number of columns could affect this?
The testcase referenced in comment 1 does only a few thing:

1) open a page
2) wait for it to load
3) open about:newtab
4) restart firefox // this is where something goes wrong
Blocks: 1026561
Flags: needinfo?(edilee)
Attached file log_osx.txt
Well, I've seen this now on OSX while running a complete testrun.
In the process of landing bug 1026561, we ran into a talos crash that was triggered because empty tiles started appearing with bug 1026561's patch. I.e., going from 9 tiles filled with 9 directory tiles to showing more than 9 tiles while still only having 9 directory tiles to fill in content.

Because there weren't enough directory tiles, history tiles started filling in and the thumbnailer would try to capture a screenshot in the background (by making a requests to pages that needs thumbnails) that eventually led to a crash in some message passing.
Flags: needinfo?(edilee)
(In reply to Andrei Eftimie from comment #2)
> Created attachment 8477302 [details]
> log_osx.txt
> 
> Well, I've seen this now on OSX while running a complete testrun.

This is not a Mozmill Test related issue but related to debug symbol packages. Is this a self-built Firefox? If yes, have you created the symbol packages before running the test?

Also the above should have nothing to do with the disconnect and the lost connection to Firefox. Can you reproduce it with the given testcase? If yes, which behavior of Firefox do you see? Does it restart or is the formerly opened process still running?
(In reply to Henrik Skupin (:whimboo) from comment #4)
> (In reply to Andrei Eftimie from comment #2)
> > Created attachment 8477302 [details]
> > log_osx.txt
> > 
> > Well, I've seen this now on OSX while running a complete testrun.
> 
> This is not a Mozmill Test related issue but related to debug symbol
> packages.

It failed on the testcase I had for bug 1046078.
We usually have bugs against mozmill-test for test failures, then file a blocker for the appropriate component. I do agree that in this case, this bug might be directly changed to the right component.

> Is this a self-built Firefox? If yes, have you created the symbol
> packages before running the test?
As mentioned in comment 1 this is reproducible with opt builds.

> Also the above should have nothing to do with the disconnect and the lost
> connection to Firefox. Can you reproduce it with the given testcase? If yes,
> which behavior of Firefox do you see? Does it restart or is the formerly
> opened process still running?

We don't do anything _after_ the restart in the testcase, I will have to add a check there to make sure. From the looks of it I don't think firefox manages to restart, it fails to properly stop and spews out the mentioned messages.

From Ed's comment this seems to be if we try stopping Firefox while some background tasks are being executed in relation to the about:newpage tiles.
So it's not clear to me if that Tile issue in Firefox has been fixed meanwhile or not. Ed, do you have details for that? Shall we file a bug for it, or is it not necessary?
Flags: needinfo?(edilee)
In comment 3, I mentioned that bug 1026561/increasing columns ran into talos crashes when it initially tried to land. Before it finally landed, bug 1031303 happened to be fixed so that increasing the number of columns no longer caused talos crashes.

It sounds like whatever fixed talos crashes (I don't quite understand what broke and got fixed) did not help in preventing this bug.

(In reply to Henrik Skupin (:whimboo) from comment #6)
> So it's not clear to me if that Tile issue in Firefox has been fixed
> meanwhile or not.
What tile issue are you referring to? I'm still not entirely sure what this bug is about either.

Andrei, could you try reproducing this bug except set browser.pagethumbnails.capturing_disabled = true?

This was created in bug 892875:

// Background thumbnails in particular cause grief, and disabling thumbnails
// in general can't hurt - we re-enable them when tests need them.
user_pref("browser.pagethumbnails.capturing_disabled", true);
Flags: needinfo?(edilee)
(In reply to Ed Lee :Mardak from comment #7)
> (In reply to Henrik Skupin (:whimboo) from comment #6)
> > So it's not clear to me if that Tile issue in Firefox has been fixed
> > meanwhile or not.
> What tile issue are you referring to? I'm still not entirely sure what this
> bug is about either.

I'm also not totally sure, so I would leave the next steps up to Andrei. He should test your proposed steps.
Flags: needinfo?(andrei.eftimie)
Indeed with:
> user_pref("browser.pagethumbnails.capturing_disabled", true);

The failure doesn't reproduce.
Flags: needinfo?(andrei.eftimie)
(In reply to Andrei Eftimie from comment #1)
> 4) restart firefox // this is where something goes wrong

So can you tell what exactly goes wrong here? Can you please tell, which specific issue this bug is about? The missing symbols or the suspicious behavior of Firefox.
(In reply to Henrik Skupin (:whimboo) from comment #10)
> (In reply to Andrei Eftimie from comment #1)
> > 4) restart firefox // this is where something goes wrong
> 
> So can you tell what exactly goes wrong here? Can you please tell, which
> specific issue this bug is about? The missing symbols or the suspicious
> behavior of Firefox.

All info is in comment 1 and comment 2

Testcase: attachment 8466966 [details]
Testcase steps:
1) open a page
2) wait for it to load
3) open about:newtab
4) restart firefox // this is where something goes wrong

On step 4) Firefox doesn't restart and we see this info in the console.

This is again a problem with the tiles (as we've had many recently). The weird part is the symbol messages in the console.

The failure itself could have the same underlying cause as bug 1046078.
Oh I see Cosmin had the same output in bug 1046078 comment 38.
So lets get the summary updated. It's totally not clear. Now I understand, and I think it might be another manifestation of bug 974971.
Blocks: 974971
Summary: Failure logged as "stackwalker.cc:125: INFO: Couldn't load symbols for: [..]" → Firefox doesn't honor application.restart() and simply quits
Mozmill is dead, WONTFIX the remaining bugs.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
Product: Mozilla QA → Mozilla QA Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: