Closed Bug 1611078 Opened 4 years ago Closed 4 years ago

Investigate harness crashes for "Puppeteer.launch" unit test's (launcher.spec.js)

Categories

(Remote Protocol :: Agent, task, P3)

task

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: whimboo, Unassigned)

References

Details

All the unit tests for the launcher under Puppeteer.launch are currently disabled due to possible hangs of maybe Firefox, and a resulting crash of the test harness. Given that launching the browser is a crucial part of Puppeteer support, we should get this investigated and fixed.

Actually I can see that when running the tests locally an instance of Firefox is NOT closed and as such remains open and most likely blocking the websocket port. As such all successive tests are going to fail.

Given that those tests open and close Firefox kinda quickly I wonder if it's overlapping some startup and shutdown code. Lots of failures can be seen in the log for domains not being destructed, including the Network domain for Puppeteer.launch userDataDir option:

1579766759922	RemoteAgent	ERROR	unable to stop listener: [Exception... "ServiceManager::GetService returned failure code:"  nsresult: "0x80570016 (NS_ERROR_XPC_GS_RETURNED_FAILURE)"  location: "JS frame :: chrome://remote/content/domains/parent/network/ChannelEventSink.jsm :: ChannelEventSinkFactory.getService :: line 104"  data: no] Stack trace: ChannelEventSinkFactory.getService()@ChannelEventSink.jsm:104
dispose()@NetworkObserver.jsm:86
disable()@Network.jsm:82

If any destructor in the clear() method of the DomainCache fails, it will bail out immediately, instead of trying to destruct the other domains, and the rest of the teardown code. Moving the try/catch from RemoteAgent to that code would maybe make more sense:

https://searchfox.org/mozilla-central/rev/e878e5b81bb319c141900ce9cfcde732df5c8449/remote/domains/DomainCache.jsm#105

And for ChannelEventSinkFactory.getService() we should return null if the service is not available.

I actually have the launcher unit tests fixed as part of my WIP on Bug 1596886. The tests were launching Firefox in a way that didn't make sense.

I think what you're referring to in Comment 2 is due to the browser being killed after a harness timeout. The fix you propose isn't a bad idea, but it's not a blocker for anything.

That's great to hear! So we can move this out of beta.

Depends on: 1596886
Whiteboard: [puppeteer-beta-reserve]

Maja, this is all good now given that your patch landed?

Flags: needinfo?(mjzffr)

There are two failures that are unresolved: tests that work fine on Travis but hang (and the harness crashes) in TaskCluster

Flags: needinfo?(mjzffr)

Which exact tests, which still need to be fixed, are remaining here?

Please also see https://github.com/puppeteer/puppeteer/issues/5466 for one reason why we have those crashes when running the Puppeteer unit tests. Maybe there is some other or similar behavior here for the launcher tests.

Depends on: 1623128

Given that Puppeteer 3 came with a lot of harness changes, and Maja currently works on a downstream sync via bug 1632710, we should wait until that patch landed before doing any further investigation here.

Depends on: 1632710

Maja, given that you were able to remove the skip from all the tests can I assume that we no longer have the crashes in case of missing Promise resolutions? That would be great!

Flags: needinfo?(mjzffr)

Yep

Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(mjzffr)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.