Closed Bug 1271000 Opened 6 years ago Closed 6 years ago

Document local unit test failures

Categories

(Firefox Build System :: Android Studio and Gradle Integration, defect)

All
Android
defect
Not set
normal

Tracking

(firefox49 fixed)

RESOLVED FIXED
mozilla49
Tracking Status
firefox49 --- fixed

People

(Reporter: mcomella, Assigned: mcomella)

References

Details

Attachments

(1 file)

They pass on treeherder but not locally – it'd be great to figure out why so we can make the command line invocation of this tool useful.

Note that the automation job runs the task, "app:testAutomationDebugUnitTest".
TestHTTPServerTestHelper fails via the command line but not in IJ – HM! The stack trace is:

org.mozilla.android.sync.test.helpers.HTTPServerTestHelper$HTTPServerAlreadyRunningError: org.mozilla.android.sync.test.helpers.HTTPServerTestHelper$HTTPServerStartedError
	at org.mozilla.android.sync.test.helpers.HTTPServerTestHelper.throwIfServerAlreadyRunning(HTTPServerTestHelper.java:150)
	at org.mozilla.android.sync.test.helpers.HTTPServerTestHelper.startHTTPServer(HTTPServerTestHelper.java:179)
	at org.mozilla.android.sync.test.helpers.HTTPServerTestHelper.startHTTPServer(HTTPServerTestHelper.java:200)
	at org.mozilla.android.sync.test.helpers.test.TestHTTPServerTestHelper.testForceStartTwice(TestHTTPServerTestHelper.java:92)
...
Caused by: org.mozilla.android.sync.test.helpers.HTTPServerTestHelper$HTTPServerStartedError
	at org.mozilla.android.sync.test.helpers.HTTPServerTestHelper.registerServerAsRunning(HTTPServerTestHelper.java:159)
	at org.mozilla.android.sync.test.helpers.HTTPServerTestHelper.startHTTPServer(HTTPServerTestHelper.java:188)
	at org.mozilla.android.sync.test.helpers.HTTPServerTestHelper.startHTTPServer(HTTPServerTestHelper.java:200)
	at org.mozilla.android.sync.net.test.TestClientsEngineStage.downloadClientRecords(TestClientsEngineStage.java:147)

---
It sounds like the test helper may not be getting cleaned up between test runs because TestHTTPServerHelperTestHelper failed by TestClientsEngineStage. If only I could figure out how to run tests individually via the command line...
Summary: Figure out why unit tests fail locally → Figure out why unit tests fail locally via command line
Assignee: nobody → michael.l.comella
Javadoc for HTTPServerTestHelper:

 * Test helper code to bind <code>MockServer</code> instances to ports.
 * <p>
 * Maintains a collection of running servers and (by default) throws helpful
 * errors if two servers are started "on top" of each other. The
 * <b>unchecked</b> exception thrown contains a stack trace pointing to where
 * the new server is being created and where the pre-existing server was
 * created.

Maybe the ports are:
 * already bound on my system (e.g. running a server for other reasons)
 * restricted ports that require sudo access (e.g. port 80)
 * not cleaned up between tests locally, but cleaned up in automation

In any case, we should make sure this gizmo cleans up its ports between tests.
That's weird. I don't see the failure when running testAutomationDebugUnitTest locally. But looking into the test report it seems like TestClientsEngineStage doesn't run at all..!?

(I only see testDSAGeneration fail. But that's probably because I did not setup the crypto policy: https://mail.mozilla.org/pipermail/mobile-firefox-dev/2015-September/001499.html)
Also: We recently saw a similar behavior: A test failing but not for everyone (and intermittent in automation): bug 1265979. Back then we realized that the order tests run in is different on some systems - we didn't investigate this further and instead fixed the test that depended on a particular order.
(In reply to Sebastian Kaspari (:sebastian) from comment #3)
> That's weird. I don't see the failure when running
> testAutomationDebugUnitTest locally. But looking into the test report it
> seems like TestClientsEngineStage doesn't run at all..!?

Nick showed me bug 1251053 which is about running the unit tests in parallel. My tests run faster than Nick's (34s vs. Nick's 120s; but Nick runs 20s in parallel) so I wonder if mine are incorrectly running in parallel.
I discovered testDownloadHasOurRecord fails due to a crypto error but it's masked by a "could not connect to host error" because the test is divided into a client/server architecture and only the client reports its errors.

Specifically, where we fail:
 DownloadLocalRecordMockServer.handle
  DLRMS.cryptoFromClient
   CryptoRecord.encrypt
    CryptoInfo.encrypt (static)
     CI.encrypt (instance)
      Cipher.init

Where a CryptoException is thrown: "Illegal key size or default parameters"
I tried installing the crypto files from comment 3 but it hasn't seemed to change anything but it's possible I installed them incorrectly.
(In reply to Michael Comella (:mcomella) from comment #7)
> I tried installing the crypto files from comment 3 but it hasn't seemed to
> change anything but it's possible I installed them incorrectly.

...and after sitting in the debugger for a few minutes trying to figure it out, all the tests except one are passing (the one sebastian mentioned in comment 3 – filed bug 1275423). I think the crypto policy I installed just took some time to get loaded into the JVM (or knocked out of cache) or something.

Conclusion: in order to get all of the tests to pass, you need to install the crypto policy. However, the failure messages aren't very descriptive because of the client/server nature of the tests. Additionally, some tests might leave bound ports on failure, causing the error messages to be even less descriptive.

What we can do about it - options:
* Make the client/server tests fail with an appropriate error message – the assertions thrown in the server don't seem to reach the console so maybe we can fix that
* Improve the documentation – explain what needs to be install, what fails, and why (basically what the conclusion here is/was).
* When gradle test fails, add a tidbit at the bottom. "Seeing failures? Did you install the crypto policies? <link>" We can also filter this so it'd only appear when crypto-related tests fail.
(In reply to Michael Comella (:mcomella) from comment #8)
> * Improve the documentation – explain what needs to be install, what fails,
> and why (basically what the conclusion here is/was).

Updated https://wiki.mozilla.org/Mobile/Fennec/Android/Testing#JUnit4_tests & sent out an email: https://mail.mozilla.org/pipermail/mobile-firefox-dev/2016-May/002085.html
(In reply to Michael Comella (:mcomella) from comment #8)
> * When gradle test fails, add a tidbit at the bottom. "Seeing failures? Did
> you install the crypto policies? <link>" We can also filter this so it'd
> only appear when crypto-related tests fail.

I could only get it to print above all tasks (including non-testing tasks) by including it inside `testOptions { unitTests.all {` so I opted not to include it at all.
re comment 10, I tried to add `doLast{...}` in various places, but the android plugin doesn't appear to contain real tasks and I tried to add `<test-task>.finalizedBy <print-task>` but it doesn't appear to recognize the tasks either.
(In reply to Michael Comella (:mcomella) from comment #8)
> * Make the client/server tests fail with an appropriate error message – the
> assertions thrown in the server don't seem to reach the console so maybe we
> can fix that

It appears JUnit does not report errors from background threads so you have to pass assertion failures back to the main thread. This appears to be non-trivial and there is no single convention to do this. Some posts suggest testing the two threads separately (and not putting them together), or running the interactions between the two threads sequentially. 

It'd be difficult to find all of the cases of multi-threaded testing in our unit test code and even if we could, I don't think it'd be a priority to rewrite them all.

I think the best solution here would be to add documentation – I've already documented the "how to run tests" documentation but I could also throw in a comment or two in the code.

[1]: http://stackoverflow.com/a/314729
Summary: Figure out why unit tests fail locally via command line → Document local unit test failures
Attachment #8756461 - Flags: review?(s.kaspari) → review+
Comment on attachment 8756461 [details]
MozReview Request: Bug 1271000 - Add explanation of potential test failures in TestClientsEngineStage. r=sebastian

https://reviewboard.mozilla.org/r/55194/#review51896
https://hg.mozilla.org/integration/fx-team/rev/c80d58a15cff0cecf948168917c45c50dbf473b1
Bug 1271000 - Add explanation of potential test failures in TestClientsEngineStage. r=sebastian
https://hg.mozilla.org/mozilla-central/rev/c80d58a15cff
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 49
Product: Firefox for Android → Firefox Build System
Target Milestone: Firefox 49 → mozilla49
You need to log in before you can comment on or make changes to this bug.