Closed Bug 656736 Opened 13 years ago Closed 13 years ago

Upgrade to latest MozMill for Thunderbird's tests

Categories

(Thunderbird :: Testing Infrastructure, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 9.0

People

(Reporter: standard8, Assigned: rain1)

References

Details

Attachments

(3 files, 10 obsolete files)

5.05 KB, patch
Details | Diff | Splinter Review
6.92 KB, patch
Details | Diff | Splinter Review
2.43 MB, application/zip
Details
Andrew's recommending we update our MozMill usage to the latest version, as this will hopefully solve some of the issues we currently have.

The latest available versions are:

- http://pypi.python.org/packages/source/m/mozrunner/mozrunner-2.5.4.tar.gz
- http://pypi.python.org/packages/source/j/jsbridge/jsbridge-2.4.3.tar.gz
- http://pypi.python.org/packages/source/m/mozmill/mozmill-1.5.3.tar.gz
Attached patch WIP (obsolete) — Splinter Review
This gets the basics working, but is hopelessly complete, and there's various test failures in there.
asuth: Can you take a look at this?

I've disabled all the error handling hooks in runtest.py, but I still only get TEST-UNEXPECTED-FAIL and no detail lines. Is this something you've done in our test code itself?

Also, I'm running on a single directory, and setupModule seems to fail for any test file after the first, despite the fact that running the subsequent files on their own pass fine...

I can't see anything in the 1.5.2/1.5.3 fixed list that says this has changed, so I'm kinda lost without any errors being logged.
(In reply to comment #2)
> I've disabled all the error handling hooks in runtest.py, but I still only
> get TEST-UNEXPECTED-FAIL and no detail lines. Is this something you've done
> in our test code itself?

(note: I did the disabling locally)
Assignee: nobody → bugmail
good news / bad news... I think I have it "working", but am seeing the same problem we are trying to escape:

error: uncaptured python exception, closing channel <jsbridge.network.BackChannel connected 127.0.0.1:24242 at 0x2be88c0> (<type 'exceptions.KeyError'>:'message' [/usr/lib64/python2.7/asyncore.py|readwrite|104] [/usr/lib64/python2.7/asyncore.py|handle_read_event|435] [/usr/lib/python2.7/site-packages/jsbridge-2.4.3-py2.7.egg/jsbridge/network.py|handle_read|94] [/usr/lib/python2.7/site-packages/jsbridge-2.4.3-py2.7.egg/jsbridge/network.py|process_read|255] [/usr/lib/python2.7/site-packages/jsbridge-2.4.3-py2.7.egg/jsbridge/network.py|fire_callbacks|270] [/usr/lib/python2.7/site-packages/jsbridge-2.4.3-py2.7.egg/jsbridge/network.py|fire_event|291] [/usr/lib/python2.7/site-packages/mozmill-1.5.3-py2.7.egg/mozmill/__init__.py|__call__|97] [runtest.py|logFailure|371])
oops, no that's my fancy pants logging that I need to update.
So it appears that the problem that is breaking mozmill right now is that an nsIException is bubbling up to cause test failures in setupModule.  Through some combination of violated assumptions on the part of my fancy pants logging, this was killing the python process when it saw an exception object that was just {}.

I have things "working" on mozmill, where "working" translates to we are now dying from:
nsIException: Component returned failure code: 0x80550013 [nsIMsgFolder.createSubfolder] (2153054227)

This raises the two major questions:

- Did something change in the platform error reporting recently that nsIExceptions would start coming out of the woodwork?

- Did something change recently in our createSubfolder semantics that would be a jerk?
Summary: Upgrade to latest MozMill for Thunderbird's tests → Resolve Thunderbird MozMill bad death failures, possibly via upgrade to latest MozMill for Thunderbird's tests
uh, and after I post that I immediately realize we might be screwing up our profile directory.  I'll look at that now...
Sorry for the chicken-with-its-head-cut-off posts here.

It appears that the createSubfolder problem is because newer mozmill uses a different submodule loading system.  Our shared-modules' defense about being initialized multiple times was being defeated, which was causing us to attempt to re-initialize all helper subsystems multiple times.  This resulted in us trying to re-create the INBOX multiple times, which led to createSubfolder getting deeply upset, which led to nsIException instances, which led to fancy pants error reporting falling over.
Summary: Resolve Thunderbird MozMill bad death failures, possibly via upgrade to latest MozMill for Thunderbird's tests → Upgrade to latest MozMill for Thunderbird's tests
Amendment to previous comment: the mechanics of module loading are basically the same (although there is now a 'require' provided that provides securable module semantics.)

The problem is that the python Mozmill runner no longer issues a single call to runTestDirectory, but rather N calls to runTestFile.  runTestFile instantiates a new Collector each time which means that our shared-module helpers each get loaded N times.
So, I tried to make this backwards compatible by detecting:
- whether self.options.test is a string or a list of strings
- whether we are on 1.5.x and therefore need to monkeypatch to change things to directory-centric behavior.

folder-display passes for me save for one archive test failure that the fancy pants logging correctly warns us about ("about to ignore command because it's disabled: cmd_archive").

I somehow confused myself into thinking that Thunderbird still was experiencing perma-orange, but since that is not the case, I won't be able to look into this any further.
Attachment #532026 - Attachment is obsolete: true
Attachment #532401 - Flags: feedback?(mbanner)
Assignee: bugmail → nobody
So, I'm not totally clear on where you are with this.  Do you need a hand with anything or are you ok?

One issue in the low level (javascript) platform that you want to avoid is the "errors are not enumerable" issue that we just had to patch in mozmill.  We just landed a fix for that on 1.5.4pre, which is available on pypi.  We will probably release 1.5.4 next week, depending on what else QA wants in the release.  

You can pick up the code by checking out the hotfix-1.5 branch from the mozautomation repo -- https://github.com/mozautomation/mozmill.
(In reply to comment #11)
> So, I'm not totally clear on where you are with this.  Do you need a hand
> with anything or are you ok?

First, thanks for the interest in helping us out here and making sure we are okay!

Short answer:

I think we would appreciate guidance on the proper way to provide support logic, preferably durable logic, but otherwise are good.  Thunderbird is a lot more stateful than Firefox and it would be simpler if we can have helpers loaded via mozmill that stick around without having to move them to be in-tree JS modules.  (We could also have the code unload at the end of each test and notice when we are being evaluated into a 'dirty' state and adapt to that, but that requires more work.)  If there is documentation on this, that would be great.

Assuming my patch works for us, I think Thunderbird wants to get on the mozmill 1.5.x bandwagon so we get nice fixes like that latest one.  What is the appropriate mailing list/RSS feed/venue for Thunderbird to keep abreast of such releases and notable fixes?


I believe the patch addresses all outstanding problems, which I'm breaking out here for Standard8's benefit:

- Thunderbird's custom logger is brittle in the face of unexpected exception objects (both with 1.4.x mozmill and mozmill 1.5.x).  The non-enumerable thing is likely the cause of the change in behaviour (bug 637207 landed apr 14th with follow-up apr 18th).  My attached patch addresses this.

- The change of 1.5.x to run tests individually with a separate JS context each time fouled up our helper code.  My attached patch addresses this by reverting to using runTestDirectory.

- The change of 1.5.x to having options.test be a list instead of a single string broke some of our driver code.  My attached patch address this by dealing with things being an array of strings; it also tries to deal with the old-school case where it is a string.


The "I won't be able to look into this any further" is a recognition that I'm no longer full-time on Thunderbird and blew my amortized Thunderbird-helping budget last week and am turfing this back to Standard8 for the time being.  (I am available for reviews and consultation.)

> One issue in the low level (javascript) platform that you want to avoid is
> the "errors are not enumerable" issue that we just had to patch in mozmill. 
> We just landed a fix for that on 1.5.4pre, which is available on pypi.  We
> will probably release 1.5.4 next week, depending on what else QA wants in
> the release.  

Righto, so this push:
https://github.com/mozautomation/mozmill/commit/198251db6273e5ec0d4b6dbd425c0519a9ab98ac
Blocks: 627829
Attached patch patch that works on Windows, v1 (obsolete) — Splinter Review
OK, so this passes for me on Windows.

The upgrade exposed a bug in a test, which I've fixed, but now I'm having second thoughts about whether this is the right way to fix it. A better way would probably be to make sure archiving is set back to enabled at the end of <https://mxr.mozilla.org/comm-central/source/mail/test/mozmill/folder-display/test-message-commands.js#412>, in teardownModule.

The reason the test hasn't been failing until now is because we don't check for a timeout at <https://mxr.mozilla.org/comm-central/source/mail/test/mozmill/shared-modules/test-folder-display-helpers.js#1429>. MozMill 1.5 always fails if controller.waitForEval times out :)
Assignee: nobody → sid.bugzilla
Attachment #532401 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #532401 - Flags: feedback?(mbanner)
Attachment #537855 - Flags: review?(mbanner)
great, glad you fixed this.

Yeah, my inclination is to make sure individual tests clean up after themselves; it's more maintainable that way.
For the record, MozMill 1.5.4b3 is up on pypi, so nothing blocks us from that side.
review ping -- Mark, if you're too busy, bienvenu's offered to look at the patch.
Comment on attachment 537855 [details] [diff] [review]
patch that works on Windows, v1

So this is looking reasonable, but on my Mac the message-header tests are failing, with no indication as to the actual failure still (just jsbridge complaining).

A few other tests fail, but they at least report the failure points.
Two more fixes:

1. MozMill changed its documentLoaded variable to mozmillDocumentLoaded, as a sort of namespacing I guess.
2. The a11y stuff's properly disabled on Mac now. Seems like we'd silently fail on Mac earlier. I had to move it to the end because that's how a function inside an if statement would be ordered regardless.
Attachment #537855 - Attachment is obsolete: true
Attachment #537855 - Flags: review?(mbanner)
Attachment #541133 - Flags: review?(mbanner)
(In reply to comment #18)
> 
> 1. MozMill changed its documentLoaded variable to mozmillDocumentLoaded, as
> a sort of namespacing I guess.

(bug 661408)
Comment on attachment 541133 [details] [diff] [review]
patch that works on Windows and Mac. v2

Ok, from what I can tell this works fine, but please check it is still ok with 1.5.4b4.

What I suggest we (well, mainly you!) do is:

- Work out a time for when we can close the trees and upgrade all the machines and trees (I'm assuming we need to update everything at the same time - does this still apply to 1.9.2?).
-- Talk to jhopkins, but I would propose Thursday as it should be a couple of days after the release, but it'll be before the merge to beta.
- File a bug on getting the builders upgraded, mention the specific versions.
- Publicise the closure
- Then we can push it out, I suspect there may be some failures that we'll need to fix up, hence if several of us are around for the upgrade time that would be good.
Attachment #541133 - Flags: review?(mbanner) → review+
Depends on: 668058
As discussed, the best way forward seems to be to switch to an in-tree MozMill. The important bits are in this patch -- the rest is just a straight code import of whichever python packages we need.

The patch works similarly to what's in mozilla-central. make -C mail/test/mozmill will
- copy over all the packages to objdir/mozilla/_tests/mozmill/resources.
- set up a virtualenv in objdir/mozilla/_tests/mozmill-virtualenv. Note that per a recommendation from jhammel, we deliberately do not set up a virtualenv inside objdir/mozilla/_tests/mozmill because it isn't guaranteed to be portable across different systems, even those running the same OS.
- install our libraries in the virtualenv.

Then, make mozmill and make mozmill-one simply run runtestlist.py/runtest.py using the virtualenv's python.

Buildbot will have to run
(1) installmozmill.py whatever/dir/should/be/the/virtualenv
(2) whatever/dir/should/be/the/virtualenv/{bin|Scripts}/python runtestlist.py etc as usual.

Other notes:
- PYTHONHOME overrides virtualenv so we unset it before trying to do anything. Buildbot will have to do the same.
- The installmozmill.py is mostly lifted from <https://mxr.mozilla.org/comm-central/source/mozilla/testing/mozmill/installmozmill.py>, with a bunch of unncessary stuff removed. (I don't think installing from pip is a good idea at all :) )
- I haven't really been able to test this on the tryserver because it involves buildbot changes. So we'll probably have to close the tree and iron any issues out.
Attachment #541133 - Attachment is obsolete: true
Attachment #543027 - Flags: review?(mbanner)
Sid, just wondering, will switching to an in-tree MozMill make it easy to upgrade to, say, Mozmill 2.0 in the future?

Or will it be so intertwined that another major uplifting effort will be required?

Also, will the version of Mozmill on c-c be exposed in some file?
(In reply to comment #22)
> Sid, just wondering, will switching to an in-tree MozMill make it easy to
> upgrade to, say, Mozmill 2.0 in the future?

Yes, modulo any APIs we use that Mozmill 2.0 breaks.

> 
> Also, will the version of Mozmill on c-c be exposed in some file?

Uh, well, I guess you could read it in from the in-tree setup.py.

---

The complete patch is too big to host on Bugzilla, so I'm currently hunting for another place to host it.
(In reply to comment #23)
> 
> The complete patch is too big to host on Bugzilla, so I'm currently hunting
> for another place to host it.

It's up at http://www.cse.iitk.ac.in/users/sagarwal/656736-mozmill-upgrade
Blocks: 581661
No real changes from v2, just s/RECURSIVE_INSTALL/DIR_INSTALL.
Attachment #543027 - Attachment is obsolete: true
Attachment #543027 - Flags: review?(mbanner)
Attachment #545719 - Flags: review?(mbanner)
Mark, you CC'ed me here, what is the action item I need to notice/what part of this matters to me. There is a lot of data here and not being someone who ever used MozMill directly, its hard to digest easily.
If mozmill doesn't get packaged how do the unit test builders install it? (they don't necessarily have a source tree checked out).
The MozMill source tree's packaged, but the virtualenv isn't. So unit test builders need to run installmozmill.py to create a virtualenv from the source tree.
We might want to switch to http://k0s.org/mozilla/hg/carton at some point though -- it's a tool that creates a single python file which when run expands into a virtualenv.
Comment on attachment 545719 [details] [diff] [review]
patch v3 that imports mozmill into the tree (important bits)

diff --git a/mail/test/mozmill/folder-display/test-message-commands.js b/mail/test/mozmill/folder-display/test-message-commands.js

+  enable_archiving(true);
+}
\ No newline at end of file

nit: please fix.

+# Copy MozMill and its dependencies over, and set up a virtualenv. The
+# virtualenv directory is outside because we don't want to bundle it up during
+# stage-package.
+VIRTUALENV_DIR = $(_DEST_DIR)/../mozmill-virtualenv
+mozmill-virtualenv: NSDISTMODE=copy
+mozmill-virtualenv:
+	$(DIR_INSTALL) $(topsrcdir)/mail/test/resources $(_DEST_DIR)
+	rm -rf $(VIRTUALENV_DIR) && \
+	mkdir $(VIRTUALENV_DIR) && \
+	$(PYTHON) $(_DEST_DIR)/resources/installmozmill.py $(VIRTUALENV_DIR)

The comment on this section is confusing. How about something more like "virtualenv goes into its own directory and doesn't get packaged into tests - as it is only applicable to the build on this machine."


Reminder re buildbot, we'll need it to not run installmozmill.py on the branches, probably the best way to do this is to detect if the file exists or not.
Attachment #545719 - Flags: review?(mbanner) → review+
This patch and the next one are both untested, but I think they should work...
Attachment #548288 - Flags: review?(jhopkins)
Attachment #548289 - Flags: review?(jhopkins)
Attachment #548288 - Attachment is obsolete: true
Attachment #548288 - Flags: review?(jhopkins)
Attachment #548291 - Flags: review?(jhopkins)
I fixed the indentation.
Attachment #548289 - Attachment is obsolete: true
Attachment #548289 - Flags: review?(jhopkins)
Attachment #548292 - Flags: review?(jhopkins)
Depends on: 676132
Comment on attachment 548291 [details] [diff] [review]
better patch to buildbotcustom-thunderbird

>diff --git a/process/factory.py b/process/factory.py
>+                # Older comm-central branches use a centrally-installed
>+                # MozMill. We figure this out by seeing if installmozmill.py is
>+                # present.
>+                if os.path.exists('build/mozmill/resources/installmozmill.py'):

Unless I am _very_ confused, this will check the path on the buildbot master, using its cwd as a reference. Which will never find this file.
Attachment #548291 - Flags: feedback-
Yes, we already figured that out on IRC. I updated the patch locally and will put it up on bugzilla once we can get it working on tinderbox.
This is what gozer gave me. I've cleaned up the patch just a little (fixing up comments etc), so it'd be great if you could apply this. I'll also upload a fixed buildbotcustom-thunderbird-try patch shortly.
Attachment #548291 - Attachment is obsolete: true
Attachment #548291 - Flags: review?(jhopkins)
This isn't tested, but it's really the same code (though I might have messed something up...)
Attachment #548292 - Attachment is obsolete: true
Attachment #548292 - Flags: review?(jhopkins)
I'm still seeing errors with a few tests on staging mini-01, but (over an ssh connection) I found that the current version of mozmill on that machine has the very same problems, and that my own OS X install doesn't. So I'm going to attribute this to a problem with the build slave or the build itself, and recommend that we get everything into the tree on Monday if possible.
OK, so the Mac and Linux issues seem to have been sorted out. Let's try getting this in tomorrow!
http://hg.mozilla.org/comm-central/rev/790cc2369783

The patch stuck.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 9.0
No longer depends on: 676132
Attached file final patch checked in
(the patch is too big to upload to bugzilla as is, so I've zipped it up)
Attachment #545719 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: