823989 - Permaorange browser-chrome on Aurora Linux64 and Win7 nightly builds since merge of Firefox 20 to Aurora

Reporter

Description

•

13 years ago

+++ This bug was initially created as a clone of Bug #818990 +++ *** Start BrowserChrome Test Results *** TEST-INFO | checking window state TEST-INFO | unknown test url | must wait for focus TEST-INFO | (browser-test.js) | Console message: PAC file installed from data:text/plain,function%20FindProxyForURL(url,%20host){%20%20var%20origins%20=%20['http://127.0.0.1:80',%20'… INFO | automation.py | Application ran for: 0:05:36.670928 INFO | automation.py | Reading PID log: /tmp/tmpQDy7Vppidlog Downloading symbols from: http://ftp-scl3.mozilla.com/pub/mozilla.org/firefox/tinderbox-builds/mozilla-aurora-linux64/1356092418/firefox-19.0a2.en-US.linux-x86_64.crashreporter-symbols.zip PROCESS-CRASH | automation.py | application crashed [@ libc-2.11.so + 0xd4aa3] Crash dump filename: /tmp/tmpXMOrul/minidumps/451a7cee-12ba-4cd4-039396d9-6fa8d400.dmp Operating system: Linux 0.0.0 Linux 2.6.31.5-127.fc12.x86_64 #1 SMP Sat Nov 7 21:11:14 EST 2009 x86_64 CPU: amd64 family 6 model 23 stepping 10 2 CPUs Crash reason: SIGABRT Crash address: 0x1f4000008a1 Thread 0 (crashed) 0 libc-2.11.so + 0xd4aa3 rbx = 0x00007f3b2a4d4d00 r12 = 0x00000000ffffffff r13 = 0x00000034d4ce5160 r14 = 0x0000000000000008 r15 = 0x00007f3b456596d8 rip = 0x00000034d2ed4aa3 rsp = 0x00007fff2068fa40 rbp = 0x0000000000000008 Found by: given as instruction pointer in context 1 libxul.so!PollWrapper [nsAppShell.cpp:4287525881ec : 35 + 0xd] rip = 0x00007f3b42b8409e rsp = 0x00007fff2068fa70 Found by: stack scanning 2 libglib-2.0.so.0.2200.2 + 0x3c9fb rbx = 0x00007f3b456596d0 r12 = 0x00007f3b42b84070 rip = 0x00000034d4a3c9fc rsp = 0x00007fff2068fa90 rbp = 0x00007f3b2a4d4d00 Found by: call frame info 3 libglib-2.0.so.0.2200.2 + 0x2e4ac7 rip = 0x00000034d4ce4ac8 rsp = 0x00007fff2068fa98 Found by: stack scanning 4 libglib-2.0.so.0.2200.2 + 0x2e4aff rip = 0x00000034d4ce4b00 rsp = 0x00007fff2068faa0 Found by: stack scanning 5 libpthread-2.11.so + 0x8daf rip = 0x00000034d3608db0 rsp = 0x00007fff2068faf8 Found by: stack scanning 6 libglib-2.0.so.0.2200.2 + 0x3cd39 rip = 0x00000034d4a3cd3a rsp = 0x00007fff2068fb10 Found by: stack scanning 7 libxul.so!nsAppShell::ProcessNextNativeEvent(bool) [nsAppShell.cpp:4287525881ec : 135 + 0xa] rip = 0x00007f3b42b8405f rsp = 0x00007fff2068fb40 Found by: stack scanning 8 libxul.so!nsBaseAppShell::DoProcessNextNativeEvent(bool, unsigned int) [nsBaseAppShell.cpp:4287525881ec : 139 + 0x5] rip = 0x00007f3b42b89ec9 rsp = 0x00007fff2068fb50 Found by: call frame info 9 libxul.so!nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, bool, unsigned int) [nsBaseAppShell.cpp:4287525881ec : 298 + 0x4] rbx = 0x00007f3b37b53080 r12 = 0x00000000002aa076 rip = 0x00007f3b42b8a081 rsp = 0x00007fff2068fb80 rbp = 0x00007f3b45625d40 Found by: call frame info 10 libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:4287525881ec : 600 + 0x7] rbx = 0x00007f3b45625d40 r12 = 0x0000000000000001

Ed Morley [:emorley]

Comment 1

•

13 years ago

Don't suppose you have the log/tbpl url? :-)

Richard Newman [:rnewman]

Reporter

Comment 2

•

13 years ago

Sorry, haven't had my coffee yet! https://tbpl.mozilla.org/php/getParsedLog.php?id=18162696&tree=Mozilla-Aurora

Ed Morley [:emorley]

Comment 3

•

13 years ago

Thank you :-)

Comment hidden (Legacy TBPL/Treeherder Robot)

Jeff Hammel

Comment 10

•

13 years ago

ABICT, this isn't a talos bug?

Component: Talos → Mochitest Chrome

Comment hidden (Legacy TBPL/Treeherder Robot)

Phil Ringnalda (:philor)

Comment 14

•

13 years ago

Charmingly enough, it's not just Linux64 and it's not particularly intermittent either. https://tbpl.mozilla.org/?tree=Mozilla-Aurora&rev=2f801d18884d was the first rev to build nightlies after 19 merged to Aurora, and it hit this. Since then, there have been only 3 or 4 Linux browser-chrome runs that should have been running against the nightly build which have not hit this (which may mean it's nearly permaorange rather than actually permaorange, or may mean that tests against nightlies don't always actually run on the nightly, I didn't investigate them). The "crash" signature is different between Linux64 and Linux32, but I don't think that's significant, it's just where they happen to be sitting idling when the timeout kills them. The visible and possibly significant differences between working runs on dep jobs and failing runs on nightlies seem to be that testpilot is enabled and installed, and that the nightlies have that line as shown abbreviated in comment 0 about "TEST-INFO | (browser-test.js) | Console message: PAC file installed from data:text/plain,function%20FindProxyForURL".

Blocks: 784681

Component: Mochitest Chrome → BrowserTest

Hardware: x86_64 → All

Version: Trunk → 19 Branch

Phil Ringnalda (:philor)

Comment 16

•

13 years ago

Repros on try (https://tbpl.mozilla.org/?tree=Try&rev=5ab7bc28b255) with "export MOZ_UPDATE_CHANNEL=aurora" so that testpilot winds up installed (and failed to repro when I took a different and failed approach to getting testpilot installed by just hacking at extension/Makefile.in). Not sure if there are other byproducts of setting MOZ_UPDATE_CHANNEL, though. https://tbpl.mozilla.org/php/getParsedLog.php?id=18352661&tree=Mozilla-Aurora https://tbpl.mozilla.org/php/getParsedLog.php?id=18352610&tree=Mozilla-Aurora

Phil Ringnalda (:philor)

Comment 17

•

13 years ago

https://tbpl.mozilla.org/php/getParsedLog.php?id=18365999&tree=Mozilla-Aurora https://tbpl.mozilla.org/php/getParsedLog.php?id=18365871&tree=Mozilla-Aurora

Phil Ringnalda (:philor)

Comment 18

•

13 years ago

https://tbpl.mozilla.org/php/getParsedLog.php?id=18385051&tree=Mozilla-Aurora https://tbpl.mozilla.org/php/getParsedLog.php?id=18384978&tree=Mozilla-Aurora

Phil Ringnalda (:philor)

Comment 19

•

13 years ago

https://tbpl.mozilla.org/php/getParsedLog.php?id=18396759&tree=Mozilla-Aurora https://tbpl.mozilla.org/php/getParsedLog.php?id=18396778&tree=Mozilla-Aurora

Phil Ringnalda (:philor)

Comment 20

•

13 years ago

https://tbpl.mozilla.org/php/getParsedLog.php?id=18433842&tree=Mozilla-Aurora https://tbpl.mozilla.org/php/getParsedLog.php?id=18431387&tree=Mozilla-Aurora

Phil Ringnalda (:philor)

Comment 21

•

13 years ago

Not quite perma, since I only got the one https://tbpl.mozilla.org/php/getParsedLog.php?id=18463151&tree=Mozilla-Aurora out of https://tbpl.mozilla.org/?tree=Mozilla-Aurora&onlyunstarred=1&rev=32dba69af0fa and the linux32 one does appear to have downloaded the nightly.

Phil Ringnalda (:philor)

Comment 22

•

13 years ago

https://tbpl.mozilla.org/php/getParsedLog.php?id=18499507&tree=Mozilla-Aurora https://tbpl.mozilla.org/php/getParsedLog.php?id=18499093&tree=Mozilla-Aurora

Phil Ringnalda (:philor)

Comment 23

•

13 years ago

https://tbpl.mozilla.org/php/getParsedLog.php?id=18516021&tree=Mozilla-Aurora https://tbpl.mozilla.org/php/getParsedLog.php?id=18516027&tree=Mozilla-Aurora

Phil Ringnalda (:philor)

Comment 24

•

13 years ago

And not 19, since Aurora 20 is affected, so I'll bet what I really meant was "any build which includes testpilot, but the only ones of those where I see the tests are Aurora nightlies." https://tbpl.mozilla.org/php/getParsedLog.php?id=18605589&tree=Mozilla-Aurora https://tbpl.mozilla.org/php/getParsedLog.php?id=18603373&tree=Mozilla-Aurora

Phil Ringnalda (:philor)

Comment 25

•

13 years ago

https://tbpl.mozilla.org/php/getParsedLog.php?id=18671165&tree=Mozilla-Aurora https://tbpl.mozilla.org/php/getParsedLog.php?id=18671463&tree=Mozilla-Aurora

Ed Morley [:emorley]

Comment 26

•

13 years ago

akeybl, this is permaorange on Aurora nightlies. Aurora has now been closed since no one has been forthcoming in fixing it. Could you find someone with some cycles that could take a look?

Severity: critical → blocker

Flags: needinfo?(akeybl)

Phil Ringnalda (:philor)

Comment 27

•

13 years ago

Along with the Linux permaorange that came in on the 19 merge, Mac and Windows b-c became permaorange on the 20 merge - Mac with "leaking until shutdown" like https://tbpl.mozilla.org/php/getParsedLog.php?id=18817908&tree=Mozilla-Aurora and Windows with those plus failures a la https://tbpl.mozilla.org/php/getParsedLog.php?id=18822117&tree=Mozilla-Aurora which remind me a great deal of the Aurora bustage that turned out to be OOM from too many threads from bug 802239. Does testpilot spin up a huge number of threads? Do we really want to keep this situation where we run tests with testpilot included, but only on Aurora nightlies and not anywhere else?

Alex Keybl [:akeybl]

Comment 28

•

13 years ago

I'm going to reach out to fx-team, given the reproducible steps in comment 16.

tracking-firefox20: --- → +

Flags: needinfo?(akeybl)

(no longer active)

Comment 29

•

13 years ago

This is easily reproducible locally as well.

(no longer active)

Comment 30

•

13 years ago

So here's what happens. During testpilot init, we get to this code: <http://mxr.mozilla.org/mozilla-central/source/browser/app/profile/extensions/testpilot@labs.mozilla.com/modules/interface.js#87>. As part of BrowserToolboxCustomizeDone, the content area gets focused: <http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser.js#3681>. Later on, when we want to start running the tests, we get to this point: <http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/browser-test.js#281>. waitForWindowsState calls waitForFocus here: <http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/browser-test.js#154> and we attempt to wait for focus on the window, but the focus event is never dispatched since the element to be focused is already focused, but the focus manager's activeWindow property returns null, so we can't detect that case. nsFocusManager::WindowRaised seems to be responsible for updating mActiveWindow, and when I turn on focus manager logging, WindowRaised is called way after this stuff.

Assignee: nobody → ehsan

(no longer active)

Comment 31

•

13 years ago

Attached patch Work-around — Details — Splinter Review

This patch works around the problem by preventing the testpilot extension from trying to customize the toolbar and hence screwing with focus.

(no longer active)

Updated

•

13 years ago

Attachment #703412 - Attachment is patch: true

Attachment #703412 - Flags: review?(enndeakin)

(no longer active)

Updated

•

13 years ago

Depends on: 831854

(no longer active)

Comment 32

•

13 years ago

Filed bug 831854 as a follow-up to fix this issue for real. I think I'll go ahead and push the patch pending post-landing review. I've already wasted enough time on this.

(no longer active)

Comment 33

•

13 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/1b1be4ac343f https://hg.mozilla.org/releases/mozilla-aurora/rev/12f52471747d

status-firefox20: --- → fixed

Ed Morley [:emorley]

Comment 34

•

13 years ago

I've filed bug 832050 for making sure that Nightly-only test breakage is more obvious (they are currently indistinguishable from pgo test results, meaning a later green pgo result implies it was only an intermittent).

Phil Ringnalda (:philor)

Comment 35

•

13 years ago

After Ehsan's fix and finally getting runs on every platform, our status now is that Linux32, Mac, and WinXP only have "leaked until shutdown" errors from devtools tests, but Win7 and Linux64 have those plus things just like the symptoms of bug 798849 (timeouts in devtools tests, yeah, but get them out of the way and you have to deal with pdfjs timeouts, get them out of the way and you have to deal with addonmgr timeouts and browser_bug666317.js and a host of others) that bug 802239 fixed. Whether testpilot uses (or leaks) a ton of memory, or we're right on the threshold anyway and it pushes us over, or it's something else, we *look* exactly like we do when we're OOM.

Assignee: ehsan → nobody

OS: Linux → All

Summary: Permaorange browser-chrome on Aurora Linux and Linux64 nightly builds: TEST-UNEXPECTED-FAIL | automation.py | application timed out after 330 seconds with no output | followed by PROCESS-CRASH | automation.py | application crashed → Permaorange browser-chrome on Aurora nightly builds

Phil Ringnalda (:philor)

Comment 36

•

13 years ago

Attached patch Extreme measures (obsolete) — Details — Splinter Review

Couple of choices: You can pass this bug around through your top generalists, khuey and bz and roc and dbaron and bsmedberg and billm and karlt and I'll think of the next set of people who aren't afraid to look at something that could be coming from any part of the codebase when I need to, until you hit on one who wants to land something on aurora badly enough to borrow a slave (since they're unlikely to have a sufficiently hobbled machine to let them repro OOM) and figure out what's going wrong remotely. Or you can just land this patch, stop building testpilot on the only tree where we actually look at the results of testing with it built, and reopen aurora in a few hours. Personally, I can't quite decide which choice I'd take, if I were in the unfortunate position to choose.

Attachment #703732 - Flags: review?(akeybl)

Phil Ringnalda (:philor)

Updated

•

13 years ago

status-firefox20: fixed → affected

Phil Ringnalda (:philor)

Comment 37

•

13 years ago

I installed testpilot, and when I went looking for the active tests that would explain why we are shipping it, it looks like there's one (either active or forgotten) for Thunderbird, and the last active Firefox tests were in the spring of 2011. I take it back, I *can* decide whether I'd take a weeks-long closure of aurora while burning the time of some of our most expensive developers or stop shipping an addon that hasn't done anything for a year and a half.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 38

•

13 years ago

So do we have a good regression window for this? It looks like it started before the last merge (which was January 7?)... which makes me puzzled as to why it's not happening on beta now too.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 39

•

13 years ago

I've been scrolling way down on https://tbpl.mozilla.org/?tree=Mozilla-Aurora&jobname=Rev3%20Fedora%2012x64%20mozilla-aurora%20pgo%20test%20mochitest-browser-chrome (though I suppose I could have pulled the nightly changeset hashes off FTP); hopefully I'll have an answer at some point.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 40

•

13 years ago

Actually, I'm guessing it's not showing up on Beta because we don't do nightlies on Beta (or at least there aren't any on tbpl). And as I scrolled further down, I realized it probably was the previous merge (when 19 merged to aurora), so I pulled: https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2012/11/2012-11-19-04-20-13-mozilla-aurora/firefox-18.0a2.en-US.linux-x86_64.txt https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2012/11/2012-11-20-04-20-14-mozilla-aurora/firefox-19.0a2.en-US.linux-x86_64.txt which led to: https://tbpl.mozilla.org/?tree=Mozilla-Aurora&rev=edc2aedfaed5 https://tbpl.mozilla.org/?tree=Mozilla-Aurora&rev=5f19747d3410 But I guess philor found that already in comment 14; I should have read more closely. Why don't I put it in the summary where it belongs so others don't do the same, at least.

Summary: Permaorange browser-chrome on Aurora nightly builds → Permaorange browser-chrome on Aurora Linux nightly builds since merge of Firefox 19 to aurora

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Updated

•

13 years ago

Summary: Permaorange browser-chrome on Aurora Linux nightly builds since merge of Firefox 19 to aurora → Permaorange browser-chrome on Aurora Linux nightly builds since merge of Firefox 19 to aurora due to presence of testpilot changing focus

Marco Bonardo [:mak]

Comment 41

•

13 years ago

Comment on attachment 703412 [details] [diff] [review] Work-around Review of attachment 703412 [details] [diff] [review]: ----------------------------------------------------------------- nit: would have been better to keep the testpilot prefs near each other (there was already one some rows below)

Attachment #703412 - Flags: review?(enndeakin) → review+

(no longer active)

Comment 42

•

13 years ago

(In reply to Marco Bonardo [:mak] from comment #41) > Comment on attachment 703412 [details] [diff] [review] > Work-around > > Review of attachment 703412 [details] [diff] [review]: > ----------------------------------------------------------------- > > nit: would have been better to keep the testpilot prefs near each other > (there was already one some rows below) (Landed on trunk as https://hg.mozilla.org/integration/mozilla-inbound/rev/7d5fdfc2b165, totally not worth backporting to Aurora)

(no longer active)

Comment 43

•

13 years ago

Comment on attachment 703732 [details] [diff] [review] Extreme measures FWIW I'd take this if it gives us all green nightly builds. AFAIK we're not actually running any user studies through this extension.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 44

•

13 years ago

So I'd missed comment 27, though I think the new failures likely belong in another bug. However, to test philor's theory in comment 36 that all of the failures are due to testpilot, which after discussion appears not to have been confirmed, I did two try runs off of aurora, one with testpilot: https://tbpl.mozilla.org/?tree=Try&rev=09687ee6aec9 and one without: https://tbpl.mozilla.org/?tree=Try&rev=54bfcb35a934 (at least assuming I did it correctly).

Phil Ringnalda (:philor)

Comment 45

•

13 years ago

(In reply to David Baron [:dbaron] from comment #44) > I did two try runs off of aurora, one with testpilot: > https://tbpl.mozilla.org/?tree=Try&rev=09687ee6aec9 That didn't actually get testpilot for you, because you have to export the env var before http://mxr.mozilla.org/mozilla-aurora/source/browser/config/mozconfigs/linux32/nightly#1 (or redo that line after you export it in the override).

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 46

•

13 years ago

Indeed. New pair, overriding the configure option directly: As things are now on aurora, with aurora update channel: https://tbpl.mozilla.org/?tree=Try&rev=3186beecad30 Plus removal of testpilot: https://tbpl.mozilla.org/?tree=Try&rev=0d401b02bc4a

(no longer active)

Comment 47

•

13 years ago

(In reply to comment #46) > Indeed. New pair, overriding the configure option directly: > > As things are now on aurora, with aurora update channel: > https://tbpl.mozilla.org/?tree=Try&rev=3186beecad30 > > Plus removal of testpilot: > https://tbpl.mozilla.org/?tree=Try&rev=0d401b02bc4a Seems like the second push is burning at least on Linux and Mac.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 48

•

13 years ago

New second try push: https://tbpl.mozilla.org/?tree=Try&rev=bd8b84c0b9b1 in which: https://hg.mozilla.org/users/dbaron_mozilla.com/patches-aurora/raw-file/2a9366c139d9/no-aurora-testpilot replaces attachment 703732 [details] [diff] [review] from comment 36.

Phil Ringnalda (:philor)

Comment 49

•

13 years ago

Current status: Aurora is closed because Linux64 and Win7 browser-chrome against nightlies fail in a way which stops the test suite from finishing, making it impossible to tell whether any new failures have been added. On those two platforms we get a complex of test failures which looks exactly like the bug 798849 OOM failures that we hit in both June/July and October 2012. We have no idea what fixed them in June/July; in October it turned out that we were winding up with ~300 storage threads.

Summary: Permaorange browser-chrome on Aurora Linux nightly builds since merge of Firefox 19 to aurora due to presence of testpilot changing focus → Permaorange browser-chrome on Aurora Linux64 and Win7 nightly builds since merge of Firefox 20 to Aurora

Version: 19 Branch → 20 Branch

Phil Ringnalda (:philor)

Updated

•

13 years ago

Crash Signature: [@ libc-2.11.so@0xd4aa3]

Phil Ringnalda (:philor)

Comment 50

•

13 years ago

Comment on attachment 703732 [details] [diff] [review] Extreme measures I don't know why I'm surprised that this awful method of enabling or disabling building an extension based on overloading an env var instead of using configure leads to confusion and bustage.

Attachment #703732 - Attachment is obsolete: true

Attachment #703732 - Flags: review?(akeybl)

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 51

•

13 years ago

So this pair of try runs: > As things are now on aurora, with aurora update channel: > https://tbpl.mozilla.org/?tree=Try&rev=3186beecad30 > > Plus removal of testpilot: > https://tbpl.mozilla.org/?tree=Try&rev=bd8b84c0b9b1 seems to show that disabling testpilot fixes all of the browser-chrome failures. (Ignore the android builds; the mechanism I used to override the update channel setting and simulate a nightly on try didn't apply to them anyway due to a build system bug that I'll prepare an m-c patch for shortly; I'm not sure why they're orange, though.)

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 52

•

13 years ago

So let me try to summarize the current state of what's going on here: Our continuous integration testing on TBPL generates builds for pushes, and runs tests on them, occasionally coalescing them. This happens on all of our active development branches. On mozilla-central and mozilla-aurora (but not mozilla-beta or mozilla-release, I think), the nightly builds we generate also show up on TBPL, and have unit tests run on them. This bug covers a set of permanent test failures (perma-oranges) that occur *only* on the unit tests of nightly builds (which differ in some ways from the other builds, most notably by setting the update channel) and not on the unit tests of the push-generated builds. Furthermore, these test failures are happening only on Aurora, and the Aurora tree is currently closed for those failures. Disabling the testpilot extension fixes *all* of these failures (see comment 51); the patch to disable it is the patch linked in comment 48. Since building testpilot is conditional on the update channel being aurora or beta, the only place we run unit tests on builds with testpilot is the unit tests we run of nightly builds on mozilla-aurora. These failures (again, all fixed by disabling testpilot) were introduced at separate points: (a) when Firefox 19 merged to aurora, we introduced a focus-related perma-orange on the browser-chrome tests on Linux. This permaorange was worked around yesterday by https://hg.mozilla.org/releases/mozilla-aurora/rev/12f52471747d and bug 831854 covers fixing it better. (b) when Firefox 20 merged to aurora, additional browser-chrome failures were introduced. These failures were similar to failures previously observed twice before (see comment 49) (c) There was also a set of leaks from devtools tests, investigated in bug 824016 rather than this bug, which I believe (but am not sure) were also introduced when Firefox 20 merged to aurora. These tests have been disabled in https://hg.mozilla.org/releases/mozilla-aurora/rev/a8d6394508a3 after a set of attempts to fix them failed. Since that fix was not included in the with-and-without testpilot comparative try runs in comment 51 (though the previous attempts to fix those failures were), these devtools leaks also appear related to testpilot. I am aware of three options going forward: (1) Decide that our push-based testing is sufficient test coverage and that we're ok reopening the aurora tree with permanent test failures in the tests of *nightly* builds, and reopen mozilla-aurora. (jlebar and I were advocating this in the thread on dev-platform; ehsan was against, as I think were some others; this was before option (2) was confirmed to be an available solution.) (2) Disable the testpilot extension on aurora using the patch in comment 48, and reopen mozilla-aurora. comment 43 says that we're not currently running any studies using testpilot (and also that ehsan supports this solution). (3) Continue to hold mozilla-aurora closed for further investigation of the group (b) failures above. This does not provide a clear path to reopening or to shipping Firefox 20.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 53

•

13 years ago

(In reply to David Baron [:dbaron] from comment #52) > (c) There was also a set of leaks from devtools tests, investigated in bug > 824016 rather than this bug, which I believe (but am not sure) were also > introduced when Firefox 20 merged to aurora. philor confirms that these were indeed introduced when Firefox 20 merged to aurora.

David Baron :dbaron: (⌚️UTC-5, no longer working on Mozilla)

Comment 54

•

13 years ago

One other point to add to the summary, actually: tests of nightlies aren't currently distinguished on tbpl from tests of pgo builds (bug 832050 covers fixing this). This meant that *all* of the failures described in this bug appeared to be intermittent failures rather than permanent failures unless they were examined very closely. That's one of the reasons it took so long for these failures to lead to the tree being closed.

(no longer active)

Comment 55

•

13 years ago

I support option 2 in comment 52.

:Ms2ger (he/him; ⌚ UTC+1/+2)

Comment 56

•

13 years ago

https://hg.mozilla.org/mozilla-central/rev/1b1be4ac343f

Assignee: nobody → ehsan

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla21

Phil Ringnalda (:philor)

Updated

•

13 years ago

Assignee: ehsan → nobody

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Alex Keybl [:akeybl]

Comment 57

•

13 years ago

(In reply to :Ehsan Akhgari from comment #55) > I support option 2 in comment 52. Agreed - Cheng and Jinghua (the main creators of testpilot surveys) hopefully don't have any urgent surveys in the short term while we continue our investigation. a=akeybl on option 2.

Alex Keybl [:akeybl]

Comment 58

•

13 years ago

When I say I'm in support of option 2, I am assuming that we'll continue to investigate and find a final resolution allowing testpilot surveys on Aurora soon, of course.

Phil Ringnalda (:philor)

Comment 59

•

13 years ago

Landed the patch to turn off building testpilot on aurora in https://hg.mozilla.org/releases/mozilla-aurora/rev/c489c87349b5

Target Milestone: mozilla21 → ---

Phil Ringnalda (:philor)

Updated

•

13 years ago

Depends on: 832702

Phil Ringnalda (:philor)

Comment 60

•

13 years ago

Filed bug 832702 - Reenable building testpilot on mozilla-aurora when it no longer causes test failures, dependent on bug 832703 - testpilot causes browser-chrome leaks on Mac and Linux and bug 832705 - Complex of OOM failures in Linux64 and Win7 browser-chrome tests with testpilot enabled.

Phil Ringnalda (:philor)

Comment 61

•

13 years ago

Aurora's reopened.

Status: REOPENED → RESOLVED

Closed: 13 years ago → 13 years ago

Resolution: --- → FIXED

Phil Ringnalda (:philor)

Comment 62

•

13 years ago

And once the light of future merges dawned on me, pushed to m-c in https://hg.mozilla.org/mozilla-central/rev/4919e8091542

Gregg Lind (Fx Strategy and Insights - Shield - Heartbeat )

Comment 63

•

13 years ago

We use Test Pilot all the time, and continually deploy new tests on it. The situation with its code is bad, and we are trying to decide what the best way to handle this going forward is... Fix 1.2? Build 2.0?

Gregg Lind (Fx Strategy and Insights - Shield - Heartbeat )

Updated

•

13 years ago

Depends on: 840108

Ed Morley [:emorley]

Updated

•

13 years ago

Severity: blocker → critical

Alex Keybl [:akeybl]

Comment 64

•

13 years ago

(In reply to Gregg Lind (User Research - Test Pilot) from comment #63) > We use Test Pilot all the time, and continually deploy new tests on it. > > The situation with its code is bad, and we are trying to decide what the > best way to handle this going forward is... > > Fix 1.2? Build 2.0? Do you have an ETA on owners for bug 832703 and bug 832705?

Assignee: nobody → glind

Gregg Lind (Fx Strategy and Insights - Shield - Heartbeat )

Updated

•

13 years ago

Depends on: 841029

Lukas Blakk [:lsblakk] use ?needinfo

Comment 65

•

13 years ago

Gregg: this is tracking for Firefox 20 which is now on Beta and will ship in 6 weeks - anything you can do to advance the investigation here (put additional pressure on the 6 day old bug about getting a 64 bit machine?)?

Lukas Blakk [:lsblakk] use ?needinfo

Comment 66

•

13 years ago

Moving this over to FF21 tracking (current Aurora) as I don't believe there is anything to do here for FF20.

status-firefox20: affected → unaffected

status-firefox21: --- → affected

tracking-firefox20: + → -

tracking-firefox21: --- → +

Phil Ringnalda (:philor)

Comment 67

•

13 years ago

Well, yes and no - there's absolutely no reason to believe that 20-on-beta isn't leaking and OOMing just because we don't actually run the tests (or to be more painfully accurate, just because we run the tests, on release builds, but absolutely positively not one person ever looks at the results of the tests) that would tell us that it is. As far as I know we haven't done any investigation about whether any of the test failures, those two or the screwy focus that we "fixed" by insisting that the addon stop customizing the toolbar, were actually things that users would also see.

Gregg Lind (Fx Strategy and Insights - Shield - Heartbeat )

Comment 68

•

13 years ago

(Ask: Real-time help me build and test on Linux-64-opt) I am blocked on this, honestly. I need some real-time help building this on Unix and running the tests. I have a build host, and have done mach builds on OSX, but unless I can get someone to walk me through the simplest testing / patching path, I really am failing at doing this. I want to fix this, and have time authorized to fix this, but the cost of re-figuring out the build/test process without guidance is very very expensive. Help me lower it :)

Flags: needinfo?

Lukas Blakk [:lsblakk] use ?needinfo

Comment 69

•

13 years ago

Taking Gregg off this bug for now, since assigning Neil to bug 831854 looks to be the next steps here. Also marking this tracking again for FF20 since, as philor calls out, we do need this test suite running prior to FF20 release to ensure we are not leaking and OOMing.

Assignee: glind → nobody

status-firefox20: unaffected → affected

tracking-firefox20: - → +

Flags: needinfo?

(no longer active)

Comment 70

•

13 years ago

(In reply to comment #69) > Taking Gregg off this bug for now, since assigning Neil to bug 831854 looks to > be the next steps here. Also marking this tracking again for FF20 since, as > philor calls out, we do need this test suite running prior to FF20 release to > ensure we are not leaking and OOMing. Note that it might be possible to work around the focus issue in testpilot in case we won't have an immediate fix for bug 831854.

Robert Kaiser

Comment 71

•

13 years ago

What's the status here? The doors are closing pretty soon on 20 and this is still marked for tracking that one...

Alex Keybl [:akeybl]

Comment 72

•

13 years ago

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #71) > What's the status here? The doors are closing pretty soon on 20 and this is > still marked for tracking that one... Is there any risk here outside of Test Pilot? If not, we can untrack for FF20 at this point (releasing in 2 weeks).

Flags: needinfo?(ehsan)

Phil Ringnalda (:philor)

Comment 73

•

13 years ago

Nothing outside of testpilot - it needed the flag that doesn't exist, tracking-the-20-betas, since they may or may not have leaked and OOMed, but we don't build or ship testpilot with releases, so at this point it's 20-whatever and on to shipping 21 betas that may or may not leak and OOM.

(no longer active)

Comment 74

•

13 years ago

Yeah, what philor said.

Flags: needinfo?(ehsan)

Alex Keybl [:akeybl]

Updated

•

13 years ago

status-firefox20: affected → wontfix

tracking-firefox20: + → -

Alex Keybl [:akeybl]

Updated

•

13 years ago

tracking-firefox21: + → ---

Alex Keybl [:akeybl]

Comment 75

•

13 years ago

We untracking in favor of bug 840108.

:Gavin Sharp [email: gavin@gavinsharp.com]

Updated

•

13 years ago

No longer depends on: 840108

Nobody; OK to take it and work on it

Assignee

Updated

•

8 years ago

Component: BrowserTest → Mochitest

Work-around 13 years ago (no longer active) 1.03 KB, patch	mak : review+	Details \| Diff \| Splinter Review
Extreme measures 13 years ago Phil Ringnalda (:philor) 680 bytes, patch		Details \| Diff \| Splinter Review