Open Bug 1339492 Opened 3 years ago Updated 2 years ago

Mozregression fails to launch OSX builds older than 2016-07-12

Categories

(Testing :: mozregression, defect)

Version 3
x86_64
macOS
defect
Not set

Tracking

(Not tracked)

People

(Reporter: jib, Assigned: pyang)

Details

Using mozregression 2.3.9.

STR:

★ ~/moz/mozilla-central $ mozregression --good 2016-07-11 --bad 2017-02-14
 0:05.51 INFO: Testing good and bad builds to ensure that they are really good and bad...
 0:05.51 INFO: Downloading build from: https://archive.mozilla.org/pub/firefox/nightly/2016/07/2016-07-11-14-37-35-mozilla-central/firefox-50.0a1.en-US.mac.dmg
===== Downloaded 100% =====
 0:21.75 INFO: Running mozilla-central build for 2016-07-11
 0:49.82 INFO: Launching /private/var/folders/mf/7kfd8gkj3mlcv2_14yxmwh2r0000gp/T/tmp9FtHCk/FirefoxNightly.app/Contents/MacOS/firefox
 0:49.82 INFO: Application command: /private/var/folders/mf/7kfd8gkj3mlcv2_14yxmwh2r0000gp/T/tmp9FtHCk/FirefoxNightly.app/Contents/MacOS/firefox -foreground -profile /var/folders/mf/7kfd8gkj3mlcv2_14yxmwh2r0000gp/T/tmpRf6m64.mozrunner
 0:49.83 INFO: application_buildid: 20160711143735
 0:49.83 INFO: application_changeset: 214884d507ee369c1cf14edb26527c4f9a97bf48
 0:49.83 INFO: application_name: Firefox
 0:49.83 INFO: application_repository: https://hg.mozilla.org/mozilla-central
 0:49.83 INFO: application_version: 50.0a1
Was this nightly build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry' or 'exit' and press Enter): 

Expected result: Firefox appears.

Actual result: Firefox never appears.

Workaround: mozregression --good 2016-07-12 --bad 2017-02-14
I just got burned by this and waste time figuring out what was going on. Console.app didn't seem to show anything relevant at quick glance. It would be good to announce this issue, especially if we don't intend to fix it.

I would guess this has something to do with GateKeeper.
Paul, would this be something you'd be able to look into? If not I can try to find some time to investigate.
Flags: needinfo?(pyang)
Assignee: nobody → pyang
Flags: needinfo?(pyang)
I tried to run 2016-07-11 directly and it showed "firefox(72123,0x7fffce8893c0 malloc: *** malloc_zone_unregister() failed for 9x7fffce87f000", looks like something wrong entirely.

As a followup I think we can get the message and display properly.
I tried 2016-01-01 from 
https://ftp.mozilla.org/pub/firefox/nightly/2016/01/2016-01-01-03-03-30-mozilla-central/firefox-46.0a1.en-US.mac.dmg
on my latest macbook pro. FirefoxNightly can't be launched and hung so we don't have a chance to know if gecko worked properly.
It might be machine/os version dependent issue if so we may try all combination.

moreover, mozregression actually capture log in comment 3 right after prompt.

For gecko side, I think if there is a welcome message/log will be nice but then we might still have edge case such like failed right after welcome log flushed.
In the mean time, what I can do immediately is to adjust position of prompt so that firefox's output message will be showed.

Will, what do you think?
Flags: needinfo?(wlachance)
If the problem is firefox-specific, I think as long as we show something in the console indicating that the launch failed we'd be ok here. Ideally we'd also log some kind of specific message in mozregression (in addition to the gecko log) like:

 0:49.83 ERROR: Application failed to launch

Does that seem doable?
Flags: needinfo?(wlachance)
After discussing with wlach, we can let user not to get build by certain date/period.  I'll check if we're able to set this config by platform.
Unfortunately I'm running into the same issue with my macOS 10.12.4 machine which is really hindering my ability to do my job right now. I've resorted back to using my Ubuntu VM machine as my main platform for mozregressions as I've wasted way to much time trying to figure out what was going on for the past few days.

As mentioned in comment#3, I'm getting the following error via the terminal when using --build-type debug (happens on regular builds as well) with builds from 2016-07-12 or older:

* firefox(45209,0x7fffbfdd43c0) malloc: *** malloc_zone_unregister() failed for 0x7fffbfdca000

Turning off the macOS gatekeeper via "sudo spctl --master-disable" under macOS Sierra didn't help either.

The main issue is that if a developer needs a regression range and the issue only occurs on macOS and is older than 2016-07-12, there's nothing QA can do right now other than maybe doing it manually which takes a very long time.
You need to log in before you can comment on or make changes to this bug.