Before offering an update, check that the build starts up afterwards
Categories
(Release Engineering :: General, defect, P3)
Tracking
(firefox88 fixed)
Tracking | Status | |
---|---|---|
firefox88 | --- | fixed |
People
(Reporter: khuey, Assigned: sfraser)
References
Details
Attachments
(4 files)
Comment 1•14 years ago
|
||
Comment 2•14 years ago
|
||
Reporter | ||
Comment 3•14 years ago
|
||
Comment 4•14 years ago
|
||
Comment 5•14 years ago
|
||
Reporter | ||
Comment 6•14 years ago
|
||
Comment 7•14 years ago
|
||
Comment 9•11 years ago
|
||
Updated•11 years ago
|
Comment 10•10 years ago
|
||
Assignee | ||
Updated•8 years ago
|
Comment 11•7 years ago
|
||
Updated•7 years ago
|
Comment 12•4 years ago
|
||
Joel, if we wanted the most robust test harness to verify that firefox starts up with minimal intermittents, what would you choose?
Comment 13•4 years ago
|
||
technically to build the profile we use for pgo (aka shippable), we run the browser and execute some perf tests. If we are just looking for it to startup and load a simple page and that is it, then anything would work- one concern I have is that any test that is a separate task (aka another know test harness) could be more complicated to wire into the creation of shippable.
As it stands every single harness has a lot of failures, the least amount of failures are non browser interactive ones. Even simple ones with few tests still have regular intermittent failures every day.
is the concern that the shippable build fails during the pgo process?
Comment 14•4 years ago
|
||
I imagine we would create a new test task that just starts up the browser.
The most recent case of the browser not starting up happened because of mac signing changes. We need to wire in a test that blocks publishing releases if Firefox doesn't start up. But that means we really want it to be rock solid in terms of intermittents.
I'd be fine with a single test in the harness: does Firefox start up? I'm not sure how to craft that test, or what the different harnesses do. It may involve loading a super simple html page that just says "hello world" or something, and shut down.
Comment 15•4 years ago
|
||
all our harnesses use marionette to start the browser- they require some type of profile, so we need to have a pre-seeded one with permissions set so we can manage the browser. Sadly we have a large volume of timeout failures when starting the browser, this is after marionette launches the browser and we seem to time out. This has been plaguing us for years on all harnesses.
maybe just the firefox-ui tests should run?
Comment 16•4 years ago
|
||
Are the firefox-ui tests more robust? I wonder if we can force-kill and retry on a shorter timeout before the task times out.
I would lean towards stripping all the tests out of the manifest except for a single "does it run" test. This would block nightlies from shipping 2x/day, betas 3x/week, esr+releases 2x/4weeks, all of which we want to happen quickly and without intermittents, so reducing the test set to 1 seems prudent.
Comment 17•4 years ago
|
||
I am not aware of any existing test harness that loads the browser that is not failing intermittently. I would make the process support intermittents. Our builds are intermittent, lint jobs intermittent, etc.
Comment 18•4 years ago
|
||
Do we even need a real test harness for this? I admit I'm looking at this very simply, but to me, all we need at a high level is:
- Download the fully signed browser
- Run it, and make sure it stays running for N seconds
- End task successfully
We obviously need some sort of wrapper to do the launching and checking to see if it's running, but unless I'm missing something, a very simple python or even bash script should be enough.
Comment 19•4 years ago
|
||
It could run but
4. hang/freeze
5. crash the content process
6. fail to connect to servers
...
Comment 20•4 years ago
|
||
(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #19)
It could run but
4. hang/freeze
5. crash the content process
6. fail to connect to servers
...
IMO that is not what we're trying to solve here, at least not immediately. I strongly suggest that we start with something simple, and then enhance it later. The 1, 2 & 3 from my previous comment will protect us from start up crashes and code signing issues already, and that's better than what we have today.
Comment 21•4 years ago
•
|
||
(In reply to bhearsum@mozilla.com (:bhearsum) from comment #18)
Do we even need a real test harness for this? I admit I'm looking at this very simply, but to me, all we need at a high level is:
- Download the fully signed browser
- Run it, and make sure it stays running for N seconds
- End task successfully
We obviously need some sort of wrapper to do the launching and checking to see if it's running, but unless I'm missing something, a very simple python or even bash script should be enough.
Perfect. It does sound like the original bug wanted us to download the previous build, apply an update, and verify it starts up. I'm guessing the new signature+notarization changes will apply during the update, so that will essentially test the current build? or we could have 2 flavors, one bare current build, the other previous build + update to current build.
[edit]: we do need to double-check if the way we launch the build matters. (If using the commandline doesn't hit the same quarantine issues as double-clicking the app, for example.)
Comment 22•4 years ago
|
||
(In reply to Aki Sasaki [:aki] (he/him) from comment #21)
(In reply to bhearsum@mozilla.com (:bhearsum) from comment #18)
Do we even need a real test harness for this? I admit I'm looking at this very simply, but to me, all we need at a high level is:
- Download the fully signed browser
- Run it, and make sure it stays running for N seconds
- End task successfully
We obviously need some sort of wrapper to do the launching and checking to see if it's running, but unless I'm missing something, a very simple python or even bash script should be enough.
Perfect. It does sound like the original bug wanted us to download the previous build, apply an update, and verify it starts up. I'm guessing the new signature+notarization changes will apply during the update, so that will essentially test the current build? or we could have 2 flavors, one bare current build, the other previous build + update to current build.
IIRC, the original bug was written with the idea that this would be done as part of update verify, which would require us to have it for nightly (we don't). This would be another wonderful enhancement to have down the line.
My idea in theory leaves us with the risk that the dmg and applying the mar end up with a different .app, which means the mar case is untested. These days the risk of that is very, very low (even without update verify), but not zero. The biggest risk is when new files are added or removed, and we forget to update the update manifest.
Comment 23•4 years ago
|
||
Aha. We will get update verify in nightly graphs if/when we move nightly automation to shipit + relpro. Maybe we start off with a simple "start up Firefox, make sure it's running, and kill it (and clean up afterwards)" and leave update verify for the nightly shipit/relpro roadmap item.
Comment 24•4 years ago
|
||
you will need to create a custom profile most likely; mozbase has mozprofile for profile creation, and mozprocess for managing processes (hangs, terminate, stdout, etc.), also mozcrash can look for crashes.
Comment 25•4 years ago
|
||
I'm guessing we probably don't want a custom profile, and we should just start up Firefox and make sure it stays running for x amount of time as Ben said. I'd be worried about the profile getting out of date and causing issues with the test.
Comment 26•4 years ago
|
||
how will you know the test is doing anything- there are a lot of first run issues to sort out and in addition if you want to automate anything once the browser is launched (i.e. force upgrade, notify done, etc.), you will need some custom preferences set.
Comment 27•4 years ago
|
||
All we need to test is that Firefox starts up and doesn't crash immediately, so having Firefox running after n
seconds is the test. We need it to be robust. Because all of our test harnesses all have intermittents, bypassing them seems like the most viable path forward.
Comment 28•4 years ago
|
||
Joel, Aki and I synced up today to try to sort this out. We agreed that:
- The MVP here is to launch Firefox with a new profile, wait 15 seconds to make sure it doesn't crash, and then declare the task a success.
- Because this test will block updates, intermittent failures must be avoided at all costs
- We will not aim to use existing test harnesses for this, because they come with a lot of baggage/set-up time. We're hopeful that a shell or python script will be enough, and remain very simple. If this script ends up getting too complex, we can revisit using an existing harness (we don't want to accidentally grow a new full fledged test harness).
- We need to watch out for places on the filesystem where a previous task may interfere with the next one; we probably need to do some cleanup at the start of the script to help avoid this.
- Once implemented, we should start running this test, and wait a week or so to see how well it works before blocking updates on it.
Comment 29•4 years ago
|
||
It supports installing Mozilla applications with mozinstall, and simply running the thing it was instructed to download.
Comment 30•4 years ago
|
||
As far as I can tell, this was simply never implemented because it hasn't been needed until now.
Depends on D107543
Comment 31•4 years ago
|
||
Depends on D107544
Comment 32•4 years ago
|
||
Comment 33•4 years ago
|
||
bugherder |
Updated•4 years ago
|
Comment 34•4 years ago
|
||
This moves the startup tests to Tier 1 (required, because they will block a Tier 1 task), and adds them as a dependency for Balrog submission on both Nightly and Release branches.
Comment 35•4 years ago
|
||
Comment 36•4 years ago
|
||
bugherder |
Comment 37•4 years ago
|
||
This is now enabled on Nightly. It will ride the trains naturally to release branches. Let's track any potential follow-up work elsewhere.
Description
•