Closed Bug 1349227 Opened 3 years ago Closed 2 years ago

[Meta] Ship 2 Firefox nightlies per day

Categories

(Release Engineering :: General, defect)

defect
Not set

Tracking

(firefox57 fixed)

RESOLVED FIXED
Tracking Status
firefox57 --- fixed

People

(Reporter: pascalc, Assigned: aki)

References

Details

Attachments

(1 file)

Currently we ship Nightly once a day, around 6AM PST which means that Nightly users (community and staff) in the EMEA and Asia timezones get their update in the afternoon/evening.

This situation is not ideal for our users in Europe who would like to get their Nightly update at the beginning of the day to have time to report bugs.

Furthermore, when a major functional regression (usually a top crash) is detected, we manually respin the builds but we don't do that for bugs that are annoyances even though they are often backed out or fixed in less than 24 hours. Having the possibility to update your nightly to a newer build sooner if you are very impacted by a regression is better than waiting 24h and would help with user retention.

We believe that having builds twice a day would help with Nightly retention in EMEA. In parallel, we would be less agressive with the big Update window that shows up on Nightly if you haven't applied a downloaded update in the last 12 hours.
Depends on: 1349234
Do we have a way to track Nightly retention in EMEA already? If so, can we establish what the baseline is now, so that we can see if changing the frequency of Nightly builds improves retention?

We'll need to adjust the frequency in two places: Taskcluster and Buildbot.
Yes we do have nightly ADIs per country so we should be able to see changes over time, however note that we have activities in Europe aiming at increasing our Nightly community, if we succeed in gaining a larger user base in EMEA, it could also be due to that. Maybe measuring our nightly numbers in some European countries were we do no or little promotion to get new users would work.
(In reply to Pascal Chevrel:pascalc from comment #0)
> Currently we ship Nightly once a day, around 6AM PST which means that
> Nightly users (community and staff) in the EMEA and Asia timezones get their
> update in the afternoon/evening.
> 
> This situation is not ideal for our users in Europe who would like to get
> their Nightly update at the beginning of the day to have time to report bugs.
> 
> Furthermore, when a major functional regression (usually a top crash) is
> detected, we manually respin the builds but we don't do that for bugs that
> are annoyances even though they are often backed out or fixed in less than
> 24 hours. Having the possibility to update your nightly to a newer build
> sooner if you are very impacted by a regression is better than waiting 24h
> and would help with user retention.
> 
> We believe that having builds twice a day would help with Nightly retention
> in EMEA. In parallel, we would be less agressive with the big Update window
> that shows up on Nightly if you haven't applied a downloaded update in the
> last 12 hours.

while agree with this idea as it is a good thing to ship more fixes faster and file more bugs,
bit specifically on how you guys are going o go with the partial updates.

since 2 updates will be provided, what if the user misses the first partial eg like now if user fails to update today's build and or skips and tomorrows nightly is full update instead of two partial updates?

This is a major concern as partial update are 10% of total update (currently a bit bigger as compression is enabled which increases the size) hence provides bandwidth savings for user as well as the org but full updates after missing a partial update will increasing this significantly.

Do you guys have any plan how to overcome this significant problem like, if partial update is missed apply one by one or just download full big just hogs users bandwidth?

Looking forward what you guys think.
P.S sorry new here and for bad english
We are generating partials for builds up to 4 days ago. So with two nightly builds per day, you could easily miss a day, and nearly 2. But yes, it divides the change to get a partial update served into the half.
(In reply to Henrik Skupin (:whimboo) from comment #4)
> We are generating partials for builds up to 4 days ago. So with two nightly
> builds per day, you could easily miss a day, and nearly 2. But yes, it
> divides the change to get a partial update served into the half.

We can generate more partials to compensate. Keep in mind bug 353804 though. If you've already downloaded an update, Firefox will not check for any newer ones. It's more likely that Firefox will find another update when you restart if we increase the frequency of Nightly builds.
(In reply to Henrik Skupin (:whimboo) from comment #4)
> We are generating partials for builds up to 4 days ago. So with two nightly
> builds per day, you could easily miss a day, and nearly 2. But yes, it
> divides the change to get a partial update served into the half.

Oh ok 
can it be tweaked to 4 days(8 builds) as just over one day will still be pretty close.

plus there is one more major issue which needs fixing which is if you try to update builds manually(just before partial is released) it downloads full builds(why do full builds are available before partial although partial is smaller and should be built faster/before full builds even if the jobs start together ??). Is this a known issue? Any fixes for that as this bug will trigger that more so .

So when will this bug be landing and ready for testing?
(In reply to Sunil Kumar from comment #6)
> plus there is one more major issue which needs fixing which is if you try to
> update builds manually(just before partial is released) it downloads full
> builds(why do full builds are available before partial although partial is
> smaller and should be built faster/before full builds even if the jobs start
> together ??). Is this a known issue? Any fixes for that as this bug will
> trigger that more so .

I thought that we don't do this anymore. Simon, mind giving us the details?
Flags: needinfo?(sfraser)
Among the things that ought to be considered before deciding to do this is how often you'll actually wind up with two *different* nightlies per day. Of course, awareness of there being two will shift when people merge (both intentionally, and because a nightly causes the need to avoid merging too close to when it starts, so along with the 1-3am time when merging to m-c is a bad idea, there will also be a 1-3pm time when merging to m-c is a bad idea), but it might be interesting to look at how often over the last couple of months the two nightlies would have been built from the same rev, or the same rev except for a meaningless hpkp/hsts/blocklist update. No matter what happens on weekdays, you're looking at a maximum of one merge per day on weekends, so either two pairs of identical nightlies or three identical and a single, or four identical, and it's probably a rare week when there aren't at least one or two days with only one set of merges to m-c, netting only 3-4 actually different half-nightlies a week.
(In reply to Henrik Skupin (:whimboo) from comment #7)
> (In reply to Sunil Kumar from comment #6)
> > plus there is one more major issue which needs fixing which is if you try to
> > update builds manually(just before partial is released) it downloads full
> > builds(why do full builds are available before partial although partial is
> > smaller and should be built faster/before full builds even if the jobs start
> > together ??). Is this a known issue? Any fixes for that as this bug will
> > trigger that more so .
> 
> I thought that we don't do this anymore. Simon, mind giving us the details?

I thought they weren't as well, but it does appear that the changes never got landed. I don't have access to do that, myself. I've poked the person who I was talking to in the original bug
Flags: needinfo?(sfraser)
(In reply to Sunil Kumar from comment #6)
> why do full builds are available before partial although partial is
> smaller and should be built faster/before full builds even if the jobs start
> together ??). Is this a known issue? Any fixes for that as this bug will
> trigger that more so .

You need the complete build and resulting MAR update file before you can generate the partial, i.e. you can't create the binary diff for the partial until you know everything that has changed in the new build.
(In reply to Chris Cooper [:coop] from comment #10)
> You need the complete build and resulting MAR update file before you can
> generate the partial, i.e. you can't create the binary diff for the partial
> until you know everything that has changed in the new build.

But we could delay offering this complete mar file to users until funsize completed the partial mar files. Given that we still have a huge delay between when builds are available and partial updates generated, we might ship the complete update to a lot of users, which is causing extra traffic on both sides.
(In reply to Henrik Skupin (:whimboo) from comment #11) 
> But we could delay offering this complete mar file to users until funsize
> completed the partial mar files. Given that we still have a huge delay
> between when builds are available and partial updates generated, we might
> ship the complete update to a lot of users, which is causing extra traffic
> on both sides.

As we move to TaskCluster and use funsize to generate partials, we *are* switching to publishing all updates (both partials and complete) at the end of that process. The trade-off is that there will be a longer delay before any update is offered.
(In reply to Henrik Skupin (:whimboo) from comment #11)
> (In reply to Chris Cooper [:coop] from comment #10)
> > You need the complete build and resulting MAR update file before you can
> > generate the partial, i.e. you can't create the binary diff for the partial
> > until you know everything that has changed in the new build.
> 
> But we could delay offering this complete mar file to users until funsize
> completed the partial mar files. Given that we still have a huge delay
> between when builds are available and partial updates generated, we might
> ship the complete update to a lot of users, which is causing extra traffic
> on both sides.

This is exactly what is happening right now.
can partial nightly updates be still available for 4 days(8 partial) after this since partials save lots of bandwidth *even* for a week compared to full builds.


(In reply to Chris Cooper [:coop] from comment #12)
> (In reply to Henrik Skupin (:whimboo) from comment #11) 
> > But we could delay offering this complete mar file to users until funsize
> > completed the partial mar files. Given that we still have a huge delay
> > between when builds are available and partial updates generated, we might
> > ship the complete update to a lot of users, which is causing extra traffic
> > on both sides.
> 
> As we move to TaskCluster and use funsize to generate partials, we *are*
> switching to publishing all updates (both partials and complete) at the end
> of that process. The trade-off is that there will be a longer delay before
> any update is offered.

Thanks for the explanation on why partial cannot be offered before.

How much of a *delay* are you taking about & *when* will this be implemented?
If two nightly's are offered in a day then won't mind a bit of delay.


Also can partial nightly updates be still available for 4 days(8 partial) after this lands?
Hi, here some insights from the sheriff workflow:

Nightlys are triggered for m-c every day at 3am Pacific which is after the timezone change on sunday - noon in europe. Then the nightlys generation take some time etc.


Sheriffs try to merge at least 2 times a day - "try" because a merge to m-c requires some special attention since we *don't* want to merge startup crashes that would bust nightly testers build nor we want to merge test regressions etc. As example one requirement is to have green pgo builds. So "merge changesets" from autoland and mozilla-inbound are special selected changesets and not just the tip rev that we merge.

Hurry like "omg its close to noon and i want/need to have something to merge now so that i makes the nightly" when the requirements for a merge to m-c are not matched is the perfect way to disasters and myself burned nightly's with startup crashes in the effort to merge something to m-c to make the nightlys - and everything beside pgo builds was green and finished....not just windows pgo builds ..and that were busted and so where the nightly then.

In terms of timelines Kwierso and I try at least this 2 merges (one in each of the sheriffs shift each) during workdays - so what i think we could do is

-> Maintaining this automated nightly trigger at 3am Pacific (or even move this a little more down) 
-> Sheriffs could trigger nightly builds when the 2nd merge (normally during the PST afternoon) manually (we also have a sheriff in taipei that could do that if Wes is out) to create that 2nd nightly that is available sometimes in the EU morning.

Triggering nightly builds is fairly easy (basically just pasting some rev id into a build self-serve webpage and for tc nightly builds into a taskcluster site) so i guess this would be doable from a sheriff workflow.
or maybe a automated task to trigger that 2nd nightly at the end of the PST workday like 6pm pacific or so would work too i guess
See https://bugzilla.mozilla.org/show_bug.cgi?id=932211#c5 for an alternate proposal.  I'm still drafting an email to send to various teams/lists.
Component: Release Automation → General Automation
QA Contact: rail → catlee
(In reply to Henrik Skupin (:whimboo) from comment #7)
> (In reply to Sunil Kumar from comment #6)
> > plus there is one more major issue which needs fixing which is if you try to
> > update builds manually(just before partial is released) it downloads full
> > builds(why do full builds are available before partial although partial is
> > smaller and should be built faster/before full builds even if the jobs start
> > together ??). Is this a known issue? Any fixes for that as this bug will
> > trigger that more so .
> 
> I thought that we don't do this anymore. Simon, mind giving us the details?

Will this be fixed soon?
Downloading entire build everyday or every other day kills bandwidth especially people like us who have 300/day,
plus Mozilla severs must be wasting bandwidth as most users are downloading full builds instead of partial low size builds
Flags: needinfo?(sfraser)
Flags: needinfo?(philringnalda)
Flags: needinfo?(pascalc)
Flags: needinfo?(hskupin)
Flags: needinfo?(cbook)
Flags: needinfo?(catlee)
Flags: needinfo?(philringnalda)
(In reply to Mefoster from comment #17)
> (In reply to Henrik Skupin (:whimboo) from comment #7)
> > (In reply to Sunil Kumar from comment #6)
> > > plus there is one more major issue which needs fixing which is if you try to
> > > update builds manually(just before partial is released) it downloads full
> > > builds(why do full builds are available before partial although partial is
> > > smaller and should be built faster/before full builds even if the jobs start
> > > together ??). Is this a known issue? Any fixes for that as this bug will
> > > trigger that more so .
> > 
> > I thought that we don't do this anymore. Simon, mind giving us the details?
> 
> Will this be fixed soon?
> Downloading entire build everyday or every other day kills bandwidth
> especially people like us who have 300/day,
> plus Mozilla severs must be wasting bandwidth as most users are downloading
> full builds instead of partial low size builds

That work is happening in bug 1324922. Please don't spam bugs with needinfo requests.
Flags: needinfo?(sfraser)
Flags: needinfo?(pascalc)
Flags: needinfo?(hskupin)
Flags: needinfo?(cbook)
Flags: needinfo?(catlee)
Aki, is there any blocker left for this to happen? Thanks
Flags: needinfo?(aki)
The alternate proposal does not block, if that's what you're asking.
I'm guessing we're only blocked on finding an owner.
Flags: needinfo?(aki)
Catlee, could you help finding an owner? This is still a recurring issue for us in Europe. Thanks!
Flags: needinfo?(catlee)
This is a pretty simple fix. It doesn't take into account whether any changes have been made since the previous nightly; if we plan on m-c being idle for all hands or whatever, we may want to back this out at those times.
I'm worried about possibly overlapping nightly funsize processes, especially if we end up triggering manual nightlies at some point.

Preferably we'd wait for bug 1324922 to be done.
Flags: needinfo?(catlee)
Not to mention potential disruptions to release processes due to limited worker capacity? Not sure how well we prioritize those jobs?
Chris, do you have an eta for the other bug? 

Ryan, do you have data on this or it is a supposition?
I'm pretty sure I've seen nightly respins hold up signing jobs for release activities in the past, but I can't point to a specific instance offhand. As long as we've ensured that priorities are set properly so that beta/release take precedence over m-c, we should be OK. Or alternatively, add more workers to account for the higher volume of jobs.
How about we try it for a week or two and see if we hit backlogs?
Comment on attachment 8901913 [details]
bug 1349227 - 2 nightlies per day on m-c.

https://reviewboard.mozilla.org/r/173346/#review179228
Attachment #8901913 - Flags: review?(catlee) → review+
Pushed by asasaki@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2f39c0312466
2 nightlies per day on m-c. r=catlee DONTBUILD
can anyone point me to where the addon-dev/unbranded dev builds without signing are going now?

the relevant tinderbox location hasn't been updated since 56 beta 5.
(In reply to Danial Horton from comment #32)
> can anyone point me to where the addon-dev/unbranded dev builds without
> signing are going now?
> 
> the relevant tinderbox location hasn't been updated since 56 beta 5.

I created this bugzilla, https://bugzilla.mozilla.org/show_bug.cgi?id=1385558 but no response so far. Seems Mozilla is just ignoring it.
(In reply to Danial Horton from comment #32)
> can anyone point me to where the addon-dev/unbranded dev builds without
> signing are going now?
> 
> the relevant tinderbox location hasn't been updated since 56 beta 5.

The addon-devel builds for Beta have been superseded by the Developer Edition builds. See bugs 1391283 and 1393071. Please direct further comments there, this bug is specifically about nightly scheduling.
How does this affect Nightly users? Will they now be prompted to update every 12 hours instead of every 24?

My experience as a Nightly user is that the 24 hour restart prompts are a bit annoying and easily the worst part of the Nightly user experience. If they double in frequency, that is even more annoying. I understand that it would be great on the rare case that a bad regression occurs, but that's not common.
Flags: needinfo?(aki)
I believe that's correct, we'll be prompted every 12 hours.
Flags: needinfo?(aki)
I would think/hope that the 'update' mechanism would be smart enough to see what version you are running before offering the update and since the builds are based on the same cset - there would be really no reason to push another 'update', only difference is time-stamp on the builds.

If this not the case (I've not tested) then perhaps a bug is needed to 'enhance' the updater.
I just spotted a 'one-off' from my statement above in looking at the 'treeherder' m-c builds I see a new Nightly has started which I assume would be the 2nd one for today and its NOT the same cset as this morning's build. 

So it would appear that yes, it will be possible to get two 'updates' per/day, in case the 2 Nightlys for today (USA wise) will have different cset's.  Not a big deal really.
(In reply to Nicholas Nethercote [:njn] from comment #35)
> How does this affect Nightly users? Will they now be prompted to update
> every 12 hours instead of every 24?
> 
> My experience as a Nightly user is that the 24 hour restart prompts are a
> bit annoying and easily the worst part of the Nightly user experience. If
> they double in frequency, that is even more annoying. I understand that it
> would be great on the rare case that a bad regression occurs, but that's not
> common.

That's one of the reasons update notification is now way more discreet and less pushy than it used to be. I don't know the exact timer, but I think it's not less than 24H and maybe 48H and more to have the more pushy version.
Some other people might add here the correct time delay.
Looking at the prefs in about:config...

Nightly checks for updates every 2 hours, release checks every 12 hours. Beta probably uses release's 12 hours. DevEdition probably picks up the Aurora setting of 8 hours.

Nightly shows an "update available" badge on the hamburger button immediately, everything else shows it after four days.

The big "update available" prompt shows up for nightly users after 12 hours, everything else shows it after 192 hours.
(In reply to Jim Jeffery not reading bug-mail 1/2/11 from comment #37)
> I would think/hope that the 'update' mechanism would be smart enough to see
> what version you are running before offering the update and since the builds
> are based on the same cset - there would be really no reason to push another
> 'update', only difference is time-stamp on the builds.

The two builds are based on the same cset? That seems strange to me. Maybe I have misunderstood how this works?


> Nightly shows an "update available" badge on the hamburger button
> immediately, everything else shows it after four days.
> 
> The big "update available" prompt shows up for nightly users after 12 hours,
> everything else shows it after 192 hours.

As a Nightly user I find the prompts very noticeable, especially the "update available" prompt. Because sometimes restarting isn't a big deal, but sometimes it will cause you to lose state (e.g. position within a page) and you want to delay it. As I said before, I have found that the frequent restarting is definitely the worst part about using Nightly.
(In reply to Nicholas Nethercote [:njn] from comment #41)
> As a Nightly user I find the prompts very noticeable, especially the "update
> available" prompt. Because sometimes restarting isn't a big deal, but
> sometimes it will cause you to lose state (e.g. position within a page) and
> you want to delay it. As I said before, I have found that the frequent
> restarting is definitely the worst part about using Nightly.

As a supplement and opponent, sometimes I want to update faster or even immediately to verify or experience a patch. So, if can provide a pref to configure the update interval, as well as a button to immediately update to the latest builds (tinderbox-builds or m-c builds), would be better.
(In reply to Nicholas Nethercote [:njn] from comment #41)
> As I said before, I have found that the frequent
> restarting is definitely the worst part about using Nightly.

Increase the app.update.interval if you want less automatic update.

http://kb.mozillazine.org/App.update.interval
> Increase the app.update.interval if you want less automatic update.

That's good to know, but defaults matter. How many people know about that option? It's not easily discoverable.
Is any one else getting full pushes?

I am getting full mar instead of partial if automatic updates are on or if I manually check during the time update are generally released,
if i wait few hours and check for updates then get partial and with 2 nightly per-day full updates are big no no.
(In reply to Nicholas Nethercote [:njn] from comment #44)
> > Increase the app.update.interval if you want less automatic update.
> 
> That's good to know, but defaults matter. How many people know about that
> option? It's not easily discoverable.

Yes, but if people do not update frequently enough, then there's no point in using Nightly.
The point is to be up-to-date to help discover issues and giving feedbacks.
An outdated build might even lead to send bad telemetry values which are not correct anymore.
(In reply to hulmelo from comment #45)
> I am getting full mar instead of partial if automatic updates are on or if I
> manually check during the time update are generally released,
> if i wait few hours and check for updates then get partial and with 2
> nightly per-day full updates are big no no.

Yes, please see comment 24. It's a known issue.
I agree with :njn that the prompts to restart are the worst part of using Nightly. Part of this is losing state, but also for example having to type in my master password and redoing HTTP authentication for one of my open tabs. So maybe the update notification can be somewhat decoupled from the availability of updates?

Separately, this makes two of my long-standing other gripes worse:

- Partial MARs AFAICT are still only available one update back, so if you miss one this increases the chance of having to redownload the whole thing, which just seems wasteful.

- When an update has been downloaded, Firefox no longer checks for following updates, so with this increase in frequency, chances also increase that I have to restart twice in a row (bug 353804).
(In reply to Dirkjan Ochtman (:djc) from comment #48)
> - Partial MARs AFAICT are still only available one update back, so if you
> miss one this increases the chance of having to redownload the whole thing,
> which just seems wasteful.

This is not correct. Partial updates are available up to 4 versions back since bug 1176550 was fixed two years ago.

Bug 1324922 means that complete updates are published before partials are ready though, so you may end up getting a complete if you check at the wrong time.
The "About Nightly" dialog displays the Firefox version, but only shows the day. 
For example: " 57.0a1 (2017-09-04) (64-bit) "
However, since there are 2 versions per day from now on, there is no way to tell which one of them I am using, by looking at the "About Nightly" dialog.
I think it should be fixed to display a version identifier that is unique to the Nightly version used.
There's about:buildconfig which gives the source url.
There is also the buildid in about:support
(In reply to Sylvestre Ledru [:sylvestre] from comment #52)
> There is also the buildid in about:support

Even though you can get the version id in other ways, that was not the issue. 
The issue was that IMO the "About Nightly" dialog should display a unique version id. Users expect the "About" dialog to display the version, and that it should be unique.
It did before, the date displayed was unique for each version, but not anymore after this change since there are 2 Nightly versions per day.
It can be fixed easily by adding the hour:minutes of the version to the "About Nightly" dialog, or display the build id.
You should open a separate bug for that :)
(In reply to nivtwig from comment #53)
> (In reply to Sylvestre Ledru [:sylvestre] from comment #52)
> > There is also the buildid in about:support
> 
> Even though you can get the version id in other ways, that was not the
> issue. 
> The issue was that IMO the "About Nightly" dialog should display a unique
> version id. Users expect the "About" dialog to display the version, and that
> it should be unique.
> It did before, the date displayed was unique for each version, but not
> anymore after this change since there are 2 Nightly versions per day.
> It can be fixed easily by adding the hour:minutes of the version to the
> "About Nightly" dialog, or display the build id.

Actually it could happen that it wasn't unique as manual response could happen.
IMHO, having 2 nightlies per day is a very good idea because with a tool like clouseau (bug 1396527) I'm able to find regressions quicker.
More generally, when a regression is introduced, since the pushlog is smaller, it's easier to identify the guilty patch.
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.