Closed Bug 918068 Opened 6 years ago Closed 5 years ago

switch b2g builds to use aus4.mozilla.org as their update server

Categories

(Release Engineering :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

(b2g-v1.4 fixed)

RESOLVED FIXED
Tracking Status
b2g-v1.4 --- fixed

People

(Reporter: kairo, Assigned: bhearsum)

References

Details

(Whiteboard: [leave-open])

Attachments

(5 files, 3 obsolete files)

In a Vidyo meeting in May, catlee agreed to change back the channel values set inside the builds to "standard" names.
Socorro crash-stats as well as possibly platform code is built around a small set of values for the update/release channel and the hackish path trickery introduced in bug 861426 and bug 862526 is disturbing that significantly. Also, release channels might be used in URLs, and introducing slashes in URLs are potential security risks as well. catlee said he can fix the update paths differently, with the channel names still being the standard nightly/aurora/beta/release/esr/default values (or some compound of that and a dash-separated suffix).

Also, only releng is using those weird channel names and we are advising partners/manufacturers to use the stand scheme of channel names so we can analyze them correctly in our crash data, see https://developer.mozilla.org/en-US/docs/Crash_Reporting_Guide_for_Firefox_OS_Partners#Release_Channels


So, all in all, releng should switch to "standard" channel names again, and for different update URLs, should probably see to modify app.update.url directly, or something like that (that variable is set in http://mxr.mozilla.org/mozilla-central/source/b2g/app/b2g.js#504 FYI).
Component: Release Automation → General Automation
QA Contact: bhearsum → catlee
Doesn't strictly block bug 869905, but related.
Blocks: 869905
This requires changing http://mxr.mozilla.org/mozilla-central/source/b2g/app/b2g.js#516 to use more than just %CHANNEL% for the update query.

I don't know who's relying on our OTA updates any more. Naoki, do you know which devices and branches (if any!) we need to support OTA updates for currently?
Flags: needinfo?(nhirata.bugzilla)
As far as I understand it, we still do OTA testing on Hamachi nightly for our smoke test (we were doing it on unagi); having said that we are being blocked on it.  bug 907444

We also do need to do update testing from 1.0.1 to 1.1, 1.1 to 1.2, etc. even if it may be rare to do it.  (Hamachi, inari, unagi)

We're also blocked on leo, due to a driver issue.

I am not sure if I've covered all the devices, but I believe those are the update testing that we have currently.  The future may include helix.
Flags: needinfo?(nhirata.bugzilla)
KaiRo and I spoke on IRC yesterday about this.

We're going to try and modify the URL used for updates to include %B2G_VERSION% and %PRODUCT_MODEL%. We'll also require changes to the mozharness configs to change the channel back to just being 'beta' or 'nightly'
Blocks: 883169
Not currently working on this.

We'd like to switch updates to balrog, and upload MAR files to FTP rather than update.boot2gecko.org. I have some code that adds the new balrog functionality to b2g_build.py, but it needs more testing.

Once we've switched updates over to balrog we can fix the update channel.
Assignee: catlee → nobody
Assignee: nobody → aki
Assignee: aki → bhearsum
Summary: Use standard release channel names for B2G builds, remove slashes → Use standard release channel names for B2G builds, remove slashes (b2g balrog)
Ben:

* partial updates?  - these seem like they're wanted-but-not-blockers and we'll need a game plan
* 1.2? - (bajaj) skip it!
** :bajaj also says "In -tree changes needed here, so lets start with a FxOS on central/master and we can take uplifts accordingly", so looks like Flame on Central is a great starting place.  She's also open to answering any b2g update-related questions.
(In reply to Aki Sasaki [:aki] from comment #6)
> Ben:
> 
> * partial updates?  - these seem like they're wanted-but-not-blockers and
> we'll need a game plan
> * 1.2? - (bajaj) skip it!
> ** :bajaj also says "In -tree changes needed here, so lets start with a FxOS
> on central/master and we can take uplifts accordingly", so looks like Flame
> on Central is a great starting place.  She's also open to answering any b2g
> update-related questions.

If I read my mail and bug 989128 correctly, I think we can also ignore Tarako across all branches?
Aki and I talked about this yesterday, and I think I have a decent understanding of what we need here. I'll be focused on just Flame builds on central/master to start with, so the following is focused on that:
* Make sure MARs are uploaded to the public FTP server
* Post MAR info to Balrog (we can probably do that for all devices right away - it doesn't do any harm).
** Some Mozharness code for this may exist in bug 858797
* Figure out what a b2g update URL looks like using /update/3 format (https://aus4.mozilla.org/update/3/%PRODUCT%/%VERSION%/%BUILD_ID%/%BUILD_TARGET%/%LOCALE%/%CHANNEL%/%OS_VERSION%/%DISTRIBUTION%/%DISTRIBUTION_VERSION%/update.xml).
** Some of this is obvious, but there's questions around a few things:
*** Is %VERSION% the b2g version? What about %OS_VERSION%? Does %OS_VERSION% even mean anything in the b2g world? We could do %VERSION% = gecko version and %OS_VERSION% = b2g version, which might make more semantic sense.
*** Does b2g have an overall buildid?
*** What does "locale" mean in a multilocale context?
* Set-up rule in Balrog
** Hope to use one rule per branch. Eg, FirefoxOS-mozilla-central-nightly-latest will contain latest updates for all devices on central/master
* Test test test test test
* Switch update URL for builds (probably a gecko or b2g change).

Once we switch the URL for builds, that update will be published to update.boot2gecko.org, and any older builds will take that update, and then start looking to aus4.mozilla.org instead. We can then turn off update.boot2gecko.org publishing for Flame on master.

And as a note to myself, we're looking at the non-eng nightlies regardless of device.

--

Once Flame is working, most additional devices/branches can be set-up by changing their update URL.

Aki tells me that some devices (at least Hamachi) require an extra "isOSUpdate" parameter in their update XML, so those devices won't be supported until we add that to the Balrog backend. I would prefer to wait until bug 748698 is fixed to do that, because it will make it trivial to do.
Summary: Use standard release channel names for B2G builds, remove slashes (b2g balrog) → switch b2g builds to use aus4.mozilla.org as their update server
Depends on: 1000204
Depends on: 1000207
Depends on: 1000208
Depends on: 1000212
Depends on: 1000215
Depends on: 1000217
Depends on: 1000221
(In reply to Ben Hearsum [:bhearsum] from comment #8)
> Aki and I talked about this yesterday, and I think I have a decent
> understanding of what we need here. I'll be focused on just Flame builds on
> central/master to start with, so the following is focused on that:
> * Make sure MARs are uploaded to the public FTP server

bug 1000207 for flame/master

> * Post MAR info to Balrog (we can probably do that for all devices right
> away - it doesn't do any harm).

Adding this capability is bug 1000208.

> * Figure out what a b2g update URL looks like using /update/3 format
> (https://aus4.mozilla.org/update/3/%PRODUCT%/%VERSION%/%BUILD_ID%/
> %BUILD_TARGET%/%LOCALE%/%CHANNEL%/%OS_VERSION%/%DISTRIBUTION%/
> %DISTRIBUTION_VERSION%/update.xml).

bug 1000212.

> * Set-up rule in Balrog

bug 1000215.

> * Switch update URL for builds (probably a gecko or b2g change).

We'll deal with this here once the deps are dealt with.

> Once we switch the URL for builds, that update will be published to
> update.boot2gecko.org, and any older builds will take that update, and then
> start looking to aus4.mozilla.org instead. We can then turn off
> update.boot2gecko.org publishing for Flame on master.

bug 1000217.

> Aki tells me that some devices (at least Hamachi) require an extra
> "isOSUpdate" parameter in their update XML, so those devices won't be
> supported until we add that to the Balrog backend. I would prefer to wait
> until bug 748698 is fixed to do that, because it will make it trivial to do.

bug 1000221
(In reply to Ben Hearsum [:bhearsum] from comment #8)
> * Figure out what a b2g update URL looks like using /update/3 format
> (https://aus4.mozilla.org/update/3/%PRODUCT%/%VERSION%/%BUILD_ID%/
> %BUILD_TARGET%/%LOCALE%/%CHANNEL%/%OS_VERSION%/%DISTRIBUTION%/
> %DISTRIBUTION_VERSION%/update.xml).

We need to make sure those replacements even work in the B2G code. Some testing might be useful there.
It's probably the code in http://mxr.mozilla.org/comm-central/source/mozilla/toolkit/mozapps/update/nsUpdateService.js#3542 that drives this for B2G as well.

> ** Some of this is obvious, but there's questions around a few things:
> *** Is %VERSION% the b2g version? What about %OS_VERSION%? Does %OS_VERSION%
> even mean anything in the b2g world? We could do %VERSION% = gecko version
> and %OS_VERSION% = b2g version, which might make more semantic sense.

I think %VERSION% is converted by the global code to be the Gecko version as it's also the version of the "B2G" product on the low levels of the binary.
Sending the B2G_OS_Version, i.e. the actual Firefox OS version string, in OS_VERSION makes sense, but this probably will need a patch to the product code. Right now, there is a token defined that is called %B2G_VERSION% that resolves to that version, maybe we should just put it in the URL instead of %OS_VERSION% for now.

> *** Does b2g have an overall buildid?

Yes, it's the Gecko build ID that we get there.

> *** What does "locale" mean in a multilocale context?

Probably nothing in this case.

What would make more sense is to send the device identifier in some field, see %PRODUCT_MODEL% in the code sited above. If we want to avoid changes to the URL format so that balrog doesn't need to support a different format, maybe that could go in for %DISTRIBUTION% or even %BUILD_TARGET% (after all, the build is created specifically for that device, which is even more detailed build target info than the usual variable) in the current format.
(In reply to Ben Hearsum [:bhearsum] from comment #7)
> (In reply to Aki Sasaki [:aki] from comment #6)
> > Ben:
> > 
> > * partial updates?  - these seem like they're wanted-but-not-blockers and
> > we'll need a game plan
> > * 1.2? - (bajaj) skip it!
> > ** :bajaj also says "In -tree changes needed here, so lets start with a FxOS
> > on central/master and we can take uplifts accordingly", so looks like Flame
> > on Central is a great starting place.  She's also open to answering any b2g
> > update-related questions.
> 
> If I read my mail and bug 989128 correctly, I think we can also ignore
> Tarako across all branches?

This is probably true.  Bug 989128 is going that direction.
No longer depends on: 1000204
Michael, I'm not sure if you're the best person to look at this or not. We decided what we should put into the update URL over in bug 1000212, so I don't think that part is a concern. I'd like to be able to roll this out just for Flame to start with though, but I'm not sure how to do that. I tried comparing against PRODUCT_MODEL, but that doesn't appear to be accessible in this file. Any suggestions?
Attachment #8412709 - Flags: feedback?(mwu)
Comment on attachment 8412709 [details] [diff] [review]
wip to switch flame to aus4/balrog

Discussed on IRC - there's no mechanism by which we can check for specific devices at compile time in gecko builds.
Attachment #8412709 - Flags: feedback?(mwu)
Depends on: 1007180
Attached patch use aus4 only for flame (obsolete) — Splinter Review
Turns out preprocessor doesn't support > or <. I think this will be good enough for the temporary scenario. Once we're ready to switch all devices over, I'll just remove the #if block and use aus4 for everything. Given that partners have been able to serve updates thusfar, I assume such a change won't break them? ie, they have their own b2g.js or some pref overrides that change the update server?
Attachment #8412709 - Attachment is obsolete: true
Attachment #8418952 - Flags: feedback?(mwu)
Comment on attachment 8418952 [details] [diff] [review]
use aus4 only for flame

Seems ok for now, assuming we switch over completely later. The Flame is updating to KitKat at some point, so the ANDROID_VERSION == 18 check won't hold for too long.
Attachment #8418952 - Flags: feedback?(mwu) → feedback+
Depends on: 1010313
(In reply to Michael Wu [:mwu] from comment #15)
> Comment on attachment 8418952 [details] [diff] [review]
> use aus4 only for flame
> 
> Seems ok for now, assuming we switch over completely later. The Flame is
> updating to KitKat at some point, so the ANDROID_VERSION == 18 check won't
> hold for too long.

I tested this patch by hand, and with all of bug 1001542, I got this update URL:
https://aus4.mozilla.org/update/3/B2G/32.0a1/20140515163731/flame/en-US/nightly/Boot2Gecko%202.0.0.0-prerelease/default/default/update.xml?force=1

...which is exactly what we want. Once bug 1000208 is resolved we'll have the update data being submitted Balrog, and be able to test the full update process. If all goes well, we should be able to switch Flame over to aus4.m.o next week.
Comment on attachment 8418952 [details] [diff] [review]
use aus4 only for flame

Probably should've requested review instead of feedback in the first place...
Attachment #8418952 - Flags: review?(mwu)
Attachment #8418952 - Flags: review?(mwu) → review?(fabrice)
Depends on: 1011486
Depends on: 1011489
Depends on: 1011501
Comment on attachment 8418952 [details] [diff] [review]
use aus4 only for flame

Review of attachment 8418952 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with a comment explaining what the version = 18 means in terms of android dessert name.
Attachment #8418952 - Flags: review?(fabrice) → review+
Depends on: 1015407
aus4 has had enough data to serve updates for the past few days now. A few of us have been dogfooding Flame by pushing app.update.url and app.update.channel prefs to the device. It's all been going well aside from bug 1015967, which appears to be a Gaia issue.

Given this, I intend to switch over the entire Flame mozilla-central/master channel in the next day or so. I want to consult with QA and probably others folks first though.

If the Flame switch goes well, we can switch over the rest of the devices once bugs 1000221, 1011489, and 1015407 are fixed. bug 1000221 may be the long pole there, because it's blocked on another patch for Balrog. Once everything on mozilla-central/master is switched we can talk about whether or not we'll be backporting this work to other branches, and how far.
This patch adds the requested comment and also updates the Flame config to set the update channel properly.

Aki, my plan is to get rid of {nightly_,}_update_channel and the associated logic from b2g_build.py after we kill update.boot2gecko.org. The only thing it's used for besides making/uploading updates is setting the Socorro channel, which we should be updating to follow the new channel the same time this patch lands.

In the meantime, I'll leave all of the old update stuff alone so we continue to upload there, in case we need to fall back to it.
Attachment #8418952 - Attachment is obsolete: true
Attachment #8429252 - Flags: review?(aki)
Haven't tested this yet, wanted to make sure you're OK with how it looks first.
Attachment #8429256 - Flags: feedback?(aki)
Comment on attachment 8429256 [details] [diff] [review]
make socorro use the right channel; add removal todos

Ended up testing this. Seems to have worked.

socorro.json from a Flame build w/ both of these patches applied:
[cltbld@b-linux64-hp-0021.build.releng.scl3.mozilla.com build]$ cat socorro.json 
{"buildid": "20140527040202", "update_channel": "nightly", "version": "32.0a1"}

And from a Nexus (which isn't switching channels yet):
[cltbld@b-linux64-hp-0021.build.releng.scl3.mozilla.com build]$ cat socorro.json 
{"buildid": "20140527040202", "update_channel": "nexus-4/2.0.0/nightly", "version": "32.0a1"}

Kairo, does the Flame's new socorro.json look right to you? Based on my understanding of early comments of this bug, it seems right to me. Though I'm not entirely certain how Socorro finds device information, I assume that's coming from some other field?
Attachment #8429256 - Flags: review?(aki)
Attachment #8429256 - Flags: feedback?(kairo)
Attachment #8429256 - Flags: feedback?(aki)
Comment on attachment 8429256 [details] [diff] [review]
make socorro use the right channel; add removal todos

Review of attachment 8429256 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me.
Attachment #8429256 - Flags: feedback?(kairo) → feedback+
Attachment #8429252 - Flags: review?(aki) → review+
Attachment #8429256 - Flags: review?(aki) → review+
Comment on attachment 8429256 [details] [diff] [review]
make socorro use the right channel; add removal todos

Thanks Aki. This patch is a no-op until the other one lands, so I landed it. It needs to get onto production before the other lands. If on one else has done a merge, I'll be doing one tomorrow afternoon to make sure that happens.
Attachment #8429256 - Flags: checked-in+
Comment on attachment 8429252 [details] [diff] [review]
flame only balrog, v2

https://hg.mozilla.org/integration/b2g-inbound/rev/92a49fd23593

Sheriffs told me that it's likely that this will make it into tomorrow morning's nightlies.
Whiteboard: [leave-open]
I took my Flame from update.boot2gecko.org (the early morning build from the 28th) -> aus4.mozilla.org (the evening build from the 28th) and then received another update this morning. Looks like the transition went fine for Flame. Bugs 1000221 and 1011489 need to be fixed before we can switch over the rest of the devices.
Merged to production and deployed.
No longer blocks: 1018276
Bug 1000221 is landed, but still needs to be verified by someone with a device. Once that's done we can switch all other devices on central/master to aus4. This patch will do that.

I'll probably end up landing this on Monday, because I'd rather not do it on a Friday...

After that, we'll probably backport all the necessary stuff to 1.4 so we can kill update.b2g after 1.3 dies.
Attachment #8434956 - Flags: review?(fabrice)
These were scattered about different bugs before. This should include all the changes needed to upload all the necessary bits of device builds to Balrog. Once this and the changes from bugs 1001542 and 1011550 are landed on 1.4, we should be able to switch it over to aus4.mozilla.org.
Attachment #8435005 - Flags: review?(aki)
Attachment #8435005 - Flags: review?(aki) → review+
Comment on attachment 8435005 [details] [diff] [review]
rollup of upload changes needed for 1.4

Drivers, this is another patch needed to move 1.4 over to our production update server. It's a roll-up a few patches from different bugs to get us uploading the necessary bits to FTP.
Attachment #8435005 - Flags: approval-mozilla-b2g30?
Comment on attachment 8434956 [details] [diff] [review]
switch all devices on master/2.0 to aus4

Vivien, could you review this since Fabrice won't be around for awhile? The tl;dr version is that we're ready to switch over the rest of the B2G builds on master/central to aus4.mozilla.org, and this is where we do it. If you need more context, let me know.
Attachment #8434956 - Flags: review?(fabrice) → review?(21)
This patch is for mozilla-central. It needs to land around the same time as the one that switches the update URL so that we set the channel names appropriately.

I think I actually need something similar in the b2g30 patch too....
Attachment #8435208 - Flags: review?(aki)
Comment on attachment 8435005 [details] [diff] [review]
rollup of upload changes needed for 1.4

Going to need to set B2G_UPDATE_CHANNEL for these as well.
Attachment #8435005 - Attachment is obsolete: true
Attachment #8435005 - Flags: approval-mozilla-b2g30?
Attachment #8435208 - Flags: review?(aki) → review+
We talked about the channel name on IRC a bit. I decided to go with "nightly-b2g30" instead of "beta-b2g30" because these are more like nightlies than betas AFAICT. Ie, they get released before any QA is done on them.
Attachment #8435777 - Flags: review?(aki)
Updating backwards dependency.
Blocks: 1000217
No longer depends on: 1000217
(In reply to Ben Hearsum [:bhearsum] from comment #35)
> We talked about the channel name on IRC a bit. I decided to go with
> "nightly-b2g30" instead of "beta-b2g30"

That means it will be hard to support that in Socorro. But for now, we first need to care that we get decent support for at least *some* B2G stuff in Socorro so I won't fight over this, it just will continue to not be supported for even longer than other stuff that might come up.
Attachment #8435777 - Flags: review?(aki) → review+
Attachment #8435777 - Flags: approval-mozilla-b2g30?
Comment on attachment 8435208 [details] [diff] [review]
set update channel for other devices on mozilla-central

https://hg.mozilla.org/mozilla-central/rev/16f3cac5e8fe
Attachment #8435208 - Flags: checked-in+
I've triggered new nightlies to make this testable sooner, too.
(In reply to Ben Hearsum [:bhearsum] from comment #40)
> I've triggered new nightlies to make this testable sooner, too.

All of these look good to me. I pinged QA and asked them to let me know if they hear of any issues.

Aki is going to make sure that these changes make it with the Aurora merge today, and someone (probably me) will need to set-up rules after the first nightly reports in.

The only thing left to do here now is backport to 1.4.
Comment on attachment 8434956 [details] [diff] [review]
switch all devices on master/2.0 to aus4

Another patch to move b2g30 to aus4.m.o
Attachment #8434956 - Flags: approval-mozilla-b2g30?
Comment on attachment 8434956 [details] [diff] [review]
switch all devices on master/2.0 to aus4

discussed offline with :bhearsum
Attachment #8434956 - Flags: approval-mozilla-b2g30? → approval-mozilla-b2g30+
Attachment #8435777 - Flags: approval-mozilla-b2g30? → approval-mozilla-b2g30+
As of tomorrow, everyone on central, aurora, and b2g30 should be getting updates through aus4.mozilla.org. The only follow-up is to fix the MAR locations for Dolphin builds, which is bug 1023523.

Thanks to all who helped with this.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.