Closed Bug 899095 Opened 7 years ago Closed 7 years ago

Switch to http://talos-bundles.pvt.b.m.o for talos bundles

Categories

(Infrastructure & Operations :: Change Requests, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

References

Details

(Whiteboard: [leave open])

Attachments

(5 files)

Process:
 - land changes to buildbot-configs, buildbotcustom, and sut_tools
 - Buildbot reconfig
 - ?? anything needed for sut_tools?

Impact:
 - The new and old URLs serve the same content, so no impact.

Rollback:
 - Revert changes in hg
 - Buildbot reconfig

People:
 - Need someone from releng to do the landings and reconfigs
Flags: cab-review?
(This is the CAB bug for bug 657046)
Blocks: 657046
Flags: cab-review? → cab-review+
This can be changed by changing the talos.zip's URL value. It can be changed in advanced. No downtime needed. You can even try it on the try server first.

https://hg.mozilla.org/mozilla-central/file/default/testing/talos/talos.json
hwine, no downtime needed for this bug.
Armen, MXR shows that URL in a number of other places:
  https://mxr.mozilla.org/build/search?string=build.mozilla.org/talos
It's not just the ZIPs.  Here are the URLs, with hit counts on one of the two webheads, from the month of August:

[root@web1.releng.webapp.scl3 build.mozilla.org-internal]# cat access_2013-08-* | cut -d] -f 2 | cut -d\" -f 2 | cut -d' ' -f 2 | sort | uniq -c | sort -n
     11 /talos/zips/talos.38e088867f7b.zip
     13 /talos/zips/talos.e405acebbbf9.zip
     25 /talos/zips/talos.5f421eb0d1ed-tresize.zip
     31 /talos/tools/buildfarm/utils/installdmg.sh
     87 /talos/zips/talos.a11542b55a70.zip
    225 /talos/zips/mobile_tp4.zip
    258 /talos/mobile/sutAgentAndroid.1.17.apk
    281 /talos/tools/buildfarm/maintenance/count_and_reboot.py
    288 /talos/zips/retry.zip
    325 /talos/zips/flash64_11_0_d1_98.zip
    354 /talos/zips/flash32_10_3_183_5.zip
    445 /talos/zips/tp5n.zip
    447 /talos/profiles/dirtyDBs.zip
    459 /talos/profiles/dirtyMaxDBs.zip
    507 /talos/zips/talos.fcbb9d7d3c78.zip

All of those files are available at http://talos-bundles.pvt.b.m.o, without the /talos/ directory component.
Flags: needinfo?(armenzg)
Attachment #788137 - Flags: review?(bugspam.Callek)
Flags: needinfo?(armenzg)
Assignee: server-ops → armenzg
Status: NEW → ASSIGNED
Attachment #788140 - Flags: review?(bugspam.Callek)
Attachment #788138 - Flags: review?(aki) → review+
I'm pushing this to try.

We'll have to create a patch like this for every tree.
Attachment #788230 - Flags: review?(jmaher)
Comment on attachment 788230 [details] [diff] [review]
[checked-in] mc.diff

Review of attachment 788230 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/talos/talos.json
@@ +24,5 @@
>                  "tspaint_places_generated_max"
>              ],
>              "talos_addons": [
> +                "http://talos-bundles.pvt.build.mozilla.org/profiles/dirtyDBs.zip",
> +                "http://talos-bundles.pvt.build.mozilla.org/profiles/dirtyMaxDBs.zip"

these two .zip files are updated daily on build.m.o.  Are these being updated daily on talos-bundles.pvt.build.m.o ?
Attachment #788230 - Flags: review?(jmaher) → review+
The URLs are served from the same directory.
Comment on attachment 788137 [details] [diff] [review]
[checked-in & live] switch_talos_bundles.tools.diff

Review of attachment 788137 [details] [diff] [review]:
-----------------------------------------------------------------

/me is sad that this doesn't resolve (for me) on VPN, but it is the correct change
Attachment #788137 - Flags: review?(bugspam.Callek) → review+
Attachment #788139 - Flags: review?(bugspam.Callek) → review+
Attachment #788140 - Flags: review?(bugspam.Callek) → review+
Comment on attachment 788137 [details] [diff] [review]
[checked-in & live] switch_talos_bundles.tools.diff

https://hg.mozilla.org/build/tools/rev/b36268aa7570
Attachment #788137 - Attachment description: switch_talos_bundles.tools.diff → [checked-in] switch_talos_bundles.tools.diff
Comment on attachment 788138 [details] [diff] [review]
[checked-in & live] switch_talos_bundles.mozharness.diff

Mozharness patch is now live.
http://hg.mozilla.org/build/mozharness/rev/f8b737e5127d
Attachment #788138 - Attachment description: switch_talos_bundles.mozharness.diff → [checked-in & live] switch_talos_bundles.mozharness.diff
Comment on attachment 788140 [details] [diff] [review]
[checked-in & live] switch_talos_bundles.diff

https://hg.mozilla.org/build/buildbot-configs/rev/fd5262822e4d
Attachment #788140 - Attachment description: switch_talos_bundles.diff → [checked-in & not live] switch_talos_bundles.diff
Comment on attachment 788139 [details] [diff] [review]
[checked-in & live] switch_talos_bundles.bc.diff

https://hg.mozilla.org/build/buildbotcustom/rev/ebe52936fcdf
Attachment #788139 - Attachment description: switch_talos_bundles.bc.diff → [checked-in & not live] switch_talos_bundles.bc.diff
Comment on attachment 788230 [details] [diff] [review]
[checked-in] mc.diff

https://hg.mozilla.org/integration/mozilla-inbound/rev/d5a9b3ef1706
Attachment #788230 - Attachment description: mc.diff → [checked-in] mc.diff
Whiteboard: [leave open]
Comment on attachment 788137 [details] [diff] [review]
[checked-in & live] switch_talos_bundles.tools.diff

Deployed with this:
python buildfarm/maintenance/manage_foopies.py -H all -j 16 -f buildfarm/mobile/devices.json update
Attachment #788137 - Attachment description: [checked-in] switch_talos_bundles.tools.diff → [checked-in & live] switch_talos_bundles.tools.diff
Adding some sheriffs just in case something happens.
It should not but we never know.
(Adding remaining sheriffs :-))
Mass back out from production.
We should figure out in Cedar if this is involved with the tree burning.
All I see on the webheads for the new URL is a few of these:

10.22.81.211 - - [13/Aug/2013:09:14:29 -0700] "GET /mobile/sutAgentAndroid.1.17.apk HTTP/1.1" 200 228341 "-" "Python-urllib/2.7"

and no errors
Attachment #788138 - Attachment description: [checked-in & live] switch_talos_bundles.mozharness.diff → [checked-in] switch_talos_bundles.mozharness.diff
(In reply to Dustin J. Mitchell [:dustin] from comment #23)
> All I see on the webheads for the new URL is a few of these:
> 
> 10.22.81.211 - - [13/Aug/2013:09:14:29 -0700] "GET
> /mobile/sutAgentAndroid.1.17.apk HTTP/1.1" 200 228341 "-" "Python-urllib/2.7"
> 
> and no errors

I had to backout the mozharness part of it.
We're still waiting on a reconfig.

You should also start seeing this file being reached:
http://talos-bundles.pvt.build.mozilla.org/zips/talos.fcbb9d7d3c78.zip

Only *Mobile* talos jobs after this changeset should start using it:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=d5a9b3ef1706&jobname=mozilla-inbound%20talos%20remote

See in the log:
https://tbpl.mozilla.org/php/getParsedLog.php?id=26492716&tree=Mozilla-Inbound&full=1
> INFO: Downloading http://talos-bundles.pvt.build.mozilla.org/zips/talos.fcbb9d7d3c78.zip as talos.zip
In production.
Merged again to the production branch of mozharness.
Live as of ~7:40AM PDT.
Attachment #788138 - Attachment description: [checked-in] switch_talos_bundles.mozharness.diff → [checked-in & live] switch_talos_bundles.mozharness.diff
Attachment #788139 - Attachment description: [checked-in & not live] switch_talos_bundles.bc.diff → [checked-in & live] switch_talos_bundles.bc.diff
Attachment #788140 - Attachment description: [checked-in & not live] switch_talos_bundles.diff → [checked-in & live] switch_talos_bundles.diff
Assignee: armenzg → dustin
(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) (EDT/UTC-4) from comment #28)

Please don't double-land on b2g18 and b2g18_v1_1_0hd unless some actual branch-specific reason for doing so. Non-branch-specific landings for v1_1_0hd are handled by daily merges from b2g18 and double-landing makes the history more confusing.
I see five of these in the last half-hour:

10.22.81.211 - - [15/Aug/2013:06:03:07 -0700] "GET /talos/zips/talos.fcbb9d7d3c78.zip HTTP/1.1" 200 17031973 "-" "Python-urllib/2.6"

the source IP is the load balancer, so that's not terribly helpful.  Should i see these requests dwindle over the course of the day?
With actual source IPs:
> 10.12.50.46 - - [15/Aug/2013:07:08:14 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769899 "-" "Python-urllib/2.5"
> 10.12.49.211 - - [15/Aug/2013:07:08:34 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769899 "-" "Python-urllib/2.5"
> 10.12.49.184 - - [15/Aug/2013:07:24:40 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769900 "-" "Python-urllib/2.5"
> 10.12.49.198 - - [15/Aug/2013:07:24:40 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769900 "-" "Python-urllib/2.5"
> 10.12.49.188 - - [15/Aug/2013:07:24:41 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769899 "-" "Python-urllib/2.5"
> 10.12.49.194 - - [15/Aug/2013:07:24:41 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769899 "-" "Python-urllib/2.5"
> 10.12.51.149 - - [15/Aug/2013:07:25:17 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769899 "-" "Python-urllib/2.5"
> 10.12.50.85 - - [15/Aug/2013:07:25:28 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769899 "-" "Python-urllib/2.5"
> 10.12.50.162 - - [15/Aug/2013:07:25:29 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769899 "-" "Python-urllib/2.5"
> 10.12.51.212 - - [15/Aug/2013:07:25:42 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769899 "-" "Python-urllib/2.5"
> 10.12.50.29 - - [15/Aug/2013:07:25:44 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769899 "-" "Python-urllib/2.5"
> 10.12.49.210 - - [15/Aug/2013:07:25:42 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769899 "-" "Python-urllib/2.5"
> 10.12.51.201 - - [15/Aug/2013:07:25:45 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769899 "-" "Python-urllib/2.5"

So the list of hosts is

talos-r3-w7-099.build.scl1.mozilla.com.
talos-r3-w7-109.build.scl1.mozilla.com.
talos-r3-w7-118.build.scl1.mozilla.com.
talos-r3-w7-122.build.scl1.mozilla.com.
talos-r3-w7-128.build.scl1.mozilla.com.
talos-r3-w7-132.build.scl1.mozilla.com.
talos-r3-xp-078.build.scl1.mozilla.com.
talos-r3-xp-095.build.scl1.mozilla.com.
talos-r3-xp-104.build.scl1.mozilla.com.
talos-r3-xp-106.build.scl1.mozilla.com.
talos-r3-xp-120.build.scl1.mozilla.com.
talos-r3-xp-121.build.scl1.mozilla.com.
talos-r3-xp-130.build.scl1.mozilla.com.

Is it possible this is from a try push based on a commit before your patches above?
Flags: needinfo?(armenzg)
It is most likely changesets prior to my landing.

It could be one of these cases:
* mozilla-central based branches that have not yet merged from mozilla-central
* project branches that are based on older branches which are not merging from mozilla-central
* try pushes of older revisions
* b2g18 PGO triggers of older changesets
** I believe we do this before merging to b2g18_1_1_0_hd

The list shows rev3 minis which is either esr17 or b2g18 based branches.

Let's look at one of the slaves e.g. talos-r3-w7-099:
> 10.12.50.85 - - [15/Aug/2013:07:25:28 -0700] "GET /talos/zips/talos.5f421eb0d1ed-tresize.zip HTTP/1.1" 200 9769899 "-" "Python-urllib/2.5"
https://secure.pub.build.mozilla.org/buildapi/recent/talos-r3-w7-099
https://tbpl.mozilla.org/?tree=Mozilla-B2g18&rev=692d3414bb12&jobname=pgo%20talos

Changeset 692d3414bb12 is prior to my landing.

dustin, can we check next week? I expect us to have a much lower number of hits.
Is there the possibility of placing redirects?
If needed, could we push the switchover?

We can start announcing on dev.platform that older changesets will need to apply the attachment above; what do you think?
Flags: needinfo?(armenzg)
Sure - we can wait until Sept 30 when build.mozilla.org will be turned off (see bug 604688) to turn off the old URLs here.  A note to dev.platform now would be helpful, too.  Will you be able to send that?

As long as Sept 30 is OK for the htpt://build.mozilla.org/talos URLs being disabled, I think we can close this bug.
Note to dev.platform done. [1]

Sept. 30th is good.


[1]
Hi,
After Sep. 30th we will not be grabbing files anymore from http://build.mozilla.org/talos but from http://talos-bundles.pvt.build.mozilla.org.

As of today, all changes have landed and made live for all of our development trees (including esr and b2g18 trees).

Any _talos_ jobs that are pushed to the try server with older changesets [1] or any _talos_ jobs re-triggered on older changesets will fail.

For this not to happen on the try pushes make sure to change the following two files:
- testing/talos/talos.json
- testing/talos/talos_from_code.py

For re-triggers, you will have to push to the try server with the mentioned changes.

best regards,
Armen

[1]
https://hg.mozilla.org/mozilla-central/rev/d5a9b3ef1706
https://hg.mozilla.org/releases/mozilla-aurora/rev/932868752f82
https://hg.mozilla.org/releases/mozilla-beta/rev/32c8013dbee9
https://hg.mozilla.org/releases/mozilla-release/rev/b5e2368423eb
https://hg.mozilla.org/releases/mozilla-esr17/rev/8ac7eb904308
https://hg.mozilla.org/releases/mozilla-b2g18/rev/5bd5ac591922
https://hg.mozilla.org/releases/mozilla-b2g18_v1_1_0_hd/rev/30741bd27846
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
No longer blocks: 2013-08-24-maint
Product: mozilla.org → Infrastructure & Operations
Change Request: --- → approved
Flags: cab-review+
You need to log in before you can comment on or make changes to this bug.