Closed
Bug 1093196
Opened 10 years ago
Closed 7 years ago
reschedule HSTS and HPKP automatic updates to run daily, and be visible on treeherder
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P5)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: keeler, Assigned: aobreja)
References
Details
Attachments
(3 files)
1.77 KB,
patch
|
kmoir
:
review+
aobreja
:
checked-in+
|
Details | Diff | Splinter Review |
1.79 KB,
patch
|
Details | Diff | Splinter Review | |
116.77 KB,
image/png
|
Details |
The HSTS and HPKP automatic update scripts aren't quite fool-proof yet (see e.g. bug 1092606), and they occasionally break the build. It's particularly a bummer when this happens since they're currently scheduled to happen on Saturdays, when nobody is around. Let's re-schedule for something like Thursday mornings.
Comment 1•10 years ago
|
||
What do you want to do, among things that are possible, when-not-if it loses a push race? The reason they run Saturday morning is because nobody wants to write the code to deal with someone else pushing between the time the updater pulls and when it pushes.
![]() |
Reporter | |
Comment 2•10 years ago
|
||
The scripts update files that only they touch (i.e. normally no human-initiated commit touches those files). If there's a push between the time the scripts check out and when they check in, in the majority of cases an automatic merge should be successful. If not, they can abandon the attempted changes and send an email that they failed or something.
![]() |
Reporter | |
Comment 5•8 years ago
|
||
Sure, we could (note that the HSTS updater takes on the order of an hour (the HPKP updater is faster)).
Flags: needinfo?(dkeeler)
Comment 6•8 years ago
|
||
OOC, why do they check in this error log? https://dxr.mozilla.org/mozilla-central/source/security/manager/ssl/StaticHPKPins.errors
Would those be better as logs in the Treeherder job?
![]() |
Reporter | |
Comment 7•8 years ago
|
||
Yeah - as long as they're accessible somewhere, I don't think we need to check (either of) the error logs in.
Updated•8 years ago
|
Summary: reschedule HSTS and HPKP automatic updates for Thursday mornings (PST) or something → reschedule HSTS and HPKP automatic updates to run daily, and be visible on treeherder
Updated•8 years ago
|
Component: General Automation → Buildduty
QA Contact: catlee → bugspam.Callek
Comment 8•8 years ago
|
||
That's a bit of a problem with our plans for autoland, in that we want to never have merges when we "merge" autoland to m-c, which requires that there not be anything on m-c which isn't on autoland below the merge point, so having this land on m-c every day would require that it happen at a time when a sheriff is available to merge it to autoland, and then autoland couldn't be merged back until after that push, or whatever push above it has backed out everything busted, had finished PGO builds.
We could half-ass around it by just having actual merges from autoland for a while, until the ocean boils and there are hardly any pushes going to mozilla-inbound and this can be switched to push there without much fear of push races, but ideal would be either teaching this how to deal with push races or even prettier, teach it to do whatever it would take to let autoland do its landing for it.
Assignee | ||
Comment 9•8 years ago
|
||
Could you tell me please on which server I could see these logs that are been generating on each Saturday ?
Flags: needinfo?(dkeeler)
![]() |
Reporter | |
Comment 10•8 years ago
|
||
https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-central-linux64/
https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-aurora-linux64/
https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-esr45-linux64/
(they're at the bottom - search for "periodicupdate")
Flags: needinfo?(dkeeler)
Comment 11•8 years ago
|
||
Andrei, you asked earlier in the day re updating treeherder so these jobs appear
I think you need to write a patch to include these jobs,
here
github.com:mozilla/treeherder-service.git
treeherder/etl/buildbot.py
and update the tests as well
tests/etl/test_buildbot.py
you can probably ask questions in #treeherder if you need more details
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → aobreja
Assignee | ||
Comment 12•8 years ago
|
||
The patch for buildbotcustom repository,it should reschedule HSTS and HPKP automatic updates to run daily at 3:02 AM.
Attachment #8798857 -
Flags: review?(kmoir)
Assignee | ||
Comment 13•8 years ago
|
||
Patch that should make these jobs visible on treeherder.
Thanks Kim for the hint.
Attachment #8798858 -
Flags: feedback?(kmoir)
Comment 14•8 years ago
|
||
That's not going to make it visible on treeherder, because just like tbpl before it what treeherder really is is "a list of pushes, and the jobs that are pending/running/finished on them" rather than "a list of jobs, some of them associated with pushes."
Unlike pretty much everything else we run (release tagging in the non-release-promotion world is the only other direct parallel), the periodic update job doesn't start out with a revision that it ran on, it instead either creates a revision if it succeeds or doesn't create one if it fails in any way.
Even if you altered the script to capture the revision it creates by pushing, and altered the job to suddenly be run on that revision rather than on no revision as the final step (which I don't have any idea about the feasability of doing), you're still only going to be able to usefully make it visible on treeherder when it succeeds, and not when it fails, since when it fails there's no revision where it makes the tiniest bit of sense to display it.
Comment 15•8 years ago
|
||
Comment on attachment 8798858 [details] [diff] [review]
bug1093196_treeherder.patch
Usually I get someone from the treeherder team to review these requests. (edmorley) Also, usually I create a pull request against their github repo which allows you to run the tests and see if they pass.
Attachment #8798858 -
Flags: feedback?(kmoir)
Updated•8 years ago
|
Attachment #8798857 -
Flags: review?(kmoir) → review+
Assignee | ||
Updated•8 years ago
|
Attachment #8798857 -
Flags: checked-in+
Comment 16•8 years ago
|
||
Comment on attachment 8798857 [details] [diff] [review]
bug1093196_b_custom_notifications.patch
http://hg.mozilla.org/build/buildbotcustom/rev/3fbd15421d2a to remove the accidental commit of misc.py.orig in 3d94b2506858,
http://hg.mozilla.org/build/buildbotcustom/rev/c3eb75097fe3 to merge to production.
Comment 17•8 years ago
|
||
Note that this still doesn't solve the fundamental problem that caused the recent public stir - namely that changes to our release cycle have invalidated the assumption that not doing these updates on Beta is OK because we release often enough that users will get a newer version before we run out of time. Running every day on Trunk/Aurora isn't going to magically do anything to change the fact that we can go 7+ weeks without an update after we go to Beta. And that we throttle updates to release users for multiple weeks sometimes.
We still need a better story for not cutting it so close with the expiration date if we want to avoid ever getting stuck in this situation again.
Comment 18•8 years ago
|
||
(And note that I already had to do a manual update to the expiration time for Fx50 to avoid the same problem happening again months after the last time)
Assignee | ||
Comment 19•8 years ago
|
||
By checking the latest logs from (1) I found that we don't have a revision number which should be set in set_script_properties (see (2) on step 8).
The script_repo_revision is getting a value here (2) in step 6 (get_script_repo_revision) but the value is not set in revision in step 8, so we don't have a value for this revision.
Here we have an example of what is run on set_script_properties where we should set the revision number: (3)
My guess is that we should force the revision to take the value of the script_repo_revision.
(1)https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-aurora-linux64/
(2)http://buildbot-master72.bb.releng.usw2.mozilla.com:8001/builders/Linux%20x86-64%20mozilla-aurora%20periodic%20file%20update/builds/4
(3) http://buildbot-master72.bb.releng.usw2.mozilla.com:8001/builders/Linux%20x86-64%20mozilla-aurora%20periodic%20file%20update/builds/4/steps/set_script_properties/logs/stdio
Assignee | ||
Comment 20•8 years ago
|
||
The set_script_properties step is set here (4)
To set the revision number we can also use this part (5) (lines 52-58).
(4)http://hg.mozilla.org/build/buildbotcustom/file/default/process/factory.py
(5)http://hg/build/tools/file/tip/scripts/valgrind/valgrind.sh
Comment 21•8 years ago
|
||
Andrei, what's the current status of this bug? Perhaps we can discuss in the standup tomorrow morning so we can get it unblocked
Flags: needinfo?(aselagea)
Comment 22•8 years ago
|
||
I guess this was intended for Andrei, so shifting the ni request to him :)
Flags: needinfo?(aselagea) → needinfo?(aobreja)
Comment 23•8 years ago
|
||
I talked to Andrei about this in our standup and he said that the job runs once a day. The problem is that because the script doesn't reference a revision so it doesn't appear on treeherder.
Flags: needinfo?(aobreja)
Comment 24•7 years ago
|
||
Note explaining the priority level: P5 doesn't mean we've lowered the priority, but the contrary. However, we're aligning these levels to the buildduty quarterly deliverables, where P1-P3 are taken by our daily waterline KTLO operational tasks.
Priority: -- → P5
Comment 25•7 years ago
|
||
Can we close this now that this work has landed?
Flags: needinfo?(bugspam.Callek)
Comment 26•7 years ago
|
||
Yes, this was fixed by scheduling via taskcluster (so the buildbot job has a revision to work from). Bug 1402457
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(bugspam.Callek)
Resolution: --- → FIXED
Comment 27•7 years ago
|
||
(In reply to Justin Wood (:Callek) from comment #26)
> Yes, this was fixed by scheduling via taskcluster (so the buildbot job has a
> revision to work from). Bug 1402457
Awesome, thanks!
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•