Closed Bug 1266229 Opened 9 years ago Closed 9 years ago

Final fixes prior to turning on ingestion of jobs from a taskcluster exchange

Categories

(Tree Management :: Treeherder: Data Ingestion, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: camd, Assigned: camd)

References

Details

Attachments

(6 files)

The primary work for this is done. But there are a few issues here and there that need fixing. Subscribing to a task-cluster owned exchange wasn't working. Have a bug in my Queue creation/binding... The data format needs some tweaks Create a process/server that runs the ingest_from_pulse management command If any of these get kind of big, I'll break into a separate bug. But much of these will be tweaks.
Assignee: nobody → cdawson
Blocks: 1169320
Component: Treeherder → Treeherder: Data Ingestion
Blocks: 1266584
Comment on attachment 8746862 [details] [review] [treeherder] mozilla:pulse-schema-changes > mozilla:master some preliminary work to turning on pulse ingestion
Attachment #8746862 - Flags: review?(emorley)
Comment on attachment 8746862 [details] [review] [treeherder] mozilla:pulse-schema-changes > mozilla:master nm... Dang, got test failures I hadn't noticed... Sorry.
Attachment #8746862 - Flags: review?(emorley)
Comment on attachment 8746862 [details] [review] [treeherder] mozilla:pulse-schema-changes > mozilla:master Hey Will-- I've been sending these to Ed mostly for the pulse ingestion changes, but would you be up for taking a look at this one?
Attachment #8746862 - Flags: review?(wlachance)
Comment on attachment 8746862 [details] [review] [treeherder] mozilla:pulse-schema-changes > mozilla:master LGTM, aside from the fact that I'd prefer the schema changes be in their own commit.
Attachment #8746862 - Flags: review?(wlachance) → review+
Thanks, wlach. I made those changes, including the schema changes in their own commit. :)
Commits pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/7b0b72e3c317f6a0058ed9bb98a7d9d80263a3b7 Bug 1266229 - Scheme / Model change job group type name to 100 Change the ``name`` field for job types and job groups to allow 100 characters because some names were being truncated. This now matches what is already set in the reference_data_signature table. https://github.com/mozilla/treeherder/commit/be37805e08be736770181e8430d4f7816ae935ef Bug 1266229 - Pulse ingestion fixes and updates These changes were discovered to be needed after direct testing against Task Cluster Pulse exchanges. -Made some JSON/YML schema changes to be more precise for several fields -Modified to job_loader to be more resilient to optional data being missing
Comment on attachment 8746677 [details] [review] [treeherder] mozilla:pulse-ingestion-fixes > mozilla:master Here are some more fixes and updates prior to turning on pulse ingestion. This stops short before actually flipping the switch by adding it to the procfile and queue
Attachment #8746677 - Flags: review?(emorley)
Attachment #8746677 - Flags: review?(emorley) → review+
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/f98bf3f8b0f869bed290fee460aea77dbdcc39b1 Bug 1266229 - Fix Pulse job ingestion prior to enabling it This contains several tweaks and fixes that allow us to ingest data from a real Task Cluster owned exchange. One of the main fixes is the way I was binding to the exchange and routing keys with the same Queue. Before, it was re-creating the Queue, so would miss some of the bindings. This will also prune the durable queue if the config has removed some exchanges and/or routing keys.
Hey Ed-- I went ahead and merged this after my changes since you gave me an r+. But I didn't want to just drop your question in the PR: ================== The docs say that the URL can be used to set options, eg ?ssl=1: http://docs.celeryproject.org/projects/kombu/en/latest/userguide/connections.html#urls ...so this option could be combined into PULSE_DATA_INGESTION_CONFIG? The docs also aren't clear on what happens if True is passed vs the full dict: http://docs.celeryproject.org/projects/kombu/en/latest/reference/kombu.connection.html#connection ...ie: does it still verifying the certs or not? Don't suppose you could check? (Would be good to avoid the potential for MITM :-)) tbh: I don't know (and haven't been able to find) the answer. I searched around a bit, but couldn't get anything specific. I also took a look at mozillapulse and they don't seem to do anything special with ssl. Just set it to true. So I feel like we're ok leaving it as it is. Does that sound ok by you?
Flags: needinfo?(emorley)
Yeah I'm happy to check - added to my todo list :-) (Don't think it need block this work though)
Flags: needinfo?(emorley)
Depends on: 1276355
Attachment #8757480 - Flags: review?(emorley)
Hey Ed-- I'm hoping to begin testing this on stage next week, so it's the appropriate time (imo) to add these queues. I'm still waiting on some work from Fubar to start flipping these switches... :)
Attachment #8757480 - Flags: review?(emorley) → review-
Comment on attachment 8757480 [details] [review] [treeherder] mozilla:turn-on-pulse-ingestion > mozilla:master Hey Ed-- Thanks for catching those issues. These adjustments should help. The next PR will add the ``read_from_pulse`` dyno definition to the procfile.
Attachment #8757480 - Flags: review- → review?(emorley)
Comment on attachment 8757480 [details] [review] [treeherder] mozilla:turn-on-pulse-ingestion > mozilla:master Looks great :-)
Attachment #8757480 - Flags: review?(emorley) → review+
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/3402f00fbc99d336118f5cde0fe3c36ec8f11b76 Bug 1266229 - Create store_pulse_jobs queue and prep to turn on Pulse ingestion Rename ``ingest_from_pulse`` management command to ``read_pulse_jobs`` to indicate that this step does not actually do any ingesting. It just populates the celery queue ``store_pulse_jobs`` that DOES do the actual ingesting.
No longer blocks: 1169320
Depends on: 1169320
Depends on: 1277724
Tweaking summary since the switching on will be happening in bug 1266584, which this bug blocks :-)
Summary: Turn on ingestion of jobs from a taskcluster exchange → Final fixes prior to turning on ingestion of jobs from a taskcluster exchange
Depends on: 1277975
Attachment #8760343 - Flags: review?(emorley)
Attachment #8760343 - Flags: review?(emorley) → review+
Attachment #8760430 - Flags: review?(wlachance)
Comment on attachment 8760430 [details] [review] [treeherder] mozilla:make-executable > mozilla:master Stealing to help balance the review queues a bit :-)
Attachment #8760430 - Flags: review?(wlachance) → review+
Depends on: 1277955, 1277962
Cameron - given bug 1277304 comment 1, you're welcome to proceed with this work before the Heroku move, should you wish :-)
(In reply to Ed Morley [:emorley] from comment #23) > Some new exceptions on stage: > https://rpm.newrelic.com/accounts/677903/applications/5585473/ > traced_errors?tw%5Bend%5D=1465263284&tw%5Bstart%5D=1465259684 > > store-pulse-jobs treeherder.model.models:MultipleObjectsReturned(get() > returned more than one JobDetail -- it returned 2!) > https://rpm.newrelic.com/accounts/677903/applications/5585473/traced_errors/ > 553a8f-83e1d9fc-2c4a-11e6-b947-b82a72d22a14 Not sure why this is happening. Looking into it now... > > store-pulse-jobs > exceptions:KeyError(u'a34348876e63c94c8ef78948d8db77ca3d310f6b') > https://rpm.newrelic.com/accounts/677903/applications/5585473/traced_errors/ > 553a98-cf831fa0-2c4f-11e6-b947-b82a72d22a14 This one was due to Bug 1277955 with revision hashes being sent in as revisions. I just merged the fix for that, so should be remedied on the next stage push.
Created a new bug for the first error seen in Comment 25: bug 1278711
Depends on: 1278711
Depends on: 1279386
No longer depends on: 1278711
Depends on: 1280306
Depends on: 1281412
Depends on: 1281821
Depends on: 1283202
Depends on: 1283845
Depends on: 1283865
Depends on: 1278711
No longer depends on: 1281821
No longer depends on: 1278711
Everything has been resolved/unblocked wrt deployment to prod.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
garndt gave me an r+ in the PR
Depends on: 1306588
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: