Closed
Bug 1372581
Opened 7 years ago
Closed 7 years ago
KeyError: 'payload' during store-pulse-resultsets
Categories
(Tree Management :: Treeherder: Data Ingestion, defect, P1)
Tree Management
Treeherder: Data Ingestion
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: armenzg, Assigned: camd)
Details
Attachments
(2 files)
I was trying to grab a sample Pulse message to fix bug 1355812 (I was doing an "Add new jobs" request) and I noticed that the Pulse listener was not getting anything. I looked in Treeherder's Papertrail and noticed the following. Searching for "Error transforming resultset" in the Papertrail logs show hundreds of these messages. This is important to fix as it affects day-to-day tasks for sheriffs. It is also important to figure out why NewRelic isn't alerting about this. Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: [2017-06-13 13:36:40,439] ERROR [treeherder.etl.resultset_loader:47] Error transforming resultset Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: Traceback (most recent call last): Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: File "/app/treeherder/etl/resultset_loader.py", line 29, in process Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: message_body["payload"]) Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: KeyError: 'payload' Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: [2017-06-13 13:36:40,439: ERROR/Worker-1] Error transforming resultset Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: Traceback (most recent call last): Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: File "/app/treeherder/etl/resultset_loader.py", line 29, in process Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: message_body["payload"]) Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: KeyError: 'payload' Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: [2017-06-13 13:36:40,479] ERROR [treeherder.etl.resultset_loader:47] Error transforming resultset Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: Traceback (most recent call last): Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: File "/app/treeherder/etl/resultset_loader.py", line 29, in process Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: message_body["payload"]) Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: KeyError: 'payload' Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: [2017-06-13 13:36:40,479: ERROR/Worker-2] Error transforming resultset Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: Traceback (most recent call last): Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: File "/app/treeherder/etl/resultset_loader.py", line 29, in process Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: message_body["payload"]) Jun 13 09:36:40 treeherder-prod app/worker_store_pulse_resultsets.1: KeyError: 'payload' Jun 13 09:36:43 treeherder-prod app/worker_store_pulse_jobs.3: [2017-06-13 13:36:43,482] WARNING [treeherder.etl.job_loader:97] Pulse job had revision_hash instead of revision: mozilla-central:06e5fb0f474b056f9ca6c1100dc673db4db60120 Jun 13 09:36:43 treeherder-prod app/worker_store_pulse_jobs.3: [2017-06-13 13:36:43,482: WARNING/Worker-2] Pulse job had revision_hash instead of revision: mozilla-central:06e5fb0f474b056f9ca6c1100dc673db4db60120 Jun 13 09:36:47 treeherder-prod app/worker_store_pulse_jobs.1: [2017-06-13 13:36:46,950] WARNING [treeherder.etl.job_loader:97] Pulse job had revision_hash instead of revision: mozilla-central:06e5fb0f474b056f9ca6c1100dc673db4db60120 Jun 13 09:36:47 treeherder-prod app/worker_store_pulse_jobs.1: [2017-06-13 13:36:46,950: WARNING/Worker-2] Pulse job had revision_hash instead of revision: mozilla-central:06e5fb0f474b056f9ca6c1100dc673db4db60120 Jun 13 09:36:47 treeherder-prod app/worker_store_pulse_jobs.3: [2017-06-13 13:36:47,313] WARNING [treeherder.etl.job_loader:97] Pulse job had revision_hash instead of revision: mozilla-central:06e5fb0f474b056f9ca6c1100dc673db4db60120 Jun 13 09:36:47 treeherder-prod app/worker_store_pulse_jobs.3: [2017-06-13 13:36:47,313: WARNING/Worker-3] Pulse job had revision_hash instead of revision: mozilla-central:06e5fb0f474b056f9ca6c1100dc673db4db60120 Jun 13 09:36:48 treeherder-prod app/worker_store_pulse_jobs.1: [2017-06-13 13:36:48,566] WARNING [treeherder.etl.job_loader:97] Pulse job had revision_hash instead of revision: mozilla-central:06e5fb0f474b056f9ca6c1100dc673db4db60120 Jun 13 09:36:48 treeherder-prod app/worker_store_pulse_jobs.1: [2017-06-13 13:36:48,566: WARNING/Worker-3] Pulse job had revision_hash instead of revision: mozilla-central:06e5fb0f474b056f9ca6c1100dc673db4db60120 Jun 13 09:36:51 treeherder-prod app/worker_store_pulse_jobs.4: [2017-06-13 13:36:51,587] WARNING [treeherder.etl.job_loader:97] Pulse job had revision_hash instead of revision: mozilla-central:06e5fb0f474b056f9ca6c1100dc673db4db60120 Jun 13 09:36:51 treeherder-prod app/worker_store_pulse_jobs.4: [2017-06-13 13:36:51,587: WARNING/Worker-2] Pulse job had revision_hash instead of revision: mozilla-central:06e5fb0f474b056f9ca6c1100dc673db4db60120 Jun 13 09:36:52 treeherder-prod app/worker_store_pulse_jobs.4: [2017-06-13 13:36:52,209] WARNING [treeherder.etl.job_loader:97] Pulse job had revision_hash instead of revision: mozilla-central:06e5fb0f474b056f9ca6c1100dc673db4db60120 Jun 13 09:36:52 treeherder-prod app/worker_store_pulse_jobs.4: [2017-06-13 13:36:52,209: WARNING/Worker-3] Pulse job had revision_hash instead of revision: mozilla-central:06e5fb0f474b056f9ca6c1100dc673db4db60120
Comment 1•7 years ago
|
||
Thank you for filing - this is definitely something we want to fix, however it's not related to "add new jobs" - this is pulse push ingestion. Cameron, could you take a look at the KeyError, and I'll look at why the New Relic Python agent isn't catching this?
Component: Treeherder → Treeherder: Data Ingestion
Flags: needinfo?(cdawson)
Priority: -- → P1
Summary: Requesting Treeherder actions is throwing some errors → KeyError: 'payload' during store-pulse-resultsets
Reporter | ||
Comment 2•7 years ago
|
||
As emorley points out this is a red herring to my bug.
No longer blocks: 1355812
Comment 3•7 years ago
|
||
Cameron, do you know when you might be able to look at this? These are exceptions during data ingestion on production that are presumably causing data loss.
Assignee | ||
Comment 4•7 years ago
|
||
This exception is happening when we get a push for which we don't have an entry in the Repository table. The exception happens when trying to get info to put into the new relic insights ``record_custom_event``. I'll create a PR to avoid the exception and still post SOMETHING to insights even if it can't find the data it's looking for. Hopefully this will help us track down any incorrect event skipping. Actually, I think this isn't causing any actual data loss. But this will help us determine that.
Flags: needinfo?(cdawson)
Comment 5•7 years ago
|
||
Assignee | ||
Comment 6•7 years ago
|
||
Comment on attachment 8884881 [details] [review] [treeherder] mozilla:key-error-pulse-push-ingest > mozilla:master Hey Ed-- This should eliminate this exception and help me track down if there's legitimate push events that are getting dropped.
Attachment #8884881 -
Flags: review?(emorley)
Assignee | ||
Comment 7•7 years ago
|
||
Need-info'ing myself to remind me to check insights when this is deployed to see what kind of reporting we get from these messages.
Flags: needinfo?(cdawson)
Comment 8•7 years ago
|
||
Comment on attachment 8884881 [details] [review] [treeherder] mozilla:key-error-pulse-push-ingest > mozilla:master Left some comments :-)
Attachment #8884881 -
Flags: review?(emorley) → review-
Updated•7 years ago
|
Assignee: nobody → cdawson
Assignee | ||
Comment 9•7 years ago
|
||
Comment on attachment 8884881 [details] [review] [treeherder] mozilla:key-error-pulse-push-ingest > mozilla:master Hey there, thanks for the feedback. Helped prompt me to look a little deeper and find what I think are all the problems now. :)
Flags: needinfo?(cdawson)
Attachment #8884881 -
Flags: review- → review?(emorley)
Comment 10•7 years ago
|
||
Comment on attachment 8884881 [details] [review] [treeherder] mozilla:key-error-pulse-push-ingest > mozilla:master Left a comment :-)
Attachment #8884881 -
Flags: review?(emorley)
Assignee | ||
Comment 11•7 years ago
|
||
Comment on attachment 8884881 [details] [review] [treeherder] mozilla:key-error-pulse-push-ingest > mozilla:master Hey Ed-- OK, I made that change we talked about. :)
Attachment #8884881 -
Flags: review?(emorley)
Comment 12•7 years ago
|
||
Comment on attachment 8884881 [details] [review] [treeherder] mozilla:key-error-pulse-push-ingest > mozilla:master Many thanks :-)
Attachment #8884881 -
Flags: review?(emorley) → review+
Comment 13•7 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/3a16745fdf35d25a97b8146094c3fff479b88ea0 Bug 1372581 - Fix exception in error reporting for missing Repository (#2624)
Assignee | ||
Updated•7 years ago
|
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Comment 14•7 years ago
|
||
On New Relic I'm now seeing exceptions during store-pulse-resultsets of form: File "/app/treeherder/etl/tasks/pulse_tasks.py", line 30, in store_pulse_resultsets File "/app/treeherder/etl/resultset_loader.py", line 45, in process File "/app/treeherder/etl/resultset.py", line 70, in store_result_set_data File "/app/treeherder/etl/resultset.py", line 13, in store_push exceptions:AttributeError: 'NoneType' object has no attribute 'get' eg: https://rpm.newrelic.com/accounts/677903/applications/14179733/filterable_errors#/show/976912c0-67bb-11e7-931c-0242ac110012_0_4022/stack_trace?top_facet=transactionUiName&primary_facet=error.class&barchart=barchart&_k=48kcqw Holding off on prod deploy for now.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Updated•7 years ago
|
Flags: needinfo?(cdawson)
Assignee | ||
Comment 15•7 years ago
|
||
Interesting. I'm still seeing errors such as these occasionally. Looks like they all come from the "Servo" repo that I see so far. In papertrail I'm seeing just prior to the attribute error: Jul 17 15:26:26 treeherder-stage app/worker_store_pulse_resultsets.1: [2017-07-17 22:26:26,268: ERROR/Worker-3] Error fetching commits Jul 17 15:26:26 treeherder-stage app/worker_store_pulse_resultsets.1: Traceback (most recent call last): Jul 17 15:26:26 treeherder-stage app/worker_store_pulse_resultsets.1: File "/app/treeherder/etl/resultset_loader.py", line 105, in fetch_resultset Jul 17 15:26:26 treeherder-stage app/worker_store_pulse_resultsets.1: commit["commit"]["author"]["email"]), Jul 17 15:26:26 treeherder-stage app/worker_store_pulse_resultsets.1: UnicodeEncodeError: 'ascii' codec can't encode character u'\xc1' in position 13: ordinal not in range(128) What happens is that I'm catching the exception in ``resultset_loader`` and logging it to new relic and the logs, but not handling it. So I return a null resultset and then we try to get the "revision" out of the resultset and can't. I shouldn't be catching that exception there. But also need to perhaps sanitize my data a bit. Still working...
Assignee | ||
Comment 16•7 years ago
|
||
So, it ends up this is actually because we had a commit with the user named "Fausto Núñez Alberro". When I tried to use the string .format() function with that, it bails because I'm trying to put non-ascii characters into a string. So if I change that field to unicode, it works. I updated the unit test for this. In reality, this is not related to the original bug here, just a coincidence of timing and code location. :) But still needed fixing.
Flags: needinfo?(cdawson)
Comment 17•7 years ago
|
||
Assignee | ||
Comment 18•7 years ago
|
||
Comment on attachment 8887270 [details] [review] [treeherder] mozilla:attribute-error > mozilla:master Here's another bundle of joy for you, Ed.
Attachment #8887270 -
Flags: review?(emorley)
Comment 19•7 years ago
|
||
Ah good spot! I'll take a look tomorrow with fresher eyes :-)
Comment 20•7 years ago
|
||
Comment on attachment 8887270 [details] [review] [treeherder] mozilla:attribute-error > mozilla:master Many thanks! A couple of review comments but doesn't need re-review after :-)
Attachment #8887270 -
Flags: review?(emorley) → review+
Comment 21•7 years ago
|
||
Commits pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/0c4387f1570d611ca229597873690952df4701f4 Bug 1372581 - Better new relic reporting for resultset_loader Add fields of ``url`` and ``repo`` in case exceptions happen. This was catching an exception that could happen in ``fetch_resultset`` for github pushes, but wasn't raising it, so we would get exceptions later due to a null object returned from that function. This allows the exception to raise and not try to do any more processing on that push. https://github.com/mozilla/treeherder/commit/9183e56d27fb03c0c0469f948da8f50626d1a995 Bug 1372581 - Handle latin characters in commit author names We use python .format() to concatenate the commit author’s name and email address. When we hit an author name with non-ASCII characters, it would bomb out trying to format it into a String. So this makes sure to put those fields into a Unicode field to preserve them.
Assignee | ||
Updated•7 years ago
|
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•