Closed Bug 1252854 Opened 9 years ago Closed 9 years ago

Enable classification of all job failures through a single panel

Categories

(Tree Management :: Treeherder, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jgraham, Assigned: jgraham)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

Based on sheriff feedback, the main problem with the existing job panel was that it did not show all the error lines. To fix this, the plan is to allow all error lines to be shown, and assigned a bug number, under the autoclassification panel. Implmentation wise this involves working out which structured lines correspond to which unstructured lines to enable deduplication. This change will not enable autoclassification of unstructured lines, since that would require moving away from the artifact storage, which would be a lot of yak to shave.
Attached file PR
camd: can you look at the frontend and log ingestion parts? emorley, mdoglio, can you look at the backend? I guess this will need a few iterations, but it's already a fairly large change.
Attachment #8725726 - Flags: review?(mdoglio)
Attachment #8725726 - Flags: review?(emorley)
Attachment #8725726 - Flags: review?(cdawson)
Comment on attachment 8725726 [details] [review] PR Unfortunately I don't even know where to start with beginning this review. I've lost track of what was reviewed before (since we never landed it on master, so now have an even larger PR) and the new work has drastically increased the complexity of our celery tasks, such that I'm going to need to spend several days digging through this next week (I'm mostly on PTO yesterday/today). I think the best way forwards might be for us to have a meeting where we can discuss the bigger picture for the way we split of the log parse tasks and see if there is a simpler way we can work avoid the current complex dependency chains which have required the use of the celery results backend for the first time, along with the custom sync/locking solution (https://github.com/mozilla/treeherder/commit/7edf602975d8c9334a709daa74d9bb01d188a52a).
Attachment #8725726 - Flags: review?(emorley)
Comment on attachment 8725726 [details] [review] PR It looks like we are going to have a meeting to discuss these changes. I'll hold on this review until then.
Attachment #8725726 - Flags: review?(mdoglio)
Attachment #8725726 - Flags: review?(mdoglio)
Attachment #8725726 - Flags: review?(emorley)
Attachment #8725726 - Flags: review?(emorley) → feedback+
Comment on attachment 8725726 [details] [review] PR I only reviewed the UI code.
Attachment #8725726 - Flags: review?(cdawson) → review+
Attachment #8725726 - Flags: review?(emorley)
Attachment #8725726 - Flags: review?(emorley) → review+
I don't think I can review attachment 1 [details] [diff] [review] anytime before next week. Given that both :emorley and :camd reviewed it I would be tempted to clear the r? flag on me. :jgraham are you ok with that?
I would prefer if you did the (hopefully trivial) reviews for the webapi bits and db changes next week. I think it's more important to do this right than do it by March 31st.
Comment on attachment 8725726 [details] [review] PR Sorry I won't be able to review this patch anytime soon. Is there anything specific you want some feedback on?
Attachment #8725726 - Flags: review?(mdoglio)
Comment on attachment 8725726 [details] [review] PR mdoglio: I think it should be fine; emorley and camd should know all the unreviewed code.
Attachment #8725726 - Flags: review?(emorley)
Comment on attachment 8725726 [details] [review] PR Looks good, thank you. As we discussed yesterday, I'll be away until 19th April. I saw another few CloudAMQP alerts this morning (even after the queues were reset yesterday) - I'll leave it to you to check whether that blocks testing on stage. For the landing of this I imagine that it may be worth landing the migrations first (presuming everything is currently switched off on stage/prod that uses the affected schema) and then the rest separately. The only gotcha might be how long it takes for the migrations to run on prod, but I guess all we can do it try :-) (And if they time out, comment out the migration step in the Chief update.py script and run them manually after deployment from one of the nodes). Prior to landing on prod, it's probably worth keeping an eye on the following for stage to see if there are any performance regressions/issues: * New Relic APM exceptions/error pages (https://rpm.newrelic.com/accounts/677903/applications/5585473/filterable_errors) * New Relic APM transaction page: web transactions (https://rpm.newrelic.com/accounts/677903/applications/5585473/transactions?type=app) * New Relic APM transaction page: non-web transactions (https://rpm.newrelic.com/accounts/677903/applications/5585473/transactions?type=other&show_browser=false) * New Relic APM database page (https://rpm.newrelic.com/accounts/677903/applications/5585473/datastores#/overview/All?value=total_call_time_per_minute) * New Relic Plugin page for rabbitmq (ie https://rpm.newrelic.com/accounts/677903/plugins/16138) Good luck! :-)
Attachment #8725726 - Flags: review?(emorley) → review+
ni? KWierso at his own request to try this out on stage.
Flags: needinfo?(wkocher)
Blocks: 1265095
Commits pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/7e4084d24fcf6d543fde9d9a2515be1e4072ee70 Bug 1252854 - Backend work for matching unstructured and structured log summary lines. https://github.com/mozilla/treeherder/commit/af1db114519b14033548dc3f35826e3a4d924caf Bug 1252854 - Front end changes to allow structured and unstructured lines to both be displayed in the autoclassify UI. https://github.com/mozilla/treeherder/commit/daf296dc30c030e8292b4915d5766b86c8b7dacf Merge pull request #1414 from mozilla/autoclassify_integration Bug 1252854 - Integrate classification of structured and unstructured error lines.
I never had a chance to get to this over the weekend. I'll try to look at an updated stage sometime this week, I guess.
Flags: needinfo?(wkocher)
Depends on: autostar
Blocks: autostar
No longer depends on: autostar
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Blocks: 1215107
Depends on: 1277506
Depends on: 1281463
Depends on: 1303765
Depends on: 1311509
Assignee: nobody → james
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: