The failure summary panel is continually refreshed even if the job hasn't completed yet

NEW
Unassigned

Status

P3
normal
4 years ago
a year ago

People

(Reporter: emorley, Unassigned)

Tracking

Details

Attachments

(1 attachment)

46 bytes, text/x-github-pull-request
emorley
: review-
Details | Review | Splinter Review
1) Click on an unfinished (eg running job)
2) Watch the failure summary panel

Expected:
Failure summary shows "No logs available for this job" but doesn't refresh every 5 seconds.

Actual:
After bug 1080760 it refreshes every 5 seconds.

We should wait until the job changes state.

We end up spamming requests to parse the log, eg:

[2015-06-17 09:50:09,512] DEBUG [treeherder.log_parser.utils:114] Finished posting artifact for mozilla-inbound e9a3c4982f8e888d7fe6983696f72223b8edd2ca
[2015-06-17 09:50:09,512] DEBUG [treeherder.log_parser.utils:80] Downloading/parsing log for mozilla-inbound a495aa0d5cbd770f2ac712f50745c365862724ef (http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux64-asan/1434551452/mozilla-inbound_ubuntu64-asan_vm_test-mochitest-3-bm53-tests1-linux64-build412.txt.gz)
[2015-06-17 09:50:10,424] DEBUG [treeherder.log_parser.utils:114] Finished posting artifact for mozilla-inbound a495aa0d5cbd770f2ac712f50745c365862724ef
[2015-06-17 09:50:10,424] DEBUG [treeherder.log_parser.utils:80] Downloading/parsing log for mozilla-inbound 00bcf595cbdd741719c8de112fb094d4508a4ebc (http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android-api-11/1434551450/mozilla-inbound_ubuntu64_vm_armv7_large_test-plain-reftest-11-bm53-tests1-linux64-build142.txt.gz)
[2015-06-17 09:50:11,187] DEBUG [treeherder.log_parser.utils:114] Finished posting artifact for mozilla-inbound 00bcf595cbdd741719c8de112fb094d4508a4ebc
Wrong log excerpt, I meant to paste:

[17/Jun/2015 09:53:30] "GET /api/project/mozilla-inbound/bug-job-map/?job_id=116 HTTP/1.1" 200 2
datasource.hubs.MySQL.MySQL debug message:
        host:localhost db:mozilla_inbound_jobs_1 host_type:read_host proc:jobs.selects.get_job_artifact
        Executing SQL:SELECT id, job_id, name, `type`, `blob` FROM job_artifact WHERE active_status = 'active' AND type = %s AND job_id = %s AND name = %s ORDER BY id DESC, name LIMIT 0,10
        Execution Time:5.2190e-04 sec

[17/Jun/2015 09:53:35] "GET /api/project/mozilla-inbound/artifact/?job_id=116&name=Bug+suggestions&type=json HTTP/1.1" 200 2
datasource.hubs.MySQL.MySQL debug message:
        host:localhost db:mozilla_inbound_jobs_1 host_type:read_host proc:jobs.selects.get_job_artifact
        Executing SQL:SELECT id, job_id, name, `type`, `blob` FROM job_artifact WHERE active_status = 'active' AND type = %s AND job_id = %s AND name = %s ORDER BY id DESC, name LIMIT 0,10
        Execution Time:4.0412e-04 sec

[17/Jun/2015 09:53:40] "GET /api/project/mozilla-inbound/artifact/?job_id=116&name=Bug+suggestions&type=json HTTP/1.1" 200 2
datasource.hubs.MySQL.MySQL debug message:
        host:localhost db:mozilla_inbound_jobs_1 host_type:read_host proc:jobs.selects.get_job_artifact
        Executing SQL:SELECT id, job_id, name, `type`, `blob` FROM job_artifact WHERE active_status = 'active' AND type = %s AND job_id = %s AND name = %s ORDER BY id DESC, name LIMIT 0,10
        Execution Time:3.4904e-04 sec

[17/Jun/2015 09:53:45] "GET /api/project/mozilla-inbound/artifact/?job_id=116&name=Bug+suggestions&type=json HTTP/1.1" 200 2
datasource.hubs.MySQL.MySQL debug message:
        host:localhost db:mozilla_inbound_jobs_1 host_type:read_host proc:jobs.selects.get_job_artifact
        Executing SQL:SELECT id, job_id, name, `type`, `blob` FROM job_artifact WHERE active_status = 'active' AND type = %s AND job_id = %s AND name = %s ORDER BY id DESC, name LIMIT 0,10
        Execution Time:3.7003e-04 sec

[17/Jun/2015 09:53:51] "GET /api/project/mozilla-inbound/artifact/?job_id=116&name=Bug+suggestions&type=json HTTP/1.1" 200 2
Assignee: nobody → cdawson
Would this be what's needed? https://github.com/mozilla/treeherder/pull/671
Flags: needinfo?(cdawson)
Created attachment 8626375 [details] [review]
PR 671

This should be ready for a first pass at review. I *think* it does what's wanted. (If the selected job's state is "pending" or "running", it just returns an empty array without even looking at the log, but it preserves the 5 second timer to try again later.)


I'd make the argument that for pending/running jobs, we could probably change the timer to a longer timeout than 5 seconds, since it's not likely that we'll have something parse-able anytime soon, but that's not relevant to this patch.
Flags: needinfo?(cdawson)
Attachment #8626375 - Flags: review?(emorley)
Attachment #8626375 - Flags: review?(cdawson)
Comment on attachment 8626375 [details] [review]
PR 671

lgtm
Attachment #8626375 - Flags: review?(cdawson) → review+
Only thing I haven't tested is if you can select a pending job and leave it selected as it progresses to running and then completed and then actually displays the log details or failure summary. It should work, assuming it worked prior to this patch.
Hrm, I left Treeherder running with this patch applied and selected a pending job to make sure it can make the transition to the completed state where it loads the job details pane and shows the failure summary if the job failed. The selected job was coalesced away.

I then selected a running job to test the same thing (without any chance of coalescing), and the job details never were filled in until I clicked the job again to manually reload it. I'm gonna try doing this again without my patch applied to see if this is a regression or if it regressed prior to this patch (likely bug 1080760 if it is indeed broken without my patch).
I'm having mixed results for things without my patch. Selecting a running taskcluster job, it seems to load the failure summary correctly. Doing the same for a buildbot job leaves the failure summary section blank just like it did with my patch (and the Job Details pane is similarly blank, and jumping between the two never shows the content).

Both with and without my patch, the failure summary section stops doing the five second reload once the job finishes, so I know for a fact that it knows that the job actually finished (and parsed?), it's just not updating the UI correctly. Maybe I should store the previous state of the job somewhere in the job itself so that when the state changes from running to completed, we can trigger something to automatically reselect the job, causing the summary and details panes to be filled with content?
Assignee: cdawson → wkocher
Status: NEW → ASSIGNED
:KWierso what's the status of attachment 8626375 [details] [review]? I see there's a r+ from camd, do you want to merge it or we can close it? Ifyou are not sure I would be for closing it (and eventually reopen it if needed).
Judging by comment 7 I don't think this is ready to land.
Priority: P2 → P3
Assignee: wkocher → nobody
Status: ASSIGNED → NEW
Attachment #8626375 - Flags: review+ → review-
Component: Treeherder → Treeherder: Log Parsing & Classification
You need to log in before you can comment on or make changes to this bug.