Closed Bug 1042751 Opened 10 years ago Closed 9 years ago

pushlog-ingestion times out when the response from json-pushes contains thousands of changesets

Categories

(Tree Management :: Treeherder: Data Ingestion, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1119028

People

(Reporter: emorley, Unassigned)

Details

On:
https://treeherder.mozilla.org/ui/#/jobs?repo=build-system

The main page content doesn't contain any pushes (it's expected that there are no results, since there have been no new jobs for ages).

a) We should display an error message if the pushlog failed to load
b) Why is the pushlog empty for this repo but not also for others that similarly inactive? (eg https://treeherder.mozilla.org/ui/#/jobs?repo=accessibility)
Priority: P2 → P3
Yeah, if there was an error, you will see that displayed.  But in this case, we just don't have any data for that repo.  So we should put up a message saying that.  Perhaps putting up a url for where we would get the resultsets (jsonpushes).
There seems to be two issues here:
1) Why is there no data for this repo & yet we have data for other repos of a similar level on inactivity?
2) Displaying a message for no result sets found (similar to bug 1045609, except that's about finding no jobs, when a result set was found).
So, if I click the little "i" button on treeherder's build-system page it shows a message about "build-system is not supported in treestatus.mozilla.org". Maybe that's why it doesn't load the results?
Flags: needinfo?(emorley)
The pull request in bug 1065541 adds separate error messages for unknown repositories and known repositories that don't return any resultsets, which takes care of comment 2's second point.
(In reply to Wes Kocher (:KWierso) from comment #3)
> So, if I click the little "i" button on treeherder's build-system page it
> shows a message about "build-system is not supported in
> treestatus.mozilla.org". Maybe that's why it doesn't load the results?

Ingestion of results from json-pushes is done in the service & is unrelated to treeherder-ui polling treestatus.mozilla.org. The reason build-system is not listed on treestatus.mozilla.org is that it doesn't have the hook installed iirc, so it wouldn't do anything (other than allow for the MOTD/closure message).
Flags: needinfo?(emorley)
The following repos don't show pushlog data when they should:

https://treeherder.mozilla.org/ui/#/jobs?repo=alder
https://tbpl.mozilla.org/?tree=Alder

https://treeherder.mozilla.org/ui/#/jobs?repo=build-system
https://tbpl.mozilla.org/?tree=Build-System
Summary: build-system repo displays no pushes & no error message → build-system & alders repos are failing to ingest their pushlogs
Clearing the pushlog cache via Django admin didn't help.
Summary: build-system & alders repos are failing to ingest their pushlogs → build-system & alder repos are failing to ingest their pushlogs
Blocks: 1076750
Priority: P3 → P2
Blocks: 1080757
No longer blocks: treeherder-dev-transition
Treeherder seems to be in sync with tbpl on Alder.
(In reply to Mauro Doglio [:mdoglio] from comment #8)
> Treeherder seems to be in sync with tbpl on Alder.

Ah alder was reset recently, so the pushlog cleared.

build-system is still broken for some reason.
Treeherder is not ingesting the build-system pushlog successfully because one of those pushes has 12746 revisions attached Inserting those rows in the database is taking a while and gunicorn is currently set to timeout requests after 45 seconds. I'll further increase the timeout to 60 seconds, hopefully that will be enough. :edmorley is it common to have so many revisions on a single push?
Ah that makes sense. No it's not common - the builds-system repo is a project repo that isn't very active - I imagine someone tried to push the latest changes from mozilla-central after it hadn't been merged for months.

We could always just request a wipe and re-clone of build-system if needs be.
(In reply to Ed Morley [:edmorley] from comment #11)
> Ah that makes sense. No it's not common - the builds-system repo is a
> project repo that isn't very active - I imagine someone tried to push the
> latest changes from mozilla-central after it hadn't been merged for months.
> 
> We could always just request a wipe and re-clone of build-system if needs be.

Re-clone from mozilla-central that is, and the wipe would empty the pushlog for build-system, the same as when the twig repos like alder are reset.
build-system and services-central were both commented-out in buildbot last May to get more max-refs capacity. Since they are no longer the way we generally work, their only real use in the last few years before they were commented-out was "someone with a clone of them is pissed off that inbound is closed, so they'll use it as a private inbound and then merge it to m-c themselves." It would be a far better thing to push on to just finally kill both of them in buildbot and remove them from treeherder and tbpl.
Yeah agreed, it was more I wanted to figure out what bug we were hitting and decide if we needed to handle it :-)
Summary: build-system & alder repos are failing to ingest their pushlogs → pushlog-ingestion times out when the response from json-pushes contains thousands of changesets
The worker timeouts have been raised to 60s, let's lower priority for now, since that may already have fixed this.
Priority: P2 → P3
No longer blocks: 1080757
Component: Treeherder → Treeherder: Data Ingestion
No longer blocks: 1076750
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.