1429030 - filing new bugs is not using the bugzilla component defined in-tree

Reporter

Description

•

6 years ago

there is a process for the tree sheriffs for filing a new intermittent failure bug- unfortunately I don't know where this is documented or what tools are used.

I do know that many of the web-platform-tests are not filed in the bugzilla component the tests are associated with, instead they are filed in testing::web-platform-tests.  There are 560 intermittent failures in that component, 55 in the last month.  I spot checked a few of the 55 recent ones and they do have valid components associated in moz.build files.  For reference here are the recent ones:
https://bugzilla.mozilla.org/buglist.cgi?keywords=intermittent-failure%2C%20&keywords_type=allwords&list_id=13958611&resolution=---&classification=Components&chfieldto=Now&query_format=advanced&chfieldfrom=2017-12-01&component=web-platform-tests&product=Testing

We need to fix up our tools to work properly, otherwise these bugs are not going to ever be seen and we are just wasting time.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 1

•

6 years ago

:coop, I believe you are headmaster of the sheriffs, is this something you can drive to ensure our tools are working properly and the work the sheriffs are doing is not for waste?

Flags: needinfo?(coop)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 2

•

6 years ago

The not working suggestions are bug 1354791 - duplicate?

James Graham [:jgraham]

Comment 3

•

6 years ago

I think this should be treated as a  meta bug rather than duping against a specific techical issue.

Depends on: 1354791

Ed Morley [:emorley]

Comment 4

•

6 years ago

Thank you for filing this - I agree it's important that the tools help the bugs end up in the right place, so they don't get missed.

The bug filer tool that files the intermittent failure bugs was created by and almost solely maintained by Wes. It would be good to have someone outside the Treeherder team take over maintenance now he's left. If it would help to have a completely separate Bugzilla component for bugs relating to this tool, I have the necessary permissions to create one (it's currently part of the "Treeherder: Log Parsing & Classification" component.

The code for the tool is here:
https://github.com/mozilla/treeherder/blob/master/ui/js/controllers/bugfiler.js
https://github.com/mozilla/treeherder/blob/master/tests/ui/unit/controllers/bugfiler.tests.js

Re this class of bugs (wrong component), the best way to figure out whether it's an issue with the bug filer or the metadata returned from hg.m.o is to:
* follow the treeherder.m.o/logviewer.html link in the bug description
* then click the "revision" link (that links back to the main Treeherder jobs view with that job selected)
* then switch to the failure summary panel
* then click the bugfiler icon next to that failure line (think maybe needs "&bugfiler" added to the URL if not a sheriff?)
* find the XHR request made to hg.m.o (eg https://hg.mozilla.org/mozilla-central/json-mozbuildinfo?p=testing/web-platform/mozilla/tests/wasm/f32.wast.js.html) and check that (a) the request was for the correct file, (b) the response is as expected

Component: Treeherder → Treeherder: Log Parsing & Classification

No longer depends on: 1354791

Ed Morley [:emorley]

Comment 5

•

6 years ago

(Sorry missed the dep change when using "make comment anyway")

Depends on: 1354791

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 6

•

6 years ago

ok, the problem looks to be the hg service, locally I have:
$ ./mach file-info bugzilla-component testing/web-platform/tests/html/semantics/embedded-content/media-elements/track/track-element/track-cues-missed.html
Core :: DOM
  testing/web-platform/tests/html/semantics/embedded-content/media-elements/track/track-element/track-cues-missed.html

but query hg via the web:
https://hg.mozilla.org/mozilla-central/json-mozbuildinfo/?p=testing/web-platform/tests/html/semantics/embedded-content/media-elements/track/track-element/track-cues-missed.html

I get:
{
  "error": "unable to obtain moz.build info"
}


:gps, do you know why this would be happening?

Flags: needinfo?(gps)

Chris Cooper [:coop] (he/him)

Comment 7

•

6 years ago

(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #1)
> :coop, I believe you are headmaster of the sheriffs, is this something you
> can drive to ensure our tools are working properly and the work the sheriffs
> are doing is not for waste?

Redirecting NI request to RyanVM

Flags: needinfo?(coop) → needinfo?(ryanvm)

Gregory Szorc [:gps]

Comment 8

•

6 years ago

json-mozbuildinfo has been failing for a while. Bug 1354791 tracks.

Getting it working is a non-trivial amount of work, both now and on an ongoing basis. I would encourage tools to consume the JSON produced by Firefox CI that contains Bugzilla metadata. See e.g. https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=bugzilla&selectedJob=156606589

Flags: needinfo?(gps)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 9

•

6 years ago

this means that the new bug filing tool needs to use data from another source, not a query to the hg server.  One thought is we could query activedata given that all this information is ingested there.

Kyle- do you have concerns with all "new bugs" filed with the sheriff/treeherder tool to use activedata to query the bugzilla component based on a file given?  Only concerns I can think about is uptime/reliability- that all falls into error handling and is probably less error prone than our current methods.

Flags: needinfo?(klahnakoski)

Kyle Lahnakoski [:ekyle]

Comment 10

•

6 years ago

I have no concerns.  The amount of data is tiny, and the queries against it are simple.

Flags: needinfo?(klahnakoski)

James Graham [:jgraham]

Comment 11

•

6 years ago

Why not just use TH/TC as gps suggsted to avoid a new dependency. The tool just needs to grab https://treeherder.mozilla.org/api/project/mozilla-central/jobs/?count=1&job_type_name=source-test-file-metadata-bugzilla-components or something to work out a recent bugzilla job and then either convert the job guid to a TC guid directly and grab https://queue.taskcluster.net/v1/task/<guid>/runs/0/artifacts/public/components.json directly or go via the relavant job details endpoint.

Ryan VanderMeulen [:RyanVM]

Updated

•

6 years ago

Flags: needinfo?(ryanvm)

Ed Morley [:emorley]

Comment 12

•

6 years ago

I'd much rather just fix bug 1354791 than add more complexity in Treeherder, unless I'm overlooking something else?

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 13

•

6 years ago

I would like to work on this, I am not aware how bug 1354791 will fix this- maybe I don't understand that bug or the solution there.

Ed Morley [:emorley]

Comment 14

•

6 years ago

The bug filer uses an API (on hg.mozilla.org) to fetch the bug component from in-repo metadata. That API is currently broken. 

:gps suggested using data from a different source instead, however I'm saying it might mean less complexity in Treeherder if we just fixed the original API instead.

If the long term goal were to get everyone to use the new data source instead (or if there were some other advantage of it, other than it not being broken), then I could be persuaded otherwise :-)

James Graham [:jgraham]

Comment 15

•

6 years ago

In case it isn't obvious, the effect on web-platform-tests intermittents is they basically all end up in Testing::web-platform-tests and therefore don't get seen by anyone who could fix the problem, at least  without further manual triage. Therefore this is having a real effect on our ability to handle bugs and other intermittent issues.

It would be good to determine the actual cost/benefit of different approaches to solving this problem, because currently the maintainers of the two most obvious pieces of code where a fix could be applied are both claiming it's too difficult to make this work in their system, and so no progress is being made.

Ed Morley [:emorley]

Comment 16

•

6 years ago

On the Treeherder side, there is no maintainer. The one person who looked after this feature no longer works at Mozilla and the Treeherder team does not have the resources to take another feature under our wing. If people want that to change, then we need more headcount.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 17

•

6 years ago

I have a pending NI for :gps https://bugzilla.mozilla.org/show_bug.cgi?id=1354791#c11 to solve this same issue.

:aryx- is fixing the treeherder bug filing tool something you can fix on your end?

Flags: needinfo?(aryx.bugmail)

James Graham [:jgraham]

Comment 18

•

6 years ago

I understand that the treeherder team is chronically understaffed. I also understand that this is a frustrating case because the feature relied on a "third party" API that suddenly stopped working. However the bug is wasting the time of engineers who have to deal with misclassified bugs, or have to ignore intermittent failures that would be fixed if only the right people knew that they existed.

If we don't have the resources to make the bug filer work correctly it should be disabled, as we did for the equally undermaintained autoclassify panel. If that isn't an acceptable solution we need to work out between the treeherder team and the sheriff's team how to prioritise fixing this kind of high-impact issue.

Ed Morley [:emorley]

Comment 19

•

6 years ago

I agree this is something that should receive resources to fix. Coop, could you find someone to do this, and coordinate with the sheriffs to ensure they don't continue to mis-file bugs in the meantime?

Flags: needinfo?(coop)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 20

•

6 years ago

I have a JS implementation locally, doing some cleanup right now.

Flags: needinfo?(aryx.bugmail)

Ed Morley [:emorley]

Comment 21

•

6 years ago

Amazing - thank you! :-)

Flags: needinfo?(coop)

GitHub Bugzilla PR Linker

Comment 22

•

6 years ago

Attached file Link to GitHub pull-request: https://github.com/mozilla/treeherder/pull/3355 (obsolete) — Details

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Updated

•

6 years ago

Attachment #8960747 - Flags: review?(cdawson)

Phil Ringnalda (:philor)

Comment 23

•

6 years ago

FWIW, the "high impact" part of this could have just been fixed months ago by removing the three lines at https://github.com/mozilla/treeherder/blob/cca48d14df73d470a59f79b8c1a4991c93a0da0b/ui/js/controllers/bugfiler.js#L172

Ed Morley [:emorley]

Comment 24

•

6 years ago

Comment on attachment 8960747 [details] [review]
Link to GitHub pull-request: https://github.com/mozilla/treeherder/pull/3355

Left some comments :-)

Since those I've just seen Phil's comment above - agree easiest to remove that for now.

Attachment #8960747 - Flags: review?(cdawson) → review-

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Updated

•

6 years ago

Comment 25

•

6 years ago

Now that bug 1447771 has fixed the main issue this was causing, this seems to be a dupe of bug 1354791 (fixing the hg.m.o API), unless we've decided that it's preferred to move away from that API longer term?

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 26

•

4 years ago

this conversation hasn't happened in a while, we seem to have good quality bugs these days- closing unless there is new information.

Status: NEW → RESOLVED

Closed: 4 years ago

Resolution: --- → WONTFIX

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 27

•

4 years ago

Reopening as it's a time waster.

Status: RESOLVED → REOPENED

Resolution: WONTFIX → ---

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Updated

•

3 years ago

Depends on: 1696984

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Updated

•

3 years ago

Attachment #8960747 - Attachment is obsolete: true

GitHub Bugzilla PR Linker

Comment 28

•

3 years ago

Attached file Link to GitHub pull-request: https://github.com/mozilla/treeherder/pull/7151 — Details

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 29

•

3 years ago

Hi Chris, are you the new the taskcluster SRE person? Could you help getting the deployment of https://prototype.treeherder.nonprod.cloudops.mozgcp.net/jobs?repo=autoland and the related the database schemas going? I'd like to check the new data ingestion works as expected before it gets merged into production. Thank you in advance.

Flags: needinfo?(cvalaas)

chris valaas [:cvalaas]

Comment 30

•

3 years ago

I've merged the new cronjob/schedule into the nonprod treeherder deploy spec.
I don't see anything regarding databases in the cloudops code, but I'm new on this project, so I could be missing something ... ?

Flags: needinfo?(cvalaas)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 31

•

3 years ago

It's also my first time having such a request.

Questions:

Does the migration mentioned python manage.py migrate at https://docs.djangoproject.com/en/3.1/topics/migrations/#workflow need to be run manually (I think so)?
The call to /api/repository/ fails and causes https://prototype.treeherder.nonprod.cloudops.mozgcp.net/jobs?repo=autoland to remain blank. This data gets loaded into from treeherder/model/fixtures/repository.json into the database when the app starts/reboots. It got modified in the recent push but I don't spot an issue (and it ran successfully on the local machine): https://github.com/mozilla/treeherder/commit/22155749149bfeb018ffc3b1a3ed849b591d07dc#diff-4d440bff7edf0c62897c28f2cd87538d1e2929e06fbc8409856f3f1e492762fe As far as I remember, I found a message about a missing column related to performance earlier this week (in GCP?) but am unable to find it.
Log Explorer shows nothing for resource.labels.database_id="moz-fx-treeherde-nonprod-34ec:treeherder-nonprod-prototype-v1" - could you check if GCP if failure messages are being logged elsewhere?
Thank you in advance.

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 32

•

3 years ago

•

Edited

Found the error: https://console.cloud.google.com/errors/CLLUuf7x2fOAXw?time=P1D&project=moz-fx-treeherde-nonprod-34ec
OperationalError: (1054, "Unknown column 'repository.life_cycle_order' in 'field list'")
So it's missing https://github.com/mozilla/treeherder/pull/7151/files#diff-69a54a8c36fc6091dc4cb899a8593707d4166e68e8db8c08dd33c4740877c879R107 - the migration command from comment 31 (in full: docker-compose run backend ./manage.py migrate) should fix this.
(The performance error message I remembered was this one.)

chris valaas [:cvalaas]

Comment 33

•

3 years ago

•

Edited

It does seem like the db migrate needs to be run manually. It is not part of the prototype Jenkins pipeline, though it is part of the stage and production pipelines.
I don't see any documentation in any of cloudops' repos regarding db migrations for treeherder.
:sclements, since you seem to know more than us about how these deploys go, do you know how db migrations work for the prototype env?

Flags: needinfo?(sclements)

chris valaas [:cvalaas]

Comment 34

•

3 years ago

•

Edited

After s'more poking, it looks like the MIGRATE step in Jenkins does run a container with the entrypoint set to "release", which, according to entrypoint_prod.sh calls ./bin/pre_deploy, which runs the django migration.
Whether or not this should happen on prototype, though, I'm still unclear. If I understand prototype correctly, it gets reset to master every-so-often, so automatic db migrations may not be desirable (or workable)...

Sarah Clements [:sclements]

Comment 35

•

3 years ago

(In reply to chris valaas [:cvalaas] from comment #34)

After s'more poking, it looks like the MIGRATE step in Jenkins does run a container with the entrypoint set to "release", which, according to entrypoint_prod.sh calls ./bin/pre_deploy, which runs the django migration.
Whether or not this should happen on prototype, though, I'm still unclear. If I understand prototype correctly, it gets reset to master every-so-often, so automatic db migrations may not be desirable (or workable)...

Hi Chris, prototype should behave the exact same way as stage and prod - so migrations should run on every deploy. People will only occasionally use the prototype branch by pushing directly to it via Git, its not designed to automatically reset to master as far as I'm aware. Also, for your reference here's some docs that might be useful for you - just a general FYI. https://treeherder.readthedocs.io/infrastructure/administration.html#database-management-cloudops

Flags: needinfo?(sclements)

chris valaas [:cvalaas]

Comment 36

•

3 years ago

Okay, I added the MIGRATE step to the prototype deploy. Should happen next prototype deployment.

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 37

•

3 years ago

Prototype got a new push from me yesterday (a merge of master because the prototype branch is protected force pushes) and it shows the pushes now but no tasks - this could be explained if it fails to connect to Pulse and retrieve the messages about the tasks. This was working 11 days earlier and the recent commits look unrelated. The database is working because the pushes are being stored.

Sarah Clements [:sclements]

Comment 38

•

3 years ago

(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #37)

Prototype got a new push from me yesterday (a merge of master because the prototype branch is protected force pushes) and it shows the pushes now but no tasks - this could be explained if it fails to connect to Pulse and retrieve the messages about the tasks. This was working 11 days earlier and the recent commits look unrelated. The database is working because the pushes are being stored.

I'd file anything like this under cloudOps. Chris, can you look into this please? You'll probably need to look at the pulse_listener_tasks or other *task worker.

Flags: needinfo?(cvalaas)

chris valaas [:cvalaas]

Comment 39

•

3 years ago

Most of the log messages from the last 26 hours for pulse_listener_tasks in prototype look like this (several lines per second):

[2021-08-18 20:34:22,714] DEBUG [treeherder.services.pulse.consumers:155] received job message from exchange/taskcluster-queue/v1/task-completed#primary.QugQcbIrTCmCTpBz6tWUfg.0.us-west-1.i-08cb54ae59a438e02.gecko-t.t-linux-xlarge.gecko-level-1.b-Xxq0DqRJyxCYZSZwqoYA._

If I filter out "received job message", these are the only log messages left:

2021-08-17 22:33:45.847 | WARNING  | mozci.configuration:__init__:123 - Configuration path mozci_config.toml is not a file.
22:33:45.878 | INFO     | mozci.data.base:51  - Sources selected, in order of priority: ('hgmo', 'taskcluster', 'treeherder_client').
[2021-08-17 22:33:46,015] INFO [treeherder.services.pulse.consumers:102] Pulse queue queue/treeherder-prototype/tasks bound to: exchange/taskcluster-queue/v1/task-pending #.#
[2021-08-17 22:33:46,019] INFO [treeherder.services.pulse.consumers:102] Pulse queue queue/treeherder-prototype/tasks bound to: exchange/taskcluster-queue/v1/task-running #.#
[2021-08-17 22:33:46,022] INFO [treeherder.services.pulse.consumers:102] Pulse queue queue/treeherder-prototype/tasks bound to: exchange/taskcluster-queue/v1/task-completed #.#
[2021-08-17 22:33:46,025] INFO [treeherder.services.pulse.consumers:102] Pulse queue queue/treeherder-prototype/tasks bound to: exchange/taskcluster-queue/v1/task-failed #.#
[2021-08-17 22:33:46,028] INFO [treeherder.services.pulse.consumers:102] Pulse queue queue/treeherder-prototype/tasks bound to: exchange/taskcluster-queue/v1/task-exception #.#
[2021-08-17 22:33:46,138] INFO [treeherder.services.pulse.consumers:102] Pulse queue queue/treeherder-prototype/tasks bound to: exchange/taskcluster-queue/v1/task-pending #.#
[2021-08-17 22:33:46,142] INFO [treeherder.services.pulse.consumers:102] Pulse queue queue/treeherder-prototype/tasks bound to: exchange/taskcluster-queue/v1/task-running #.#
[2021-08-17 22:33:46,146] INFO [treeherder.services.pulse.consumers:102] Pulse queue queue/treeherder-prototype/tasks bound to: exchange/taskcluster-queue/v1/task-completed #.#
[2021-08-17 22:33:46,152] INFO [treeherder.services.pulse.consumers:102] Pulse queue queue/treeherder-prototype/tasks bound to: exchange/taskcluster-queue/v1/task-failed #.#
[2021-08-17 22:33:46,159] INFO [treeherder.services.pulse.consumers:102] Pulse queue queue/treeherder-prototype/tasks bound to: exchange/taskcluster-queue/v1/task-exception #.#

(There was a similar batch of messages from 19:41 (which seems to match the time of the deploy) that I left out. Same messages though.)

I don't see any other workload ending in *tasks in the cluster.

Flags: needinfo?(cvalaas)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 40

•

3 years ago

Let's take this log message:

{
insertId: "h8lmn78h4q5vr7lc"
labels: {10}
logName: "projects/moz-fx-treeherde-nonprod-34ec/logs/stderr"
receiveTimestamp: "2021-08-18T21:14:02.870117823Z"
resource: {2}
severity: "ERROR"
textPayload: "[2021-08-18 21:13:59,696] DEBUG [treeherder.etl.taskcluster_pulse.handler:179] Message received for task YBAIl9huRU6FA3QPXRr0jw"
timestamp: "2021-08-18T21:13:59.696802846Z"
}

The mentioned task ran 9 days ago - the prototype instance might try to process the backlog of messages since my first push to prototype because it's unbounded. https://pulseguardian.mozilla.org/queues shows only my own queue.

The queue should be deleted and prototype restarted which will recreate it (worked with my queue) - this shall get rid of the backlog. https://mana.mozilla.org/wiki/display/ITEO/Systems+Engineering+Team might have admin access in case you don't have it. The credentials are likely in the vault Sarah passed to cloudOps when they took over (I don't have access).

chris valaas [:cvalaas]

Comment 41

•

3 years ago

From what I can see via CloudAMQP.com, the treeherder-prototype instance has 19 queues. All but one are empty. The store_pulse_tasks queue has 3+ million messages.
It looks like I can purge that queue, would that work?
It also seems I can delete it, but if purging it is sufficient, that seems the easier solution.

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 42

•

3 years ago

Are there other store_pulse_tasks queues (one per instance, because acked messages are not sent to other instances (?)).

If the answer is yes: 3+ million messages sound like it cannot be production. Please purge the messages.

chris valaas [:cvalaas]

Comment 43

•

3 years ago

treeherder-prod and treeherder-stage both have store_pulse_tasks queues, but they're both empty (messages are popping in and out, but they're hovering at 0).

Shall I go ahead and purge store_pulse_tasks on treeherder-prototype?

Sarah Clements [:sclements]

Comment 44

•

3 years ago

•

Edited

(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #42)

Are there other store_pulse_tasks queues (one per instance, because acked messages are not sent to other instances (?)).

If the answer is yes: 3+ million messages sound like it cannot be production. Please purge the messages.

Each deployment has its own store_pulse_task worker and listeners, so deleting one should not affect other deployments. To clarify, the pulse_listener_tasks is a cloudAMQP queue, not a pulse guardian queue. It takes those messages from pulse guardian queues and then processes and stores those tasks with store_pulse_tasks, which also then kicks off log process workers if applicable. So we want to purge the store_pulse_tasks queue as Chris said, not delete the pulse guardian queues.

Sarah Clements [:sclements]

Comment 45

•

3 years ago

•

Edited

(In reply to chris valaas [:cvalaas] from comment #43)

treeherder-prod and treeherder-stage both have store_pulse_tasks queues, but they're both empty (messages are popping in and out, but they're hovering at 0).

Shall I go ahead and purge store_pulse_tasks on treeherder-prototype?

Before you purge it, we should figure out why the messages aren't being acknowledged because that would be why the tasks are not showing up in prototype. Is it under resourced or is there some other error?

I'm not seeing the connection between your changes on prototype Sebastian and why the tasks stopped being stored in the database.

We've had occasional issues with workers not working for some random infra reason and never being alerted to it until someone notices something is broken. Any ideas on how to set up alerts when something fails or we reach some sort of unacknowledged message limit, Chris?

Sarah Clements [:sclements]

Comment 46

•

3 years ago

•

Edited

Aha, I see some errors in new relic: https://onenr.io/08dQeJVA5we

So it looks like this error OperationalError: (1054, "Unknown column 'repository.life_cycle_order' in 'field list'") that Sebastian mentioned was the cause of the backlog of storing tasks and the last occurrence was August 17th. So safe to proceed with purging the queue then. But I wonder if the other 1.7 million messages from "Retry in 30s: MissingPushException('No push found in try for revision 0d206fdbd6564bd64904ffac7bf83ea3112fbe13 for task BkRVyClsRUO_OoOwnwmTkw')" is also an issue. That seems to be on going. That might need to be looked into more.

Chris, if you're going to be the Treeherder point of contact for troubleshooting I can give you access to New Relic if you'd like.

chris valaas [:cvalaas]

Comment 47

•

3 years ago

So safe to proceed with purging the queue then.

Queue has been purged.

Chris, if you're going to be the Treeherder point of contact for troubleshooting I can give you access to New Relic if you'd like.

That'd be great, thanks!

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 48

•

3 years ago

Sarah, could you merge this to master, please?

Flags: needinfo?(sclements)

Sarah Clements [:sclements]

Comment 49

•

3 years ago

Merged.

Flags: needinfo?(sclements)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 50

•

3 years ago

Chris, please set up the cron task for production.

Merge to master: https://github.com/mozilla/treeherder/commit/d3973898636dbfd91a4bba763ba742ced4d3979b

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Updated

•

3 years ago

Flags: needinfo?(cvalaas)

chris valaas [:cvalaas]

Comment 51

•

3 years ago

•

Edited

Added new cron to stage (I assume you want it in stage? If not, let me know) and prod. Awaiting review and merge.
https://github.com/mozilla-services/cloudops-infra/pull/3358

EDIT: merged.

Flags: needinfo?(cvalaas)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 52

•

3 years ago

Could you check the config, please? Searching Log Explorer for treeherder-prod for update_files_bugzilla_map doesn't find anything but does treeherder-nonprod - or is this awaiting deployment?

Flags: needinfo?(cvalaas)

chris valaas [:cvalaas]

Comment 53

•

3 years ago

Looks like :sclements approved a prod deploy within the last hour ... ?

Flags: needinfo?(cvalaas)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 54

•

3 years ago

Thanks, wasn't sure if this needed a new TH deploy or the cloudops-infra change was managed independently. Production shows the desired behavior now.

Status: REOPENED → RESOLVED

Closed: 4 years ago → 3 years ago

Resolution: --- → FIXED

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Updated

•

3 years ago

Blocks: 1732359

Nobody; OK to take it and work on it

Assignee

Updated

•

2 years ago

Component: Treeherder: Log Parsing & Classification → TreeHerder

Link to GitHub pull-request: https://github.com/mozilla/treeherder/pull/3355 6 years ago GitHub Bugzilla PR Linker 47 bytes, text/x-github-pull-request	emorley : review-	Details \| Review
Link to GitHub pull-request: https://github.com/mozilla/treeherder/pull/7151 3 years ago GitHub Bugzilla PR Linker 47 bytes, text/x-github-pull-request		Details \| Review