Recent Firefox Test Engineering results not appearing in ActiveData

RESOLVED FIXED

Status

Testing
ActiveData
RESOLVED FIXED
a month ago
4 days ago

People

(Reporter: davehunt, Assigned: ekyle)

Tracking

Version 3
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

Created attachment 8880796 [details]
screenshot of all expected outcomes chart

I noticed a sharp drop (see attached screenshot) in the number of outcomes shown in our dashboard at http://net-mozaws-stage-fx-test-report.s3-website-us-east-1.amazonaws.com/, and discovered that some recent results although appearing in our S3 bucket are not returned in ActiveData queries.

The following query should return the results from https://s3.amazonaws.com/net-mozaws-stage-fx-test-activedata/jenkins-fxapom.stage-241/py27_raw.txt

{
  "from":"fx-test",
  "where":{"eq":{"run.job_name":"fxapom.stage","run.build_number":"241"}},
  "format":"table"
}

Instead, it returns "no records to show".
Flags: needinfo?(klahnakoski)
(Assignee)

Comment 1

a month ago
Looking st this soon after this bug was filed, the data appeared to be a day behind, but not enough to be noticeable. The problem appears to be this data series is not ingested at all; even old data is not showing.  My next step is to review what's happening when the S3 bucket is scanned; it does scan in alphabetical order; it may be skipping stuff.
Flags: needinfo?(klahnakoski)
(Assignee)

Comment 2

a month ago
Ignore that last comment.  I looked at the S3 bucket scanner, and it looks good.  The ETL pipeline that feeds ActiveData is behind; only the last result #241 is missing. I scaled up the ETL, and I will see if it caught up later today.
(Assignee)

Comment 3

a month ago
I have confirmed the ETL processing was behind, and that is the cause of this problem. The ETL is still behind for a couple of other reasons, which have manual workarounds until the code is fixed.

https://github.com/marco-c/grcov/issues/31
https://github.com/klahnakoski/SpotManager/issues/5
Thanks Kyle. I currently perform queries with a range of results from 8 weeks ago to 1 day ago. This is because we may not yet have results for today, and I don't want to alarm viewers of the report if they see the charts fall to zero. It looks like this range is still not including all results. Should I adjust the range to 2 days ago?
Flags: needinfo?(klahnakoski)
(Assignee)

Comment 5

a month ago
The ultimate problem was code coverage was slowing down the workers, and making them fall behind.  

You should adjust the range to 2 days ago, so if/when this slowdown happens again there is less alarm.  Although, I did appreciate you notifying me of the slowdown.
Flags: needinfo?(klahnakoski)
(Assignee)

Comment 6

29 days ago
I am still tracking this problem; ActiveData is catching up. My apologies for the delay.
(Assignee)

Comment 7

4 days ago
Use this query to monitor the ingestion.

https://activedata.allizom.org/tools/query.html#query_id=S4TfgG3z

> {
>     "from":"fx-test",
>     "edges":[{
>         "value":"result.start_time",
>         "domain":{
>             "type":"time",
>             "min":"today-month",
>             "max":"eod",
>             "interval":"day"
>         }
>     }]
> }
Status: NEW → RESOLVED
Last Resolved: 4 days ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.