Closed
Bug 973944
Opened 10 years ago
Closed 10 years ago
b2g datazilla hamachi shows no data since Feb 10, 2014
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bkelly, Unassigned)
References
Details
It appears datazilla for b2g master on hamachi has been busted for the last week. Can someone from a-team take a look at what is going on?
Comment 1•10 years ago
|
||
I did restart the b2g-3 device earlier today because it was hit by bug https://bugzilla.mozilla.org/show_bug.cgi?id=971747 and restarted a job. b2g-0 is still hit by it but I cannot restart it because I don't know the password.
Comment 2•10 years ago
|
||
Actually b2g-3 looks to have fallen foul of bug 971747 again. The UI tests are not hit by this bug anymore but it seems the perf tests are.
Comment 3•10 years ago
|
||
I think this is a manifestation of bug 971605. I've just bumped b2gperf and we'll see if that fixes it.
Comment 4•10 years ago
|
||
That problem fixed, new problem found: bug 973822. Will fix with bug 974092.
Comment 5•10 years ago
|
||
This seems to have done the trick: https://datazilla.mozilla.org/b2g/?branch=master&device=hamachi&range=7&test=cold_load_time&app_list=browser,calendar,camera,clock,contacts,email%20FTU,fm_radio,gallery,messages,music,phone,settings,template,usage,video&app=settings&gaia_rev=f9a37c77efb4621a&gecko_rev=70e808a2cbea876f&plot=avg I'll make sure all of the b2gperf jobs are updated correctly.
Comment 6•10 years ago
|
||
Do we have some monitoring in place that could send emails every time a b2gperf build fails?
Reporter | ||
Comment 7•10 years ago
|
||
(In reply to Anthony Ricaud (:rik) from comment #6) > Do we have some monitoring in place that could send emails every time a > b2gperf build fails? Its in the works and discussed weekly as part of the Signal from Noise meetings.
Comment 8•10 years ago
|
||
(In reply to Ben Kelly [:bkelly] from comment #7) > (In reply to Anthony Ricaud (:rik) from comment #6) > > Do we have some monitoring in place that could send emails every time a > > b2gperf build fails? > > Its in the works and discussed weekly as part of the Signal from Noise > meetings. In the interim, we can have Jenkins send an e-mail for every failing build, but this generates a lot noise.
Comment 9•10 years ago
|
||
This seems to be working now.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment 10•10 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #8) > (In reply to Ben Kelly [:bkelly] from comment #7) > > (In reply to Anthony Ricaud (:rik) from comment #6) > > > Do we have some monitoring in place that could send emails every time a > > > b2gperf build fails? > > > > Its in the works and discussed weekly as part of the Signal from Noise > > meetings. > > In the interim, we can have Jenkins send an e-mail for every failing build, > but this generates a lot noise. These are already sent to the webqa-ci mailing list [1], so you can subscribe but you will want to configure filters because there's a lot of noise. [1] https://mail.mozilla.org/listinfo/webqa-ci
Comment 11•10 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #8) > In the interim, we can have Jenkins send an e-mail for every failing build, > but this generates a lot noise. Why is that a lot of noise? Failing builds is a signal. It's not ok to lose a week of data because no one was warned. My question here is "how can we make sure we do better next time?".
Comment 12•10 years ago
|
||
(In reply to Anthony Ricaud (:rik) from comment #11) > (In reply to Jonathan Griffin (:jgriffin) from comment #8) > > In the interim, we can have Jenkins send an e-mail for every failing build, > > but this generates a lot noise. > Why is that a lot of noise? Failing builds is a signal. It's not ok to lose > a week of data because no one was warned. Failing builds do not mean that no results are gathered and submitted to DataZilla, so from that point there could be noise. I usually monitor these jobs myself (including the existing email notifications), however I have been on PTO for the last two weeks. > My question here is "how can we make sure we do better next time?". I have high hopes for the data ingestion alerts that are being worked on for datazilla, which will send an alert when the rate of data received drops. I'm not sure of the bug or ETA for this though. I'm a little surprised that this wasn't noticed for so long, so if anybody monitoring the results sees such an outage I would encourage them to raise a bug sooner. One possible reason this went unnoticed could be that datazilla shows the available data spanning the available width of the chart, so a lack of recent results might not be clear until there are no results in the last week (the default range). I wonder if there's a way in the datazilla UI we could make lack of recent data more obvious?
Flags: needinfo?(jeads)
Comment 13•10 years ago
|
||
In this case, we did notice the problem, however we misattributed its cause. Bug 971747 popped up around the same time as a couple of unrelated problems, some of which had similar effects. We didn't notice the unrelated problems as quickly as we might have, because we assumed they were instances of bug 971747, and were waiting on help from devs to figure what was going on with that. I agree that the Datazilla alerting system which is in development will help here, but it won't eliminate the kind of confused identity problem we encountered in this particular case.
Comment 14•10 years ago
|
||
(In reply to Dave Hunt (:davehunt) from comment #12) > One possible reason this went unnoticed could be that datazilla shows the > available data spanning the available width of the chart, so a lack of > recent results might not be clear until there are no results in the last > week (the default range). I wonder if there's a way in the datazilla UI we > could make lack of recent data more obvious? Yes, that, totally! I think the rightmost part of datazilla should be the current time and not the last result.
Comment 15•10 years ago
|
||
(In reply to Anthony Ricaud (:rik) from comment #14) > (In reply to Dave Hunt (:davehunt) from comment #12) > > One possible reason this went unnoticed could be that datazilla shows the > > available data spanning the available width of the chart, so a lack of > > recent results might not be clear until there are no results in the last > > week (the default range). I wonder if there's a way in the datazilla UI we > > could make lack of recent data more obvious? > Yes, that, totally! I think the rightmost part of datazilla should be the > current time and not the last result. I've raised bug 974860.
Flags: needinfo?(jeads)
You need to log in
before you can comment on or make changes to this bug.
Description
•