Closed
Bug 925462
(talos-r4-lion-027)
Opened 11 years ago
Closed 11 years ago
talos-r4-lion-027 problem tracking
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: armenzg, Unassigned)
References
Details
(Whiteboard: [buildduty][buildslaves][capacity])
We saw one instance of _Black_ Pixel of Death https://tbpl.mozilla.org/php/getParsedLog.php?id=28937841&tree=Mozilla-Inbound#error0
Reporter | ||
Comment 1•11 years ago
|
||
Nothing to do though. RyanVM: should the text get a little bit more of fuzzing?
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 2•11 years ago
|
||
Why would I add fuzz to a test for a one-off failure? Looking at the screenshot and the location of the wrong pixel, I find it hard to believe that the failure had anything to do with the specific test being run.
Reporter | ||
Comment 3•11 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #2) > Why would I add fuzz to a test for a one-off failure? Looking at the > screenshot and the location of the wrong pixel, I find it hard to believe > that the failure had anything to do with the specific test being run. I was hoping that it would make it hide this type of one-off failures and see one less intermittent orange.
Comment 4•11 years ago
|
||
This is one weird slave. PPoD is perfectly understandable, that shade of pink is one bit off from white, so one bit of memory flipped to the wrong state gives you pink instead of white, but... https://tbpl.mozilla.org/php/getParsedLog.php?id=29473226&tree=Fx-Team is another black pixel of death. How is it doing that? And no, fuzzing every single reftest in the tree to paper over bad slaves isn't something we're going to do. Somewhere we have another bug where you suggested that, and I ranted for 500 words about how reftests cleanly show bad memory, but other suites show it by crashing because a bit in a pointer was flipped so it accesses memory where it shouldn't be, or by failing because the actual result of something was one number, but it passed through the bad memory and came back as a different number.
Reporter | ||
Comment 5•11 years ago
|
||
It seems that our only option is to replace the memory. We could send the memory to dolske to see if he finds anything interesting.
Reporter | ||
Comment 6•11 years ago
|
||
Is comment 5 accurate?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 7•11 years ago
|
||
Back in production.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•4 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•