(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #13) > I am out of ideas for why fonts and icons would be changed as a result of removing the extra scratch disk. Given the theory that the theme is crashing or failing to fully load, I imagine the fonts are related to that as well. Right, it seems some collection of "inessential system resources" are occasionally not available. > Maybe there is a timing issue or something or different issues with larger amounts of disk IO (reftests do run very fast). Note that when things go wrong and we get the wrong theme,, they go wrong before we even start running the tests; we get the wrong theme right at Firefox startup time. > Some thoughts on next steps: > 1) We could move reftests back to workers with scratch disks. This seems worth trying to me. > 2) for certain manifests we could run the reftests a little slower using slow-if (src) It's not clear to me that that annotation would make a difference here, though I might be misunderstanding. > 3) something else I haven't thought of yet? Ideally, it feels worth getting closer to the bottom of this, by e.g. testing much-reduced versions of our tasks to see if we can identify when/where things go wrong. I'm not sure how easy it is to troubleshoot these runners, but e.g. simply seeing what `gsettings get org.gnome.desktop.interface gtk-theme` returns seems like a reasonable investigation tactic. (I would bet that this yields different results near the start of "good" vs. "bad" tasks.) I'm not sure how much time it's worth sinking into that, particularly if there's a sunset on our usage of Ubuntu 18.04 on the horizon anyway (which I imagine there might be before too long). So that makes me lean towards option (1) as a quick-and-easy return to a more reliable state, even if it's a bit unsatisfying.
Bug 1895092 Comment 14 Edit History
Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.
(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #13) > I am out of ideas for why fonts and icons would be changed as a result of removing the extra scratch disk. Given the theory that the theme is crashing or failing to fully load, I imagine the fonts are related to that as well. Right, it seems some collection of "inessential system resources" are occasionally not available. > Maybe there is a timing issue or something or different issues with larger amounts of disk IO (reftests do run very fast). Note that when things go wrong and we get the wrong theme,, they go wrong before we even start running the tests; we get the wrong theme right at Firefox startup time. > Some thoughts on next steps: > 1) We could move reftests back to workers with scratch disks. This seems worth trying to me. > 2) for certain manifests we could run the reftests a little slower using slow-if (src) It's not clear to me that that annotation would make a difference here, though I might be misunderstanding. > 3) something else I haven't thought of yet? Ideally, it feels worth getting closer to the bottom of this, by e.g. testing much-reduced versions of our tasks to see if we can identify when/where things go wrong. I'm not sure how easy it is to troubleshoot these runners, but e.g. simply seeing what `gsettings get org.gnome.desktop.interface gtk-theme` returns seems like a reasonable investigation tactic. (I would bet that this yields different results near the start of "good" vs. "bad" tasks.) I'm not sure how much (more) time it's worth sinking into that, though, particularly if there's a sunset on our usage of Ubuntu 18.04 on the horizon anyway (which I imagine there might be before too long). So that makes me lean towards option (1) as a quick-and-easy return to a more reliable state, even if it's a bit unsatisfying.
(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #13) > I am out of ideas for why fonts and icons would be changed as a result of removing the extra scratch disk. Given the theory that the theme is crashing or failing to fully load, I imagine the fonts are related to that as well. Right, it seems some collection of "inessential system resources" are occasionally not available. > Maybe there is a timing issue or something or different issues with larger amounts of disk IO (reftests do run very fast). Note that when things go wrong and we get the wrong theme,, they go wrong before we even start running the tests; we get the wrong theme right at Firefox startup time. > Some thoughts on next steps: > 1) We could move reftests back to workers with scratch disks. This seems worth trying to me. > 2) for certain manifests we could run the reftests a little slower using slow-if (src) It's not clear to me that that annotation would make a difference here, though I might be misunderstanding. > 3) something else I haven't thought of yet? Ideally, it feels worth getting closer to the bottom of this, by e.g. testing much-reduced versions of our tasks to see if we can identify when/where things go wrong, and if the ever go from wrong to right or vice-versa partway through a task, etc. I'm not sure how easy it is to troubleshoot these runners, but e.g. simply seeing what `gsettings get org.gnome.desktop.interface gtk-theme` returns seems like a reasonable investigation tactic. (I would bet that this yields different results near the start of "good" vs. "bad" tasks.) I'm not sure how much (more) time it's worth sinking into that, though, particularly if there's a sunset on our usage of Ubuntu 18.04 on the horizon anyway (which I imagine there might be before too long). So that makes me lean towards option (1) as a quick-and-easy return to a more reliable state, even if it's a bit unsatisfying.
(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #13) > I am out of ideas for why fonts and icons would be changed as a result of removing the extra scratch disk. Given the theory that the theme is crashing or failing to fully load, I imagine the fonts are related to that as well. Right, it seems some collection of "inessential system resources" are occasionally not available. > Maybe there is a timing issue or something or different issues with larger amounts of disk IO (reftests do run very fast). Note that when things go wrong and we get the wrong theme,, they go wrong before we even start running the tests; we get the wrong theme right at Firefox startup time. > Some thoughts on next steps: > 1) We could move reftests back to workers with scratch disks. This seems worth trying to me. > 2) for certain manifests we could run the reftests a little slower using slow-if (src) It's not clear to me that that annotation would make a difference here, though I might be misunderstanding. > 3) something else I haven't thought of yet? Ideally, it feels worth getting closer to the bottom of this, by e.g. testing much-reduced versions of our tasks to see if we can identify when/where things go wrong, and if the ever go from wrong to right or vice-versa partway through a task, etc. I'm not sure how easy it is to troubleshoot these runners, but e.g. simply seeing what `gsettings get org.gnome.desktop.interface gtk-theme` returns seems like a reasonable investigation tactic. (I would bet that this yields different results near the start of "good" vs. "bad" tasks.) (note, this `gsettings` command is just cribbed from the attached patch in phabricator here.) I'm not sure how much (more) time it's worth sinking into that, though, particularly if there's a sunset on our usage of Ubuntu 18.04 on the horizon anyway (which I imagine there might be before too long). So that makes me lean towards option (1) as a quick-and-easy return to a more reliable state, even if it's a bit unsatisfying.
(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #13) > I am out of ideas for why fonts and icons would be changed as a result of removing the extra scratch disk. Given the theory that the theme is crashing or failing to fully load, I imagine the fonts are related to that as well. Right, it seems some collection of "inessential system resources" are occasionally not available. > Maybe there is a timing issue or something or different issues with larger amounts of disk IO (reftests do run very fast). Note that when things go wrong and we get the wrong theme,, they go wrong before we even start running the tests; we get the wrong theme right at Firefox startup time. > Some thoughts on next steps: > 1) We could move reftests back to workers with scratch disks. This seems worth trying to me. > 2) for certain manifests we could run the reftests a little slower using slow-if (src) It's not clear to me that that annotation would make a difference here, though I might be misunderstanding. > 3) something else I haven't thought of yet? Ideally, it feels worth getting closer to the bottom of this, by e.g. testing much-reduced versions of our tasks to see if we can identify when/where things go wrong, and if the ever go from wrong to right or vice-versa partway through a task, etc. I'm not sure how easy it is to troubleshoot these runners, but e.g. simply seeing what `gsettings get org.gnome.desktop.interface gtk-theme` returns seems like a reasonable investigation tactic. (I would bet that this yields different results near the start of "good" vs. "bad" tasks, at least.) (note, this `gsettings` command is just cribbed from the attached patch in phabricator here.) I'm not sure how much (more) time it's worth sinking into that, though, particularly if there's a sunset on our usage of Ubuntu 18.04 on the horizon anyway (which I imagine there might be before too long). So that makes me lean towards option (1) as a quick-and-easy return to a more reliable state, even if it's a bit unsatisfying.
(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #13) > I am out of ideas for why fonts and icons would be changed as a result of removing the extra scratch disk. Given the theory that the theme is crashing or failing to fully load, I imagine the fonts are related to that as well. Right, it seems some collection of "inessential system resources" are occasionally not available. > Maybe there is a timing issue or something or different issues with larger amounts of disk IO (reftests do run very fast). Note that when things go wrong and we get the wrong theme,, they go wrong before we even start running the tests; we get the wrong theme right at Firefox startup time. > Some thoughts on next steps: > 1) We could move reftests back to workers with scratch disks. This seems worth trying to me. > 2) for certain manifests we could run the reftests a little slower using slow-if (src) It's not clear to me that that annotation would make a difference here, though I might be misunderstanding. > 3) something else I haven't thought of yet? Ideally, it feels worth getting closer to the bottom of this, by e.g. testing much-reduced versions of our tasks to see if we can identify when/where things go wrong, and if the ever go from wrong to right or vice-versa partway through a task, etc. I'm not sure how easy it is to troubleshoot these runners, but e.g. simply seeing what `gsettings get org.gnome.desktop.interface gtk-theme` returns seems like a reasonable investigation tactic. (I would bet that this yields different results near the start of "good" vs. "bad" tasks, at least.) (note, this `gsettings` command is just cribbed from the attached patch in phabricator here.) I'm not sure how much (more) time it's worth sinking into that, though, particularly if it's specific to these 18.04 machines & assuming that there's a sunset on our usage of Ubuntu 18.04 on the horizon anyway (which I imagine there might be before too long). So that makes me lean towards option (1) as a quick-and-easy return to a more reliable state, even if it's a bit unsatisfying.