Bug 1895092 Comment 6 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #5)
> probably the right thing we should do is ensure at the start of a test job before launching the browser that we are running the correct theme (whichever we choose to be desired).  This would give consistency longer term to our CI.

That sounds great, as long as we can either ensure or re-validate that the theme stays correct throughout the run.

> Why do we get different themes?  

That is a good question. :)  Theme that we're using just comes from a call to a GTK api, so something about the system GTK configuration is changing, it seems.

> I wonder if the theme sometimes changes mid run, i.e. something happens to the OS and we fall back to Ambiance.  

That's an interesting idea, yeah.  This does seem 

> There is a pattern I notice, specifically the failures all take >5 minutes to download and decompress the docker image (typically 10+ minutes), whereas the successful tasks all do this in <5 minutes (usually 4 minutes), specifically this is not downloading or decompressing, but the loading of the docker image

I noticed a similar pattern in bug 1894564, but I also noticed some counterexamples where we had reftest failures of this sort (with transparent scrollbars) and yet still the log was pretty short. e.g. this one, where the whole job was "only" 22min long, and only 1 second elapses from the start of the log to `Image 'public/image.tar.zst' from task 'Xy1CZOamS5SeLiFFT-mRrA' loaded`:
https://treeherder.mozilla.org/logviewer?job_id=456585940&repo=autoland&lineNumber=12227
https://firefoxci.taskcluster-artifacts.net/bUKWv5afRVq1mHu_D8nj7w/0/public/logs/live_backing.log

Not sure what to make of that. (There's no mention of docker in this log, so perhaps the setup is slightly different.)
(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #5)
> probably the right thing we should do is ensure at the start of a test job before launching the browser that we are running the correct theme (whichever we choose to be desired).  This would give consistency longer term to our CI.

That sounds great, as long as we can either ensure or re-validate that the theme stays correct throughout the run.

> Why do we get different themes?  

That is a good question. :)  Theme that we're using just comes from a call to a GTK api, so something about the system GTK configuration is changing, it seems.

> I wonder if the theme sometimes changes mid run, i.e. something happens to the OS and we fall back to Ambiance.  

That's an interesting idea, yeah.  This does seem possible given that this seems to be more frequent on (though not exclusive to) "heavier" test-runs like TSAN/ASAN.  Though note that we are hitting this issue the first time that Firefox starts up; so if something goes wrong partway through, it's earlier than when Firefox starts.
[EDIT: sorry, left this^ thought unfinished at first. editing comment in-place to fill out what I had meant to say.]

> There is a pattern I notice, specifically the failures all take >5 minutes to download and decompress the docker image (typically 10+ minutes), whereas the successful tasks all do this in <5 minutes (usually 4 minutes), specifically this is not downloading or decompressing, but the loading of the docker image

I noticed a similar pattern in bug 1894564, but I also noticed some counterexamples where we had reftest failures of this sort (with transparent scrollbars) and yet still the log was pretty short. e.g. this one, where the whole job was "only" 22min long, and only 1 second elapses from the start of the log to `Image 'public/image.tar.zst' from task 'Xy1CZOamS5SeLiFFT-mRrA' loaded`:
https://treeherder.mozilla.org/logviewer?job_id=456585940&repo=autoland&lineNumber=12227
https://firefoxci.taskcluster-artifacts.net/bUKWv5afRVq1mHu_D8nj7w/0/public/logs/live_backing.log

Not sure what to make of that. (There's no mention of docker in this log, so perhaps the setup is slightly different.)
(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #5)
> probably the right thing we should do is ensure at the start of a test job before launching the browser that we are running the correct theme (whichever we choose to be desired).  This would give consistency longer term to our CI.

That sounds great, as long as we can either ensure or re-validate that the theme stays correct throughout the run.

> Why do we get different themes?  

That is a good question. :)  Theme that we're using just comes from a call to a GTK api, so something about the system GTK configuration is changing, it seems.

> I wonder if the theme sometimes changes mid run, i.e. something happens to the OS and we fall back to Ambiance.  

That's an interesting idea, yeah.  This does seem possible given that this seems to be more frequent on (though not exclusive to) "heavier" test-runs like TSAN/ASAN.  Though note that we are hitting this issue the first time that Firefox starts up (in e.g. the "bad" try run linked in comment 2 here); so if something goes wrong partway through, it's earlier than when Firefox itself starts.
[EDIT: sorry, left this^ thought unfinished at first. editing comment in-place to fill out what I had meant to say.]

> There is a pattern I notice, specifically the failures all take >5 minutes to download and decompress the docker image (typically 10+ minutes), whereas the successful tasks all do this in <5 minutes (usually 4 minutes), specifically this is not downloading or decompressing, but the loading of the docker image

I noticed a similar pattern in bug 1894564, but I also noticed some counterexamples where we had reftest failures of this sort (with transparent scrollbars) and yet still the log was pretty short. e.g. this one, where the whole job was "only" 22min long, and only 1 second elapses from the start of the log to `Image 'public/image.tar.zst' from task 'Xy1CZOamS5SeLiFFT-mRrA' loaded`:
https://treeherder.mozilla.org/logviewer?job_id=456585940&repo=autoland&lineNumber=12227
https://firefoxci.taskcluster-artifacts.net/bUKWv5afRVq1mHu_D8nj7w/0/public/logs/live_backing.log

Not sure what to make of that. (There's no mention of docker in this log, so perhaps the setup is slightly different.)
(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #5)
> probably the right thing we should do is ensure at the start of a test job before launching the browser that we are running the correct theme (whichever we choose to be desired).  This would give consistency longer term to our CI.

That sounds great, as long as we can either ensure or re-validate that the theme stays correct throughout the run.

> Why do we get different themes?  

That is a good question. :)  Theme that we're using just comes from a call to a GTK api, so something about the system GTK configuration is changing, it seems.

> I wonder if the theme sometimes changes mid run, i.e. something happens to the OS and we fall back to Ambiance.  

That's an interesting idea, yeah.  This does seem possible given that this seems to be more frequent on (though not exclusive to) "heavier" test-runs like TSAN/ASAN.  Though note that we are hitting this issue the first time that Firefox starts up (in e.g. the "bad" try run linked in comment 2 here); so if something goes wrong partway through, it's earlier than when Firefox itself starts.
[EDIT: sorry, left this^ thought unfinished at first. editing comment in-place to fill out what I had meant to say.]

> There is a pattern I notice, specifically the failures all take >5 minutes to download and decompress the docker image (typically 10+ minutes), whereas the successful tasks all do this in <5 minutes (usually 4 minutes), specifically this is not downloading or decompressing, but the loading of the docker image

I noticed a similar pattern in bug 1894564, but I also noticed some counterexamples where we had reftest failures of this sort (with transparent scrollbars and an unexpected icon-set) and yet still the log was pretty short. e.g. this one, where the whole job was "only" 22min long, and only 1 second elapses from the start of the log to `Image 'public/image.tar.zst' from task 'Xy1CZOamS5SeLiFFT-mRrA' loaded`:
https://treeherder.mozilla.org/logviewer?job_id=456585940&repo=autoland&lineNumber=12227
https://firefoxci.taskcluster-artifacts.net/bUKWv5afRVq1mHu_D8nj7w/0/public/logs/live_backing.log

Not sure what to make of that. (There's no mention of docker in this log, so perhaps the setup is slightly different.)
(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #5)
> probably the right thing we should do is ensure at the start of a test job before launching the browser that we are running the correct theme (whichever we choose to be desired).  This would give consistency longer term to our CI.

That sounds great, as long as we can either ensure or re-validate that the theme stays correct throughout the run.

> Why do we get different themes?  

That is a good question. :)  Theme that we're using just comes from a call to a GTK api, so something about the system GTK configuration is intermittently different, it seems.

> I wonder if the theme sometimes changes mid run, i.e. something happens to the OS and we fall back to Ambiance.  

That's an interesting idea, yeah.  This does seem possible given that this seems to be more frequent on (though not exclusive to) "heavier" test-runs like TSAN/ASAN.  Though note that we are hitting this issue the first time that Firefox starts up (in e.g. the "bad" try run linked in comment 2 here); so if something goes wrong partway through, it's earlier than when Firefox itself starts.
[EDIT: sorry, left this^ thought unfinished at first. editing comment in-place to fill out what I had meant to say.]

> There is a pattern I notice, specifically the failures all take >5 minutes to download and decompress the docker image (typically 10+ minutes), whereas the successful tasks all do this in <5 minutes (usually 4 minutes), specifically this is not downloading or decompressing, but the loading of the docker image

I noticed a similar pattern in bug 1894564, but I also noticed some counterexamples where we had reftest failures of this sort (with transparent scrollbars and an unexpected icon-set) and yet still the log was pretty short. e.g. this one, where the whole job was "only" 22min long, and only 1 second elapses from the start of the log to `Image 'public/image.tar.zst' from task 'Xy1CZOamS5SeLiFFT-mRrA' loaded`:
https://treeherder.mozilla.org/logviewer?job_id=456585940&repo=autoland&lineNumber=12227
https://firefoxci.taskcluster-artifacts.net/bUKWv5afRVq1mHu_D8nj7w/0/public/logs/live_backing.log

Not sure what to make of that. (There's no mention of docker in this log, so perhaps the setup is slightly different.)

Back to Bug 1895092 Comment 6