Open Bug 1279821 Opened 4 years ago Updated 1 month ago

Intermittent browser/components/sessionstore/test/browser_windowStateContainer.js | Test timed out

Categories

(Firefox :: Session Restore, defect, P3)

defect

Tracking

()

Fission Milestone M4.1

People

(Reporter: philor, Unassigned, NeedInfo)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell needswork])

Attachments

(1 obsolete file)

Flags: needinfo?(allstars.chh)
Assignee: nobody → allstars.chh
Flags: needinfo?(allstars.chh)
Priority: -- → P3
Assignee: allstars.chh → nobody
as a note, I saw this while running tests by themselves (fresh profile, process with no tests before/after):
https://treeherder.mozilla.org/#/jobs?repo=try&revision=72375369a12a65e716cdc7afa3a81200a4857dfb&filter-searchStr=bc9&selectedJob=30145054
This test has been failing about 30 times a week for several months now, on windows and osx; it needs some attention.

:allstars.chh -- Could you have another look / make this test more reliable?
Flags: needinfo?(allstars.chh)
I notice most logs show many instances of

ERROR	TelemetrySend::_doPing - error making request to https://127.0.0.1:8888/telemetry-dummy...

before the timeout. I don't see a close connection between this test and telemetry, but maybe that's a contributing factor.
:Dexter -- Can you comment on the TelemetrySend issue? I see you made some changes in bug 1329978.

For example, see the error messages and backtraces starting at https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=76671296&lineNumber=4360. I understand why prefs are used to configure telemetry-dummy and expect that to fail if attempted, but shouldn't such attempts be disabled during most tests?
Flags: needinfo?(alessio.placitelli)
(In reply to Geoff Brown [:gbrown] from comment #28)
> :Dexter -- Can you comment on the TelemetrySend issue? I see you made some
> changes in bug 1329978.

Yes! However, it looks like it's not related to this problem. The preference that controls that change is false by default anyway, and will just be set to true in Marionette tests.

> For example, see the error messages and backtraces starting at
> https://treeherder.mozilla.org/logviewer.
> html#?repo=autoland&job_id=76671296&lineNumber=4360. I understand why prefs
> are used to configure telemetry-dummy and expect that to fail if attempted,
> but shouldn't such attempts be disabled during most tests?

Seeing these logs lines in Mochitests is normal, as Telemetry shouldn't be disabled in these tests: we want to make sure that Telemetry behaves correctly when running along with other components. I agree that these log lines can be misleading and clutter the log, so we might think of disabling them in another bug.

The problem of this test seems to be around here: http://searchfox.org/mozilla-central/rev/12cf11303392edac9f1da0c02e3d9ad2ecc8f4d3/browser/components/sessionstore/test/browser_windowStateContainer.js#48

For some reason, it gets stuck there and doesn't move on after the 3rd iteration, but Telemetry doesn't seem to be the culprit here :(
Flags: needinfo?(alessio.placitelli)
this is still a problematic test, :dexter can you take a look at this or nudge allstars.chh to look into this?
Flags: needinfo?(alessio.placitelli)
Whiteboard: [stockwell needswork]
(In reply to Alessio Placitelli [:Dexter] from comment #30)
> > For example, see the error messages and backtraces starting at
> > https://treeherder.mozilla.org/logviewer.
> > html#?repo=autoland&job_id=76671296&lineNumber=4360. I understand why prefs
> > are used to configure telemetry-dummy and expect that to fail if attempted,
> > but shouldn't such attempts be disabled during most tests?
> 
> Seeing these logs lines in Mochitests is normal, as Telemetry shouldn't be
> disabled in these tests: we want to make sure that Telemetry behaves
> correctly when running along with other components. I agree that these log
> lines can be misleading and clutter the log, so we might think of disabling
> them in another bug.

I agree, that is separate from the test issue here.
Let's file a separate bug to consider just disabling Telemetry upload in the tests.

> The problem of this test seems to be around here:
> http://searchfox.org/mozilla-central/rev/
> 12cf11303392edac9f1da0c02e3d9ad2ecc8f4d3/browser/components/sessionstore/
> test/browser_windowStateContainer.js#48
> 
> For some reason, it gets stuck there and doesn't move on after the 3rd
> iteration, but Telemetry doesn't seem to be the culprit here :(

Agreed, someone from the changelog of the actual test might know more:
https://hg.mozilla.org/mozilla-central/log/tip/browser/components/sessionstore/test/browser_windowStateContainer.js

Either way, this doesn't seem related in any way to our teams work.
Flags: needinfo?(alessio.placitelli)
(In reply to Georg Fritzsche [:gfritzsche] from comment #33)
> (In reply to Alessio Placitelli [:Dexter] from comment #30)
> > > For example, see the error messages and backtraces starting at
> > > https://treeherder.mozilla.org/logviewer.
> > > html#?repo=autoland&job_id=76671296&lineNumber=4360. I understand why prefs
> > > are used to configure telemetry-dummy and expect that to fail if attempted,
> > > but shouldn't such attempts be disabled during most tests?
> > 
> > Seeing these logs lines in Mochitests is normal, as Telemetry shouldn't be
> > disabled in these tests: we want to make sure that Telemetry behaves
> > correctly when running along with other components. I agree that these log
> > lines can be misleading and clutter the log, so we might think of disabling
> > them in another bug.
> 
> I agree, that is separate from the test issue here.
> Let's file a separate bug to consider just disabling Telemetry upload in the
> tests.

Filed bug 1343243.
I'll look at this, however I still need a few days (or even weeks) to start working on this.
Assignee: nobody → allstars.chh
Flags: needinfo?(allstars.chh)
luckily this reduced a bit last week, I suspect it was just whatever jobs were run more frequently.
Whiteboard: [stockwell needswork] → [stockwell unknown]
checking in here- it has been a couple weeks and this bug still fails at the same rate (slightly below our radar for pushing for a fix or temporarily disabling), primarily on osx opt and win7-pgo.  It would be nice to pick this up when we have a little time to hack on it.
Summary: Intermittent browser_windowStateContainer.js | Test timed out → Intermittent browser/components/sessionstore/test/browser_windowStateContainer.js | Test timed out
Attached patch Disable the test on win and mac. (obsolete) — Splinter Review
I still don't have enough time to investigate this into detail, so I'd like to disable it on windows and mac first.
Attachment #8858708 - Flags: review?(mdeboer)
if this fails only a few times/week (the last 3 weeks), I don't see an urgent need to disable this on windows/mac.
Assignee: allstars.chh → nobody

Mike,
This is failing quite frequently on windows 10 and windows 10 qr both opt, there have been 29 failures in the last 7 days.

Could you please take a look at it?

Flags: needinfo?(mdeboer)

Update:
There have been 31 failures within the last 7 days:

  • 9 failures on Windows 10 x64 opt
  • 22 failures on Windows 10 x64 QuantumRender opt

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=278174684&repo=mozilla-central&lineNumber=6263

Whiteboard: [stockwell unknown] → [stockwell needswork]

In the last 7 days there have been 27 occurrences on Windows 10 64 build type opt.

This is only failing on fission windows10-64/qr opt. Neha, could you please redirect this to someone to take a look? Thank you.
It has 131 total failures in the last 30 days: https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-11-08&endday=2019-12-08&tree=trunk&bug=1279821
Recent failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=268866721&repo=mozilla-central

Fission Milestone: --- → ?
Flags: needinfo?(nkochar)

browser/components/sessionstore/test/browser_windowStateContainer.js is timing out on fission windows10-64/qr opt. Andrew, can you look into the cause of this?

Flags: needinfo?(nkochar) → needinfo?(continuation)
Depends on: 1602609

Given that it is disabled entirely on debug Fission and opt Linux, I think it isn't too unreasonable to disable it also on opt Fission on Windows. I filed bug 1602609 for that. There's nothing obviously going wrong in the logs. Some of the logs have Activity Stream throwing an error "TypeError: NetworkError when attempting to fetch resource" when trying to take a screenshot, but not all of them. I would guess that Activity Stream just happens to run sometimes when the browser is idle, because of whatever is causing this to hang.

Flags: needinfo?(continuation)
Fission Milestone: ? → M5
Fission Milestone: M5 → M4.1

Mike, this bug is a blocker for Fission's current milestone (M4.1 aka "fix all the mochitests"), but it's currently unassigned. The Fission team is hoping teams will fix their mochitests for Fission before the end of Q1 (75 or 76 Nightly).

Will your team be able to prioritize this bug for Q1? If you don't think this mochitest failure should block shipping Fission, just let me know.

If you have questions for Fission engineers, you can reach them in the #fission channel on Slack or Riot.

You need to log in before you can comment on or make changes to this bug.