1734020 - (tv-chaosmode-timeout-wpt) web-platform tests can timeout in test-verify because of delays from MOZ_CHAOSMODE=0xfb

Reporter

Description

•

3 years ago

Initially spotted in https://bugzilla.mozilla.org/show_bug.cgi?id=1733717. The test failing in test-verify here takes too long in chaosmode and times out. A normal run of the test takes between 10 and 20 seconds. But in chaosmode, it takes more than 45 seconds. I get a similar ratio comparing locally with or without CHAOSMODE.

In Bug 1390884, MOZ_CHAOSMODE was initially set to MOZ_CHAOSMODE=3 for test-verify and 1 year ago it was updated to MOZ_CHAOSMODE=0xfb. With 3 the test performance was very close to what we have in normal mode, but with 0xfb, adding tabs for example is much slower.

For instance, a simple test adding a 100 tabs takes 25 seconds with MOZ_CHAOSMODE=0xfb and 6 seconds in normal mode (or with MOZ_CHAOSMODE=3).

Looking at the list of features of ChaosMode, a few are especially about delaying things:

enum ChaosFeature {
  None = 0x0,
  // Altering thread scheduling.
  ThreadScheduling = 0x1,
  // Altering network request scheduling.
  NetworkScheduling = 0x2,
  // Altering timer scheduling.
  TimerScheduling = 0x4,
  // Read and write less-than-requested amounts.
  IOAmounts = 0x8,
  // Iterate over hash tables in random order.
  HashTableIteration = 0x10,
  // Randomly refuse to use cached version of image (when allowed by spec).
  ImageCache = 0x20,
  // Delay dispatching threads to encourage dispatched tasks to run.
  TaskDispatching = 0x40,
  // Delay task running to encourage sending threads to run.
  TaskRunning = 0x80,
  Any = 0xffffffff,
};

So I imagine it's not too surprising that tests can timeout more easily in chaos mode. Maybe we should increase the default timeout for tests running in chaos mode then? Note, with MOZ_CHAOSMODE=0xfb all chaos features except NetworkScheduling are enabled.

Overall this leads to false positives from test-verify, would be great to avoid them so that we can trust the reports from test-verify.

Julian Descottes [:jdescottes]

Reporter

Comment 1

•

3 years ago

Attached file Chaos mode performance difference — Details

Geoff Brown [:gbrown]

Comment 2

•

3 years ago

Thanks Julian. I was aware of performance differences with chaos mode, but I hadn't seen such big differences / didn't know it was problematic. I agree with the idea of increasing the timeout, perhaps like

https://searchfox.org/mozilla-central/rev/9bc5dcea99c59dc18eae0de7064131aa20cfbb66/testing/mochitest/runtests.py#2133

Chaos mode performance difference 3 years ago Julian Descottes [:jdescottes] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1734020 - extend timeout for test-verify chaosmode mode. r=gbrown! 2 years ago Joel Maher ( :jmaher ) (UTC -8) 48 bytes, text/x-phabricator-request		Details \| Review