Bug 1734020 Comment 20 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #17)
> (In reply to Daniel Holbert [:dholbert] from comment #7)
> > Reading the notes in comment 0 here... I wonder if the "Altering thread scheduling", "Delay dispatching threads", and "Delay task running" items there might cause **particular slowness for style-system tests**, since our style system is multi-threaded.
> > 
> > That would explain why `outline-valid-mandatory.html`'s seemingly-benign calls to `test_valid_value()` (which resolves some CSS) would end up being 50-100x slower with `MOZ_CHAOSMODE=0xfb`.
> 
> Emilio, would you have some feedback regarding this possible idea?

I spoke to him & he agrees.  Essentially all this means is that this bug will be particularly-bad for anything that uses multiple threads.  I see three options to address it:

(1) turn off Stylo's multi-threading just for this test-verify task (run with `STYLO_THREADS=1`)
(Including for completeness, but really this would be silly; it would defeat the purpose of the task in the first place. Also, it wouldn't help other components like e.g. WebRTC where folks are hitting this a lot too, per e.g. karlt's note in bug 1786445 comment 2.)

(2) use less-aggressive chaos mode that doesn't cause as much of a perf impact.  Comment 1 here suggests that `MOZ_CHAOSMODE=3` had less overhead here.  But of course it wouldn't catch issues as much.

(3) Use a longer timeout when we're running under chaos mode as part of this task (higher than 10 seconds).  Anecdotally, I see tests take 50-100x as long when run under this configuration (see bug 1787569 comment 6), so even increasing the timeout by 10x here would be conservative.

Option 3 seems like the best outcome here (maintaining our strictness while avoiding false positive timeouts). I think it'd be a test harness change in our test-verify setup somewhere.

whimboo, do you know if this is possible & who we might point at that?
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #17)
> (In reply to Daniel Holbert [:dholbert] from comment #7)
> > Reading the notes in comment 0 here... I wonder if the "Altering thread scheduling", "Delay dispatching threads", and "Delay task running" items there might cause **particular slowness for style-system tests**, since our style system is multi-threaded.
> > 
> > That would explain why `outline-valid-mandatory.html`'s seemingly-benign calls to `test_valid_value()` (which resolves some CSS) would end up being 50-100x slower with `MOZ_CHAOSMODE=0xfb`.
> 
> Emilio, would you have some feedback regarding this possible idea?

I spoke to him & he agrees.  Essentially all this means is that this bug will be particularly-bad for anything that uses multiple threads.  I see three options to address it:

(1) turn off Stylo's multi-threading just for this test-verify task (run with `STYLO_THREADS=1`)
(I'm including this option for completeness, but really this would be silly; it would defeat the purpose of the task in the first place. Also, it wouldn't help other components like e.g. WebRTC where folks are hitting this a lot too, per e.g. karlt's note in bug 1786445 comment 2.)

(2) use less-aggressive chaos mode that doesn't cause as much of a perf impact.  Comment 1 here suggests that `MOZ_CHAOSMODE=3` had less overhead here.  But of course it wouldn't catch issues as much.

(3) Use a longer timeout when we're running under chaos mode as part of this task (higher than 10 seconds).  Anecdotally, I see tests take 50-100x as long when run under this configuration (see bug 1787569 comment 6), so even increasing the timeout by 10x here would be conservative.

Option 3 seems like the best outcome here (maintaining our strictness while avoiding false positive timeouts). I think it'd be a test harness change in our test-verify setup somewhere.

whimboo, do you know if this is possible & who we might point at that?
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #17)
> (In reply to Daniel Holbert [:dholbert] from comment #7)
> > Reading the notes in comment 0 here... I wonder if the "Altering thread scheduling", "Delay dispatching threads", and "Delay task running" items there might cause **particular slowness for style-system tests**, since our style system is multi-threaded.
> > 
> > That would explain why `outline-valid-mandatory.html`'s seemingly-benign calls to `test_valid_value()` (which resolves some CSS) would end up being 50-100x slower with `MOZ_CHAOSMODE=0xfb`.
> 
> Emilio, would you have some feedback regarding this possible idea?

I spoke to him & he agrees.  Essentially all this means is that this bug will be particularly-bad for anything that uses multiple threads.  I see three options to address it:

(1) turn off Stylo's multi-threading just for this test-verify task (run with `STYLO_THREADS=1`)
(I'm including this option for completeness, but really this would be silly; it would defeat the purpose of the task in the first place. Also, it wouldn't help other components like e.g. WebRTC where folks are hitting this a lot too, per e.g. karlt's note in bug 1786445 comment 2.)

(2) use less-aggressive chaos mode that doesn't cause as much of a perf impact.  Comment 1 here suggests that `MOZ_CHAOSMODE=3` had less overhead here.  But of course it wouldn't exercise edge cases as well & wouldn't catch as many real bugs/failures.

(3) Use a longer timeout when we're running under chaos mode as part of this task (higher than 10 seconds).  Anecdotally, I see tests take 50-100x as long when run under this configuration (see bug 1787569 comment 6), so even increasing the timeout by 10x here would be conservative.

Option 3 seems like the best outcome here (maintaining our strictness while avoiding false positive timeouts). I think it'd be a test harness change in our test-verify setup somewhere.

whimboo, do you know if this is possible & who we might point at that?

Back to Bug 1734020 Comment 20