Open Bug 2034815 Opened 1 month ago Updated 22 hours ago

Multiple wpt leaks caused by wpt-sync

Categories

(Core :: General, defect)

defect

Tracking

()

Tracking Status
firefox-esr140 --- unaffected
firefox150 --- unaffected
firefox151 --- unaffected
firefox152 --- wontfix

People

(Reporter: amarc, Unassigned, NeedInfo)

References

(Depends on 1 open bug, Regression)

Details

(Keywords: intermittent-failure, regression, Whiteboard: [stockwell needswork:owner])

Flags: needinfo?(aborovova)
Keywords: regression
Regressed by: 2033390
Duplicate of this bug: 2034658
Duplicate of this bug: 2034460
Duplicate of this bug: 2034772
Duplicate of this bug: 2034742
Duplicate of this bug: 2034768

Set release status flags based on info from the regressing bug 2033390

Component: web-platform-tests → DOM: Core & HTML
Product: Testing → Core

Andrew, could you check what started these leaks or redirect the request? Thank you in advance.

Flags: needinfo?(continuation)

Hmm, maybe the changes to split out leakcheck large by directory somehow broke leak-threshold? That's the only recent change I can think of.

I looked at one random log, which was failing in /css/css-pseudo/ with what looks like 8912 bytes of leaks. Hmm I was going to say it has a leak-threshold value but I see now that's for a tab process and this leak is in the default process.

Come to think of it, I did just file bug 2032137 about how somebody noticed that we were getting this in every WPT

INFO Browser exited with return code -15
WARNING Firefox didn't exit cleanly, not processing leak logs

which was effectively disabling all leak checking in WPTs. Maybe whatever was causing those processes to crash in shutdown magically fixed itself and now we're getting months of accumulated leaks showing up again.

Flags: needinfo?(continuation) → needinfo?(james)
See Also: → 2032137

It's not so much that it magically fixed itself, as that I fixed it upstream but apparently my prior try run in m-c didn't correctly anticipate the extent of the problems it would cause.

So yes, I think that this is "just" the metadata being very out of date and leaks not being deterministic enough for the wptsync to add all the necessary annotations in a single pass.

Flags: needinfo?(james)

Does wptsync not pick up of those leaks? There was a sync e.g. on May 5.

Flags: needinfo?(james)

It's definitely supposed to. https://hg-edge.mozilla.org/mozilla-central/rev/03ab982e43ac has some leak metadata updates, but not many. https://treeherder.mozilla.org/jobs?repo=try&revision=d4c8752b5855ed9a72c112220d6464dbe8907a85 doesn't look like it has loads of leaks.

It's possible that the leaks are just intermittent enough that we don't catch them all on a single try push. Alternatively maybe we see more on central for some reason. We could try updating the metadata from a central push, if there's one that's otherwise green but shows a lot of leaks.

Flags: needinfo?(james)

The intermittent failures view for this bug shows 7-12 task failures for mozilla-central classified with this leak bug, with at least 3 kind of leaks

  • gkrust
  • changeTableSize
  • large nsGlobalWindowInner | dom

gkrust looks like it is generic Rust library code. It could be added to lsan_allowed in testing/web-platform/meta/__dir__.ini. That probably means we'd ignore a wide variety of Rust but it would at least hide the problem.

Depends on: 2038448
See Also: → 2038411
Flags: needinfo?(aborovova)

Maybe you have looked into some of this recently?

(But this doesn't seem to be DOM: Core&HTML specific )

Component: DOM: Core & HTML → General
Flags: needinfo?(florian)
Flags: needinfo?(dothayer)

Well, this bug is bad news. It looks like all WPT leaks are just being dumped in here now so there's no way to see what is going on.

Skimming through some recent failures, it looks like there are two major buckets of failures.

The first is the gkrust leaks, that look like leak at gkrust.87458e216685fe30-cgu.0. I opened 4 or 5 of these, and they are all nsHttpChannel::ParseDictionary leaking things via urlpattern. I suspect the issue here is that the auto ignorer keeps adding things but then the build changes or whatever and that weird number changes and then we get the failure again. It looks like there are two specific directories where this is happening. I'll file a bug to clean up the lsan-allowed and also a bug to hopefully get networking people to investigate the leak.

The second is where a ton of different processes have the missing output line for total leaks! error message. I guess this is for regular leakcheck. I'm not sure what is going on here or how to address this, but they appear to all be wdspec WPTs whatever that is.

Depends on: 2044021
Depends on: 2044025
No longer depends on: 2044021
Depends on: 2034558
You need to log in before you can comment on or make changes to this bug.