AddressSanitizer isn't using symbols on at least autoland today

NEW
Unassigned

Status

defect
P3
normal
2 years ago
Last year

People

(Reporter: KWierso, Unassigned)

Tracking

Version 3
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

I'm trying to figure out what's causing bug 1395422, but it's made more difficult because the symbolizer doesn't appear to be working.

Backscroll from #developers:
15:29:32 <RyanVM> mccr8: https://public-artifacts.taskcluster.net/aTs05-UsTpO16aCmgs5pbQ/0/public/logs/live_backing.log is concerning
15:29:45 <RyanVM> mccr8: "==1596==WARNING: Failed to use and restart external symbolizer!"
15:30:13 <RyanVM> we think we know what push is causing the leaks - *really* hoping that whatever's causing the leaks is what's breaking LSAN too, though...
15:30:41 <mccr8> RyanVM: well, that sounds like an existing intermittent
15:30:54 <RyanVM> this is consistent
15:31:06 <RyanVM> (trying to track down a leak on autoland and every instance is hitting it)
15:31:12 <mccr8> yeah, "} else for" is really terrifying...
15:31:23 <mccr8> RyanVM: Ah. Well, a big leak could certainly cause that.
15:31:37 <RyanVM>  GECKO(2492) | ==2649==ERROR: AddressSanitizer failed to allocate 0x22000 (139264) bytes of LargeMmapAllocator (error code: 12) 
15:31:40 <RyanVM> from another log
15:31:49 <mccr8> RyanVM: the theory is that the symbolizer uses a ton of memory, so if you don't have much free memory then it fails.
15:32:01 <mccr8> because it has to load in the whole Firefox binary, or something.
15:32:09 <RyanVM> interesting, we upgraded the instances the devtools asan runs are on too IIRC
15:32:35 <mccr8> yeah, I think that helped...
15:32:37 <RyanVM> welp, I feel bad for the dev that needs to hunt down his leaks after he gets backed out :P
15:41:01 <mccr8> Hmm the leak itself is very small so I can't see it causing the symbolizer to break. That is unfortunate....
15:50:02 <%KWierso> mccr8: yeah, only a few kb as far as I can see...
16:29:58 <%KWierso> mccr8: hrm, still happening on the backout of the main suspect
16:31:50 <mccr8> KWierso: So, I think that the real problem is the OOM, not the leak. If that makes sense.
16:32:11 <mccr8> KWierso: When we can't run the symbolizer, then the leak white list does not work. 
16:32:40 <RyanVM|bbl> KWierso: i still think the backout was justified given the various netmonitor timeouts it was causing :P
16:32:46 <RyanVM|bbl> but still, boooo
16:32:49 <mccr8> because it matches against the stack, and obviously libxul.so isn't something in the list.
16:33:04 <mccr8> ==1596==WARNING: failed to fork (errno 12)
16:33:07 <mccr8> I see a lot of that.
16:33:31 <%KWierso> really wish there wasn't hours of build failures...
16:34:16 <mccr8> I see "GECKO(1442) | Completed ShutdownLeaks collections in process 1596" but no earlier references to process pid 1596, so I'm not sure what that process is, or what it is doing...
16:34:27 <%KWierso> and the history rewriting makes it harder to tell what started when
16:34:49 <%KWierso> I guess I could just back out anything touching devtools as the next guess?
16:35:51 <mccr8> yeah. something that deals with preallocated processes might be suspect too....
16:36:10 <mccr8> I don't remember anything like that landing today but I could be wrong.
16:37:21 <mccr8> KWierso: which branch are the failures on?
16:37:28 <%KWierso> autoland
16:37:34 <%KWierso> https://treeherder.mozilla.org/#/jobs?repo=autoland&fromchange=32607ab7ecb69318f7c98d2a0b7428dbcfb89793&noautoclassify&filter-searchStr=asan%20dt&group_state=expanded
16:37:50 <%KWierso> YMMV looking at these, since the chunks are probably shuffling tests around at some point
16:37:57 <mccr8> Ah, ok. I saw something for preallocated processes, but that's in inbound.
16:40:02 <%KWierso> mccr8: how about https://hg.mozilla.org/integration/autoland/rev/ce0752c07ff698c8dd7c94928e5160812318edfd ?
16:40:12 <mccr8> Bah, I don't see any of these pids in the log, so I guess that's just a red herring theory.
16:41:08 <mccr8> KWierso: that does sound a little scary, and the initial bug talks about devtools, so I guess it is worth a shot...
Blocks: 1245527
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.