AddressSanitizer isn't using symbols on at least autoland today



2 years ago
Last year


(Reporter: KWierso, Unassigned)


Version 3
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)


I'm trying to figure out what's causing bug 1395422, but it's made more difficult because the symbolizer doesn't appear to be working.

Backscroll from #developers:
15:29:32 <RyanVM> mccr8: is concerning
15:29:45 <RyanVM> mccr8: "==1596==WARNING: Failed to use and restart external symbolizer!"
15:30:13 <RyanVM> we think we know what push is causing the leaks - *really* hoping that whatever's causing the leaks is what's breaking LSAN too, though...
15:30:41 <mccr8> RyanVM: well, that sounds like an existing intermittent
15:30:54 <RyanVM> this is consistent
15:31:06 <RyanVM> (trying to track down a leak on autoland and every instance is hitting it)
15:31:12 <mccr8> yeah, "} else for" is really terrifying...
15:31:23 <mccr8> RyanVM: Ah. Well, a big leak could certainly cause that.
15:31:37 <RyanVM>  GECKO(2492) | ==2649==ERROR: AddressSanitizer failed to allocate 0x22000 (139264) bytes of LargeMmapAllocator (error code: 12) 
15:31:40 <RyanVM> from another log
15:31:49 <mccr8> RyanVM: the theory is that the symbolizer uses a ton of memory, so if you don't have much free memory then it fails.
15:32:01 <mccr8> because it has to load in the whole Firefox binary, or something.
15:32:09 <RyanVM> interesting, we upgraded the instances the devtools asan runs are on too IIRC
15:32:35 <mccr8> yeah, I think that helped...
15:32:37 <RyanVM> welp, I feel bad for the dev that needs to hunt down his leaks after he gets backed out :P
15:41:01 <mccr8> Hmm the leak itself is very small so I can't see it causing the symbolizer to break. That is unfortunate....
15:50:02 <%KWierso> mccr8: yeah, only a few kb as far as I can see...
16:29:58 <%KWierso> mccr8: hrm, still happening on the backout of the main suspect
16:31:50 <mccr8> KWierso: So, I think that the real problem is the OOM, not the leak. If that makes sense.
16:32:11 <mccr8> KWierso: When we can't run the symbolizer, then the leak white list does not work. 
16:32:40 <RyanVM|bbl> KWierso: i still think the backout was justified given the various netmonitor timeouts it was causing :P
16:32:46 <RyanVM|bbl> but still, boooo
16:32:49 <mccr8> because it matches against the stack, and obviously isn't something in the list.
16:33:04 <mccr8> ==1596==WARNING: failed to fork (errno 12)
16:33:07 <mccr8> I see a lot of that.
16:33:31 <%KWierso> really wish there wasn't hours of build failures...
16:34:16 <mccr8> I see "GECKO(1442) | Completed ShutdownLeaks collections in process 1596" but no earlier references to process pid 1596, so I'm not sure what that process is, or what it is doing...
16:34:27 <%KWierso> and the history rewriting makes it harder to tell what started when
16:34:49 <%KWierso> I guess I could just back out anything touching devtools as the next guess?
16:35:51 <mccr8> yeah. something that deals with preallocated processes might be suspect too....
16:36:10 <mccr8> I don't remember anything like that landing today but I could be wrong.
16:37:21 <mccr8> KWierso: which branch are the failures on?
16:37:28 <%KWierso> autoland
16:37:34 <%KWierso>
16:37:50 <%KWierso> YMMV looking at these, since the chunks are probably shuffling tests around at some point
16:37:57 <mccr8> Ah, ok. I saw something for preallocated processes, but that's in inbound.
16:40:02 <%KWierso> mccr8: how about ?
16:40:12 <mccr8> Bah, I don't see any of these pids in the log, so I guess that's just a red herring theory.
16:41:08 <mccr8> KWierso: that does sound a little scary, and the initial bug talks about devtools, so I guess it is worth a shot...
Blocks: 1245527
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.