Open Bug 1395424 Opened 3 years ago Updated 9 months ago
Sanitizer isn't using symbols
I'm trying to figure out what's causing bug 1395422, but it's made more difficult because the symbolizer doesn't appear to be working. Backscroll from #developers: 15:29:32 <RyanVM> mccr8: https://public-artifacts.taskcluster.net/aTs05-UsTpO16aCmgs5pbQ/0/public/logs/live_backing.log is concerning 15:29:45 <RyanVM> mccr8: "==1596==WARNING: Failed to use and restart external symbolizer!" 15:30:13 <RyanVM> we think we know what push is causing the leaks - *really* hoping that whatever's causing the leaks is what's breaking LSAN too, though... 15:30:41 <mccr8> RyanVM: well, that sounds like an existing intermittent 15:30:54 <RyanVM> this is consistent 15:31:06 <RyanVM> (trying to track down a leak on autoland and every instance is hitting it) 15:31:12 <mccr8> yeah, "} else for" is really terrifying... 15:31:23 <mccr8> RyanVM: Ah. Well, a big leak could certainly cause that. 15:31:37 <RyanVM> GECKO(2492) | ==2649==ERROR: AddressSanitizer failed to allocate 0x22000 (139264) bytes of LargeMmapAllocator (error code: 12) 15:31:40 <RyanVM> from another log 15:31:49 <mccr8> RyanVM: the theory is that the symbolizer uses a ton of memory, so if you don't have much free memory then it fails. 15:32:01 <mccr8> because it has to load in the whole Firefox binary, or something. 15:32:09 <RyanVM> interesting, we upgraded the instances the devtools asan runs are on too IIRC 15:32:35 <mccr8> yeah, I think that helped... 15:32:37 <RyanVM> welp, I feel bad for the dev that needs to hunt down his leaks after he gets backed out :P 15:41:01 <mccr8> Hmm the leak itself is very small so I can't see it causing the symbolizer to break. That is unfortunate.... 15:50:02 <%KWierso> mccr8: yeah, only a few kb as far as I can see... 16:29:58 <%KWierso> mccr8: hrm, still happening on the backout of the main suspect 16:31:50 <mccr8> KWierso: So, I think that the real problem is the OOM, not the leak. If that makes sense. 16:32:11 <mccr8> KWierso: When we can't run the symbolizer, then the leak white list does not work. 16:32:40 <RyanVM|bbl> KWierso: i still think the backout was justified given the various netmonitor timeouts it was causing :P 16:32:46 <RyanVM|bbl> but still, boooo 16:32:49 <mccr8> because it matches against the stack, and obviously libxul.so isn't something in the list. 16:33:04 <mccr8> ==1596==WARNING: failed to fork (errno 12) 16:33:07 <mccr8> I see a lot of that. 16:33:31 <%KWierso> really wish there wasn't hours of build failures... 16:34:16 <mccr8> I see "GECKO(1442) | Completed ShutdownLeaks collections in process 1596" but no earlier references to process pid 1596, so I'm not sure what that process is, or what it is doing... 16:34:27 <%KWierso> and the history rewriting makes it harder to tell what started when 16:34:49 <%KWierso> I guess I could just back out anything touching devtools as the next guess? 16:35:51 <mccr8> yeah. something that deals with preallocated processes might be suspect too.... 16:36:10 <mccr8> I don't remember anything like that landing today but I could be wrong. 16:37:21 <mccr8> KWierso: which branch are the failures on? 16:37:28 <%KWierso> autoland 16:37:34 <%KWierso> https://treeherder.mozilla.org/#/jobs?repo=autoland&fromchange=32607ab7ecb69318f7c98d2a0b7428dbcfb89793&noautoclassify&filter-searchStr=asan%20dt&group_state=expanded 16:37:50 <%KWierso> YMMV looking at these, since the chunks are probably shuffling tests around at some point 16:37:57 <mccr8> Ah, ok. I saw something for preallocated processes, but that's in inbound. 16:40:02 <%KWierso> mccr8: how about https://hg.mozilla.org/integration/autoland/rev/ce0752c07ff698c8dd7c94928e5160812318edfd ? 16:40:12 <mccr8> Bah, I don't see any of these pids in the log, so I guess that's just a red herring theory. 16:41:08 <mccr8> KWierso: that does sound a little scary, and the initial bug talks about devtools, so I guess it is worth a shot...
Summary: AddressSanitizer isn't using symbols on at least autoland today → AddressSanitizer isn't using symbols
Flags: needinfo?(continuation) → needinfo?(nfroyd)
You need to log in before you can comment on or make changes to this bug.