Closed Bug 1511025 Opened 6 years ago Closed 5 years ago

Network error when trying to search for AppendElements function

Categories

(Webtools :: Searchfox, defect)

defect
Not set
normal

Tracking

(firefox71 fixed)

RESOLVED FIXED
Tracking Status
firefox71 --- fixed

People

(Reporter: bzbarsky, Assigned: asuth)

References

()

Details

Attachments

(2 files)

Steps to reproduce:

1) Load https://searchfox.org/mozilla-central/source/xpcom/ds/nsTArray.h#1743
2) Click on the "AppendElements" function name.
3) Select "Search for function nsTArray_Impl::AppendElements".

I get a network error, in both Firefox and Chrome.  It looks like we're generating a 500KB query string, for what it's worth.

If code ends up moving, line 1743 is:

  elem_type* AppendElements(nsTArray_Impl<Item, Allocator>&& aArray);
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #0)
> Steps to reproduce:
> 
> 1) Load https://searchfox.org/mozilla-central/source/xpcom/ds/nsTArray.h#1743
> 2) Click on the "AppendElements" function name.
> 3) Select "Search for function nsTArray_Impl::AppendElements".
> 
> I get a network error, in both Firefox and Chrome.  It looks like we're
> generating a 500KB query string, for what it's worth.

So this happens because we try to link to all the symbols of all the instantiations... We could try to identify and give an unique symbol name to generic functions, and then make all instantiations reference that symbol.

WDYT kats?
(In reply to Emilio Cobos Álvarez (:emilio) from comment #1)
> So this happens because we try to link to all the symbols of all the
> instantiations... We could try to identify and give an unique symbol name to
> generic functions, and then make all instantiations reference that symbol.
> 
> WDYT kats?

That sound reasonable enough. But I'm wondering if this is a regression from any recent changes, because I'm pretty sure I've looked up things like this before and it worked.

Also relatedly, click on the Release() call at https://searchfox.org/mozilla-central/source/mfbt/RefPtr.h#47 - this gives a giant list of all the template instantiations and is another case that maybe could be improved.
Hmm, https://github.com/mozsearch/mozsearch/pull/167/commits/189b701ac9c354a5e61fc3f93bebe219cf0a9986 is the only code that has touched this recently...

Looking at the code it doesn't seem we should be generating a single giant symbol, string, but a bunch of different links instead...
Assignee: nobody → emilio
Actually, this looks like breakage on the indexer side. Just searching for:

  symbol:_ZN13nsTArray_Impl13AppendElementEOT_RKSt9nothrow_t

Finds all the relevant concrete instantiations of that template correctly but looking at the generated ANALYSIS_DATA looks like we're generating a bunch of symbols for that line.

The only thing that should make us end up with a bunch of symbols there is findOverridenMethods:

  https://searchfox.org/mozilla-central/rev/f2028b4c38bff2a50ed6aa1763f6dc5ee62b0cc4/build/clang-plugin/mozsearch-plugin/MozsearchIndexer.cpp#643

Which I wouldn't expect it to return all the instantiations for a generic method, but maybe clang has changed behavior here or something? I doubt so...

The other possible culprit is the analysis merging code you wrote...
I'll take a look at the analysis data from taskcluster to try to pinpoint this.
Yeah, so this looks like a somewhat unintended regression from the merging code...

The unmerged nsTArray.h contains one entry per symbol, but the merged one merges all the symbols and we end up with that monstrosity :)

The reason this used to work is because when we're building the menu entries we discard all the previous entries with the same name, so we effectively only chose one of them...

Kats, any particular suggestion? I think the tricky bit is differentiating between the "has multiple symbols because the function has been overriden" vs. "has multiple symbols because we've merged them".

But I _think_ that distinction may not even matter, at least searching for any of the overrides of nsIFrame::BuildDisplayList, for example, like:

  symbol:_ZN8nsIFrame16BuildDisplayListEP20nsDisplayListBuilderRK16nsDisplayListSet

Returns all the uses. So we only really need a single symbol.

Kats, does that make sense? If so, I'll submit a patch just using the first symbol for the menu. The other bits probably do need the multiple symbols...
Flags: needinfo?(kats)
Hmm, that's not right, only the first symbol might work for AppendElements, but it might not for e.g. the Release() call in RefPtr... But that's a use not a definition...
Hmm, so before the merging code in https://github.com/mozsearch/mozsearch/pull/154 landed, we'd generate instead a bunch of "Go to definition of nsTArray_Impl::AppendElements" with different symbols.

An alternative would be to just generate one entry per symbol unconditionally, though that may not be great.

An (even better) alternative may be, I think, teach the C++ indexer to generate the specialized "pretty" name for the symbol.

Let me know and stop me if anything I'm saying makes no sense at all Kats :)
Somewhat blind attempt at getting better pretty names: https://github.com/emilio/mozsearch/compare/pretty-name
(In reply to Emilio Cobos Álvarez (:emilio) from comment #8)
> Hmm, so before the merging code in
> https://github.com/mozsearch/mozsearch/pull/154 landed, we'd generate
> instead a bunch of "Go to definition of nsTArray_Impl::AppendElements" with
> different symbols.

Are you sure about this? How did you determine this? Even if this was the case, this doesn't sound particularly desirable either.

> An alternative would be to just generate one entry per symbol
> unconditionally, though that may not be great.

That would be undesirable for cross-platform work, because you might end up with different symbols per platform.

> An (even better) alternative may be, I think, teach the C++ indexer to
> generate the specialized "pretty" name for the symbol.

This would effectively also result in a long menu of different AppendElements, which I think is not desirable. We should be able to have just one item that does the Right Thing. I need to think about it more though.

(In reply to Emilio Cobos Álvarez (:emilio) from comment #9)
> Somewhat blind attempt at getting better pretty names:
> https://github.com/emilio/mozsearch/compare/pretty-name

I noticed you kicked off a dev push for this, but changing MozsearchIndexer.cpp in a dev push has no effect, because it will download the data from the last taskcluster m-c push which won't have your change. You need to do a try push of searchfox jobs with the MozsearchIndexer.cpp change, and then a dev push with the mozilla-central/setup file modified to use the try results. Complicated, I know :(
Flags: needinfo?(kats)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #10)
> (In reply to Emilio Cobos Álvarez (:emilio) from comment #8)
> > Hmm, so before the merging code in
> > https://github.com/mozsearch/mozsearch/pull/154 landed, we'd generate
> > instead a bunch of "Go to definition of nsTArray_Impl::AppendElements" with
> > different symbols.
> 
> Are you sure about this? How did you determine this? Even if this was the
> case, this doesn't sound particularly desirable either.

I basically downloaded the latest indexer run from linux64 and:

  cat xpcom/ds/nsTArray.h | grep AppendElements

And then run merge-analysis with that file only and grep in the same way.

I recall seeing some long lists pointing to different symbols in the past, though I didn't verify it.

Also I concur it's not the best.

> > An alternative would be to just generate one entry per symbol
> > unconditionally, though that may not be great.
> 
> That would be undesirable for cross-platform work, because you might end up
> with different symbols per platform.

Hmm, that is true, let's scratch that.

> > An (even better) alternative may be, I think, teach the C++ indexer to
> > generate the specialized "pretty" name for the symbol.
> 
> This would effectively also result in a long menu of different
> AppendElements, which I think is not desirable. We should be able to have
> just one item that does the Right Thing. I need to think about it more
> though.

So I think there are two different cases, the "use" case (the `Release()` call you pointed above for example), and the definition / declaration case (the AppendElements call).

For the first you probably do need the long list, since they're different functions with different symbols, etc. I don't think there's much else you can do. It's not a virtual function, or anything, it's just a template that gets instantiated a gazillion times.

For the definition case I think we can just get the first symbol (of either the virtual function or the template function) and it'll just work.

> (In reply to Emilio Cobos Álvarez (:emilio) from comment #9)
> > Somewhat blind attempt at getting better pretty names:
> > https://github.com/emilio/mozsearch/compare/pretty-name
> 
> I noticed you kicked off a dev push for this, but changing
> MozsearchIndexer.cpp in a dev push has no effect, because it will download
> the data from the last taskcluster m-c push which won't have your change.
> You need to do a try push of searchfox jobs with the MozsearchIndexer.cpp
> change, and then a dev push with the mozilla-central/setup file modified to
> use the try results. Complicated, I know :(

Oh, buuh, I can give that a shot tomorrow, if you don't beat me to it.
So it's not clear to me what purpose is served by having all the different template instantiations getting listed at all. Taking one random example symbol:

_ZNK13nsTArray_ImplI10AutoTArrayI6RefPtrI21nsDOMMutationObserverELm4EE27nsTArrayInfallibleAllocatorE8ContainsERKT_

If I grep the entire linux analysis tarball, the only hits for this symbol are in xpcom/ds/nsTArray.h, despite the fact that there is a "use" site for this symbol (nsTArray<RefPtr<nsDOMMutationObserver>>::Contains()) at [1]. So it's currently not useful in terms of finding the calls specific to this instantiation, which is the only thing such a symbol might be useful for.

Therefore, if we just stop generating these template-instantiation analysis lines that would fix this bug and shouldn't impact anything. I'm not sure offhand where in MozsearchIndexer.cpp we generate these but it shouldn't be hard to track down. Does that sound reasonable?

[1] https://searchfox.org/mozilla-central/rev/49e78df13e7a505827a3a86daae9efdf827133c6/dom/base/nsDOMMutationObserver.cpp#841
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #12)
> Therefore, if we just stop generating these template-instantiation analysis
> lines that would fix this bug and shouldn't impact anything. I'm not sure
> offhand where in MozsearchIndexer.cpp we generate these but it shouldn't be
> hard to track down. Does that sound reasonable?

It sounds sensible. I saw you started to look at this, but I can do it if you want. Just let me know so we don't duplicate work :)
I wasn't able to make much progress due to family things so feel free to take it.
So, I took a look at this today, I have a test-case which reproduces the issue on the vagrant VM and is somewhat minimal (I managed to reduce it from an old bindgen input that I have lurking around in bindgen for testing[1]).

I managed to run the indexer in isolation and reproduce it there, but both rr and gdb fail in epic ways, so I think that means I should go to sleep for today.

Current progress is at https://github.com/emilio/mozsearch/tree/pretty-name, if you're curious.

[1]: https://github.com/rust-lang/rust-bindgen/blob/master/tests/stylo.hpp
Dang, gdb is even failing to debug its own core dump :(
Attached file Reduced test-case.
I spent some more time reducing this today, this should be workable, though I ran out of time for the day.
Attachment #9032613 - Attachment mime type: text/x-c++src → text/plain

Stealing with permission, I have some cycles to look into this. Turns out the analysis for nsTArray.h is so massive it's slowing down a lot of things so fixing this would help performance as well.

Assignee: emilio → kats

The WIP seems to remove some stuff which we want to remove, but not everything. And it removes some stuff that we don't want to remove, I think. So I need to fiddle with this a bit more.

I've been digging into this a bit using Emilio's excellent reduced test-case as I try to understand how templates work in searchfox a bit more.

I augmented the clang indexer to generate "trace" records. Emilio's test case has the following notable lines/definitions:

  • line 9: templated class nsTArray defined.
  • line 11: E* AppendElements(const Item* aArray, unsigned aArrayLen) with the empty return nullptr.
  • line 17: E* AppendElements(Span<const Item> aSpan) with the contents:
  • line 19: return AppendElements<Item>(aSpan.Elements(), aSpan.Length());

Ignoring the source/target record distinction, at line 11 we get 2 records emitted, both with the same pretty of nsTArray::AppendElements:

  1. The generic symbol _ZN8nsTArray14AppendElementsEPKT_j with a trace of ClassTemplateDecl.FunctionTemplateDecl.NamedDecl.FunctionDecl.CXXMethodDecl.actual
  2. The excess template-expanded instance symbol _ZN8nsTArrayI17ServoAttrSnapshotE14AppendElementsEPKT_j with a trace of ClassTemplateDecl.dep[00019:38].spec[0_0].FunctionTemplateDecl.NamedDecl.FunctionDecl.CXXMethodDecl.actual.

It's important to note that when we are processing templates, we push an AutoTemplateContext in 1 of 2 modes:

  • "gather": We visit every identifier we find but don't recurse into template instantiations/specializations. If VisitCXXDependentScopeMemberExpr is called, we make note of that location, saving it off for the next mode.
  • "analyze" mode: We do recurse into template instantiations/specializations but visitIdentifier will only output data if the location of the symbol corresponds to a dependent location from VisitCXXDependentScopeMemberExpr we saved off in the first pass.

The problem we are running into is that in the 2nd symbol above, we have a ClassTemplateDecl in "analyze" mode on the stack that has traversed into a instantiation/specialization (that's the dep[00019:38].spec[0_0] bit) but it has traversed into a FunctionTemplateDecl that freshly starts out in "gather" mode again which makes visitIdentifier start emitting everything it sees again. (That FunctionTemplateDecl already fully ran in "gather" mode while its parent ClassTemplateDecl was in "gather" mode.)

If we change AutoTemplateContext to inherit its mode from its parent when present then everything seems to work as intended. That is, we only generate the 1st symbol from Emilio's test case, while still generating the multiple dependent symbols in templates2.cpp.

I'm going to run a full mozilla-central indexing run with the tentative fix and the trace explanation stuff in a little bit and we'll see how it goes. (I want to fix the router.py stuff I currently broke on "dev". The menu will still be giant, however.) I'll chime in here when it being done and me being awake intersects.

Early results look good. nsTArray.h analysis is now (linux-only, not merged) 25M down from 606M. And that's with the somewhat chatty "trace" and "tracecontextsym" lines in the output bloating things.

Results are up at https://dev.searchfox.org/. Post-merging (and discarding of trace fields), analysis/xpcom/ds/nsTArray.h is now 980K, down from 606M. The analysis dir is down a little from 8.7G to 8.0G, identifiers way down from 523M to 310M. The crossref file is actually up from 4.7G to 5.7G but that's because this build also includes a new "consumes" kind which are the out edges from a symbol to the things it uses (as exposed by contextsym in target records), plus a "meta" structure that holds the syntaxkind, type, and typesym.

You are my hero! (The fix looks/behaves reasonably enough for me. I've tried to understand that template-handling code a few times but never quite wrapped my head around it). Did you happen to note how long the windows taskcluster job took with the change? IIRC having to rewrite nsTArray.h's analysis data a lot of times was causing some slowness and I wonder if this patch ameliorates it as well.

Assignee: kats → bugmail

220 minutes still, unfortunately. https://treeherder.mozilla.org/#/jobs?repo=try&revision=0813746beea00c369c29b1e3e998c236e2fd5373&selectedJob=259872303 is the job.

It's possible the trace generation or my other changes ate up any wins, though. I'll put up just the fix for review and I'll run try builds with that as well.

Pushed by bugmail@asutherland.org:
https://hg.mozilla.org/integration/autoland/rev/6e7d2195d5dc
Correct searchfox template traversal. r=emilio

Note that this fix is only the fix and doesn't include the "trace" and "tracecontextsym" changes from the graph-support branch, so we may also see indexing speed wins. However, this won't impact today's indexing runs (the nightly build and its searchfox jobs are initiated at 6am), though we do expect those to speed up a lot because of changes that did land in mozsearch as part of bug 1567724.

Might be worth uplifting this to beta and release branches too.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: