Closed Bug 1490853 Opened 6 years ago Closed 6 years ago

Allow searching generated files across all platforms

Categories

(Webtools :: Searchfox, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kats, Assigned: kats)

References

Details

Now that we are indexing both linux and macOS C++/Rust code, we run into the problem where generated source files may be different on the two platforms. Right now we just use the Linux one, but Ehsan suggested in bug 1487583 comment 6 that it might be better to let the user pick which platform they care about the most and default to that.

We should be able to easily store the generated files and analyses in separate folders in the searchfox index and then pick the appropriate one from the router code based on the user's selection or cookie.
I thought about this and we should actually be able to expose all variants of the generated files without having to manually select which one you want. For generated files that differ across platforms we can just put them in platform-specific folders under __GENERATED__. So e.g. __GENERATED__/__linux64__/mozilla-config.h would be different from __GENERATED__/__win64__/mozilla-config.h, and both would get searched at the same time when doing searches etc.

I'll try that first and see if it's workable.
Assignee: nobody → kats
Does this mean when looking up a name in a generated file (e.g. IPC bindings) you'll find 3 copies of everything instead of 1?  Hopefully we can somehow make those usecases not become too painful.  :-)
You'd only find 3 copies if the generated code is not the same on all platforms. Let's see how many generated files this ends up affecting - if it's too many then we can try some modification or go with your original suggestion.
I did a first implementation and that seems to be promising. It's currently deployed to dev.searchfox.org. there's not that many files that are different across platforms. The only exception is the headers generated from idl files in dist/include and that's only because they have an absolute pathname to the file they were generated from, and the absolute pathname is different on Windows vs mac/linux. This produces unnecessarily duplicates results for e.g. [1]. But that should be easy to fix, I can just normalize those paths with sed before comparing.

[1] https://dev.searchfox.org/mozilla-central/search?q=symbol:T_mozIStoragePendingStatement&redirect=false
Updated and redeployed, seems better.
The number of per-platform generated files:
- linux: 68
- macosx: 34
- windows: 68

34 of the ones from linux and windows are in dist/stl_wrappers and probably nobody really cares about. The remaining 34 are listed below. Given the low number of per-platform generated files I'm thinking this approach seems reasonable enough to go with. Thoughts? Feel free to try searching for generated stuff on dev.searchfox.org and see if there's any undesirable behaviour.



./toolkit/components/telemetry/TelemetryHistogramNameMap.h
./toolkit/components/telemetry/TelemetryScalarEnums.h
./toolkit/components/telemetry/TelemetryHistogramEnums.h
./toolkit/components/telemetry/TelemetryHistogramData.inc
./toolkit/components/telemetry/TelemetryScalarData.h
./gfx/cairo/cairo/src/cairo-features.h
./mozilla-config.h
./ipc/ipdl/PCompositorWidget.cpp
./ipc/ipdl/PlatformWidgetTypes.cpp
./ipc/ipdl/PDocAccessibleParent.cpp
./ipc/ipdl/PDocAccessibleChild.cpp
./ipc/ipdl/IPCMessageTypeName.cpp
./ipc/ipdl/PCompositorWidgetParent.cpp
./ipc/ipdl/_ipdlheaders/mozilla/widget/PlatformWidgetTypes.h
./ipc/ipdl/_ipdlheaders/mozilla/widget/PCompositorWidgetParent.h
./ipc/ipdl/_ipdlheaders/mozilla/widget/PCompositorWidgetChild.h
./ipc/ipdl/_ipdlheaders/mozilla/widget/PCompositorWidget.h
./ipc/ipdl/_ipdlheaders/mozilla/layers/PCompositorBridgeChild.h
./ipc/ipdl/_ipdlheaders/mozilla/layers/PCompositorBridgeParent.h
./ipc/ipdl/_ipdlheaders/mozilla/a11y/PDocAccessibleChild.h
./ipc/ipdl/_ipdlheaders/mozilla/a11y/PDocAccessibleParent.h
./ipc/ipdl/_ipdlheaders/mozilla/a11y/PDocAccessible.h
./ipc/ipdl/PCompositorWidgetChild.cpp
./ipc/ipdl/PDocAccessible.cpp
./build/automation.py
./js/src/js-confdefs.h
./js/src/ctypes/libffi/fficonfig.h
./js/src/ctypes/libffi/include/ffi.h
./js/src/shell/shellmoduleloader.out.h
./xpcom/xpcom-private.h
./xpcom/reflect/xptinfo/xptdata.cpp
./xpcom/xpcom-config.h
./xpcom/idl-parser/xpidl/xpidlyacc.py
./dist/include/mozilla-config.h
We might want to exclude xptdata.cpp. I can't imagine there's much of importance that is different there, and it is one of the largest files we have, at almost 200k lines.
Hmm in comment 2 I was worried about this approach ruining search results such as <https://dev.searchfox.org/mozilla-central/search?q=symbol:_ZN7mozilla3dom12ContentChild29RecvClearSiteDataReloadNeededERK9nsTStringIDsE%2C_ZN7mozilla3dom13PContentChild29RecvClearSiteDataReloadNeededERK9nsTStringIDsE&redirect=false>.  But now that I tried it out, I see your suggestion was to only list all three files for cases where there is a difference between the generated files, is that correct?

I think I can easily get behind your approach, thanks for trying it out and providing us with a way to play with it.  :-)
(In reply to Andrew McCreight [:mccr8] from comment #7)
> We might want to exclude xptdata.cpp.

Just to be clear, in this case you want to keep just the Linux version? Or ditch the file altogether?

(In reply to :Ehsan Akhgari from comment #8)
> But now that I tried it out, I see your suggestion
> was to only list all three files for cases where there is a difference
> between the generated files, is that correct?

Yup, exactly.

> I think I can easily get behind your approach, thanks for trying it out and
> providing us with a way to play with it.  :-)

Great! :)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #9)
> Just to be clear, in this case you want to keep just the Linux version? Or
> ditch the file altogether?

I think keeping one version around would be nice.
This is deployed now.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Summary: Use generated files from user-specified platform (saved in cookie) → Allow searching generated files across all platforms
You need to log in before you can comment on or make changes to this bug.