Closed Bug 820531 Opened 12 years ago Closed 4 years ago

Index all trees that are in MXR

Categories

(Webtools Graveyard :: DXR, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jcranmer, Unassigned)

References

Details

(Whiteboard: mxr-parity)

Right now, only mozilla-central is being indexed on dxr.mozilla (and dxr.allizom?). MXR contains a much larger set of trees:

{mozilla/comm}-{central,aurora,beta,release,esr10,esr17,2.0,1.9.2,1.9.1} + l10n variants of many of these.

chromium, b2g/gaia, full CVS, various CVS branches, the classic snapshot, NSPR, NSS, websites, webtools, etc. Oh, and there are some "hidden" branches which require LDAP authentication to see--I won't mention them here because I don't know if their existence is security-sensitive or not.

Most of these branches are basically only indexable as plain-text: all of the website code and webtools code in particular. Other projects are also effectively plain-text due to unrecognized languages (Gaia is heavily JS-based, Bugzilla is perl), at least for the near future. The historical branches are highly unlikely to actually compile for DXR--that would be anything before esr10 at this point.

So in terms of practical utility for C/C++ indexing, we have only the current release trains.

For full MXR-parity, we'd probably need to have copies of many of these trees that we can do at least as well as grep on; for the short-term, adding comm-central and mozilla-aurora would probably be sufficient for most uses.
Depends on: 820527
Whiteboard: [parity-mxr] → mxr-parity
Bug 850479 (cosmetics) makes multiple trees annoying.
bug 874667 has some data for how frequently trees are being used on MXR. To summarize order:

mozilla-central
CVS snapshots [I expect a combination of old links and people wanting m-c but not getting it makes it relatively high. Though also doing deep archaeology may be a starting point]
comm-central
mozilla-aurora
l10n [Maybe l10n-central was desired? I don't know]
mozilla-release
chromium
[ and a long tail begins ]

After staring at this list and correlating with the original list, I think some pruning may be in order. Basically, we'd need
Production code:
  {m,c,l10n}-{c,a,b,r,e*,2.0,1.9.2,1.9.1,b2g18}
  Gaia
  NSPR, NSS, JSS, Python-NSS
  Bugzilla [ + old versions]
  "Other" projects (build, projects, webtools, labs, services) {mostly dumps of several repositoryies at once}
Archived versions:
  CVS Trunk, 1.9, 1.8, 1.8.0, Aviary
  Mozilla Classic

Pretty much all of the other trees not mentioned can probably be "implemented" as a hidden redirect to existing pages
RelEng has quite a few repos that would be nice to index too. There's some high priority ones that are must haves before mxr can die:
build/{buildbot-configs,buildbotcustom,tools,puppet,mozharness}

And some medium to low priority ones that are nice to have:
build/{braindump,buildapi,buildbot,cloud-tools,mozpool,partner-repacks,talos}

An a couple medium priority repos that are only hosted on git.m.o:
http://git.mozilla.org/?p=build/release-kickoff.git;a=summary
http://git.mozilla.org/?p=build/slaveapi.git;a=summary
Wait, are we preparing to kill mxr?
Ehsan: medium-term. We obviously have to bring DXR up to parity first. Watch https://wiki.mozilla.org/DXR_UI_Refresh#Plans_And_Priorities for some priorities and blockers soon. Did you see the big dev-platform and dev-static-analysis threads I posted? https://groups.google.com/forum/#!topic/mozilla.dev.platform/TW8If68UYZw
Unbuggy multiple tree support will be landing as part of the ui branch. Then it'll just be a matter of configuring our nightly build script.
Depends on: 860271
comm-central is now indexed!


To reconfirm and burrow in a bit, here are the repos that are probably statically analyzeable C/C++:

m-{c,a,b,r,esr24,b2g18,b2g26_v1_2,b2g28_v1_3}
c-{c,a,b,r,esr24}
B2G - Gaia, NSPR, NSS

Those are the hard ones, with build processes to set up. The rest will be just text for the moment.


Oh, and here's a link to a newer (and more permanent) DXR roadmap: https://wiki.mozilla.org/DXR_Roadmap.
since :erik pointed me here. I'm looking to have the build repos indexed in dxr, there is no C/C++ building needed here (its mostly shell/python/puppet/etc)

e.g. a copy of mxr.mozilla.org/build/

This is a "merge" of many repos at hg.mozilla.org/build/ and the update script for mxr is at http://hg.mozilla.org/webtools/mxr/file/76ae30bc7fd7/update-src.pl#l314 for this specific set of things (basically a for dir in *; do hg -R dir update)

Not having the build/* repos blocks my ability to test dxr for my daily use/needs.
While I love DXR and search it looking for strings in Firefox products, most of my work and my team's work is on website products. All of our websites' source is up on GitHub and would love to have one place to go to and be able to find strings in our website code.

Most of the searching I am doing is for bedrock (https://github.com/mozilla/bedrock), but there are plenty of other GitHub repos at https://github.com/mozilla/.* that I search from time to time. I know that you can search GitHub at github.com/search with queries like (https://github.com/search?q=optimizely+repo%3Amozilla%2Fbedrock&type=Code&ref=searchresults), but it would be nice to be able to use the power of DXR and search all of our code from one service.

Maybe GitHub's search is "good enough" for everyone and having DXR index the static resources isn't a worthwhile endeavor. I think it is worth asking and see if others would find this beneficial.
(In reply to Ben Hearsum [:bhearsum] from comment #3)
> RelEng has quite a few repos that would be nice to index too. There's some
> high priority ones that are must haves before mxr can die:
> build/{buildbot-configs,buildbotcustom,tools,puppet,mozharness}
> 

I've compiled a list of releng repos, not all of them in hg, and not all of them are currently indexed by MXR.  MXR parity only needs ones in hg/currently-in-mxr. But for completeness sake:

https://wiki.mozilla.org/ReleaseEngineering/Repositories
Depends on: 1024438
Once we've indexed all the MXR trees (or even before then), we should move the rest of the tree-indexing requests into a different bug. I want to keep this one MXR-only for dependency-tracking purposes.
Depends on: 1047554
Summary: Index multiple trees → Index all trees that are in MXR
A crazy idea hwine had was to use our try servers: we'd run a Jenkins, and it would throw kick off try jobs. However, I suspect the try servers would be useful to us only for moz-central and similar repos, not for, say, web sites. We also have the question of whether we can get the try servers access to our ES cluster. Jenkins is sounding better and better as at least a central coordination point.
jlund in #releng is a good person to ask about the releng infra.
andym and the add-ons team are interested in the addons tree, which is really just a mess of things rsynced to an MXR indexing box atm. They may move the source to a series of GH repos in the future. Talk to andym. In any case, this pseudo-tree would need to be behind auth. The main use case is searching across the sum of all add-ons to detect use of various APIs.
Depends on: 1281218
See Also: → 1281443

We are not going to work on that for dxr!

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.