Closed Bug 1045183 Opened 10 years ago Closed 9 years ago

Request-time file rendering

Categories

(Webtools Graveyard :: DXR, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: erik, Unassigned)

References

Details

Instead of building 22GB of static HTML at index time, render file views at request time, pulling syntax-coloring offsets out of ES.

An initial benchmark: pulling all 2M docs making up sqlite3.c takes 1.4s of wall-clock time. 700ms of that are ES running the query, and another 700ms are the (local) TCP transfer time and JSON serialization.

% time curl -s -XGET 'http://127.0.0.1:9200/dxr_test/line/_search?pretty' -d '{
    "query": {
        "filtered": {
            "query": {
                "match_all": {}
            },
            "filter": {
                "term": {"path": "db/sqlite3/src/sqlite3.c"}
            }
        }
    },
    "size": 9999999,
    "sort": "number"
}' > /dev/null
0.01s user 0.03s system 2% cpu 1.435 total
Motivations include...

* Being able to deploy markup changes without waiting for a reindex
* Making indexing faster, thus being able to do it more often and handle more trees on less hardware. (We assume that most of the files we currently build are never requested.)
* Not needing a huge, expensive, shared NetApp to hold all the static files

A straightforward caching reverse proxy could be used to boost the speed for frequently requested files, if necessary.
Blocks: 1047554
and

* Be able to syntax-color search results
* Maybe someday incrementally load large files (sqlite3.c)
We'll also need to insert into ES the info needed to render refs and links.
QA Contact: erik
Commit pushed to es at https://github.com/mozilla/dxr

https://github.com/mozilla/dxr/commit/559c7f69e1a299ceb72302619e9e61de9a5fad42
Implement request-time rendering of files. Fixes bug 1045183.

* Stop rendering file HTML at index time, and instead insert refs, regions, annotations, and links into ES for use at request time.
* Bump format version. DXR instances now consist of just config.py in an otherwise-empty folder. And most of that will go away once we implement independent tree indexing.
* The generated config.py and the HTTP controllers had a misguided idea that there is one set of enabled plugins across all trees. Corrected that. That would have been a big surprise once we started indexing more trees, with different sets of plugins. The only reason it even ran was that the generating code referred to a "tree" variable, the iterator of a list comp which leaked out (since this isn't Python 3).
* Correct the documentation of refs() and regions(). Neither is actually per line.
* Since HTML is never built at index time, remove the "html" option for "skip_stages".
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Commit pushed to master at https://github.com/mozilla/dxr

https://github.com/mozilla/dxr/commit/559c7f69e1a299ceb72302619e9e61de9a5fad42
Implement request-time rendering of files. Fixes bug 1045183.
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.