Closed Bug 1094452 Opened 10 years ago Closed 4 years ago

Request: Please rebuild HTML documentation tree to reflect HTML5's new status

Categories

(developer.mozilla.org Graveyard :: General, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: sheppy, Unassigned)

References

()

Details

Now that HTML has moved to recommendation status, everywhere our version info macros indicate that it's PR is out of date. Please trigger a rebuild of the subtree starting at https://developer.mozilla.org/en-US/docs/Web/HTML as soon as is practicable, to update all of these references.

Thanks!
Unfortunately there are too many pages to regenerate all of them by hand.
Flags: needinfo?(lcrouch)
Summary: Request: Please rebuild HTML documentation tree → Request: Please rebuild HTML documentation tree to reflect HTML5's new status
Severity: normal → major
There's a `render_document` management command we can use for this [1], which already includes the ability to run:

python manage.py render_document /en-US/docs/HTML

Looks like the stage server also has the full HTML tree available, so we can rehearse the command there:

https://developer.allizom.org/en-US/docs/Web/HTML

[1] https://github.com/mozilla/kuma/blob/master/kuma/wiki/management/commands/render_document.py
Flags: needinfo?(lcrouch)
:jezdez - can you file a bug under this to add or change code to use the new celery chord & group code?
Flags: needinfo?(jezdez)
Ok, I've talked to :robhudson in detail and have formed a few ideas, depending how much work we want to invest in this.

Currently the management command schedules rendering tasks immediately when using the --defer command line option. That translates to executing those rendering tasks in parallel on all three workers and a guarantees a DDoS via the kumascript/kuma request/response cycle.

There are a few options we can implement to make sure that scheduling those tasks doesn't lead to a DDoS:

1. extend the management command to use a Celery chain to make sure the rendering tasks are executed sequentially wrapped in a chord with a callback that reports when the rendering is done

Pro: Simple to develop
Con: Will take as long as actually running the management command without the defer CLI option, probably very long

2. use the rate-limit ability of celery to rate-limit a task per worker instance to a sensible amount per time unit and use a Celery task group for running the rendering tasks in parallel but under control to reduce risk of overwhelm

Pro: improved scalability and potentially speed
Contra: hard to guess how many tasks can kuma/kumascript take, so not clear how big the rate-limiting should be, also hard to test outside of prod environment due to missing dev/stage/prod parity

3. use a separate Celery queue to separate task execution from other of Celery tasks, with a hard lock of the number of tasks and/or time of processing, a.k.a. "global" rate-limit

Pro: best isolation from other important Celery tasks such as email sending. other tasks could profit from that as well, e.g. have a separate "render" queue just for rendering docs
Contra: ops intensive as it requires separate celery worker process setup, including dev environment setup

Obviously the main reason to decide for one of the options above (and they could easily be combined as well) is constraints on developer time. That's outside my jurisdiction though ;)
Flags: needinfo?(jezdez)
Assignee: nobody → robhudson.mozbugs
In a meeting this morning we decided to go with option 1 for now. Option 3 is attractive but we will save that for later if/when we add a self-service option for rebuilding subtrees.

In addition to option 1 we will add tasks in the chain that send emails to mdn-dev@ after each 20% of the tasks are processed and once again when everything is complete.
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/b8fc1b41e7ceb7c43f6fe0d1839661aeef26c3df
Bug 1094452 - Convert render_document command to use chain

https://github.com/mozilla/kuma/commit/c917d99252d07cb654b5cc6463da509bb4cb148c
Merge pull request #3100 from robhudson/1094452-render

Bug 1094452 - Convert render_document command to use chain
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/43de314f85500cc3e7c146c6792c1aeb5ea006a6
Bug 1094452 - Fix date comparison in query

https://github.com/mozilla/kuma/commit/2c0d5be1741abbf1b61ffc62c7f99f5f8ea7a4b2
Merge pull request #3111 from robhudson/fix-date-comparison

Bug 1094452 - Fix date comparison in query
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/12331eda798220ebb25bbe63a45d281e3122ec0b
Bug 1094452 - Fix call to document render with pk

https://github.com/mozilla/kuma/commit/bb9cd3c084b0a7454bef66ea2ad7d6563beb65d5
Merge pull request #3114 from robhudson/render-fix

Bug 1094452 - Fix call to document render with pk
I sadly never was able to focus on general search improvements.
Assignee: robhudson → nobody
MDN Web Docs' bug reporting has now moved to GitHub. From now on, please file content bugs at https://github.com/mdn/sprints/issues/ and platform bugs at https://github.com/mdn/kuma/issues/.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
Product: developer.mozilla.org → developer.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.