Closed Bug 671762 Opened 13 years ago Closed 10 years ago

Site-wide search and replace

Categories

(developer.mozilla.org Graveyard :: Editing, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: openjck, Unassigned)

References

Details

For example, when approaching a product release, the version number changes.
Priority: -- → P3
This is also needed, for example, because in Gecko 10.0, all uses of PRBool changed to the standard C++ bool type. Updating all docs that use PRBool by hand is hard, and can almost always be handled automatically using a global find and replace feature. For this reason, I'm bumping priority.
Priority: P3 → P2
Blocks: 756266
Priority: P2 → P3
No longer blocks: 756266
Blocks: 756266
Can this be done with a page in the admin?

Can this be a straight text replacement or do you need to use regular expressions?

Do you need to approve all the replacements? Do you need a list of all the places where changes occurred?

What happens if you discover you made a grave error?
I'd like this to be permission controlled so that only people with high permissions can use it (admins and trusted editors).

Ideally, we'd enter in the find and replace strings, but get a list of where changes would occur so we could look at the changes before they're applied, and possibly drop some out of the list if need be.

I'm fine with straight text replacement but some people want regexp. I'd say the former is fine for starters, and maybe do regexps later?

This should absolutely leave the change in the history for each page, so they can be undone if a mistake was made. The revision comment should include a note like "Global find and replace X -> Y".
New comments about this from Eric and Jean-Yves:

https://bugzilla.mozilla.org/show_bug.cgi?id=779218#c0
https://bugzilla.mozilla.org/show_bug.cgi?id=779218#c1
Summary: Site-wide search and replace for article titles and bodies → Site-wide search and replace
Version: Kuma → unspecified
Component: Website → Landing pages
Component: Landing pages → Editing
Priority: P3 → --
Whiteboard: u=contributor c=contentmgmt p= → p=
No longer blocks: 756266
Whiteboard: p=
In order to accomplish zone-related cleanup and organization work before zones launch, this will need to be implemented sooner rather than later. We have a great deal of content work to do that cannot realistically be done without this functionality.

Additional notes:

* The Django admin has a search but cannot search only current versions of pages, so you get hits for all of history. It has no replace.

* We need to have replace functionality locked to people with advanced permissions (either admins or, ideally, a new permission). There's a lot of risk inherent to this feature's misuse.

* Site-wide search should be available more broadly.

* The search should search on the raw content of pages, before templates are executed, so that we can find template uses to update them if they need fixing after template changes.
Blocks: 907234
No longer blocks: 907234
Blocks: 907248
No longer blocks: 907248
Some notes off the top of my head while I'm looking at this bug:

* A site-wide find is less troublesome than a site-wide replace. In fact, bug 926316 has sprouted up for just the find part.

* A replace could take long enough to warrant executing it as an offline queue task with some sort of (email?) notification when it's done.

* A replace could potentially affect 100s to 10000s of documents. We should probably pair this with a site-wide undo feature that keeps track of all the docs affected by a replace operation, and perform a mass revert of all those pages in case the operation was a mistake. Otherwise, the fix is a manual page-by-page repair process.

* What happens if someone edits a page within the scope of an ongoing replace operation? And what if the edit just happens via race condition to collide with the replace operation?

* Basically, what are all the failure modes for this kind of whole-site modification operation, and how can we recover from each?
Not sure if this is more or less scary, but this kind of change is something that a git-based backend would make much more practical to implement. (see bug 756547)
(In reply to Les Orchard [:lorchard] from comment #7)

> * A site-wide find is less troublesome than a site-wide replace. In fact,
> bug 926316 has sprouted up for just the find part.

Agreed; that's why we filed the other bug, because we know the replace is scary and complicated, and we don't want the find part to be blocked on that.

> * A replace could take long enough to warrant executing it as an offline
> queue task with some sort of (email?) notification when it's done.

Agreed.

> * A replace could potentially affect 100s to 10000s of documents. We should
> probably pair this with a site-wide undo feature that keeps track of all the
> docs affected by a replace operation, and perform a mass revert of all those
> pages in case the operation was a mistake. Otherwise, the fix is a manual
> page-by-page repair process.

Yeah, that's probably true. We'd also want the replace to first show you what's going to happen, rather than just blindly doing it. Indeed, I would not be opposed to it actually prompting you page by page to confirm that you want the change to be applied there, instead of it going through them all without any user interaction. That would be much, much safer, yet still faster than manually editing each found page.

> * What happens if someone edits a page within the scope of an ongoing
> replace operation? And what if the edit just happens via race condition to
> collide with the replace operation?

This is another reason prompting for each page might make sense.

> * Basically, what are all the failure modes for this kind of whole-site
> modification operation, and how can we recover from each?

That, I'm not qualified to answer, but I suspect the one-by-one replace would help a lot.
(In reply to Eric Shepherd [:sheppy] from comment #9)

> > * A replace could potentially affect 100s to 10000s of documents. We should
> > probably pair this with a site-wide undo feature that keeps track of all the
> > docs affected by a replace operation, and perform a mass revert of all those
> > pages in case the operation was a mistake. Otherwise, the fix is a manual
> > page-by-page repair process.
> 
> Yeah, that's probably true. We'd also want the replace to first show you
> what's going to happen, rather than just blindly doing it. Indeed, I would
> not be opposed to it actually prompting you page by page to confirm that you
> want the change to be applied there, instead of it going through them all
> without any user interaction. That would be much, much safer, yet still
> faster than manually editing each found page.

So, some implications of supporting a manual page-by-page replace:

* Need a model of a replace operation - i.e., store a list of all the pages matched by the find, the desired replace, and a pointer to current progress in the list.

* UI views to manage a user's progress through a replace set, along with options to possibly cancel.

* UI views to manage possible multiple ongoing replace operations owned by a user.

* UI views to review each replacement operation with actions to execute or skip.

* Still probably need an undo, but I guess that could be lower priority if every replacement needs manual confirmation to begin with.
The search is working well, and the replace won't be done that way. 

I'm closing this part of our bug slaughtering hour.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Product: developer.mozilla.org → developer.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.