Closed Bug 863692 Opened 7 years ago Closed 2 years ago

Pages disappearing

Categories

(developer.mozilla.org :: General, defect, critical)

All
Other
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: sheppy, Assigned: espressive)

References

Details

(Whiteboard: [specification][type:bug])

Attachments

(5 files)

What did you do?
================
Lots of pages are just gone! What's going on?

Some examples:

https://developer.mozilla.org/en-US/docs/Project:Introduction_to_KumaScript
https://developer.mozilla.org/en-US/docs/Project:Getting_started_with_Kuma
https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/10

There are more, but those are the three that come to mind

What happened?
==============
They're just not there

What should have happened?
==========================
They should be there

Is there anything else we should know?
======================================
These are not pages anyone would delete. They haven't been deleted. This is in addition to other pages we mentioned over the last few days, where the records are there in Kuma's database but the pages are 404ing.
Severity: normal → blocker
Priority: -- → P1
Depends on: 851199
My thinking out loud from IRC, fwiw:

Hmm, I'd be surprised that docs get deleted ever, but there *is* just one spot where I can tell Document.delete ever gets called - https://github.com/mozilla/kuma/blob/master/apps/wiki/models.py#L998

that *should* only apply when you're trying to save a document atop an existing redirect

No idea how clobbering existing redirects could figure into this, but I can't think of where else in Django code we might try to delete a doc

Unless maybe... the redirect gets deleted, but then there's some problem in the rest of the save, and so then nothing is stored to replace what was deleted
No longer depends on: 851199
Attached the output of mysql --default-character-set=utf8 -e "select id, slug from wiki_document;" from yesterday's backup dump and the current production data.
So, I imported a DB backup from yesterday and a dump taken from production today, and ran this:

mysql -uroot -B -e'select concat("https://developer.mozilla.org/", locale, "/docs/", slug) from devmo_2013_04_18.wiki_document where id not in (select id from kuma.wiki_document) order by modified' > missing-since-yesterday

The result is attached as a list of 173 URLs.
Progress report as of right now:

1) We added a temporary hack to the Document model that throws an exception if anything tries to delete a document in Django. We don't really want to delete anything ever, so this should help us detect what might be doing it in server error emails.

2) I worked up a list of 173 docs missing since a SQL dump yesterday, and selectively restored them & their revisions with some SQL imports.

3) We re-enabled editing.

Still not sure what might have caused this, but #1 should help prevent it and give us diagnostics. #2 brings a bunch of pages back, but there still might be some missing if this issue has gone on longer than yesterday (let us know).
Oh, and for future reference when we want to remove the hack in delete():

https://github.com/mozilla/kuma/commit/67489d50ea6bf222db3b4968792cced12730b0a7

We'll want to remove or refine that, because it's currently breaking tests.
I suspect that the strange behaviour of the ListSubpages template on this page https://developer.mozilla.org/pt-PT/docs/Mozilla/Firefox/Releases may be related to this bug.

It should be listing only sub pages, but is listing https://developer.mozilla.org/pt-PT/docs/Firefox_3.6_para_desenvolvedores which is defiantly not a sub page.
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/eefb6f69543621c3aa469a21a8bde513496619da
bug 863692: Waffle switch "wiki_error_on_delete" to enable error-on-delete in wiki

https://github.com/mozilla/kuma/commit/f2f5ab152557c53a940fa84b097b718721eff004
Merge pull request #1035 from lmorchard/863692-delete-exception-waffle

bug 863692: Waffle switch "wiki_error_on_delete"
Going to make bug 863344 the one-stop-shop for restoring pages that have gone missing. Please add to the list with any missing pages you come across.
Blocks: 863344
FWIW: Still trying to figure out why this happened. As per comment 9, we implemented a hack that causes the site to throw an internal server error on an attempt to delete a Document.

Since 4/22, it's been triggered 3 times that I've seen:

* Once for /zh-CN/docs/CSS/开始$edit
* Twice for /en-US/docs/Mozilla/Firefox/Versions/14$move

Seems like both of these cases were attempts to delete a redirect and replace it with another document. This doesn't seem to explain missing documents discovered for this bug, though. So... no idea what happened, yet
I keep seeing bug reports about this happening on pages with special characters (if you consider underscores to be special) and Firefox version documentation. No idea what any of these things have to do with each other. I might just be reading too much into things.

Have Firefox/Versions pages been moving more frequently than average lately?
(In reply to John Karahalis [:openjck] from comment #12)
> I keep seeing bug reports about this happening on pages with special
> characters (if you consider underscores to be special) and Firefox version
> documentation. No idea what any of these things have to do with each other.
> I might just be reading too much into things.
> 
> Have Firefox/Versions pages been moving more frequently than average lately?

Those pages moved once or twice.

Les - Someone accidentally moved the Firefox 14 for developers page, and we need to move it back, but can't because of the disabled ability to delete redirect pages.

I'm still not 100% convinced this is related to page moving, but that does seem the most likely candidate. However, we do know for certain that it's not only moved pages that have gone missing.
Priority: P1 → P2
Priority: P2 → P1
(In reply to Eric Shepherd [:sheppy] from comment #13)

> Les - Someone accidentally moved the Firefox 14 for developers page, and we
> need to move it back, but can't because of the disabled ability to delete
> redirect pages.

The delete ability can be turned back on in the admin panel, by turning off the wiki_error_on_delete switch here:

https://developer.mozilla.org/admin/waffle/switch/

You can turn that off, do the deletes/moves, and turn it back on.

> I'm still not 100% convinced this is related to page moving, but that does
> seem the most likely candidate. However, we do know for certain that it's
> not only moved pages that have gone missing.

No, the initial set of missing documents didn't seem to point at redirects of page moving in particular. That's why I say we still have no idea why it happened.
Now I can't access to following pages.

 https://developer.mozilla.org/ja/docs/tag/CSS%20R%C3%A9f%C3%A9rence (Unnecessary)
 https://developer.mozilla.org/ja/docs/tag/CSS%20Reference

I don't know this problem is same bug.
lorchard? ubernostrum? Any ideas?
(In reply to Eric Shepherd [:sheppy] from comment #16)
> lorchard? ubernostrum? Any ideas?

No, those aren't disappeared pages. That's an ISE on tag pages, totally different bug.
(In reply to ghost@ethertank.jp from comment #15)
> Now I can't access to following pages.
> 
>  https://developer.mozilla.org/ja/docs/tag/CSS%20R%C3%A9f%C3%A9rence
> (Unnecessary)
>  https://developer.mozilla.org/ja/docs/tag/CSS%20Reference
> 
> I don't know this problem is same bug.

Different issue entirely. Filed bug 871638 to capture it.
I was change original location to the following pages.
(Purpose: Order to follow the parent page to child page)

  Moved From Web/HTML/Canvas/Tutorial to HTML/Canvas/Tutorial
    https://developer.mozilla.org/en-US/docs/Canvas_tutorial

And I was change location of child pages.

Pages disappearing was occur....
You can see some here. (Red link)
https://developer.mozilla.org/en-US/docs/Canvas_tutorial


"A Basic RayCaster" and its child was also disappear.
Help me.
(In reply to ghost@ethertank.jp from comment #20)
> You can see some here. (Red link)
> https://developer.mozilla.org/en-US/docs/Canvas_tutorial
> 
> 
> "A Basic RayCaster" and its child was also disappear.
> Help me.

Looks like you moved the Canvas tutorial page several times (at least, that's what the history says) and things got lost along the way. What exactly did you do while working on that page, so I can try to figure out where the child pages might be now?
(In reply to ghost@ethertank.jp from comment #19)
> I was change original location to the following pages.
> (Purpose: Order to follow the parent page to child page)
> 
>   Moved From Web/HTML/Canvas/Tutorial to HTML/Canvas/Tutorial
>     https://developer.mozilla.org/en-US/docs/Canvas_tutorial
> 
> And I was change location of child pages.
> 
> Pages disappearing was occur....

Also, I don't understand why you were moving those pages; they were in the correct place already.
(In reply to Eric Shepherd [:sheppy] from comment #22)

Because the child page is placed in the correct position before moving parent page, I was not able to move in the right place parent page.
Parent pages was should placed correct position before moving child.
That's reason for that operation.

But I already found another solution.
(In reply to Eric Shepherd [:sheppy] from comment #21)
> Looks like you moved the Canvas tutorial page several times (at least,
> that's what the history says) and things got lost along the way. What
> exactly did you do while working on that page, so I can try to figure out
> where the child pages might be now?

Were you ever able to find these? If they're still in the DB -- and if they were part of a move they *ought* to be -- then the process of fixing bug 881327 gets a lot simpler.
Flags: needinfo?(eshepherd)
(In reply to James Bennett [:ubernostrum] from comment #24)
> (In reply to Eric Shepherd [:sheppy] from comment #21)
> > Looks like you moved the Canvas tutorial page several times (at least,
> > that's what the history says) and things got lost along the way. What
> > exactly did you do while working on that page, so I can try to figure out
> > where the child pages might be now?
> 
> Were you ever able to find these? If they're still in the DB -- and if they
> were part of a move they *ought* to be -- then the process of fixing bug
> 881327 gets a lot simpler.

No, as far as I can tell these pages are *gone* from the database; at least, I can't find them even searching by name in the Django admin panel. :(
Flags: needinfo?(eshepherd)
Looks like this work will be pretty complex. Maybe we could break this bug down, and move the smaller-sized dependencies through the Kanban board as normal?
(In reply to John Karahalis [:openjck] from comment #26)
> Looks like this work will be pretty complex. Maybe we could break this bug
> down, and move the smaller-sized dependencies through the Kanban board as
> normal?

Well, first thing would be to restore the canvas tutorial pages, and anything else missing as of the latest round.

Second thing is to leave the waffle switch enabled, and just watch for attempts to delete (which will fail and generate error mail with the switch enabled). Unless/until we get enough data from that, we will not know what's causing pages to disappear, and we will not be able to resolve this. I'm going to add this as blocking bug 886013 since that's the first step.
Depends on: 886013
(In reply to James Bennett [:ubernostrum] from comment #27)
> (In reply to John Karahalis [:openjck] from comment #26)
> > Looks like this work will be pretty complex. Maybe we could break this bug
> > down, and move the smaller-sized dependencies through the Kanban board as
> > normal?
> 
> Well, first thing would be to restore the canvas tutorial pages, and
> anything else missing as of the latest round.

We have rebuilt the canvas tutorial pages from old copies left around here and there, so that part is done.
Sheppy, is anything else needed on your end at this time? From what I understand, the next step on our end is to keep an eye out, and review the (newly set up) log when this happens again.
Flags: needinfo?(eshepherd)
As far as I'm aware, we're back into waiting mode again.

Although I'd still like to know how this switch got undone in the first place. :)
Flags: needinfo?(eshepherd)
FYI we can now use errormill to watch for these:

https://errormill.mozilla.org/mdn/mdn/
Bumping the priority back down since we haven't seen this in a while, and Sentry should catch us when someone tries to delete again.
Severity: blocker → critical
Priority: P1 → --
Assignee: nobody → schalk.neethling.bugs
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/790a4d784fa7c67827973c0d9bd129cf517f4852
Fix Bug 863692, remove wiki_error_on_delete switch

https://github.com/mozilla/kuma/commit/870c41a54e4d59ac7323731a1ffab07c328cc3fd
Merge pull request #4562 from schalkneethling/bug863692-remove-wiki-error-on-delete-switch

Fix Bug 863692, remove wiki_error_on_delete switch
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.