Last Comment Bug 706948 - In some cases, a question displays "No replies" with a reply below
: In some cases, a question displays "No replies" with a reply below
Status: RESOLVED FIXED
u=contributor c=questions p=1
:
Product: support.mozilla.org
Classification: Other
Component: Questions (show other bugs)
: unspecified
: All All
: P2 normal (vote)
: 2012.8
Assigned To: Ricky Rosario [:rrosario, :r1cky]
:
:
Mentors:
: 707070 717163 736586 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-12-01 13:12 PST by Ricky Rosario [:rrosario, :r1cky]
Modified: 2012-05-07 18:28 PDT (History)
8 users (show)
See Also:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments

Description Ricky Rosario [:rrosario, :r1cky] 2011-12-01 13:12:34 PST
For example:

https://support.mozilla.com/en-US/questions/901169
http://cl.ly/2q2D1Y3q0j3V3x0v063w
Comment 1 Swarnava Sengupta (:Swarnava) 2011-12-01 23:59:08 PST
*** Bug 707070 has been marked as a duplicate of this bug. ***
Comment 2 Ricky Rosario [:rrosario, :r1cky] 2012-01-11 06:23:37 PST
*** Bug 717163 has been marked as a duplicate of this bug. ***
Comment 3 Swarnava Sengupta (:Swarnava) 2012-01-11 06:37:27 PST
This is started again after 10-01-12 SUMO release
Comment 4 Ricky Rosario [:rrosario, :r1cky] 2012-01-11 06:39:30 PST
There seems to be some cases where an answer gets saved but the question's denormalized fields don't get updated. (I have no idea how this can happen)
Comment 5 Muhammed Hasan 2012-01-28 10:00:06 PST
Problem still happening.. related explanation -> https://support.mozilla.org/en-US/forums/contributors/708120
Comment 6 Ricky Rosario [:rrosario, :r1cky] 2012-02-14 07:37:14 PST
I still have no idea how this happens. Making a 1pter to look into it. This does seem infrequent so we shouldn't spend more ~ a half day going down a rat hole.
Comment 7 Will Kahn-Greene [:willkg] 2012-02-17 14:49:29 PST
Grabbing this one to look into on Monday.
Comment 8 Will Kahn-Greene [:willkg] 2012-02-21 13:49:02 PST
The "No Replies" message is displayed if question.num_answers is 0.

I think if the question comes from cache, then num_answers could be 0, but question.answers.all() could kick off a query that doesn't come from cache and show the answers in the db and thus you could get two different "understandings" of the state of things.

I think the easy fix is to change questions/answers.html to do one of two things:

1. rely on question.num_answers: if num_answers is 0, then don't do a db query and don't show answers

2. rely on question.answers: check question.answers to see if it's 0 in regards to when to show "No replies"


I'm inclined to go with the latter. I don't think it'll be any worse performance-wise since we're already doing the query to show the answers.
Comment 9 Will Kahn-Greene [:willkg] 2012-02-21 14:05:38 PST
Hrm... according to the screenshot, the reply was 42 minutes later. So I think that probably nixes the cache theory.

I'll look into how an answer could be added without updating num_answers.
Comment 10 Will Kahn-Greene [:willkg] 2012-02-22 07:11:30 PST
I can reproduce the problem (or something that looks like the problem) locally when I have memcache enabled. I create a new question, then I answer it. The problem occurs when the answer gets saved, there's a section of code that does:

   self.question.num_answers = self.question.answers.count()

after the answer has been saved. However, that self.question.answers.count() is returning 0--even though there should be an answer there--so then self.question.num_answers gets a 0, gets saved, and that's how we end up with a mismatch.

Three interesting things:

1. There is a test for this already and the test works fine (test_models.py TestAnswer.test_new_answer_updates_question).

2. If I switch to the dummy cache, I can't reproduce it anymore with my steps.

3. If I change the count query to:

       self.question.num_answers = Answer.objects.filter(
           question=self.question, upvotes__gte=0).count()

   that correctly returns the number of answers. The key part is the upvotes__gte
   part which is totally goofy, but causes the query to get the answer from the db
   rather than cache.


I'd rather not go with that as a solution, though. Is there a way to invalidate the cache or force the queryset to get the results from the db for "get me all answers for question x"?
Comment 11 Will Kahn-Greene [:willkg] 2012-02-22 08:38:38 PST
Switched it to use uncached. That fixes the num_answers mismatch for me.

https://github.com/mozilla/kitsune/pull/506
Comment 12 Will Kahn-Greene [:willkg] 2012-02-22 09:36:11 PST
Checked into master in https://github.com/mozilla/kitsune/commit/41a5e7326ba5aa7ecfd1b500eebb4efb039c0608 .

I think that change fixes the problem here. If not, we'll go back to the drawing board with new information.
Comment 13 Rebecca Billings [:rbillings] 2012-02-29 16:53:21 PST
Is there a way to test the fix manually on stage? I haven't been able to repro.
Comment 14 Will Kahn-Greene [:willkg] 2012-03-01 06:06:19 PST
(In reply to Will Kahn-Greene [:willkg] from comment #12)
> Checked into master in
> https://github.com/mozilla/kitsune/commit/
> 41a5e7326ba5aa7ecfd1b500eebb4efb039c0608 .
> 
> I think that change fixes the problem here. If not, we'll go back to the
> drawing board with new information.

So... the problem with my steps to reproduce and premise was that it assumed we were caching counts, which turned out to be wrong. Given that, where I was able to trivially reproduce the issue, it's not the case on dev, stage or production.

After talking this over with James and Ricky, we decided to leave the fix in because it shouldn't have an adverse effect and if we ever did decide to cache counts, this would be something that would just continue to work without additional changes.

The reported issue happens sporadically. It's probably hard to test in production, though we'll find out if the frequency of the issue goes down. It's probably harder to test on staging where we have less activity and data.
Comment 15 Rebecca Billings [:rbillings] 2012-03-01 16:30:30 PST
I'm closing this as [qa-] due to the difficulty of testing on stage. This will re-open if the problem persists on production after the push.
Comment 16 John Hesling [:John99] (NeedInfo me) 2012-04-21 12:39:14 PDT
Possibly the problem continues apparently it was noticed on a recent Sumo Day see for instance /forums/contributors/708120#post-46131
Comment 17 Will Kahn-Greene [:willkg] 2012-04-22 06:48:58 PDT
Reopening.

Given that this has been worked on several times before over the last year (possibly year and a half), I think it warrants serious effort the next time we look into it since it's hard to reproduce and we've probably exhausted the set of easy causes. So it should probably be a 3 pter.
Comment 18 Tyler Downer [:Tyler] 2012-04-25 09:16:24 PDT
I've been seeing this alot recently on the forums. Especially over the past two weeks or so.
Comment 19 Ricky Rosario [:rrosario, :r1cky] 2012-04-26 11:36:42 PDT
Landed in prod. I have a good feeling this will fix it:

https://github.com/mozilla/kitsune/commit/4fbc42f0ec3fbfc8ecc3ce4cc7ed46e3e59baad0
Comment 20 Ricky Rosario [:rrosario, :r1cky] 2012-05-07 18:28:51 PDT
*** Bug 736586 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.