Closed Bug 1032455 Opened 10 years ago Closed 9 years ago

[tracker] Implement "helpfulness" rating for articles

Categories

(developer.mozilla.org Graveyard :: Wiki pages, enhancement)

All
Other
enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jswisher, Unassigned)

References

Details

(Whiteboard: [specification][type:feature][LOE-ux:10][LOE:2])

Attachments

(2 files)

What problems would this solve?
===============================
This would provide a quick feedback mechanism for users about particular articles. It would also provide a quality measure for articles, which MDN currently does not have.

Who would use this?
===================
Users would use the helpfulness rating to provide feedback about articles.

Editors, localizers, and admins would use ratings over time and across articles to assess article quality, and prioritize updates.

What would users see?
=====================
A widget at the bottom of the article, with "Was this article helpful?" and Yes and No buttons. 

There would also need to be a dashboard for tracking helpfulness metrics per article, and across articles. Contents to be determined. Also, an interface for examining "No" feedback.

What would users do? What would happen as a result?
===================================================
Users would click either Yes or No. If Yes, they would get an acknowledgement. If No, they would get the opportunity to provide more details about what's wrong. See the attached screenshot for the options available on SUMO. 

Is there anything else we should know?
======================================
SUMO has had such a system in place for years, so we should look at it and talk to the SUMO team for ideas and lessons. 

User ratings tend to be bi-polar, so there is no need for more than two rating values. Metrics to track might include yeses/nos, and ratings/visits. See http://www.amazon.com/Building-Reputation-Systems-Randy-Farmer-ebook/dp/B0043D2ES6
<3 this feature idea.

:hoosteeno, is the screenshot from SUMO enough for product definition? :) I'd be happy to just copy and tweak the feature they have now.
Flags: needinfo?(hoosteeno)
Whiteboard: [specification][type:feature] → [specification][type:feature][LOE-ux:?]
As a person who updates MDN, I would tell you for sure that this would make me update it more frequently and better. Getting the feeling that other people are being helped is awesome.

That's why I would actually recommend that we try to put the "Was this helpful?" thing on each secion on the mdn page, this way it can be more useful to editors. Like they can see that people specifically found their edit more useful.


I have a bunch of edits to make but i havent got around to it yet just because I dont know how helpful all my editing has been.


It might turn out people might not use this feature too much if they have to make an account, so please allow anonymous feedback.

If still people don't use it too much, that would be a bummer :(
(In reply to Luke Crouch [:groovecoder] from comment #2)
> :hoosteeno, is the screenshot from SUMO enough for product definition? :)
> I'd be happy to just copy and tweak the feature they have now.

I would not want to use SUMO's categories as-is, since they don't all fit MDN's case. The appropriate categories for MDN will emerge over time, but we'll need to figure out an initial set.

(In reply to noitidart from comment #3)
> As a person who updates MDN, I would tell you for sure that this would make
> me update it more frequently and better. Getting the feeling that other
> people are being helped is awesome.

Thanks for the feedback!

> That's why I would actually recommend that we try to put the "Was this
> helpful?" thing on each secion on the mdn page, this way it can be more
> useful to editors. Like they can see that people specifically found their
> edit more useful.

I expect we'll start with per-page feedback, and only add per-section feedback if there is demand for it. Also, see bug 665752, which is about annotations/comments on pages; it may be in line with your thoughts. (Lots of history in that bug.)

> It might turn out people might not use this feature too much if they have to
> make an account, so please allow anonymous feedback.

I think that is the way SUMO's feature works (i.e., anonymously). They use cookies or something to hide the widget after you've used it on a particular page.
I think this set of categories would probably cover the bases:

* Not enough info
* Confusing
* Incorrect
* Other
* I have feedback about a Mozilla product
If an article is poor (not helpful) and I significantly update it, how do we reset this value?
That's a good point Jean. Maybe we can do it by IP. If a user from that IP makes another comment/feedback it overwrites their old one.
No I mean that if I significantly update the article, the bad notation of it will stay, giving the impression that the new article is still bad.
Hmm well I guess script should decide if the change was drastic, if it was, then it creates a new poll for the revision. So polls can be kept per major revision? Just thinking out loud.
ps after changing dpi i did the mspaint measure thingy to make sure :P
oops wrong bugzilla topic, plz delete
:atopal - (how) what does SUMO do when a low-rated article is significantly updated. Do you reset the helpfulness rating?
Flags: needinfo?(a.topal)
I don't think they keep a global "for all time" rating. Rather, they look at ratings in a given time period. Thus, if the article was updated in May, the ratings would be (expected to be) higher in June.
I think the components of this solution are coming together.

1) the SUMO widget in comment 1 is good as UI guidance
2) the categories in comment 5 are good

The biggest gap here is specifying what we'll eventually do with the data down the road, and what (if anything) we have to do with the data at launch to get some value from the feature. So...

3) What will we do with this data down the road? The answers to this will influence how the feature is built. Ideas:
* we'll display negative feedback in the context of editing an article to help article editors fix it
* we'll create a task (https://developer.mozilla.org/en-US/docs/MDN/Getting_started#Possible_task_types) and encourage people to fix unhelpful content (using the handy negative feedback display mentioned above), ordering the list of these by most unhelpful
* we'll award badges to "most helpful" contributors (so we probably have to capture that editor-feedback relationship somehow starting on day 1)
* we'll measure overall satisfaction levels and how they change over time and how many articles have feedback and how many people are using the widget because :groovecoder
* we'll show some of those measurements on a community metrics dashboard, report them to folks, blog about them

4) What do we need to do with the data, minimally, to launch?
* I think if we're capturing the data such that we can address #3's use cases eventually, then we have to do very little at launch. To demonstrate basic value of the feature I think we could simply say, "13 people found this article helpful" somewhere on an article. That's nice to see as comment 3 points out.
* And file bugs for the most compelling things in #3.

WRT "resetting" helpful rating, I suggest we answer this by examining our potential uses:
* On an article: "10 recent visitors found this article helpful." (A rolling window, maybe 60 days long?)
* In an editing interface: "Since its last edit, 8 people have said this article is too short." (Related to edit events perhaps, but "recent" might work too)
* In a profile or badge scenario: "20 people have found articles you edited helpful!" (Over all time, because why not?)
* On a metrics dashboard or a report: "We saw 50,312 pieces of feedback in 2014. 108 were negative, the rest were positive". (Over all time)

In other words, I think we don't want to "reset" the count because we need that data perpetually for some of these cases. We just have to experiment with the right amount of it to show per use case.
Flags: needinfo?(hoosteeno)
How do we measure success of this feature? Which metric(s) do we want to improve with it? And what are our prognosis of improvement of these?
Since this feature's data will feed *other* features, it succeeds when:

1. X% of readers use it and 
2. provide quality feedback with it

:r1cky - what's the reader:helpfulness-rater ratio on SUMO?
Flags: needinfo?(rrosario)
(In reply to Luke Crouch [:groovecoder] from comment #16)
> :r1cky - what's the reader:helpfulness-rater ratio on SUMO?

In the past month:  832,630 votes for 33,903,559 unique views
Flags: needinfo?(rrosario)
(In reply to Ricky Rosario [:rrosario, :r1cky] from comment #17)
> (In reply to Luke Crouch [:groovecoder] from comment #16)
> > :r1cky - what's the reader:helpfulness-rater ratio on SUMO?
> 
> In the past month:  832,630 votes for 33,903,559 unique views

Those are article views ^
:r1cky - do you have numbers from when you first launched the feature?
Flags: needinfo?(rrosario)
(In reply to Luke Crouch [:groovecoder] from comment #19)
> :r1cky - do you have numbers from when you first launched the feature?

Nope :(. It was there since the original version of kitsune. And back then we weren't on GA yet... I think we've found that placement affects the number of people that vote.
Flags: needinfo?(rrosario)
Wow thanks Janet for getting this rolling. I have wanted this feature for so long. When I sent out that looong winded email nagging for "helpful ratings" via the AMO Editors thing I was so surprised that it got a reply hahaha. It looks like lots of people want to see this happen too. :)
(In reply to Luke Crouch [:groovecoder] from comment #16)
> Since this feature's data will feed *other* features, it succeeds when:
> 
> 1. X% of readers use it and 
> 2. provide quality feedback with it
> 
1. What value of X will be a success? We need to define this now, before the implementation. Not after the launch to call it a success.
2. Is not something we can say success or fail. We need a measurable metric here.



Also I don't see how much it will help towards goals defined by Stormy. Can you provide the information:
* How many new readers you expect we will get *this year*? 
* How many new active editors(/month) this feature will bring the month after the launch, after 3 months, after 6 months?

Here again we need a prediction so that we can check if our modelisation of how the MDN works is correct or need tweaking.
Flags: needinfo?(lcrouch)
Also I think we need to transform this bug in a meta bug, we need a lot of sub-features to get this working.

We need to adapt the page json generation, we need to adap the MDN macros so that our l10n and,  we need a few special pages (à la /fr/docs/without-parent ), we need a UI that needs to be reviewed by the content team before sign-off.

Also, how will this work on translated pages? As they are mere translations, will the info be combined with en-US rating? If not, how do we act on them? Is there a risk that translators get poor notes and get frustrating because it is the en-US original who is poor?
(In reply to Jean-Yves Perrier [:teoli] from comment #23)
> we need to adap the MDN macros so that our l10n and, 
Sorry too early in the morning!

I meant:

We need to adapt several MDN macros so that our l10n and topic drivers knows they have poor quality pages in their area.
+1 noitidart - Janet, thanks for starting this thread.

+1 :teoli - thanks for helping define this feature and target it to our goals

For the success metrics & goals, I propose:

1. Target 2.5% usage rate like SUMO has

2a. Increase revisions to unhelpful docs by 10%
2b. Increase first-time contributors to unhelpful docs by 10%
2c. Increase helpfulness rating of unhelpful docs by 10%

Great questions re: translations. needinfo'ing :hoosteeno
Flags: needinfo?(lcrouch) → needinfo?(hoosteeno)
> Also, how will this work on translated pages? As they are mere translations,
> will the info be combined with en-US rating? If not, how do we act on them?
> Is there a risk that translators get poor notes and get frustrating because
> it is the en-US original who is poor?

It's a great question.

I think we can't assume that feedback about a page in Spanish (es confuso) applies to page in English. In other words, the feedback a potential editor might see could vary depending on their locale. We might save the locale of the person giving feedback and associate it with the locale of a person editing. We might also include a fallback option -- perhaps something like "Also, 23 people offered feedback in English."

As a monoglot, I'm not sure this is the right way forward. Better ideas welcome!

> Also I think we need to transform this bug in a meta bug, we need a lot of
> sub-features to get this working.

Agreed.

:groovecoder, now that the scope of this bug is becoming clearer, can you give us an estimated level of effort?
Flags: needinfo?(hoosteeno) → needinfo?(lcrouch)
Summary: Implement "helpfulness" rating for articles → [tracker] Implement "helpfulness" rating for articles
Whiteboard: [specification][type:feature][LOE-ux:?] → [specification][type:feature][LOE-ux:5]
Severity: normal → enhancement
Component: General → Wiki pages
Is it possible to have like a monthly leader board. Every time something gets marked as helpful, all users who edited it get a point. And we can see a score board. Monthly only, because all time boards eventually end up turning the top users to permanent leaders.
That's certainly possible, though we probably won't ship it with v1 of the feature.

As for the back-end estimate ...

We already use a contentflagging app [1] to flag demos [2] and to mark wiki docs for deletion [3]. The wiki view is already made to receive POSTs with any flag_type from FLAG_REASONS[4].

So, the backend change for the MVP is to:

1. Add 'helpful' and 'unhelpful' to WIKI_FLAG_REASONS
2. Make the doc template check if the user has already flagged the article as 'helpful' or 'unhelpful'

The rest of the work is to design & build the front-end that sends the proper POST to the flag view, and displays if - and how - the article has been flagged.

[1] https://github.com/mozilla/kuma/tree/master/apps/contentflagging
[2] https://github.com/mozilla/kuma/blob/master/apps/demos/views.py#L250-275
[3] https://github.com/mozilla/kuma/blob/master/apps/wiki/views.py#L2448-2479
[4] https://github.com/mozilla/kuma/blob/master/settings.py#L860-873
Flags: needinfo?(lcrouch)
Whiteboard: [specification][type:feature][LOE-ux:5] → [specification][type:feature][LOE-ux:5][LOE:2]
Sorry for the late reply here (offsite and all), here is some more info about how we are using this at SUMO.

1. The placement of the survey at the bottom of the article (give you an order of magnitude more votes than the the sidebar)

2. We show the votes over time (both, total votes (yes and no) and percentages) on each article's history page: https://support.mozilla.org/en-US/kb/update-firefox-latest-version/history ("click on Show Helpfulness Votes Chart")

3. We show a short list of articles with their helpfulness rating on the KB dashboard: https://support.mozilla.org/en-US/contributors The full list lives at: https://support.mozilla.org/en-US/contributors/unhelpful

We sort that list by a combination of number of views and helpfulness rating (and I think we only take the last 30 days into account)

4. We currently store the comments that go with "no, this wasn't helpful" in the KB, no representation in the UI.

5. Every document gets it's own helpfulness rating, meaning all localized articles are separate and we show the helpfulness for them on their Localization dashboard and on the history page (There are large differences in helpfulness between locales)


Let me know if you have any other question.
Flags: needinfo?(a.topal)
Depends on: 1039463
Depends on: 1039468
Increasing the UX estimate.  Stephanie says we may have to do a more thorough investigation of other rating widgets, and possibly a bit of original design work, to get a system that fits our goals.
Whiteboard: [specification][type:feature][LOE-ux:5][LOE:2] → [specification][type:feature][LOE-ux:10][LOE:2]
I was wondering how this was going. We have people posting on stackoverflow that they are finding things useful on MDN: http://stackoverflow.com/a/28030718/1828637
(In reply to noitidart from comment #31)
> I was wondering how this was going. We have people posting on stackoverflow
> that they are finding things useful on MDN:
> http://stackoverflow.com/a/28030718/1828637

That is fantastic, thanks for the link. It is always great to hear that people found their answer on MDN.

There are no engineers currently working on this feature and it is not prioritized on our near-term roadmap[0]. If any very experienced django developers or UX experts want to pitch in, please reach out here or in #mdn.

[0] https://wiki.mozilla.org/MDN/Product_roadmap#Product_Roadmap
I am not a very experienced django developer but a little experienced django developer. I have set up vagrant based dev environment. I would like to pitch in with the django specific part as my first contribution. Any insight?
Flags: needinfo?(lcrouch)
Hi Abhishek. This particular feature may need more time with heavier django code than you'll like as a first bug. But, if you want to experiment with something and learn more django, a general approach might be to add a wiki view that can create, update, and retrieve an ActionCounterUnique model for an article. Then wire it up with some basic UI on the wiki article view page.

But be aware that this feature needs some UX attention and some product attention before we'll be able to merge or push it. So even if you get something working quickly, it may be a long while before it lands on the live site.
Flags: needinfo?(lcrouch)
@groovecoder. Thanks for that suggestion! I'd rather not start with this bug. looking into others. Thanks :)
This is another item on my long-term to-do list to write requirements for.
Flags: needinfo?(jswisher)
I have filed a project review bug for Qualaroo, a micro-survey widget tool. I would like to use Qualaroo to help us learn the answers to some key outstanding questions related to the helpfulness bug:

* How can we convince readers to rate content? (what call to action? what placement?)
* How do we make sense of ratings? (what weight do we assign to ratings? what else might be involved -- content traffic rates? social share rate? links back? over what timespans?)
* How can we use those ratings to improve content? (what do we do with pages that have a low quality rating? how do we triage this?)
* How will we know the helpfulness rating is making content more helpful? (the content helpfulness rating will go up over time?)

Using Qualaroo cuts out a lot of development and UI work and lets us move directly to answering these key questions; then, if it proves insufficient (for example, because it does not integrate with our metrics dashboards) we will be able to implement a much more specific featureset. 

I suggest we treat Qualaroo as an experiment that will help us learn what helpfulness needs to be. I attach a demo that shows how it might present a question about helpfulness.

(FWIW, we want to use Qualaroo for other things too, but it seems very suited for this.)
Just rediscovered this spreadsheet full of user stories about this feature, which was created back in October 2014: https://docs.google.com/a/mozilla.com/spreadsheets/d/1Jt1-LN_siEjT2QsRoaeVbC_YySIFKtjClABYUQgkNiI/edit#gid=0
Flags: needinfo?(jswisher)
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
I don't think this bug is fixed. But we have some great progress! Specifically...
* We have an experiment launched this week[0]
* We're collecting great data from the experiment[1]

The current widget is not good enough to be made permanent, but it is producing great information about the potential for this feature going forward.

[0] https://docs.google.com/document/d/19jZFet9i-zptxnhnzuOjqQLYPPyr2JAK38jluCKdrU0/edit#
[1] https://docs.google.com/spreadsheets/d/1b54Lbmnund4ZOnnbGX9OmV8q0radtj27ENMD4Pm6BZ0/edit#gid=305806819
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
We learned some great things from the helpfulness experiment. Briefly...

1) Helpfulness is one metric (of several, many of which are already instrumented) that we could use to identify opportunities for content improvement on a locale-by-locale, section-by-section, or page-by-page basis. If Helpfulness was instrumented it probably would have a place in the Doc Status stats[0].

2) Helpfulness ratings could also be a useful metric for identifying, prioritizing and measuring progress on a variety of technical changes. The data the experiment produced suggest interesting explorations in UX and navigation. A helpfulness metric might also be useful as a benchmark for performance improvements. In this case Helpfulness might share space with other Engineering KPIs such as site stability and page speed. 

Both items above would depend on either constant or regular collection of Helpfulness data at a higher sample rate than the experiment had (it was potentially exposed to 5%, but actually only exposed to .2% due to page time requirement). 

A slide deck that contains the entire narrative of Helpfulness is here (for fans of circular references, it links to this bug): https://docs.google.com/presentation/d/1RPXNXHhSpvkChl-wxkCymntRJSL1E8mb4uS0SlfD8-Y/edit#slide=id.gb8cec65aa_0_51

I presented this slide deck (including its final pitch) to stakeholders to the engineering and content organizations during two optional meetings. Based on the experiment, I believe the metric is valuable enough to collect constantly, consider (but not alone) in prioritization conversations, and optimize against. 

[0] e.g. https://developer.mozilla.org/en-US/docs/MDN/Doc_status/JavaScript
Just to state it clearly: the doc team has currently no plan to use this as a KPI in 2016. We like it as a metric though, but re-running the experiments once every Quarter or twice a year is enough for our needs. We don't require any more development there.
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/2fab2cebccfa0047e963fcb337513ad6560fa080
Bug 1032455: Tweaks to helpfulness

2 new options for "What would make it better?".
Do not ask more than once per day.

https://github.com/mozilla/kuma/commit/afa8e0fccf162743937d704f4165fd66297dfd49
Merge pull request #3556 from stephaniehobson/bug-1032455-helpfulness

Bug 1032455: Tweaks to helpfulness
I thought the main goal for helpfulness was to make the content writers feel appreciated/their work is actually being used, which would promote them to provide more content to MDN.
In my opinion, that is a beneficial effect, but not a primary goal. The main goal is to understand which pages are helpful to readers and which ones are not, so we can improve the lower-rated pages. (Low ratings about a page don't make anybody feel good, but it's important to know.)
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/e0c5d0c0ca131ae218ffd247a4c843e18d19d637
Bug 1032455: Helpfulness - extend wait

Extend waiting period to ask again to 7 days. Move setting localStorage variables earlier in the file so asking again is not contingent on their answer. Track waiting period for helpfulness and articles using same technique.

https://github.com/mozilla/kuma/commit/ac0628e85f74547a8fcee907da3ed8b85be658e4
Merge pull request #3634 from stephaniehobson/bug-1032455

Bug 1032455: Helpfulness - extend wait
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/eb9bb498ef725d18028b8992e07a688f4c39283d
Fix Bug 1032455: Remove Helpfulness waffle flag.

https://github.com/mozilla/kuma/commit/3f3074f2a3fa6bb0f931872fd1470d9d9d57c9f9
Merge pull request #3686 from stephaniehobson/1032455-helpfulness

Fix Bug 1032455: Remove Helpfulness waffle flag.
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Product: developer.mozilla.org → developer.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: