Open Bug 1058583 Opened 5 years ago Updated 5 years ago

Address Book Popularity Index needs to age

Categories

(Thunderbird :: Address Book, defect)

defect
Not set

Tracking

(Not tracked)

People

(Reporter: educmale, Unassigned)

References

(Blocks 1 open bug, )

Details

User Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0 (Beta/Release)
Build ID: 20140716183446

Steps to reproduce:

Using the autocomplete for address book autocomplete . . .

The PopularityIndex not only needs to age, but it must do so without a history of when various emails' addresses were used in the first place.  (Note that a user could have a huge address book, and no emails, in the worst case)

I am assuming (am I right?) that there is a field in the AB database with the integer count, the "PopularityCount", which is incremented by one each time an email is sent using an address



Expected results:

I think I have a simple solution path for this aging, where the PopularityIndex is really a "PopularityCount" of uses -- using this algorithm: 

Pick various values, and stash them in the config.  The first is an integer "ReductionFactor" of 0-100; default="3".   The second is a "DaysBetween", which selects the "NextExecutionDate" of the algorithm by adding that integer to the current execution date; default="30".   The third is the "ReductionFactorMultiplier"; default="1.5".   The forth is the "RangeBase", which helps define the ranges in a), b) and c) below; default=10

On or after the "NextExecutionDate, if the "ReductionFactor" is not zero:  reduce the size of the "PopularityCount" for each address in the address book, using these rules:
a) If the current PopularityCount is <= 100 (<="RangeBase^2"), don't change the count
b) If the PopularityCount is 101-1000, (RangeBase^2+1 <= PopularityCount <= RangeBase^3) reduce the count by ReductionFactor as a percent, rounding up to an integer
c) If the PopularityCount is >= 1001 (>= RangeBase^3 + 1), reduce the count by ReductionFactorMultiplier*ReductionFactor.   

Solutions for hard user cases
0) Disinterested user: sets the ReductionFactor to zero
1) User is just starting to use the algorithm after years and years of not using it (and therefore has very large counts).  Do nothing.   The ReductionFactorMultiplier will smooth out this anomaly.
2) User who has a long history, including previously well used addresses that he still wants to strongly emphasize in the AB autocomplete sort: increase the RangeBase, or reduce the ReductionFactorMultiplier
3) User who doesn't access TB all that often; but still wants to age his PopularityCount:  Does nothing (as this algorithm only executes once on the next excecution of TB)
4) User who CC's one address on every email:  Algorithm does not disrupt the access to this autocomplete
5) User whose address book is small:  Algorithm does not disrupt his use of AB Autocomplete
6) User whose address book is large, but all addresses are used equally: Algorithm does not disrupt AB Autocomplete

Addresses which continue to be used will continue to up the counts, and counter the aging

I suspect that heavy TB users, could scan their AB database, and get a feel for whether my suggested defaults for RangeBase, ReductionFactor and ReductionFactorMultiplier make sense

This could be either an integral part of TB or an add-on.   I think it should be integral to the application
Component: Untriaged → Address Book
Product: Firefox → Thunderbird
Version: 31 Branch → unspecified
On reflection, to normalize the Config values, the ReductionFactor should be 1-100, and there should be a boolean "bExecuteAging", which is tested to execute the aging system if True.

I am calling 'the config' the values that are stashed in the About:Config -- am I labeling this place correctly....?
Not explicitly stated is that the check of the "bExecuteAging" and the "NextExecutionDate" would be undertaken on TB startup, but after the completion of checking email accounts for new mail and etc.
John, thanks a lot for this detailed proposal to improve the sorting algorithm of recipient autocomplete results, currently based on everlasting popularityIndex counter, with an aspect of time (conceptionally similar to bug 382415), so that users will see the best matches on top.

I'm not sure if I'm able (yet) to comment on your proposal, but it can certainly get us thinking more specifically in the right directions, what kind of algorithm is needed to solve the problem originally filed in bug 382415. To make the link, I've marked bug 382415 to depend on this bug, which is not a dependency in the strict sense, but rather presenting this bug 1058583 as one possible solution.

I'm hoping we might benefit from the algorithm currently implemented in FF awesome bar (location bar), which I think is based on a dedicated database which maps any user input (e.g. "Jo") to the selected target ("John Ruskin <asdf@asdf.com>") and then somehow factors in frequency and recency of that mapping to show better results next time.
(In reply to john ruskin from comment #1)
> On reflection, to normalize the Config values, the ReductionFactor should be
> 1-100, and there should be a boolean "bExecuteAging", which is tested to
> execute the aging system if True.
> 
> I am calling 'the config' the values that are stashed in the About:Config --
> am I labeling this place correctly....?

Thanks for asking, correct terminology is crucial... :)

We usually call them "prefs", which is short for "preferences", but imo using the full term isn't recommended as it might be misunderstood as "Tools > Options" which is the primary UI for editing some of the prefs. Other technical prefs are not found in the main Options UI, so they are "hidden prefs". The dialog to edit all the prefs (hidden or not) is called Config Editor. The URL to get Config Editor is about:config (same as in FF). The place where most of them actually live is in your local copy of this file:
http://mxr.mozilla.org/comm-central/source/mailnews/mailnews.js
There might be other dedicated files I can't remember right now, and you can add your own in a file called user.js or so. Here's the link to the FF prefs guide, but TB might differ slightly:

https://developer.mozilla.org/en-US/docs/Mozilla/Preferences/A_brief_guide_to_Mozilla_preferences
(In reply to Thomas D. from comment #3)
> I'm hoping we might benefit from the algorithm currently implemented in FF
> awesome bar (location bar), which I think is based on a dedicated database
> which maps any user input (e.g. "Jo") to the selected target ("John Ruskin
> <asdf@asdf.com>") and then somehow factors in frequency and recency of that
> mapping to show better results next time.

A starting point to find out more about the FF frecency algorithm implementation might be Bug 382415 Comment 7, but some links are outdated.
(In reply to Thomas D. from comment #3)
Thanks, Thomas, for your kind and helpful comments.

What prompted my solution path was the complexity of a solution which attempts to measure aging, compared to the simplicity of aging the Count, directly.

I'm smiling, just -reading- the paragraph where you explained the complex, which you so clearly and helpfully focused on! :
"I'm hoping we might benefit from the algorithm currently implemented in FF awesome bar (location bar), which I think is based on a dedicated database which maps any user input (e.g. "Jo") to the selected target ("John Ruskin <asdf@asdf.com>") and then somehow factors in frequency and recency of that mapping to show better results next time."

The additional advantages of aging the count are that it:
1. does not require modification of the AB database (creating ugly problems when older addresses are imported),  
2. is a tempered approach to imported addresses and their age and history of use, and
3. does not create code conflicts with search and sort, 
4. won't require major (if any) changes to the testing platform, and finally 
5. is a low resource solution (which does not cause awkward delays due to inefficiency of search and sort, identified in various bugs...); and really finally
6. it's simple !

I'm betting that someone could create the code for aging, lickidy-split, and only have to figure out where/how to insert the code into the startup region of TB

If someone later comes up with the global and complicated and efficient solution to all the bugs on search, sort, recency and frequency, this suggested simple algorithm can be cut out.

[[ how's that for a sales pitch ! ]]
This bug has an interesting approach seeking to offer a simple trade-off solution to a bad problem seriously affecting our composition UX:

popularityIndex property of AB cards, currently used for primary sorting of recipient autocomplete results, is dull because it only measures absolute frequency of communication since time immemorial but doesn't have an aspect of time (recency). For a more detailed list of problems, see my Bug 382415 Comment 8.

For certain use patterns and depending on AB data content, these age-old design deficiencies are now more exposed for some users due to autocomplete power search twin bugs (Bug 529584 and Bug 558931):
Say you wrote 1000 emails to display name "Miss Angel" in 1999 and none everafter, then 500 emails to "Angelina" in 2014, search for "Ang" and Thunderbird will stubbornly keep pushing your old friend "Miss Angel" to the top all the time (the point being that we now finally find "Miss Angel" even though there's no field content beginning with "Ang", which is good, but users haven't seen her turn up in autocomplete results all along so they are now surprised when she does).

Ideally, what we want here is something along Bug 382415, a full-fledged frecency algorithm similar to FF location bar:

> Assuming that you've *recently*
> communicated with "Angelina" more frequently than with "Miss Angel", and you always pick "Angelina" > after typing "Ang", TB should
> automatically adapt to that use pattern just like FF awesome bar
> successfully does for URLs (I never type more than 2 or 3 letters to find my
> current favourite URLs from thousands in the history).

However, John certainly has a point that a fast and simple solution might be better than waiting for the perfect but complex solution for an unpredictable long time. I'd like to hear from others what they think of the algorithm proposed here in comment 0 (I don't feel in a position to evaluate this (yet)).
^^
Also, another low-hanging fruit to significantly improve the recipient-adding UX would be Bug 325458 - (full) nickname matches must have top priority in autocomplete results. Judging from comments, that's exactly what many users are after: type "Ang", and *always* get my friend "Angelina", in a stable 1:1 shortcut relationship. Nickname feature provides just that, but - you guessed it - the design is broken in TB.
For the engineers among us (me included) the algorithm could be simplified to a transform function that modestly mimics my step function, that is applied to the PopularityCount at/above some pre-existing PopularityCount.

The shape of which could be modified by a user-pref, bigger emphasizes and retains old records, smaller emphasizes recent history

Or....

The coding could examine the deviation of PopularityCount (a quick calc, before executing the aging), and then pare back or enhance the aging effects depending what's already there.
(In reply to Thomas D. from comment #9)
> Also, another low-hanging fruit to significantly improve the
> recipient-adding UX would be Bug 325458 - (full) nickname matches must have
> top priority in autocomplete results. Judging from comments, that's exactly
> what many users are after: type "Ang", and *always* get my friend
> "Angelina", in a stable 1:1 shortcut relationship. Nickname feature provides
> just that, but - you guessed it - the design is broken in TB.

See Bug 972690 comment 11
I'll confirm this as a valid idea which could be used to fix the current broken design of popularityIndex.
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Windows 8 → All
Hardware: x86_64 → All
FYI, info on FF location bar frecency scoring algorithm:
https://developer.mozilla.org/en-US/docs/Mozilla/Tech/Places/Frecency_algorithm
See Also: → 1067681
See Also: → 970456
FYI, info on FF location bar frecency scoring algorithm:
https://developer.mozilla.org/en-US/docs/Mozilla/Tech/Places/Frecency_algorithm
[Thanks Tomas...]
You need to log in before you can comment on or make changes to this bug.