Closed Bug 676855 Opened 13 years ago Closed 5 years ago

need script for auto detect duplicates at getsatisfaction

Categories

(support.mozillamessaging.com Graveyard :: General, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: wsmwk, Assigned: rolandtanglao)

Details

need script for auto detect duplicates at getsatisfaction (step 1).
perhaps two types of potential duplicates
1. topic has same author, and roughly matches at least two words, of say 5 or more characters
2. topic has different author, and matches at least N words of topic

The easy one is #1.

(step 2 will be script to do merge the duplicates found in step 1)
Assignee: nobody → rtanglao
Status: NEW → ASSIGNED
wayne: could you give me some testcases please?
i.e.
1. two GS urls of two topics by the same author that are duplicate
2. same as #1 but different authors

implementing 2. might be difficult but we'll see :-)
thanks for the test data wayne

i am getting there:
here's the pseudo code i have so far:

https://gist.github.com/1137754
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.