Open Bug 280786 Opened 20 years ago Updated 10 years ago

bayesian analysis for auto-categorization of bugs

Categories

(Bugzilla :: Creating/Changing Bugs, enhancement)

enhancement
Not set
normal

Tracking

()

People

(Reporter: asa, Unassigned)

Details

The easiest wins might be for dupe finding and invalid bug recognition (see bug 280750 for an example of a probably easy match) but it might also work well for product and component adjustments or suggestions. Folks at Sun attached a patch for Mozilla Mail auto-categorization (not just the binary "is junk" "is not junk" that our current filters have) at bug 168905 which may have some useful algorithms.
There are actually perl modules in CPAN that will do this without too much trouble. See AI::Categorizer (not AI::Categorize, which is an older version) and AI::NaiveBayes1
See Bug 22353 "Duplicate bug detection." Bayesian analysis for classification can do many things for us. What are we trying to achieve: 1. Picking 'best' bugs for Developers's attention? Identifying Layout bugs out of all the others? ... 2. Identifying Dups Identifying Garbage ... 3. Making the use of Bugzilla more pleasant for infrequent users Making the Bugzilla easier for infrequent users ... 4. Deterring non-developers from submitting bugs Deterring non-developers from making empty comments ... 5. Pointing Reporters of non-bugs to resources that will help them Personally, (putting myself in the position of 4. above) I think that the most useful thing would be an end-to-end assessment of how likely the information that I have put together is to lead to an improvement to Firefox. This would be the sum of the Bug's being resolved FIXED and improvements to documentation, the website, help system, internationalisation and so forth. This could perhaps only be done if each Bug was rated for its value when CLOSED or RESOLVED. The drawback of doing this is thta we may end up informing Reporters that their bugs rated 0 out of 5, which is not the way to make friends and influence people. The biggest win for Bugzilla is to get Reporters, no matter how awful their first attempts, interested and enthusiastic in identifying problems with Firefox and fixing them. I think that the next most beneficial use is the 1. above, to wit picking out the (few) reports onto which our scarce resources of time and attention should be directed. Merely identifying bugs of low value or actual garbage (2 above) is probably of less benefit as such bugs may sit in Bugzilla as UNCONFIRMed, not garnering much attention, nor causing much harm. Since there seems to be a thing about the early detection of Dups, lets go for it. As you say: It is an easy win, and we need easy wins. You probably have your own ideas some of which I suspect overlap with mine, and some that I have not yet thought of.
Assignee: gerv → charting
Assignee: charting → create-and-change
Component: Reporting/Charting → Creating/Changing Bugs
Hardware: x86 → All
You need to log in before you can comment on or make changes to this bug.