Closed Bug 1104335 Opened 10 years ago Closed 2 months ago

Implement Bayesian probability assignment in weighted rules scenario

Categories

(Content Services Graveyard :: Classification Engine, defect)

x86
macOS
defect
Not set
normal
Points:
13

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: mzhilyaev, Assigned: mruttley)

References

Details

Rules generated from moreover corpus have the form of:

rule: [cat1: probability, cat2: probability, ....]

The current algorithm implemented in ruleClassify chooses highest weighed category which is incorrect.

The correct formulation of probability for a given cat C is below:

P(C| R1 & R2 & R3) = P(R1|C) * P(R2|C) * P(R3|C) * P(C) / (P(R1) * P(R2) * P(R3))

Then all C above particular threshold are chosen, OR the most probable category is chosen.  Either algorithm is worth testting
Blocks: 1104322
Points: --- → 8
Whiteboard: .?
Points: 8 → 13
Blocks: 1104329
This procedure does not seem to be necessary as simple selection of rules with precision above 85% seems to provide good overall precision recall in folding scenario.


Suggest putting it on back burner
No longer blocks: 1104329
Whiteboard: .?
Assignee: nobody → mruttley
Status: NEW → RESOLVED
Closed: 2 months ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.