Closed
Bug 978397
Opened 10 years ago
Closed 10 years ago
impl of CountPopulation32 is incorrect
Categories
(Core :: MFBT, defect)
Tracking
()
RESOLVED
FIXED
mozilla30
People
(Reporter: xidorn, Assigned: sunfish)
References
Details
Attachments
(1 file, 2 obsolete files)
1.76 KB,
patch
|
Details | Diff | Splinter Review |
The current long impl of CountPopulation32 will return a wrong result if there is any bit set in high 16bits. It could be fixed with one extra line: uint32_t sum32 = (sum16 & 0x0000ffff) + ((sum16 & 0xffff0000) >> 16); and make the function return sum32 instead of sum16.
Comment 1•10 years ago
|
||
Thanks for catching this. It looks like mfbt/tests/moz.build doesn't include TestCountPopulation.cpp, which explains why this wasn't caught by windows builds.
Assignee | ||
Comment 2•10 years ago
|
||
Indeed. This patch adds the test to the test list, and upgrades the code to an algorithm which fixes the bug, and is also more efficient.
Assignee: nobody → sunfish
Attachment #8386754 -
Flags: review?(evilpies)
Comment 3•10 years ago
|
||
Comment on attachment 8386754 [details] [diff] [review] popcount.patch I can't review code in this component, but I tested it for every uint32 and the results looked correct.
Attachment #8386754 -
Flags: review?(evilpies) → feedback?(evilpies)
Updated•10 years ago
|
Attachment #8386754 -
Flags: feedback?(evilpies) → feedback+
Assignee | ||
Comment 4•10 years ago
|
||
Comment on attachment 8386754 [details] [diff] [review] popcount.patch Hi :nbp, I apparently didn't notice that the code I refactored out of the register allocator wasn't actually computing a full 32-bit popcount. This fixes it.
Attachment #8386754 -
Flags: review?(nicolas.b.pierron)
Comment 5•10 years ago
|
||
Comment on attachment 8386754 [details] [diff] [review] popcount.patch Review of attachment 8386754 [details] [diff] [review]: ----------------------------------------------------------------- ::: mfbt/MathAlgorithms.h @@ +188,5 @@ > > inline uint_fast8_t > CountPopulation32(uint32_t u) > { > + uint32_t x = u - ((u >> 1) & 0x55555555); This line is equivalent to: x = (u & 0x55555555) + ((u & 0xaaaaaaaa) >> 1) as 0b11 -> 0b10, 0b10 -> 0b01, 0b01 -> 0b01, 0b00 -> 0b00 @@ +189,5 @@ > inline uint_fast8_t > CountPopulation32(uint32_t u) > { > + uint32_t x = u - ((u >> 1) & 0x55555555); > + x = (x & 0x33333333) + ((x >> 2) & 0x33333333); The sum of each components are limited to 0b100 (= 0b10 + 0b10) & 0xf @@ +190,5 @@ > CountPopulation32(uint32_t u) > { > + uint32_t x = u - ((u >> 1) & 0x55555555); > + x = (x & 0x33333333) + ((x >> 2) & 0x33333333); > + return ((x + (x >> 4) & 0xf0f0f0f) * 0x1010101) >> 24; nit: Can you add a pair of parentheses around the x + (x >> 4) expression. x + (x >> 4) sum numbers of 0b100 max with each others, which gives numbers of 0b1000 max. Then we mask the result of the sum with 0xf0f0f0f to keep result of the sum without counting them ~twice. The multiplication by 0x1010101 correspond to a shift and sum ((x << 0x1) + (x << 0x100) + …) Then the final result is shifted down to keep only the sum of the 4 components. This sounds good, and I now understand this code.
Attachment #8386754 -
Flags: review?(nicolas.b.pierron) → review+
Assignee | ||
Comment 6•10 years ago
|
||
Added parens as requested; carrying review forward.
Attachment #8386754 -
Attachment is obsolete: true
Assignee | ||
Updated•10 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 7•10 years ago
|
||
Add reviewer information to patch.
Attachment #8388773 -
Attachment is obsolete: true
Comment 8•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/4d6f4e6bba21
Keywords: checkin-needed
Comment 9•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/4d6f4e6bba21
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
You need to log in
before you can comment on or make changes to this bug.
Description
•