Bugzilla quips are not random

RESOLVED FIXED in Bugzilla 2.18

Status

()

Bugzilla
Bugzilla-General
--
minor
RESOLVED FIXED
15 years ago
5 years ago

People

(Reporter: steve camsell, Assigned: Bill McGonigle (not currently reading bugmail; please contact directly))

Tracking

unspecified
Bugzilla 2.18
Bug Flags:
approval +

Details

Attachments

(1 attachment, 3 obsolete attachments)

(Reporter)

Description

15 years ago
User-Agent:       Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705)
Build Identifier: Bugzilla Version 2.17.3

I have 100 + quips in my quips table, however the same quip is selected 
approximately 80% of the time. Having deleted the quip in question, Buzilla now 
selects another quip around 80% of the time. It seems like the ramdomizer 
function isn't randomizing very well?

Reproducible: Always

Steps to Reproduce:
1.Run a query
2.Hit Browser Refresh
3.Hit Browser Refresh Again

Actual Results:  
Same quip is selected and displayed most of the time, although there are over 
100 quips in the quip table.

Expected Results:  
A quip to be selected at random each time a query is run.
This is currently using SQL instructions to get the randomness, so it's using
whatever random mechanism that MySQL uses.

SELECT quip FROM quips ORDER BY RAND() LIMIT 1

It does support passing a seed however...  perhaps we can generate a random
number in perl to use as a seed?
Status: UNCONFIRMED → NEW
Ever confirmed: true
*** Bug 199869 has been marked as a duplicate of this bug. ***
I don't know if thats portable, either - RAND() isn't deterministic, so if it
needs to be evaluated more than once for comparison, you'll get strange results.

Maybe LIMIT 1 OFFSET x, where x is a random number (generated in perl) between 1
and SELECT COUNT(*) FROM quips ?

Or we could just use a counter, and iterate through it every time that function
is  called, or something.
The correct way, fwiw, would be something like:

SELECT quip FROM (SELECT quip, RAND() as r FROM quips) HAVING r=MIN(r) LIMIT 1
*** Bug 200964 has been marked as a duplicate of this bug. ***
Please also see bug 200964 comment 0 for another possible way to fix this. 
Michael: do you know what version of MySQL first supported Benchmark()?  I
hadn't seen it before, want to make sure it's not something new in MySQL 4 first.

Comment 7

15 years ago
This fix came from a mysql developer.  It works in 3.23.54 and it's described in
the "Core MySQL" book (Prentis Hall 2002) .  I would imagine that is MySQL
specific, but so is the bug!
see the usenet post...
http://groups.google.com/groups?selm=b2bsom%242tq8%241%40FreeBSD.csie.NCTU.edu.tw&oe=utf-8&output=gplain

With this version of MySQL (3.23.54) on RH8 and RH9, I get the same quip 100% of
the time !!
'Eww' I really don't think we want to benchmark it.

We could order by rand()*rand(), I guess. Can't we set the rand seed somehow?

Comment 9

15 years ago
The mySQL problem extends to the seed.

You could do the following...

    SendSQL("SELECT COUNT( quipid ) from quips");
    my $quip_id = FetchSQLData();
       $quip_id = sprintf( "%d", rand( $quip_id ) );
    SendSQL("SELECT quip FROM quips where quipid = " . $quip_id );

The only problem might be if a quip is deleted, and rand gives that quip_id.
Maybe there is a way to get the nth non null element in mySQL.
Well, we could lock the quips table for read, I guess. We don't have any other
locks for buglist.cgi (which is an issue in and of itself), so we can unlock it
again immediately afterwards.

Or we could file a mysql bug report asking them to call srand at startup.

Comment 11

15 years ago
Strangely enough, when i went in to the MySQL command line and ran the query 
that we are running here to get the quip i am getting results that appear to 
be random.
That's because you're still on the same DB connection.  Bugzilla is
breaking/recreating the connection between each access.

Comment 13

15 years ago
I found that doing this worked. (Saw it on the MySQL site manual for RAND()
SendSQL("SELECT quip FROM quips ORDER BY MD5(RAND(NOW())) LIMIT 1");

Now its random for me
Created attachment 130428 [details] [diff] [review]
patch to use MD5 to really randomize

FWIW, here's a patch that implements caseyg's suggestion.  Here's a reference
on MySQL RAND():

http://www.mysql.com/doc/en/Mathematical_functions.html

One user suggests using reverse for about a 2x speed-up over MD5.  He quotes 9
seconds for MD5 on 10000 rows and 5 for REVERSE, obviously sacrificing some
randomness.  That said, the MD5 works well for me and isn't noticable on my
list of ~50 quips.
Attachment #130428 - Flags: review?(bbaetz)
Comment on attachment 130428 [details] [diff] [review]
patch to use MD5 to really randomize

We don't really need cryptographical randomness, and this will mean iterating
through all of the entries.

Given that COUNT is cheap for myisam tables, can't we select COUNT(*) from
quips, and then use perl's rand function via limit + offset? (and possibly
ORDER BY quip too)
Created attachment 130971 [details] [diff] [review]
Patch w/ 2 SELECTS & perl random

Bradley, do you mean something like this?

Just to clarify, the MD5 version doesn't do MD5(quip) it does MD5 on the random
value, which should smaller/faster.

Still, the whole ORDER BY RAND() method seems like a bad strategy altogether:
>as explain shows, mysql optimizes this VERY badly (or may be better said,
>doens't optimize it at all): it uses an temporary table and an extra filesort.


So maybe two calls to mysql isn't more expensive, I just wanted to point out
that if this bug is just 'fix the randomness' md5 isn't too bad, but 'fix the
randomness and fix performance' is good too.

Also note this doesn't appear to be portable.  Postgres seems to need something
like:

+    SendSQL("SELECT quip FROM quips LIMIT 1 OFFSET $random");

Does anybody knows of a SQL standard way to express this?

Anyway, I tested this on my system, and it seems to work OK.  It gets both ends
of the list, so there probably aren't any fencepost errors.
Attachment #130428 - Attachment is obsolete: true
Attachment #130971 - Flags: review?(bbaetz)
Comment on attachment 130971 [details] [diff] [review]
Patch w/ 2 SELECTS & perl random

mysql supports what it calls 'postgresql compatible' OFFSET from 4.0.6,
although I thought that that syntax was ansi.

Doesn't matter for now, though.
Attachment #130971 - Flags: review?(bbaetz) → review+

Updated

14 years ago
Attachment #130428 - Flags: review?(bbaetz)
Flags: approval?
-> patch author

Do you need someone to check this in for you?
Assignee: justdave → bill+mozilla
Flags: approval? → approval+
OS: Windows 2000 → All
Hardware: PC → All
Target Milestone: --- → Bugzilla 2.18
>Do you need someone to check this in for you?

Yes, please.  Check to make sure I did the patch correctly first (I'm new at
this gig).

Status: NEW → ASSIGNED
yeah, the code review's already done, in theory, that's what the review+ on the
patch is for :)

Except it won't apply, so the reviewer apparently never attempted to apply it.
Looks like we're missing the code to deal with moderated quips here.  Was this
patch against 2.16.3?

The patch was generated correctly (other than being against outdated code) so
you're doing okay :)
Flags: approval+
Comment on attachment 130971 [details] [diff] [review]
Patch w/ 2 SELECTS & perl random

r- per previous comment.
Attachment #130971 - Flags: review-

Comment 22

14 years ago
Created attachment 131507 [details] [diff] [review]
Patch for CVS version

Previous patch:
+    my $random = int(rand($count+1));

This patch:
+    my $random = int(rand($count));

Reason: We want to have values between 0 and count-1 as offset for
  LIMIT $offset,1

(I did some minimal MySQL tests, but not very thoroughly.)

Updated

14 years ago
Attachment #130971 - Attachment is obsolete: true

Updated

14 years ago
Attachment #131507 - Flags: review?(justdave)
>We want to have values between 0 and count-1 as offset for LIMIT $offset,1

Yes, I have to agree after re-reading the docs on the mysql website.

The strange bit is when I wrote the patch I had read somewhere that the initial
row was 1, not 0, but I can't find that reference now.  Even stranger is, not
really believing it, I reloaded a bug list until it showed me every quip on my
list, to make sure there wasn't a fencepost error. Is there a chance this
changed in a later version of MySQL?  I'm going to change my version to not add
the +1 and see if I can get every quip.

My patch was against 2.17.3 - I see I missed a release - sorry about that.
I was able to get every quip from my quip list using the non- +1 version, so it
must be right.

I must have been smokin' crack.

Updated

14 years ago
Attachment #131507 - Flags: review?(bbaetz)
Comment on attachment 131507 [details] [diff] [review]
Patch for CVS version

This is OK, if you add a comment that COUNT() is quick because its cached for
mysql, and that we may want to revisit this when we support other databases.
Attachment #131507 - Flags: review?(bbaetz) → review+

Updated

14 years ago
Flags: approval?

Updated

14 years ago
Summary: Buzilla quips are not random → Bugzilla quips are not random
Flags: approval?
Comment on attachment 131507 [details] [diff] [review]
Patch for CVS version

bbaetz's review comments were nits, but since the person creating the patch
doesn't have checkin privs, I'd like a patch that deals with bbaetz's nits
first so the person doing the checkin doesn't have to try to figure them out or
potentially forget them.
Attachment #131507 - Flags: review?(justdave) → review-

Comment 27

14 years ago
Created attachment 141926 [details] [diff] [review]
Patch with comments
Attachment #131507 - Attachment is obsolete: true

Comment 28

14 years ago
Feel free to r+ my patch as well if you thikn bbaetz's r+ is no longer valid. 
(I only added bbaetz's comments to the previous patch)
Flags: approval?
Attachment #141926 - Flags: review+
Flags: approval? → approval+

Comment 29

14 years ago
Checking in buglist.cgi;
/cvsroot/mozilla/webtools/bugzilla/buglist.cgi,v  <--  buglist.cgi
new revision: 1.245; previous revision: 1.244
done
Status: ASSIGNED → RESOLVED
Last Resolved: 14 years ago
Resolution: --- → FIXED
QA Contact: matty_is_a_geek → default-qa
You need to log in before you can comment on or make changes to this bug.