Closed Bug 192247 Opened 20 years ago Closed 19 years ago
Bugzilla quips are not random
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705) Build Identifier: Bugzilla Version 2.17.3 I have 100 + quips in my quips table, however the same quip is selected approximately 80% of the time. Having deleted the quip in question, Buzilla now selects another quip around 80% of the time. It seems like the ramdomizer function isn't randomizing very well? Reproducible: Always Steps to Reproduce: 1.Run a query 2.Hit Browser Refresh 3.Hit Browser Refresh Again Actual Results: Same quip is selected and displayed most of the time, although there are over 100 quips in the quip table. Expected Results: A quip to be selected at random each time a query is run.
This is currently using SQL instructions to get the randomness, so it's using whatever random mechanism that MySQL uses. SELECT quip FROM quips ORDER BY RAND() LIMIT 1 It does support passing a seed however... perhaps we can generate a random number in perl to use as a seed?
Status: UNCONFIRMED → NEW
Ever confirmed: true
*** Bug 199869 has been marked as a duplicate of this bug. ***
I don't know if thats portable, either - RAND() isn't deterministic, so if it needs to be evaluated more than once for comparison, you'll get strange results. Maybe LIMIT 1 OFFSET x, where x is a random number (generated in perl) between 1 and SELECT COUNT(*) FROM quips ? Or we could just use a counter, and iterate through it every time that function is called, or something.
The correct way, fwiw, would be something like: SELECT quip FROM (SELECT quip, RAND() as r FROM quips) HAVING r=MIN(r) LIMIT 1
*** Bug 200964 has been marked as a duplicate of this bug. ***
Please also see bug 200964 comment 0 for another possible way to fix this. Michael: do you know what version of MySQL first supported Benchmark()? I hadn't seen it before, want to make sure it's not something new in MySQL 4 first.
This fix came from a mysql developer. It works in 3.23.54 and it's described in the "Core MySQL" book (Prentis Hall 2002) . I would imagine that is MySQL specific, but so is the bug! see the usenet post... http://groups.google.com/groups?selm=b2bsom%242tq8%241%40FreeBSD.csie.NCTU.edu.tw&oe=utf-8&output=gplain With this version of MySQL (3.23.54) on RH8 and RH9, I get the same quip 100% of the time !!
'Eww' I really don't think we want to benchmark it. We could order by rand()*rand(), I guess. Can't we set the rand seed somehow?
The mySQL problem extends to the seed. You could do the following... SendSQL("SELECT COUNT( quipid ) from quips"); my $quip_id = FetchSQLData(); $quip_id = sprintf( "%d", rand( $quip_id ) ); SendSQL("SELECT quip FROM quips where quipid = " . $quip_id ); The only problem might be if a quip is deleted, and rand gives that quip_id. Maybe there is a way to get the nth non null element in mySQL.
Well, we could lock the quips table for read, I guess. We don't have any other locks for buglist.cgi (which is an issue in and of itself), so we can unlock it again immediately afterwards. Or we could file a mysql bug report asking them to call srand at startup.
Strangely enough, when i went in to the MySQL command line and ran the query that we are running here to get the quip i am getting results that appear to be random.
That's because you're still on the same DB connection. Bugzilla is breaking/recreating the connection between each access.
I found that doing this worked. (Saw it on the MySQL site manual for RAND() SendSQL("SELECT quip FROM quips ORDER BY MD5(RAND(NOW())) LIMIT 1"); Now its random for me
FWIW, here's a patch that implements caseyg's suggestion. Here's a reference on MySQL RAND(): http://www.mysql.com/doc/en/Mathematical_functions.html One user suggests using reverse for about a 2x speed-up over MD5. He quotes 9 seconds for MD5 on 10000 rows and 5 for REVERSE, obviously sacrificing some randomness. That said, the MD5 works well for me and isn't noticable on my list of ~50 quips.
Comment on attachment 130428 [details] [diff] [review] patch to use MD5 to really randomize We don't really need cryptographical randomness, and this will mean iterating through all of the entries. Given that COUNT is cheap for myisam tables, can't we select COUNT(*) from quips, and then use perl's rand function via limit + offset? (and possibly ORDER BY quip too)
Bradley, do you mean something like this? Just to clarify, the MD5 version doesn't do MD5(quip) it does MD5 on the random value, which should smaller/faster. Still, the whole ORDER BY RAND() method seems like a bad strategy altogether: >as explain shows, mysql optimizes this VERY badly (or may be better said, >doens't optimize it at all): it uses an temporary table and an extra filesort. So maybe two calls to mysql isn't more expensive, I just wanted to point out that if this bug is just 'fix the randomness' md5 isn't too bad, but 'fix the randomness and fix performance' is good too. Also note this doesn't appear to be portable. Postgres seems to need something like: + SendSQL("SELECT quip FROM quips LIMIT 1 OFFSET $random"); Does anybody knows of a SQL standard way to express this? Anyway, I tested this on my system, and it seems to work OK. It gets both ends of the list, so there probably aren't any fencepost errors.
20 years ago
Attachment #130428 - Attachment is obsolete: true
20 years ago
Attachment #130971 - Flags: review?(bbaetz)
Comment on attachment 130971 [details] [diff] [review] Patch w/ 2 SELECTS & perl random mysql supports what it calls 'postgresql compatible' OFFSET from 4.0.6, although I thought that that syntax was ansi. Doesn't matter for now, though.
Attachment #130971 - Flags: review?(bbaetz) → review+
20 years ago
-> patch author Do you need someone to check this in for you?
Assignee: justdave → bill+mozilla
Flags: approval? → approval+
OS: Windows 2000 → All
Hardware: PC → All
Target Milestone: --- → Bugzilla 2.18
>Do you need someone to check this in for you? Yes, please. Check to make sure I did the patch correctly first (I'm new at this gig).
Status: NEW → ASSIGNED
yeah, the code review's already done, in theory, that's what the review+ on the patch is for :) Except it won't apply, so the reviewer apparently never attempted to apply it. Looks like we're missing the code to deal with moderated quips here. Was this patch against 2.16.3? The patch was generated correctly (other than being against outdated code) so you're doing okay :)
Comment on attachment 130971 [details] [diff] [review] Patch w/ 2 SELECTS & perl random r- per previous comment.
Attachment #130971 - Flags: review-
Previous patch: + my $random = int(rand($count+1)); This patch: + my $random = int(rand($count)); Reason: We want to have values between 0 and count-1 as offset for LIMIT $offset,1 (I did some minimal MySQL tests, but not very thoroughly.)
>We want to have values between 0 and count-1 as offset for LIMIT $offset,1 Yes, I have to agree after re-reading the docs on the mysql website. The strange bit is when I wrote the patch I had read somewhere that the initial row was 1, not 0, but I can't find that reference now. Even stranger is, not really believing it, I reloaded a bug list until it showed me every quip on my list, to make sure there wasn't a fencepost error. Is there a chance this changed in a later version of MySQL? I'm going to change my version to not add the +1 and see if I can get every quip. My patch was against 2.17.3 - I see I missed a release - sorry about that.
I was able to get every quip from my quip list using the non- +1 version, so it must be right. I must have been smokin' crack.
Comment on attachment 131507 [details] [diff] [review] Patch for CVS version This is OK, if you add a comment that COUNT() is quick because its cached for mysql, and that we may want to revisit this when we support other databases.
Attachment #131507 - Flags: review?(bbaetz) → review+
Summary: Buzilla quips are not random → Bugzilla quips are not random
Comment on attachment 131507 [details] [diff] [review] Patch for CVS version bbaetz's review comments were nits, but since the person creating the patch doesn't have checkin privs, I'd like a patch that deals with bbaetz's nits first so the person doing the checkin doesn't have to try to figure them out or potentially forget them.
Attachment #131507 - Flags: review?(justdave) → review-
Feel free to r+ my patch as well if you thikn bbaetz's r+ is no longer valid. (I only added bbaetz's comments to the previous patch)
Checking in buglist.cgi; /cvsroot/mozilla/webtools/bugzilla/buglist.cgi,v <-- buglist.cgi new revision: 1.245; previous revision: 1.244 done
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
QA Contact: matty_is_a_geek → default-qa
You need to log in before you can comment on or make changes to this bug.