<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Marc Schumann [:Wurblzap]

Assignee

Comment 3

•

17 years ago

This should work now in Bugzilla 3.0.

Status: NEW → RESOLVED

Closed: 17 years ago

Resolution: --- → WORKSFORME

Comment 4

•

17 years ago

Clicking the links in comment 2 shows it doesn't.

Status: RESOLVED → REOPENED

Resolution: WORKSFORME → ---

Marc Schumann [:Wurblzap]

Assignee

Comment 5

•

17 years ago

Hrm. Maybe with a different MySQL collation this would work properly?

Aleksandr Derevianko

Comment 6

•

17 years ago

The search for UTF-8 is completely broken. I just install clean Bugzilla-3.0.2, and search doesn't work if i search something outside latin1 encoding.

Updated

•

17 years ago

Flags: blocking3.2?

Assignee

Comment 7

•

17 years ago

Okay, this should definitely at least be looked into before 3.2.

Flags: blocking3.2? → blocking3.2+

Updated

•

17 years ago

Status: REOPENED → NEW

Target Milestone: --- → Bugzilla 3.2

Comment 8

•

17 years ago

(In reply to comment #2) > buglist.cgi?long_desc=%C4%B0&long_desc_type=regexp > finds the bug. > buglist.cgi?long_desc=%C4%B0&long_desc_type=allwordssubstr > doesn't. Let's add a 3rd query: buglist.cgi?long_desc_type=casesubstring&long_desc=%C4%B0 Appending &debug=1 to all three queries shows that: 1) the regexp one uses: longdescs_.thetext REGEXP 'İ' 2) the allwordssubstr one (case insensitive) uses: INSTR(CAST(LOWER(longdescs_.thetext) AS BINARY), CAST('i̇' AS BINARY)) > 0 3) the casesubstring one (case sensitive) uses: INSTR(CAST(longdescs_.thetext AS BINARY), CAST('İ' AS BINARY)) > 0 So the problem seems to be that 'i̇' is not seen as the lowercase flavor of 'İ', and so MySQL returns no match.

Comment 9

•

17 years ago

I tested with PostgreSQL 8.2.6, and it has the same problem.

Comment 10

•

17 years ago

In Search::GetByWordListSubstr(), I tried replacing (using PostgreSQL): push(@list, $dbh->sql_position(lc($sql_word), "LOWER($field)") . " > 0"); by: push(@list, $dbh->sql_position("LOWER($sql_word)", "LOWER($field)") . " > 0"); but this doesn't help. Instead of 0 bugs, it now returns all bugs.

Comment 11

•

17 years ago

As reported by bbaetz on IRC, there isn't a one to one mapping between lowercase and uppercase for Turkish, see http://rt.perl.org/rt3/Public/Bug/Display.html?id=36953 and also perldoc perlunicode /lc: "Things to do with locales (Lithuanian, Turkish, Azeri) do not work since Perl does not understand the concept of Unicode locales."

Assignee

Comment 12

•

17 years ago

Okay. So we should find a way to be using sql_istrcmp or something like that to be doing case-insensitive substring location, instead of using Perl's lc.

Updated

•

17 years ago

Assignee: query-and-buglist → jjclark1982

Jesse Clark

Comment 13

•

17 years ago

In theory this should work if we replace code like $$term = $dbh->sql_position(lc($$q), "LOWER($$ff)") . " > 0"; with $$term = $dbh->sql_position($dbh->sql_istring($$q), $dbh->sql_istring($$ff)) . " > 0"; However, I am having a lot of trouble ensuring that the entered value ($$q) is in the correct encoding. encode('utf8',decode('utf8',$$q)) appears to print the correct value, but passing this to mysql does not match correctly.

Assignee

Comment 14

•

17 years ago

(In reply to comment #13) > However, I am having a lot of trouble ensuring that the entered value ($$q) is > in the correct encoding. encode('utf8',decode('utf8',$$q)) appears to print the > correct value, but passing this to mysql does not match correctly. Oh, don't mess with the encoding of anything--that shouldn't be necessary at all, if this is 3.1.x.

Assignee

Comment 15

•

16 years ago

Hey jjclark, any progress on this? This is one of our few code blockers for 3.2.

Comment 16

•

16 years ago

Attached patch patch, v1 (obsolete) — Details — Splinter Review

Is it as simple as that? I didn't test this patch.

Attachment #327334 - Flags: review?(jjclark1982)

Assignee

Comment 17

•

16 years ago

Comment on attachment 327334 [details] [diff] [review] patch, v1 This won't work on MySQL. Our sql_position for MySQL was made case-sensitive: INSTR(CAST($text AS BINARY), CAST($fragment AS BINARY)) We could make a sql_iposition, though, which could handle it. It could default to calling istring on both its arguments, and MySQL could have its own version.

Attachment #327334 - Flags: review?(jjclark1982) → review-

Assignee

Comment 18

•

16 years ago

I didn't realize there were so few LOWER/lc calls in Search.pm, I can probably fix this myself.

Assignee: jjclark1982 → mkanat

Assignee

Comment 19

•

16 years ago

Attached patch v2 — Details — Splinter Review

I've tested this and it generates the right SQL. So at this point, if we don't work, it's a bug in the database, not in Bugzilla. :-)

Attachment #327334 - Attachment is obsolete: true

Attachment #327344 - Flags: review?(LpSolit)

Assignee

Comment 20

•

16 years ago

Comment on attachment 327344 [details] [diff] [review] v2 I want to write a more extensive patch for the tip that uses sql_iposition everywhere that we currently use LOWER() in sql_position.

Attachment #327344 - Attachment description: v2 → v2 (3.2)

Assignee

Comment 21

•

16 years ago

Comment on attachment 327344 [details] [diff] [review] v2 Actually, I'll just do that in a separate bug.

Attachment #327344 - Attachment description: v2 (3.2) → v2

Assignee

Updated

•

16 years ago

Blocks: 442582

Comment 22

•

16 years ago

Comment on attachment 327344 [details] [diff] [review] v2 Looks correct to me, so r=LpSolit. Someone who is used to Turkish characters will have to test it for us after checkin.

Attachment #327344 - Flags: review?(LpSolit) → review+

Assignee

Updated

•

16 years ago

Flags: approval3.2+

Flags: approval+

Assignee

Comment 23

•

16 years ago

tip: Checking in Bugzilla/DB.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/DB.pm,v <-- DB.pm new revision: 1.115; previous revision: 1.114 done Checking in Bugzilla/Search.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/Search.pm,v <-- Search.pm new revision: 1.160; previous revision: 1.159 done Checking in Bugzilla/DB/Mysql.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/DB/Mysql.pm,v <-- Mysql.pm new revision: 1.62; previous revision: 1.61 done 3.2: Checking in Bugzilla/DB.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/DB.pm,v <-- DB.pm new revision: 1.112.2.1; previous revision: 1.112 done Checking in Bugzilla/Search.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/Search.pm,v <-- Search.pm new revision: 1.159.2.1; previous revision: 1.159 done Checking in Bugzilla/DB/Mysql.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/DB/Mysql.pm,v <-- Mysql.pm new revision: 1.60.2.1; previous revision: 1.60 done

Status: NEW → RESOLVED

Closed: 17 years ago → 16 years ago

Resolution: --- → FIXED

Comment 24

•

16 years ago

Will try to get Pardus team involved

Comment 25

•

16 years ago

http://bugs.pardus.org.tr/show_bug.cgi?id=7621 filed

Comment 26

•

16 years ago

Right now landfill returns 16 bugs: http://landfill.bugzilla.org/bugzilla-tip/buglist.cgi?query_format=advanced&short_desc_type=allwordssubstr&short_desc=%C4%B0 Correct test case (http://landfill.bugzilla.org/bugzilla-tip/show_bug.cgi?id=3296) is found, but all accented 'i' variants (í, Î, Ì) are returned also.