Closed Bug 554609 Opened 10 years ago Closed 10 years ago

fts3 tokenizer should use bi-gram for Thai, Lao and Khmer

Categories

(MailNews Core :: Database, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 3.1b2

People

(Reporter: m_kato, Assigned: m_kato)

Details

Attachments

(1 file)

Thai, Lao and Khmer is complex word-break rule.  These language should use bi-gram like CJK.

U+0E00-U+0E7F ... Thai
U+0E80-U+0EFF ... Lao
U+1780-U+17FF ... Khmer
Attached patch patchSplinter Review
Assignee: nobody → m_kato
Attachment #435554 - Flags: review?(bugmail)
Comment on attachment 435554 [details] [diff] [review]
patch

Looks good.  This can land now but let's not mark a schema change yet.  Testers will need to delete their database manually.
Attachment #435554 - Flags: review?(bugmail) → review+
(In reply to comment #2)
> (From update of attachment 435554 [details] [diff] [review])
> Looks good.  This can land now but let's not mark a schema change yet.  Testers
> will need to delete their database manually.

Andrew which bug should make the schema change ?
(In reply to comment #3)
> Andrew which bug should make the schema change ?

A new bug should be filed if explicit tracking is desired.  Since we already have a schema change in the build since beta 1, anyone upgrading from beta 1 to beta 2 when it ships will already have this taken care of for them.  This leaves nightly users, and I'm expecting another schema change to come down the nightly pipe not too long from now, so I wasn't going to worry about it.  Given my understanding of the limited numbers of 3.1 nightly users, it didn't strike me that more than a handful of users would be likely to be affected.
landed
http://hg.mozilla.org/comm-central/rev/f3ae26af4fb8
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 3.1b2
You need to log in before you can comment on or make changes to this bug.