Open Bug 284856 Opened 20 years ago Updated 2 years ago

Search/Filter on IMAP folders are case-sensitive for non-ASCII (Cyrillic/Russian, Greek, accented Latin) keywords

Categories

(MailNews Core :: Filters, defect)

defect

Tracking

(thunderbird112 affected, thunderbird113 affected)

Tracking Status
thunderbird112 --- affected
thunderbird113 --- affected

People

(Reporter: jshin1987, Assigned: smontagu)

References

(Blocks 1 open bug)

Details

(Keywords: intl)

Attachments

(1 file)

Follow-up to bug 254740 which fixed the problem for local folders (and possibly POP3 folders)
Blocks: 194514
Summary: Search/Filter on IMAP folders are case-sensitive for non-ASCII keywords → Search/Filter on IMAP folders are case-sensitive for non-ASCII (Cyrillic/Russian, Greek, accented Latin) keywords
*** Bug 285130 has been marked as a duplicate of this bug. ***
Newsgroup search has the same problem. I posted two test cases to n.p.mozilla.test (news://news.mozilla.org/ddnh62$b0q2@ripley.netscape.com and news://news.mozilla.org/ddnh5c$b0q1@ripley.netscape.com A possible lead is http://lxr.mozilla.org/seamonkey/source/mailnews/imap/src/nsImapService.cpp#819 (nsImapService::Search) for IMAP while that for NNTP is http://lxr.mozilla.org/seamonkey/source/mailnews/news/src/nsNntpService.cpp#1607 (nsNntpService::Search). Both are implementations of nsIMsgMessageService::Search. I'll keep looking.
Status: NEW → ASSIGNED
Assignee: jshin1987 → smontagu
Status: ASSIGNED → NEW
QA Contact: filters
Product: Core → MailNews Core
sp3000, on irc "Ä seems to match ä in body filter" and subject filter But I cannot confirm. I sent myself a message with äsubject in subject, and äbody in message body. Using version 5.0b2pre I was able to find the message when using äsubject and Äsubject, but not with äbody nor Äbody. I wonder why sp3000 has different results? I'm almost certain there are some duplicate bug reports of this issue.
sp3000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 wsm Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit
John, Rimas, anyone is this resolved, or does it still reproduce for you?
Flags: needinfo?
Whiteboard: [dupeme?]
> John, Rimas, anyone is this resolved, or does it still reproduce for you? I haven't read the referenced bug thoroughly, but quick filtering with upper-case Cyrillic or extended Latin characters works fine for me, that is, it's case insensitive. At least in the few IMAP folders and Newsgroups that I tested. Here's a screenshot.
Flags: needinfo?
Hm, skimming further, it seems that the real problem is not with headers, but with body. Quick filtering by body still seems to be fuzzy for me. I can find many messages containing the word "today" in the body in my spam folder, but I can't find ones containing a Russian word "файле", even though I just copied it from one of those messages. It might be a different problem than just characters though, as filtering for "файл" gives me one result, and reducing the filter to "фа" yields even more. I've no idea why.
Flags: needinfo?
Comment 7 sounds like bug 374795 (searching for файл isn't finding файл or vice versa, where one is written with U+0439 CYRILLIC SMALL LETTER SHORT I and the other is written with U+0438 CYRILLIC SMALL LETTER I plus U+0306 COMBINING BREVE)
Flags: needinfo?
Simon, but why would it not find the message I pasted the word from? Is it being (de)normalized when displaying/copying/pasting, or what?
By the way, I've just checked: the message that I don't find when filtering for "файл", and the message I can find, both have the short letter I in its composed form. I can attach both messages if you like.
Is this a failing of mime?
See Also: → 1042681
Severity: normal → S3
See Also: → 506064
See Also: → 1680606
Duplicate of this bug: 506064
See Also: 1042681, 506064
Whiteboard: [dupeme?]
Version: Trunk → unspecified

If I'm understanding this correctly, we're using the search functionality built into IMAP in order to do this. If that's the case, we likely need to implement support for the COMPARATOR command from RFC 5255: Internet Message Access Protocol Internationalization (https://www.rfc-editor.org/rfc/rfc5255.html#section-4.7) and use the i;unicode-casemap comparator when possible.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: