International support for IMAP4 search



MailNews Core
20 years ago
9 years ago


(Reporter: Takayuki Tei, Assigned: nhottanscp)


Windows NT
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)


(Whiteboard: nsbeta2+]Exception Feature)



20 years ago
(This bug imported from BugSplat, Netscape's internal bugsystem.  It
was known there as bug #88257
Imported into Bugzilla on 05/04/99 17:49)

Messenger client should have fall back mechanism just in case IMAP4 server
doesn't support the charset used with SEARCH command. For example, when it's
working in Japanese char encoding, it should work like below:

     result = SEARCH(UTF8 string with charset "UTF-8");
     if (result == NO) {     // UTF-8 may not be supported.
         result = SEARCH(ISO-2022-JP string with charset "ISO-2022-JP");
         if (result == NO) { // ISO-2022-JP may not be supported.
             result = SEARCH(AS IS without charset);
             if (result == NO)
                 printf("Couldn't match any");

     Notice: By checking whether the search string contains only ASCII or not,
     you can skip first two SEARCH().  It's up to implementation.

Messenger client in Communicator 4.0 doesn't work like above.  It sends SEARCH
command with "Shift_JIS" charset, and gives up without retrying if server
response is "NO".

Comment 1

20 years ago
John, Isn't this on the Gromit "In" list?

Comment 2

20 years ago
Actually, there is one change to the algorithm specified here.

As the very first step, if search string contains only US-ASCII (regardless of
encoding of the search UI), then SEARCH with charset=US-ASCII otherwise continue
as listed here.

Comment 3

20 years ago
Seems more search related than IMAP related. If you disagree, assign back to me.

Comment 4

20 years ago
Yes; it's search related, so it goes to Scott :-)

We'll need to invent some way to allow multiple passes at a single search scope,
which we don't have right now.

To clarify jfriend's point, if the search string contains only US-ASCII, we only
try US-ASCII, and not any i18n charset stuff.

I'd also like to clarify whether the IMAP server must send a NO response
if it doesn't know the charset, or whether it can just search and not find any

Comment 5

20 years ago
Setting TFV to 4.5.

Comment 6

20 years ago
Mass moving bugs from product version 5.0 to 4.5 since that's where the bugs are
now (no change to TFV).

Comment 7

20 years ago
Setting qa assigned to field.

Comment 8

19 years ago
Not a PR1 stopper.

Comment 9

19 years ago
Bulk change: Bug assigned to mail/news engineer but no component specified.
Changed to mail/news component.

Comment 10

19 years ago
<sorry for the bug notification intrusion.  Product version on this bug shows
1.0 (due to a bugsplat bug).  Correcting all mail/news bugs numbered < 90000 to
product version 4.0.   Bulk changing this.>

Comment 11

19 years ago
FYI. is now running MS4.0 Beta which supports various
charset option for IMAP SEARCH.  If you don't have test environment
now, ask for an account.

However, I would recommend to have WSU's IMAP4 server as reference
as well. It also have good SEARCH implementation.

Comment 12

19 years ago
Phil, wasn't this the I18N bug we were talking about with Naoki and Bob? Where
you going to end up doing this one?

changing QA field to gbush

Comment 13

19 years ago
Bouncing over to Phil.

Comment 14

19 years ago
M15, I hope. I won't get to this for 4.5b2

Comment 15

19 years ago
Later. Too many more serious bugs for 4.5.

Comment 16

19 years ago
How can this remain "latered"?  Negotiation of search
charsets with our own MS4.0 is something most major mail clients
can perform now, e.g. Outlook, WinBiff, etc. We should be
competitive and support the search charset negotiation. Without
this, our IMAP search for Japanese and other non-ASCII languages
would not work. How can we promote our clients to enterprise
customers without this feature working?

MS4.0 is nearing completion. This bug should be a perfect candiadte
for 4.51.

Re-opening for consideration in 4.51.

Comment 17

19 years ago
In case we need to review how this functionality should work,
I consulted taka and came up with the following summary of the

** Proposed steps for negotiating down the IMAP search charset. **

0. Check the 'capability' of the IMAP server for UTF-8.

   IMAP4 capability command should return something like the
   following in response to "a capability" command:

     a capability
     a OK Completed

If the return string contains "X-NETSCAPE", we can be assured of UTF-8
seacrh capability with this server.

(Note: If you see X-NETSCAPE in the response of CAPABILITY command, there's
       100% guarantee that the server will recognize UTF-8 charset.  Do NOT
       rely on the banner message because it's configurable, user may change
       it to something else. You can always try UTF-8 as charset whethr or not
       it's IMAP4 server (it will fail if the server doesn't know UTF-8). )

1. Determine if the search string contains any 8-bit characters.

   ---> If not (=only 7-bit data), send the search string in ASCII.

2. If 1) is yes, then assume that the search charset is in the System
   Charset (or the global default -- e.g. in 4.5 we use global default
   for LDAP servers so that more than one charsets can be used for search.)
   Convert it to UTF-8 and send to the server. If the server accepts it,
   then it should return matches if there are any matches.

3. If the request in 2 is rejected by the server, then, send the string in
   the standard mail charset matching the System (or the global default)
   charset. (For example, iso-2022-jp for the Japanese Win/Mac system
   charset, Shift_JIS.)

4. If the request in 3 is rejected, then send the raw search
   string (as is) without any charset specification.
   And this completes the client's responsibility.

Open issue: Should we use the global default or the system charset
            as the basis for the source charset?  The global default is
            more flexible in that we can input in different charsets
            if proper keyboards or input methods are available as we
            change the global default.

Comment 18

19 years ago
qa assigned shouldn't be gbush.  Should be someone in msanz's group.

Comment 19

19 years ago
There are two issues,
The pref mailnews.force_ascii_search is set to true.
The second problem is that we need to convert search string to mail charset
which is JIS in case of Japanese. We are currently using the folder csid which
is ShiftJIS or EUC.

Here is a change I applied to my local tree.
Index: search.cpp
RCS file: /m/src/ns/lib/libmsg/search.cpp,v
retrieving revision
diff -c -r1. search.cpp
*** search.cpp	1998/10/01 04:24:55
--- search.cpp	1998/11/10 18:53:45
*** 2182,2188 ****
--- 2182,2192 ----

  		// Ask the newsgroup/folder for its csid.
  		if (m_scope->m_folder)
  			dst_csid = m_scope->m_folder->GetFolderCSID() &
  dst_csid = INTL_DefaultMailCharSetID(dst_csid);


  	// default means that our best guess is to get the default window char
set ID

Comment 20

19 years ago
This sounds like a lot of work, so I think we shouldn't commit to doing this
for 4.51, unless a customer escalation comes in which forces us to do it.
Clearing TFV. Please see me before setting the TFV.

BTW, I think Naoki's proposed change above is partial, at best, and defeats the
per-folder CSID that we allow the user to set.

Comment 21

19 years ago
Why can it sound like a lot of work?  Naoki shows everything to fix.
What is wrong with partial solution?  Any serious side effect?

Although I don't mind what TFV it's got, I do care if customers in Japan
find all other IMAP clients work with Messaging Server 4.0, but
only Netscape client (except Messenger Express 4.1) doesn't with
Netscape's own IMAP server.

I've waited almost 10 month. And, seems like I have to keep
waiting more.  Am I expecting too much?

Comment 22

19 years ago
>and defeats the per-folder CSID that we allow the user to set.
That has been true anyway as we restrict to Ascii only. The other issue is that
we only support single charset inside the search dialog. Also more complicated
issue is folder hierachy which may have mixed charsets situations.
So, those issues need to be solved in future. But I am not sure if we should
support only ascii until we solve those issues.

Comment 23

19 years ago
> Why can it sound like a lot of work?

Because none of the other searching code takes more than one attempt at a search
based on the results of previous attempts.

> Naoki shows everything to fix.

That is absolutely not true. Naoki shows how to convert to the mail server's
charset only. That does not implement the algorithm Kat showed his 10/29/98

> I've waited almost 10 month. And, seems like I have to keep waiting more.
> Am I expecting too much?

As I said above, the question for when we add this feature is determined by
customer escalations. There are lots of other features that people have wanted
for longer then 10 months that we're not doing in 4.51.

Comment 24

19 years ago
After discussing various pros and cons, we have decided to
open a new bug for fulfilling a minimum IMAP search
requirement for the Japanese market. A new bug does not
ask for server-client negotiation, and should be handled by
the escalation team.
The new bug is: 334536.

Comment 25

19 years ago
TFV 5.0

Comment 26

19 years ago
I (or someone else) will be moving enhancements, etc, bugs targeted for 5.0 to
bugzilla in the near future.

------- Additional Comments From paulmac  May-04-1999 17:44 -------

Okay, time to close out old bugsplat bugs - Please move to bugzilla if this
one is still relevant or mark won't fix, please.

------- Additional Comments From momoi  May-04-1999 17:49 -------

Well, this is still a valid bug.
Let's move to 5.0 and send it to the Mail/News team.


18 years ago
Target Milestone: M9


18 years ago
Blocks: 7228


18 years ago
Target Milestone: M9 → M13

Comment 27

18 years ago
search is moving out.

Comment 28

18 years ago
Search won't be implemented until after Beta 1, so this bug does not need to be
fixed until after Beta 1


18 years ago
Assignee: phil → mscott
Target Milestone: M13 → M14

Comment 29

18 years ago
mscott owns the search backend, so reassigning to him for M14. Searching is not
a B1 feature.


18 years ago
Target Milestone: M14 → M16

Comment 30

18 years ago
triagin...this is not a beta2 bug.
Target Milestone: M16 → M18

Comment 31

18 years ago
Based on Beta2 Criteria http://client/seamonkey/prd/beta2criteria.html.
This is beta2 P1 bug, should add a keyworkds beta2 on this bug?

Comment 32

18 years ago
Karen, the beta2 doc says we need to implement a search back end which is a
separate bug. We need the search backedn before we can start fixing bugs like
this which have been around since 4.5. =(

I don't see any mention of this bug in the beta2 docs so I'm not sure what you
were looking at or maybe you were thinking about the comment to implement search
for beta2?

Comment 33

18 years ago
I suck i was only looking under mail not under mail 18n on the beta2 docs.

moving back to a beta2 milestone. Thanks for catching my mistake Karen!

I18N, are you guys sure this is a beta2 stopper?
Target Milestone: M18 → M17

Comment 34

18 years ago
4.x didn't do this - I can't believe it would be a beta stopper for 6.0, and we 
could ship with it as well - we always have before.

Comment 35

18 years ago
From Beta2 Criteria http://client/seamonkey/prd/beta2criteria.html.
1) Scroll down to see the Features
2) Selec I18N Features.
3) Select Mail I18N
4) Search for Mail/News Tasks - IMAP I18N - IMAP search 5933 - P1

P.S. I don't know what I18N mean? Does anybody know that?

Comment 36

18 years ago
I18N = Internationalization. I believe that the i18n group says it's a beta 
stopper. I just don't think we're going to have time to do it.

Comment 37

18 years ago
OK. I am just checking & trying to clarify that.
Then the document should be modified!!

Comment 38

18 years ago
This bug was transferred from 4.x bug system.
What we need for beta2 is i18n IMAP search to work. It is working in 4.x.
In 4.x, if ascii search does fails then it falls back to another query using a 
folder charset.
But for mozilla, it is easier and better to do UTF-8 query since we have a 
query string in unicode.

Comment 39

18 years ago
This is an IMAP spec.

We made some very hard choices to ship 4.5 and this was one of the features
that was cut at the very end.

The mail server guys have been very adamant that the client needs to support
this and were very disappointed that if fell off the 4.5 list at the end of
that development cycle.

taka and jgmyers can provide more data on what will break for who without this
long awaited feature...

Comment 40

18 years ago
I'd be surprised if we get 80% of the search functionality that was in 4.5 into
6.0 - getting > 100% would be a miracle. If you hadn't noticed, we haven't even
started search yet!

Comment 41

18 years ago
Putting beta2 for i18n beta2 criteria items. Contact bobj for question.
Keywords: beta2

Comment 42

18 years ago
> This is an IMAP spec.

I don't see this mentioned in RFC 2060 or 2683. Please give the spec reference
which supports your claim.


18 years ago
Blocks: 35851


18 years ago
Keywords: nsbeta2

Comment 43

18 years ago
Putting on [nsbeta2-] radar. 
Keywords: beta2
Whiteboard: [nsbeta2-]

Comment 44

18 years ago
As the bug is old and the original comment is not consistent with what we need 
for beta2, I am rewriting the i18n requirement for beta2 (which is the same 
level of support as the current 4.x cleint). I also changed the summary.
For beta2, we need US-ASCII search and charset specified search (i18n search). 

Here is how we can do,
* Apply 7 bit check against search string. Assuming the search string is unicode 
(PRUnichar* or UTF-8), we can check < 128 against the search string.
* If the search string is 7bit then the do US-ASCII search (search with no 
charset specified).
* If the search string is 8bit then get the folder charset, convert the unicode 
string to the folder charset and specify the charset in the search command.
Summary: IMAP4 search doesn't retry if first attempt fails → International support for IMAP4 search

Comment 45

18 years ago
clear nsbeta2-
Whiteboard: [nsbeta2-]

Comment 46

18 years ago
ftang, why did you clear nsbeta2-..can you state your case?
Whiteboard: [NEED INFO]

Comment 47

18 years ago
Since search has been an approved feature exception, this goes hand in hand with
that. It basically says make our imap seach I18N friendly when we implement it =).

Comment 48

18 years ago
On exception list for PR2, removing 5/ [nsbeta2+]Exception Feature 
Whiteboard: [NEED INFO] → nsbeta2+]Exception Feature

Comment 49

18 years ago
It's my understanding that the mail team cut search today.

Comment 50

17 years ago
so, like the last bug, I did a bunch of i18n work yesterday.

And a reality check from everyone: This bug is over 2 years old now, a carryover
from 4.5.. the general i18n-ness of search is already covered in bug 11659..
kinda seems like this should just be a dupe. 

if however this bug is referring to the algorithm described at the top of this
file, I believe it may never have been implemented in 4.x.. and if that's the
case I'm not sure why this would be nsbeta2+

in any case, I think this should either go to bienvenu or myself to lighten
scott's load.

Comment 51

17 years ago
So after your i18n fixes, are we now close to parity 
with 4.5 and later? The spec there was described in 2000-05-01 16:00 comment above.
That should be the minimum -- it has been implemnted before
and current users of Communicator will expect as much.

Comment 52

17 years ago
I _think_ so... we won't know for certain until we have a UI.
I haven't seen the equivalent of the algorithm described at the top of this might be there though

Comment 53

17 years ago
The algorithm which retries with a different character set if no hits are found
was not implemented in 4.x. Since that's that this bug was about originally, I'm
guessing that we should separate that issue (which we're not addressing for
seamonkey) with the issue of 4.x parity WRT i18n searching (which we should
address for seamonkey)

Comment 54

17 years ago
>I _think_ so... we won't know for certain until we have a UI.
Do we have a bug for that? As soon as that is resolved iqa can test i18n search.

Comment 55

17 years ago
ok, does anyone object to me marking this a dupe of 11659 (which has been marked
fixed) then? bienvenu has appearantly got IMAP search working, and I have
supposedly made the whole search backend i18n friendly...

*** This bug has been marked as a duplicate of 11659 ***
Last Resolved: 17 years ago
Resolution: --- → DUPLICATE

Comment 56

17 years ago
nhotta, see bug 33101 for the search UI frontend bug.
I just added you to the CC

Comment 57

17 years ago
i object actually. Alecf, this bug refers to a specic algorith for imap4
searching that escalation engineering implemented in 4.6. This bug is track this
when we implement search for imap.

It's separate from the random i18n filter and search bug you marked it a  dup of.
Resolution: DUPLICATE → ---

Comment 58

17 years ago
I don't really see how i18n search can be done, despite what Alec has done. My 
understanding was that the i18n group had to provide us with API's that existed 
in 4.5 but no longer exist in 6.0 in order for i18n search to work. Of course, 
I've been out of it for a long time, but that was my understanding.

Comment 59

17 years ago
The last time we talked about NNTP search API and we dropped that from beta2.
For I map, I think what I mentioned in 2000-05-01 16:00 are available in 6.0 
(e.g. getting a folder charset, conversion from unicode to a folder 
charset, etc.).

Comment 60

17 years ago
is the latter also true for local? Convert the headers to unicode using the 
charset and do a unicode comparison with the utf8->unicode converted search 
string? What about message bodies? We can't really convert the whole message 
body to unicode in memory, can we?

Comment 61

17 years ago
I believe that some of the search code converts the unicode search term to the
folder's charset, then performs the search with this converted string.

Comment 62

17 years ago
alecf's description is how local searching is supposed to work (and did in 4.x).


Comment 63

17 years ago
For local search, header search requires MIME decoding. Sicne the MIME decoder 
returns unicode, that can be compared with the search term.
For body local search, I believe we converted the body (not the search term). 
Here is a 4.x code, I belive DO_I18N was defined in the 4.x (otherwise japanese 
search wouldn't work).

739 #ifdef DO_I18N
740                     // In here we do I18N conversion if we get the converter
741                     char *newBody = nsnull;
742                     newBody = (char *)INTL_CallCharCodeConverter(conv, 
(unsigned char *) buf, (int32) PL_strlen(buf));
743                     if (newBody && (newBody != buf))
744                     {
745                         // CharCodeConverter return the char* to the orginal 
746                         // we don't want to free body in that case 
747                         compare = newBody;
748                     }
749 #endif

Comment 64

17 years ago
DO_I18N is not in the 4.5 code; it was added to 6.0 so the code would compile 
because things like INTL_CreateCharCodeConverter don't exist in 6.0 - I think 
this was one area where we need a 6.0 equivalent way of doing this.

Comment 65

17 years ago
you know, it's actually going to be EASIER for me to convert the user-entered
value to the folder's charset and do the body search that way. Anyone object if
I do it that way? It'll be faster too.

Comment 66

17 years ago
I take that back, it's not as simple as I had hoped.. converting the body is the
easy way right now.

Comment 67

17 years ago
Reposting my comment in 2000-05-01 16:00 which contains I18N requirement for 

> As the bug is old and the original comment is not consistent with what we need 
> for beta2, I am rewriting the i18n requirement for beta2 (which is the same 
> level of support as the current 4.x cleint). I also changed the summary.
> For beta2, we need US-ASCII search and charset specified search (i18n search). 
> Here is how we can do,
> * Apply 7 bit check against search string. Assuming the search string is 
> (PRUnichar* or UTF-8), we can check < 128 against the search string.
> * If the search string is 7bit then the do US-ASCII search (search with no 
> charset specified).
> * If the search string is 8bit then get the folder charset, convert the 
> string to the folder charset and specify the charset in the search command.


Comment 68

17 years ago
Adding and to cc.

Comment 69

17 years ago
Added myself to Cc.

Comment 70

17 years ago
I and taka started to look at the code. The search criteria string is UTF-8 and 
there is also a function to get a folder charset. 7 bit check can be done 
easily agains a UTF-8 string. Also, we can convert the string from UTF-8 to a 
folder charset.
Taka pointed that we can use literal string instead of quoted string (which 
needs escaping for some charset, e.g. ISO-2022-JP).

Comment 71

17 years ago
The patch (hooked up charset conversion) was reviewed. I will probably check in 

Comment 72

17 years ago
Checked in, testable once the UI is functional again.
Last Resolved: 17 years ago17 years ago
Resolution: --- → FIXED

Comment 73

17 years ago
** Checked with 7/10/2000 Win32 build **

OK, we are finally able to check on this because I can now
see attribute names. 

Here's what works:

1. With the default view charset set to ISO-2022-JP, a single condition
 or "OR" with more than 1 attributes work OK to find relevant messages
 when we input Japanese search keys.

What does not work:

1. Any search after the first one using a Japanese word produces no 
   change even if you change an attribute value to another Japanese
   word. Even if you close the Search window and re-open it,
   it does not seem possible to do any search. If you use ASCII values,
   you can do more than 1 search at a time succesfully. This problem
   seems to be due to the use of non-ASCII data as search keys. 
2. Any change in attribute category changes, e.g. from Subject to 
   Sender, or from "OR" conjunction to "AND" conjunction. This type
   of change forces the server to send an error message saying that
   "Required argument was missing."
   This problem happens regardless of the charset of the attribute values

There are other problems but I have not sorted them out yet.

For item 2, I'll look for an existing bug. But for Item 1, I need to
re-open this bug.
Resolution: FIXED → ---

Comment 74

17 years ago
Reassing to me.
Can we file a separate bug for the first problem since the international search 
itself has been enabled?

Assignee: mscott → nhotta

Comment 75

17 years ago
Can we get a bit more analysis before deciding to file another
bug? If we know for sure that it has nothing to do with the 
way non-ASCII was implemented, then let's file a new bug.

Problem #1 makes Search in Japanese very difficult since 
users often try one key and then another in case the first didn't work.
Unless I am mistaken, the user will have to reboot Mozilla to try
the next search. That is really bad. 

Comment 76

17 years ago
I've looked at Problem #1 a bit further and it seems that
the problem is a bit more complex than I had described above.
It seems that if you pick certain Japanese words, you can
do more than 1 search at a time. When you use some other word,
it does not work until you use some other data that do not have
this problem. One example of a problem word is "Ni-hon" (Japan
in Kanji). I have not been able to do any search with it.

Comment 77

17 years ago
I used win32 build ID 2000071008 on WinNT 4 Japanese and I can search Japanese 
strings more than once.
First, I searched "mail" in Japanese then got some results.
Then I searched "homepage" in Japanese then got additional results and they were 
appended to the search result.
And I searched "welcome" in Japanese  then got additional results and they were 
appended to the search result.
So I cannot reproduce the problem. There may be a condition to reproduce this. 
Anyway, I prefer the problem to be filed separately.

Comment 78

17 years ago
Would you try "nihon" in Kanji and see if that works?

Comment 79

17 years ago
There seems to be another problem in search string
formation to send to the server. See the SCOPUS bug
we dealt with for Communicator, Bug ID 343598. The example
string described there, Hiragana "a", causes a server
error in Mozilla.

Comment 80

17 years ago
Please file seperate bug instead reopen this feature bug. Individual bugs will 
help us track different cases. 

Comment 81

17 years ago
OK. I'll verify that the basic Intl IMAP functionality is 
There are soem misses and they will be filed as separate bugs.
Last Resolved: 17 years ago17 years ago
Resolution: --- → FIXED

Comment 82

17 years ago
** Checked on 7/10/2000 Win32, Mac, and Linux builds **

On the above builds, basic non-ASCII search function is now
working as long the search keys match the default view charset
set in the Preferences dilaog.
Marking it verifies ad fixed.
We will file new bugs for thsi new feature in separate bugs.
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.