IMAP search for Japanese fails when search key contains 0x5C or 0x22

VERIFIED FIXED in M18

Status

P2
normal
VERIFIED FIXED
19 years ago
10 years ago

People

(Reporter: momoi, Assigned: mozilla)

Tracking

Trunk
x86
Windows NT

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [nsbeta3+][pdtp2])

(Reporter)

Description

19 years ago
** Observed with 7/10/2000 Win32 build **

This is a left over bug from Bug 5933 where basic IMAP search
for non-ASCII key data has been implemented. Unfortunately
this still leaves a problem with those JPN characters whose
JIS 2nd-byte corresponds to 0x5C (\) or 0x22 (*) which have
special significance in IMAP protocols.

See a SCOPUS Communicator bug 343598 for details, particularly
my comments:

Additional Comments From momoi  05/03/1999 02:51

There I summarize what needs to be looked to delclare that the problem
has been fixed. 

Here are some characters which will fail: 

"niti" (Day) in Kanji, "a" as in Hiragana "a". 

See other comments in Bug 5933.
(Assignee)

Comment 1

19 years ago
We should use literal string over quoted string for IMAP search to avoid this 
sort of problem.

Comment 2

19 years ago
nsbeta3, reassign to taka.
Assignee: nhotta → taka
Keywords: nsbeta3
(Assignee)

Comment 3

19 years ago
accepted
Status: NEW → ASSIGNED

Comment 4

19 years ago
nsbeta3+ per i18n bug review meeting.
Whiteboard: [nsbeta3+]

Comment 5

18 years ago
What happen to English message which have \ and * in it ?
For example, if I have a English message, the subject is "Hello\My Test"
and I search "o\M", can I find it ?
or if I have a message which subject is "Hello*My Test" and I search
for "o*M" , can I find it ?

If the answer is NO, then we have a generic escaping issue that MailNews folks 
didn't do. 
(Reporter)

Comment 6

18 years ago
I did not paste in all the comments from the internal bug database
but in that bug, I stated that these special characters are missed
even in ASCII search.

0x5C (\) -- backslash
0x22 (") -- double-quote   (there was an error in my original bug report above. 0x22 is not (*). )

These 2 are the "quoted specials" in RFC 2060.

So, yes, the complete fix will have to deal with dealing with these
in ASCII, or not using quoted strings in IMAP search.
(Reporter)

Comment 7

18 years ago
Strings such as "hello\kitty" or "common"key"" cannot be searched
in ASCII currently.
I take it that taka will take care of all these problems.
(Assignee)

Comment 8

18 years ago
My plan is to switch to string literal so that we don't have to worry
about escaping all those characters.

Comment 9

18 years ago
The quoted-string code should search for characters not permitted in quoted 
strings, switching to a literal when it finds them.  It should avoid literals in 
other cases as synchronizing literals require an extra round trip.

Comment 10

18 years ago
taka what is the status. Is this on track ? Please consider buffering time for 
code review and different opinion.
(Assignee)

Comment 11

18 years ago
i'm sorry that it's taking long time.  this is my top priority thing to do
now.  i'll update you within a few days.

Comment 12

18 years ago
Mark it as P2
Priority: P3 → P2

Comment 13

18 years ago
taka is testing his approach now. He said he will post his patch shortly today.
Whiteboard: [nsbeta3+] → [nsbeta3+]ETA:9/1
(Assignee)

Comment 14

18 years ago
Per jgmyer's comment and the fact that there is a bug in escaping code,
I kept quoted string code rather than rewriting them for string literal.
Following is the diff from the tip of tree for review:


D:\0826\mozilla\mailnews\base\search\src>cvs diff -u nsMsgSearchAdapter.cpp
Index: nsMsgSearchAdapter.cpp
===================================================================
RCS file: /cvsroot/mozilla/mailnews/base/search/src/nsMsgSearchAdapter.cpp,v
retrieving revision 1.25
diff -u -r1.25 nsMsgSearchAdapter.cpp
--- nsMsgSearchAdapter.cpp      2000/08/20 01:08:29     1.25
+++ nsMsgSearchAdapter.cpp      2000/08/30 12:36:09
@@ -619,19 +619,35 @@

                        useQuotes = !reallyDredd ||
                 (nsAutoString(convertedValue).FindChar((PRUnichar)' ') != -1);
-                       if (useQuotes)
+
+                       // now convert to char* and escape quoted_specials
+                       value = TryToConvertCharset(convertedValue, 
destCharset, reallyDredd);
+                       if (value)
                        {
-                PRUnichar *oldConvertedValue = convertedValue;
-                               convertedValue = 
EscapeQuoteImapSearchProtocol(convertedValue);
-                nsCRT::free(oldConvertedValue);
+                               if (useQuotes)
+                               {
+                                       char *oldValue = value;
+                                       // max escaped length is one extra 
character for every character in the cmd.
+                                       char *newValue = 
(char*)PR_Malloc(sizeof(char) * (2*nsCRT::strlen(value) + 1));
+                                       if (newValue)
+                                       {
+                                               char *p = newValue;
+                                               while (1)
+                                               {
+                                                       char ch = *value++;
+                                                       if (!ch)
+                                                               break;
+                                                       if (ch == '"' || ch == 
'\\')
+                                                               *p++ = '\\';
+                                                       *p++ = ch;
+                                               }
+                                               *p = '\0';
+                                               value = 
nsCRT::strdup(newValue); // realloc down to smaller size
+                                               nsCRT::free(newValue);
+                                       }
+                                       nsCRT::free(oldValue);
+                               }
                        }
-
-            // now convert to char*
-            value = TryToConvertCharset(convertedValue, destCharset, 
reallyDredd);
-            // if couldn't convert, send as is
-                       if (!value)
-                               value = 
NS_ConvertUCS2toUTF8(convertedValue).ToNewCString();
-
                        nsCRT::free(convertedValue);
                        valueWasAllocated = PR_TRUE;


D:\0826\mozilla\mailnews\base\search\src>

Comment 15

18 years ago
this looks ok to me from what I remember after poking around in this code
adding bienvenu in case he wants to review too

Updated

18 years ago
Whiteboard: [nsbeta3+]ETA:9/1 → [nsbeta3+]fix in hand. code reviewed. Wati for more review

Comment 16

18 years ago
I don't really know this code, so I'll just second Alec's comment. It 
looks ok, and performance isn't an issue since this code just executes when 
we're building search terms, which is rare.
(Assignee)

Comment 17

18 years ago
So far, I haven't heard of any complains.  I'll check it in late today (JST).

Updated

18 years ago
Whiteboard: [nsbeta3+]fix in hand. code reviewed. Wati for more review → [nsbeta3+]code reviwed. Wait for tree open to check in.

Comment 18

18 years ago
please check it in.
(Assignee)

Comment 19

18 years ago
checked in. mark as fixed.
Status: ASSIGNED → RESOLVED
Last Resolved: 18 years ago
Resolution: --- → FIXED
(Reporter)

Comment 20

18 years ago
With 9/11/2000 Win32 build,
characters like "hon" as in "nihon" now can be searched.
So also we can search with hiragana "a". 
(Note that I misstated above that "ni" in "nihon" cannot be 
searched. "ni" or "niti" has no problem. It was "hon" that contains
0x5C byte.

However, this is not enough, there are other other characters like
0x5C (\) -- backslash
0x22 (") -- double-quote 

in English/ASCII which will fail. 
I'm out of time for this one and so will pass this on to ji.
Please check out ASCII ones in addition to Japanese ones.
QA Contact: momoi → ji

Comment 21

18 years ago
Checked 2000-09-18-08 win32 build.
When the keystring contains 0x5c characters, like "hon", IMAP search doesn't 
list the mail which matches the search criteria. When the key string contains 
hiragana "a", an error dialog pops up, saying that "The current command didn't 
succeed. The mail server responded:Unexpected extra arguments to Search."

Reopened the bug.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Comment 22

18 years ago
I could really use some help on this one.

Comment 23

18 years ago
Ftang/Nhotta - Can you check to see if Taka can help Alec?

Comment 24

18 years ago
It's actually assign to him.
Taka san, could you take a look at this?
(Assignee)

Comment 25

18 years ago
There is a piece of code in NECKO which replaces '\' with '/'.
I'll try to find a way to fix the problem with minimum impact.

Comment 26

18 years ago
If you generate %5C instead of \ in the mailnews code, will necko change it back 
to \ ? 
Whiteboard: [nsbeta3+]code reviwed. Wait for tree open to check in. → [nsbeta3+]
(Assignee)

Comment 27

18 years ago
didn't work.
(Assignee)

Comment 28

18 years ago
i've found a way to work around NECKO and i'm pretty confident that I can fix 
this by end of tomorrow. i'll post a diff for review once tomorrow.

Comment 29

18 years ago
pdt agrees p2.
Whiteboard: [nsbeta3+] → [nsbeta3+][pdtp2]
Target Milestone: --- → M18
(Assignee)

Comment 30

18 years ago
D:\0919\mozilla\mailnews\base\search\src>cvs diff -u nsMsgSearchAdapter.cpp
Index: nsMsgSearchAdapter.cpp
===================================================================
RCS file: /cvsroot/mozilla/mailnews/base/search/src/nsMsgSearchAdapter.cpp,v
retrieving revision 1.26
diff -u -r1.26 nsMsgSearchAdapter.cpp
--- nsMsgSearchAdapter.cpp      2000/09/05 09:07:24     1.26
+++ nsMsgSearchAdapter.cpp      2000/09/21 15:07:20
@@ -217,8 +217,16 @@
        return result;
 }

+/*
+   09/21/2000 - taka@netscape.com
+   This method is bogus. Escape must be done against char * not PRUnichar *
+   should be rewritten later.
+   for now, just duplicate the string.
+*/
 PRUnichar *nsMsgSearchAdapter::EscapeSearchUrl (const PRUnichar *nntpCommand)
 {
+       return nsCRT::strdup(nntpCommand);
+#if 0
        PRUnichar *result = nsnull;
        // max escaped length is two extra characters for every character in 
the cmd.
   PRUnichar *scratchBuf = (PRUnichar*) PR_Malloc(sizeof(PRUnichar) * 
(3*nsCRT::strlen(nntpCommand) + 1));
@@ -246,11 +254,20 @@
         nsCRT::free (scratchBuf);
        }
        return result;
+#endif
 }

+/*
+   09/21/2000 - taka@netscape.com
+   This method is bogus. Escape must be done against char * not PRUnichar *
+   should be rewritten later.
+   for now, just duplicate the string.
+*/
 PRUnichar *
 nsMsgSearchAdapter::EscapeImapSearchProtocol(const PRUnichar *imapCommand)
 {
+       return nsCRT::strdup(imapCommand);
+#if 0
        PRUnichar *result = nsnull;
        // max escaped length is one extra character for every character in the 
cmd.
     PRUnichar *scratchBuf =
@@ -276,11 +293,20 @@
         nsCRT::free (scratchBuf);
        }
        return result;
+#endif
 }

+/*
+   09/21/2000 - taka@netscape.com
+   This method is bogus. Escape must be done against char * not PRUnichar *
+   should be rewritten later.
+   for now, just duplicate the string.
+*/
 PRUnichar *
 nsMsgSearchAdapter::EscapeQuoteImapSearchProtocol(const PRUnichar 
*imapCommand)
 {
+       return nsCRT::strdup(imapCommand);
+#if 0
        PRUnichar *result = nsnull;
        // max escaped length is one extra character for every character in the 
cmd.
     PRUnichar *scratchBuf =
@@ -306,6 +332,7 @@
     nsCRT::free (scratchBuf);
        }
        return result;
+#endif
 }


@@ -616,37 +643,32 @@
             // do all sorts of crazy escaping
                        convertedValue = reallyDredd ? EscapeSearchUrl 
(searchTermValue) :
                 EscapeImapSearchProtocol(searchTermValue);
-
                        useQuotes = !reallyDredd ||
                 (nsAutoString(convertedValue).FindChar((PRUnichar)' ') != -1);
-
                        // now convert to char* and escape quoted_specials
                        value = TryToConvertCharset(convertedValue, 
destCharset, reallyDredd);
                        if (value)
                        {
-                               if (useQuotes)
+                               char *oldValue = value;
+                               // max escaped length is one extra character 
for every character in the cmd.
+                               char *newValue = (char*)PR_Malloc(sizeof(char) 
* (2*nsCRT::strlen(value) + 1));
+                               if (newValue)
                                {
-                                       char *oldValue = value;
-                                       // max escaped length is one extra 
character for every character in the cmd.
-                                       char *newValue = 
(char*)PR_Malloc(sizeof(char) * (2*nsCRT::strlen(value) + 1));
-                                       if (newValue)
+                                       char *p = newValue;
+                                       while (1)
                                        {
-                                               char *p = newValue;
-                                               while (1)
-                                               {
-                                                       char ch = *value++;
-                                                       if (!ch)
-                                                               break;
-                                                       if (ch == '"' || ch == 
'\\')
-                                                               *p++ = '\\';
-                                                       *p++ = ch;
-                                               }
-                                               *p = '\0';
-                                               value = 
nsCRT::strdup(newValue); // realloc down to smaller size
-                                               nsCRT::free(newValue);
+                                               char ch = *value++;
+                                               if (!ch)
+                                                       break;
+                                               if ((useQuotes ? ch == '"' : 0) 
|| ch == '\\')
+                                                       *p++ = '\\';
+                                               *p++ = ch;
                                        }
-                                       nsCRT::free(oldValue);
+                                       *p = '\0';
+                                       value = nsCRT::strdup(newValue); // 
realloc down to smaller size
+                                       nsCRT::free(newValue);
                                }
+                               nsCRT::free(oldValue);
                        }
                        nsCRT::free(convertedValue);
                        valueWasAllocated = PR_TRUE;

D:\0919\mozilla\mailnews\base\search\src>
(Assignee)

Comment 31

18 years ago
D:\0919\mozilla\mailnews\imap\src>cvs diff -u nsImapService.cpp
Index: nsImapService.cpp
===================================================================
RCS file: /cvsroot/mozilla/mailnews/imap/src/nsImapService.cpp,v
retrieving revision 1.159
diff -u -r1.159 nsImapService.cpp
--- nsImapService.cpp   2000/09/20 12:23:42     1.159
+++ nsImapService.cpp   2000/09/21 15:03:48
@@ -772,7 +772,12 @@
        urlSpec.AppendWithConversion(hierarchySeparator);
     urlSpec.Append((const char *) folderName);
     urlSpec.Append('>');
-    urlSpec.Append(aSearchUri);
+    // escape aSearchUri so that IMAP special characters (i.e. '\')
+    // won't be replaced with '/' in NECKO.
+    // it will be unescaped in nsImapUrl::ParseUrl().
+    char *search_cmd = nsEscape((char *)aSearchUri, url_XAlphas);
+    urlSpec.Append(search_cmd);
+    nsCRT::free(search_cmd);
     rv = mailNewsUrl->SetSpec((char *) urlSpec.GetBuffer());
     if (NS_SUCCEEDED(rv))
     {


D:\0919\mozilla\mailnews\imap\src>
(Assignee)

Comment 32

18 years ago
two files need to be updated.  search criteria will be URL escaped so that
IMAP special character '\' won't be replaced with '/' in NECKO.  search
criteria string will be unescaped later before being sent to server.

a couple of methods in nsMsgSearchAdapter class need to be rewritten.
escaping must be done after converting from UCS2.  i'll revisit
these after M18.

Comment 33

18 years ago
Can you make sure that searching for things like %, /, and " still work?

The escaping that is done on PRUnichar* was done to avoid excess conversions (if
you try to change them back to char*, you'll see what I'm talking about) - it
seems like it should be perfectly valid to escape/unescape PRUnichar* stuff.

After you've verified that the above search strings work, then r=alecf.

Comment 34

18 years ago
>it seems like it should be perfectly valid to escape/unescape PRUnichar* stuff.
char is 8 bits, so you escape it into %xx
PRUnichar is 16 bits, how can you escape / unescape it correctly ? 

Comment 35

18 years ago
alec- taka hack from Japan, in a different time zone from us. His AIM is 
mozippe. Sometime I saw him on line after 5:00PM PST. 

Comment 36

18 years ago
well, I assume / would become PRUnichar('%') plus the unicode equivalent of the
ASCII charcode for /.

Comment 37

18 years ago
There are several characters other than U+002F which use a 2F octet in their 
ISO-2022-* representation.  Since a U+0025 character will cause an ISO-2022-* 
encoder to shift into ASCII mode, how can you possibly escape in a (PRUnicode *) 
a non-ASCII character which uses a 2F octet in is ISO-2022-* representation?
(Assignee)

Comment 38

18 years ago
double checked that all ASCII range characters including ", \, and % can be 
searched correctly against nsmail-1.mcom.com.

Alec, please review the code above.  I'll check in as soon as tree becomes
open and you give me a go sign.
(Assignee)

Comment 39

18 years ago
checked in.  r=alecf, a=alecf
mark as fixed.
Status: REOPENED → RESOLVED
Last Resolved: 18 years ago18 years ago
Resolution: --- → FIXED

Comment 40

18 years ago
 As long as block bug 53736 is resolved, I'll verify this.

Comment 41

18 years ago
Verified with win32 2000092409 build. It is fixed.
Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.