Closed Bug 354951 Opened 18 years ago Closed 14 years ago

ICS import fails in ip_parseFromStream [in nsIScriptableUnicodeConverter.convertFromByteArray]

Categories

(Calendar :: Import and Export, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 334657

People

(Reporter: mozilla, Unassigned)

References

Details

Attachments

(3 files)

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a1) Gecko/20060930 Sunbird/0.3
and
Mozilla/5.0 (OS/2; U; Warp 4.5; en-US; rv:1.9a1) Gecko/20060930 Calendar/0.3

Even after the fixes for bug 315672 and bug 352842 importing my .ics file from Evolution does not work. The error message is identical on Linux and OS/2 (apart from the paths involved) and looks like this (copied from Error Console):

Error: [Exception... "Component returned failure code: 0x8000ffff (NS_ERROR_UNEXPECTED) [nsIScriptableUnicodeConverter.convertFromByteArray]"  nsresult: "0x8000ffff (NS_ERROR_UNEXPECTED)"  location: "JS frame :: file:///M:/SUNBIRD/js/calIcsImportExport.js :: ics_importFromStream :: line 98"  data: no]
Source File: file:///M:/SUNBIRD/js/calIcsImportExport.js
Line: 98
There isn't much we can do here without a testcase... Can you attach the file that gives problems? If possible, create a new evolution calendar, with as few items as possible, and without any private data.
Yes, I know. I really didn't want to attach my whole appointments list, I am working on reducing it to a testcase. So far I can only say that a simple Evolution .ics file as you suggest doesn't cause the problem, while my full .ics file does.
Attached file ICS test file —
The problem seems to be that convertFromByteArray() stops processing as soon as it encounters an octet in the stream that is > 200 or so (maybe even >127?). This file contains one of those in the SUMMARY line. (I edited it by hand so it may not be a valid .ics file any more but it shows the bug.)
Not sure if this is what you want to do (although from bug 315672#c18 and following comments I understood that you originally wanted to do char replacement) but it works for me.
(In reply to comment #4)
This is not a solution. See Bug 315672 Comment #3 and Bug 315672 Comment #4.
Yeah, I have seen that but the same perhaps who minused the first patch there then argued later that it should be done (in the comment that I pointed to).
Oh, and if "silent dataloss" is not wanted then just shown a warning alert.
This bug seems invalid to me. Evolution exports invalid ics file (not utf8), so sunbird can't import it. Nothing can be done about invalid files.
Huh? The patch _shows_ that something can be done.

Put up an alert saying: 
   The ICS file that you are trying to import contains invalid characters.
   Do you want me to replace them with question marks?
     [ Yes, continue ] [ No, stop import ]
If the users presses Yes, do the changes as in the patch.

Btw, it's may not be really Evolution's fault, those chars could have been imported from gnome-calender that I was using before.
(In reply to comment #8)
> This bug seems invalid to me. Evolution exports invalid ics file (not utf8), so
> sunbird can't import it. Nothing can be done about invalid files.
> 
This bug is valid to me. I have the same error which prevents me from importing my calendars from Sunbird 0.2 to Sunbird 0.3. I have never used evolution but only Calendar extension then Sunbird (for some years). In all cases, the generation of and Error (exception) can not be the good solution for any software.
Status: UNCONFIRMED → NEW
Ever confirmed: true
blocking 0.5?
Flags: blocking-calendar0.5?
Retest with Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.1.2pre) Gecko/20070225 Calendar/0.4a1: Import fails with message

Error: [Exception... "Component returned failure code: 0x8000ffff (NS_ERROR_UNEXPECTED) [nsIScriptableUnicodeConverter.convertFromByteArray]"  nsresult: "0x8000ffff (NS_ERROR_UNEXPECTED)"  location: "JS frame :: file:///D:/sunbird/js/calIcsParser.js :: ip_parseFromStream :: line 189"  data: no]
Source File: file:///D:/sunbird/js/calIcsParser.js
Line: 189
Summary: ICS import fails in ics_importFromStream → ICS import fails in ip_parseFromStream [in nsIScriptableUnicodeConverter.convertFromByteArray]
This doesn't block, but we wouldn't turn down patches.
Flags: blocking-calendar0.5? → blocking-calendar0.5-
I think before you invite patches you should first clarify what a patch you would allow should do. The ones that I have created were obviously not what you wanted to have.
I think this is a worthwhile change, especially if you want to inter-operate with Outlook (perhaps via the CalDav plugins for Outlook) but when getting raw data out of Outlook (either via MAPI or OOM) it is very common to get OEM characters that will cause these same sort of issues.
I suggest blocking 0.5 until at least a replace with other chars (and warning) is landed in the builds. Imho, we want to prevent dataloss and catch errors where we can. This especially since ctalbert's remark we'll be getting more of these errors with a growing user-base. 
Attached file Sample ICS file —
I get this failure with this file
I get this to happen quite frequently, and (atleast for me) doesn't appear to be related to non-UTF8 characters. Ihave been toying around with a sample ICS file, and trying to isolate the problems. SOmetimes it is just a trailing space at the end of a line. Seems very random.

I have attached a sample file whichis small and is currently failing
Attachment #257840 - Attachment mime type: text/calendar → text/plain
I found this great website to validate ical files :
http://severinghaus.org/projects/icv/
Related, maybe dupe to bug 334657. IMO inherent problem is whether we want to modify the ics data and replace those characters, which may cause dataloss. IMO the user should at least confirm that (comment #9).
Flags: blocking-calendar0.7?
Version: Trunk → unspecified
helping to mass reassign from requesting blocking‑calendar0.7 to wanted‑calendar0.8.
Flags: blocking-calendar0.7? → wanted-calendar0.8?
It would be good to have this in 0.8
Flags: wanted-calendar0.8?
Flags: wanted-calendar0.8+
Flags: blocking-calendar0.5-
Not going to happen for 0.8.
Flags: wanted-calendar0.8+ → wanted-calendar0.8-
I tried to import an .ics file which was generated from a website. Thereby also the problem with error [nsIScriptableUnicodeConverter.convertFromByteArray] occured. My problem is, i can not influence the format of the ics file, but if there are "umlaute" (special characters in german language) in the file - the parser throws the error.

My private solution was a little change in the source code in file: callcsParser.js in the Funktion: ip_parseFromStream 

original:
-------------------------------------------------------------------
......

    // Interpret the byte-array as a UTF8-string, and convert into a
    // javascript string.
    var unicodeConverter = Components.classes["@mozilla.org/intl/scriptableunicodeconverter"]
                                     .createInstance(Components.interfaces.nsIScriptableUnicodeConverter);
   // ICS files are always UTF8
   unicodeConverter.charset = "UTF-8";
   var str = unicodeConverter.convertFromByteArray(octetArray, octetArray.length);
    return this.parseString(str, aTzProvider);
-------------------------------------------------------------------

changed to:
-------------------------------------------------------------------
......

    // Interpret the byte-array as a UTF8-string, and convert into a
    // javascript string.
    var unicodeConverter = Components.classes["@mozilla.org/intl/scriptableunicodeconverter"]
                                     .createInstance(Components.interfaces.nsIScriptableUnicodeConverter);
   try{
      // ICS files are always UTF8
      unicodeConverter.charset = "UTF-8";
      var str = unicodeConverter.convertFromByteArray(octetArray, octetArray.length);
   }
   catch(err){
      unicodeConverter.charset = "ISO-8859-1";
      var str = unicodeConverter.convertFromByteArray(octetArray, octetArray.length);
   }
   
    return this.parseString(str, aTzProvider); 
-------------------------------------------------------------------
I know it is not the best solution for every above described problem, but it works in my case, so i thougt i will tell you the way to fix the problem.

I hope my english is understandable - i know it should be better...
best regards
Thomas
Thomas, your english is fine :-)

As I've heard, ICS that is not UTF-8 is invalid, but of course it might be a good idea to be forgiving. I'm not sure falling back to iso-8859-1 is the right solution though. That might work out fine for German and such, but other charactersets might also be wanted.

It might be an interesting idea to use [1] to detect the character set and then import it using that set.

[1] http://lxr.mozilla.org/mozilla1.8/source/extensions/universalchardet/

Thomas, would you be interested in creating a patch using this extension? We can give you guidance (also in German, if I guessed your native language correctly) to doing so, if you like.

Others, do you think this would be a worthwhile approach?
IMO forgiving is good, a charset detection would be even better... FWIW: The unichar-stream-loader already replaces wrong UTF-8 sequences.
Philip, thx for your answer :)

I think the character detection (universalchardetector) is an interesting way to deal with this problem, but I've never before implemented an extension and so I'm not familiar with XPCOM. I searched in the internet and found that there is a component "@mozilla.org/intl/charsetdetect;1?type=universal_charset_detector", which I tried to instance. But in the interface descriptions I only found the nsISupports interface. So I'm not sure I'm on the right way or how I can call the methods described in the C-code.

PS: yes U're right with my native language - I'm from Austria.


Taking a second look, its seems the character set detection has no public interfaces. Either this extension needs to be modified to receive some public interfaces which can be called from JS, or we need to write a wrapper component  with our own interface definition in case checkin in to that extension is restrictive or impossible.

Something like this should do it, although the observer needs some modification to not use a C++ enum, but idl constants. See http://www.mozilla.org/scriptable/faq.html question 2 for more information.

[scriptable, uuid(....)]
interface nsICharsetDetectionObserver : nsISupports
{

  const unsigned long eNoAnswerYet = 0;
  const unsigned long eBestAnswer = 1;
  const unsigned long eSureAnswer = 2;
  const unsigend long eNoAnswerMatch = 3;
  void Notify(in string charset, in PRUint32 confident);
};

[scriptable, uuid(....)]
interface nsICharsetDetector : nsISupports
{
  void Init(in nsICharsetDetectionObserver aObserver);

  boolean DoIt(in string aBytesArray, in PRUint32 len);
};

This might not be the best way to start off with extensions and calendar development, but if you are brave, go ahead! I can guide you through it if you encouter problems. If you are otherwise interested in calendar development, please contact me per Email, I'll get you started and can point you to some easier to fix bugs.


With the interfaces written (and placed somewhere in /intl/chardet/public, correctly registered), you could then instanciate with

var f = Components.classes["@mozilla.org/intl/charsetdetect;1?type=universal_charset_detector"].createInstance(Components.interfaces.nsICharsetDetector);
var obs = {
  Notify: function obs_Notify(aCharset, aConfident) {
    ...
  }
f.Init(obs);
f.DoIt(octetArray, octetArray,length);


I think for now the easier solution would be to either use unichar-stream-loader as daniel noted, or to just be lazy and fall back to iso-8859-1.
I have the same problem with build 2008081318

and this ical file

http://programm.froscon.org/2008/schedule.en.ics
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: