Closed Bug 426899 Opened 17 years ago Closed 17 years ago

All webservice methods fail if a string has upper-ASCII characters in it

Categories

(Bugzilla :: WebService, defect)

3.1.3
defect
Not set
major

Tracking

()

RESOLVED FIXED
Bugzilla 3.2

People

(Reporter: LpSolit, Assigned: mkanat)

References

Details

Attachments

(1 file)

+++ This bug was initially created as a clone of Bug #415796 +++ I just tested the existing Bug.get and the new User.get webservice methods in 3.1.3+, and they both fail if a string has upper-ASCII characters in it (such as "Frédéric"). Despite the returned XML file starts with <?xml version="1.0" encoding="UTF-8"?>, such strings are mangled, generating the following error: not well-formed (invalid token) at line 1, column 203, byte 203 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux/XML/Parser.pm line 187 So for some reason, strings we return are not valid UTF-8 strings. Did we forget to turn on a parameter somewhere? Granting blocking as non-US installations use a lot of such upper-ASCII characters, in bug summaries, product names, real names, etc..., making webservice unusable.
Flags: blocking3.2+
I've been looking this over for several hours now, and I have absolutely no idea why this happens. Part of the problem is that SOAP::Lite is really huge overkill for what we need to do. I'm starting to think we should be using RPC::XML or XML::RPC.
Assignee: webservice → mkanat
Attached patch v1Splinter Review
Ha, take that, SOAP::Lite! This works now with upper-ASCII characters and multi-byte characters. contrib/bz_webservice_demo doesn't always display things correctly if you have just upper ASCII characters, but that's some problem with that script, not with the webservice.
Attachment #317759 - Flags: review?(LpSolit)
Status: NEW → ASSIGNED
Comment on attachment 317759 [details] [diff] [review] v1 A bug summary with only upper-ASCII characters is badly displayed using bz_webservice_demo.pl. Add an UNICODE character, and the whole bug summary is displayed correctly. Crazy! But at least, this patch fixes the crash, so r=LpSolit Max, any idea how to fix bz_webservice_demo.pl so that it displays upper-ASCII characters correctly in all cases?
Attachment #317759 - Flags: review?(LpSolit) → review+
Flags: approval+
(In reply to comment #3) > Max, any idea how to fix bz_webservice_demo.pl so that it displays upper-ASCII > characters correctly in all cases? No, no idea yet. I think we have to set bindmode STDOUT, ':utf8' but it's impossible to know if the Bugzilla we're accessing is utf8-enabled. We'd also have to deal with the same sort of recoding that checksetup.pl does for the console.
Checking in Bugzilla/WebService.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/WebService.pm,v <-- WebService.pm new revision: 1.9; previous revision: 1.8 done
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
This patch seems to not fix the problem. I was facing the same error msg with characters like: ┌─ and it was solved by changing the code to (Using decode* instead of encode, also for my case utf8::is_utf8 always returns false, so I think it should 'OR' and not 'AND', I may be wrong, but works here): Index: Bugzilla/WebService.pm =================================================================== --- Bugzilla/WebService.pm (revision 551) +++ Bugzilla/WebService.pm (working copy) @@ -141,8 +141,9 @@ my ($value) = @_; # Something weird happens with XML::Parser when we have upper-ASCII # characters encoded as UTF-8, and this fixes it. - utf8::encode($value) if utf8::is_utf8($value) - && $value =~ /^[\x00-\xff]+$/; + if (utf8::is_utf8($value) || $value =~ /^[\x00-\xff]+$/) { + utf8::decode($value); + } return $self->SUPER::as_string($value); }
Hey Tiago. No, that wouldn't make sense, because you'd be running utf8::encode (I assume putting ::decode there is a typo) on a string that isn't utf8. I think the actual problem you might be running into is one that we fixed (or was opened and is still open) later, which is that SOAP::Lite, in modern versions, no longer needs some of our utf8-encoding magic.
Well, after hours looking for a solution I see that we can use as_base64 without breaking it as describe in the documentation: http://cpan.uwinnipeg.ca/htdocs/SOAP-Lite/SOAP/Lite.html#Processing_of_XML_encoded_fragments. So, I've changed the type as_string to as_base64 and that fixed my problem.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: