Closed Bug 839023 Opened 11 years ago Closed 5 years ago

xmlrpc interface doesn't correctly escape data in response, resulting in invalid xml

Categories

(Bugzilla :: WebService, defect)

4.0.9
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: benjamin, Unassigned)

References

Details

Attachments

(2 files)

https://api-dev.bugzilla.mozilla.org/1.2/bug?id=4843&include_fields=_default,history consistently shows me a "Please come back later" message. Viewing the bug from b.m.o works, and most other bugs don't have this problem.

There may be a few other bugs which have a similar problem; I'm diagnosing errors retrieving about 60 bugs total.
Assignee: nobody → gerv
Component: Extensions: REST → BzAPI
Product: bugzilla.mozilla.org → Webtools
Version: Production → other
This is a problem with Bugzilla's XML RPC "history" method returning invalid XML character data. Punting over to bugzilla.mozilla.org component for glob to take a look at, although we may finally find it's an upstream bug.

Gerv
Assignee: gerv → nobody
Component: BzAPI → General
Product: Webtools → bugzilla.mozilla.org
Blocks: 839047
Attached file generated xml
there's the xml bugzilla generates as a response.
at char 421 there's an unescaped char(3).
Assignee: nobody → webservice
Component: General → WebService
OS: Windows 7 → All
Product: bugzilla.mozilla.org → Bugzilla
QA Contact: default-qa
Hardware: x86_64 → All
Summary: BzAPI: Error accessing history of bug 4843 → xmlrpc interface doesn't correctly escape data in response, resulting in invalid xml
Version: other → 4.0.9
So the issue here is that this char should have been replaced by an  or something like that? I'd expect our XML encoding library, whatever it is, to handle that stuff...

Gerv
U+0000-U+001F are illegal in HTML 4.0 and XML 1.0 (except the characters HR, LF and CR). And it's not permitted to use named character references such as  either (although it is permitted in XML 1.1, except for NUL): http://www.w3.org/International/questions/qa-controls

So what should we do?

1a) Silently filter out such characters whenever data is submitted?
1b) Throw an error whenever data is submitted with such characters?
2) Filter out such characters whenever data is displayed?
3) Both 1) and 2)?
4) Filter on submission, and do a one-time DB scan to remove any existing ones?
5) Nothing?

Gerv
Related: bug 538946.

Gerv
Given that we already have such data in the DB, why not just filter on output and replace them with U+FFFD (unicode unknown character)?
LpSolit: if we wanted to do what Ben suggests in comment 8, where in the code do you think we could put such a filter? There may be quite a few fields and data items which need filtering...

Gerv
I needed to fix this problem on a Bugzilla server (version 4.0.2) which I maintain where I don't care about the content in the XML-RPC response that contains the data. Inside Bugzilla/WebService/Server/XMLRPC.pm, in _strip_undefs, at the end of the function (around line 250):

    if (ref $initial eq '')
    {
     	$initial =~ s/([\x01-\x08\x0b\x0c\x0f-\x1f])/sprintf "\\x%02x", ord($1)/ge;
    }

This doesn't replace them with the 'unknown character', but with plain ascii in the form '\x##'. This may not be sufficient for some users, but it ensures that my xmlrpc client doesn't fail on the reading of the XML-RPC result.
(In reply to Justin Fletcher from comment #10)
> where I don't care about the content in the XML-RPC response that contains the data.
>     if (ref $initial eq '')
>     {
>      	$initial =~ s/([\x01-\x08\x0b\x0c\x0f-\x1f])/sprintf "\\x%02x",
> ord($1)/ge;
>     }

For anybody else trying this workaround: It works around the problem for comment text but obviously damages binary attachment content. Wondering if there's any way to only apply this to comment text.
Now what happens. This bug is still persistent. Is it going to be fixed ever?
@Ahmed: No-one can predict the future. Generally speaking, bugs get fixed faster when someone interested in fixing provides a code patch for review. Would you fancy doing that? See https://wiki.mozilla.org/Bugzilla:Patches for more information if you are interested.
Thanks a lot. I will look into that.

The JSON-RPC and XML-RPC APIs are deprecated and will no longer be updated. Feel free to file a new bug if the same issue can be reproduced with the REST API.

https://bugzilla.readthedocs.io/en/latest/integrating/apis.html

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: