Note: bugzilla.mozilla.org will be unavailable for 10 minutes on Saturday, September 30th 2017 at 16:00 UTC.
If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

xmlrpc interface doesn't correctly escape data in response, resulting in invalid xml

NEW
Unassigned

Status

()

Bugzilla
WebService
5 years ago
a month ago

People

(Reporter: Benjamin Smedberg, Unassigned)

Tracking

Details

Attachments

(2 attachments)

(Reporter)

Description

5 years ago
https://api-dev.bugzilla.mozilla.org/1.2/bug?id=4843&include_fields=_default,history consistently shows me a "Please come back later" message. Viewing the bug from b.m.o works, and most other bugs don't have this problem.

There may be a few other bugs which have a similar problem; I'm diagnosing errors retrieving about 60 bugs total.
Assignee: nobody → gerv
Component: Extensions: REST → BzAPI
Product: bugzilla.mozilla.org → Webtools
Version: Production → other
Created attachment 711306 [details]
XML RPC test program showing bug (need to insert username and password)
This is a problem with Bugzilla's XML RPC "history" method returning invalid XML character data. Punting over to bugzilla.mozilla.org component for glob to take a look at, although we may finally find it's an upstream bug.

Gerv
Assignee: gerv → nobody
Component: BzAPI → General
Product: Webtools → bugzilla.mozilla.org
(Reporter)

Comment 3

5 years ago
Also affected:

bug 13867
bug 140803
bug 148674
bug 15236
bug 188847
bug 191053
bug 23120
bug 298693
bug 30090
bug 313030
bug 51322
bug 67914
Blocks: 839047
Created attachment 711309 [details]
generated xml

there's the xml bugzilla generates as a response.
at char 421 there's an unescaped char(3).
Assignee: nobody → webservice
Component: General → WebService
OS: Windows 7 → All
Product: bugzilla.mozilla.org → Bugzilla
QA Contact: default-qa
Hardware: x86_64 → All
Summary: BzAPI: Error accessing history of bug 4843 → xmlrpc interface doesn't correctly escape data in response, resulting in invalid xml
Version: other → 4.0.9
So the issue here is that this char should have been replaced by an  or something like that? I'd expect our XML encoding library, whatever it is, to handle that stuff...

Gerv
U+0000-U+001F are illegal in HTML 4.0 and XML 1.0 (except the characters HR, LF and CR). And it's not permitted to use named character references such as  either (although it is permitted in XML 1.1, except for NUL): http://www.w3.org/International/questions/qa-controls

So what should we do?

1a) Silently filter out such characters whenever data is submitted?
1b) Throw an error whenever data is submitted with such characters?
2) Filter out such characters whenever data is displayed?
3) Both 1) and 2)?
4) Filter on submission, and do a one-time DB scan to remove any existing ones?
5) Nothing?

Gerv
Related: bug 538946.

Gerv
(Reporter)

Comment 8

4 years ago
Given that we already have such data in the DB, why not just filter on output and replace them with U+FFFD (unicode unknown character)?
LpSolit: if we wanted to do what Ben suggests in comment 8, where in the code do you think we could put such a filter? There may be quite a few fields and data items which need filtering...

Gerv

Comment 10

4 years ago
I needed to fix this problem on a Bugzilla server (version 4.0.2) which I maintain where I don't care about the content in the XML-RPC response that contains the data. Inside Bugzilla/WebService/Server/XMLRPC.pm, in _strip_undefs, at the end of the function (around line 250):

    if (ref $initial eq '')
    {
     	$initial =~ s/([\x01-\x08\x0b\x0c\x0f-\x1f])/sprintf "\\x%02x", ord($1)/ge;
    }

This doesn't replace them with the 'unknown character', but with plain ascii in the form '\x##'. This may not be sufficient for some users, but it ensures that my xmlrpc client doesn't fail on the reading of the XML-RPC result.

Updated

3 years ago
Duplicate of this bug: 1055629

Comment 12

3 years ago
(In reply to Justin Fletcher from comment #10)
> where I don't care about the content in the XML-RPC response that contains the data.
>     if (ref $initial eq '')
>     {
>      	$initial =~ s/([\x01-\x08\x0b\x0c\x0f-\x1f])/sprintf "\\x%02x",
> ord($1)/ge;
>     }

For anybody else trying this workaround: It works around the problem for comment text but obviously damages binary attachment content. Wondering if there's any way to only apply this to comment text.
Duplicate of this bug: 1176452

Updated

a year ago
Duplicate of this bug: 1274989

Comment 15

a year ago
Now what happens. This bug is still persistent. Is it going to be fixed ever?

Comment 16

a year ago
@Ahmed: No-one can predict the future. Generally speaking, bugs get fixed faster when someone interested in fixing provides a code patch for review. Would you fancy doing that? See https://wiki.mozilla.org/Bugzilla:Patches for more information if you are interested.

Comment 17

a year ago
Thanks a lot. I will look into that.
You need to log in before you can comment on or make changes to this bug.