Last Comment Bug 313441 - [SECURITY] Query RSS should HTML-escape summary in <title>
: [SECURITY] Query RSS should HTML-escape summary in <title>
[doesn't affect 2.16][doesn't affect ...
Product: Bugzilla
Classification: Server Software
Component: Query/Bug List (show other bugs)
: 2.20
: All All
: -- critical (vote)
: Bugzilla 2.20
Assigned To: Phil Ringnalda (:philor)
: default-qa
Depends on:
Blocks: 314659 320312 367674
  Show dependency treegraph
Reported: 2005-10-22 22:59 PDT by Phil Ringnalda (:philor)
Modified: 2007-01-21 08:25 PST (History)
8 users (show)
justdave: approval+
justdave: blocking2.22+
justdave: approval2.20+
justdave: blocking2.20.1+
See Also:
QA Whiteboard:
Iteration: ---
Points: ---

One filter ain't enough (1.31 KB, patch)
2005-10-23 20:38 PDT, Phil Ringnalda (:philor)
no flags Details | Diff | Review
First cut work in progress (8.47 KB, patch)
2005-12-05 03:48 PST, Phil Ringnalda (:philor)
no flags Details | Diff | Review
Like so? (10.86 KB, patch)
2005-12-05 17:22 PST, Phil Ringnalda (:philor)
myk: review-
Details | Diff | Review
last, not pop (14.70 KB, patch)
2005-12-08 21:17 PST, Phil Ringnalda (:philor)
myk: review+
Details | Diff | Review
For 2_20-Branch (14.30 KB, patch)
2005-12-10 18:20 PST, Phil Ringnalda (:philor)
myk: review+
LpSolit: review-
Details | Diff | Review
2_20-Branch v2 (14.26 KB, patch)
2006-01-12 01:32 PST, Phil Ringnalda (:philor)
myk: review+
LpSolit: review+
Details | Diff | Review
docs patch for tip v1 about my comment (803 bytes, patch)
2006-02-28 00:49 PST, victory <never@receive.bug.mails.i.hate.spammer>
timeless: review+
Details | Diff | Review
other docs patch for 2.20 (1.29 KB, patch)
2006-04-20 00:23 PDT, victory <never@receive.bug.mails.i.hate.spammer>
mkanat: review+
Details | Diff | Review

Description Phil Ringnalda (:philor) 2005-10-22 22:59:59 PDT
While it's formally correct and to-spec to have an RSS 1.0 <title> that includes &lt; when it isn't starting HTML markup, since the spec calls for plain text, practically speaking the title element has been so polluted with HTML that virtually all consumers treat it as HTML, and a literal < has to be double-escaped as &amp;lt; to not be treated as markup, and either filtered out or interpreted.

With <title>Wrong DOM-tree with &lt;form&gt; and &lt;button&gt; elements.</title>, my cautious RSS reader and Google Reader both display "Wrong DOM-tree with and elements.", but the very popular Bloglines reader treats them both as HTML, and displays an actual button with the text "elements." Switching to "FILTER html" instead of "FILTER xml" to get &amp;lt;form&amp;gt; would wind up displaying "&lt;form&gt;" in strict constructionist RSS readers, but anyone using them is surely used to that by now.
Comment 1 Dave Miller [:justdave] ( 2005-10-22 23:05:15 PDT
XSS anyone?

Though it seems weird since we're already encoding them...  couldn't that be considered an XSS bug in said readers?
Comment 2 Phil Ringnalda (:philor) 2005-10-23 00:31:54 PDT
Bloglines must have broken their XSS filter: it's currently happy to display old favorites like |&lt;img src="..." onmouseover="this.src=''+document.cookie">|, and since its whole interface is done in JavaScript... ouch. I don't *think* they've always been that insecure.

Me, I just want to be able to read the whole summary of bugs without missing chunks.
Comment 3 Phil Ringnalda (:philor) 2005-10-23 20:38:26 PDT
Created attachment 200571 [details] [diff] [review]
One filter ain't enough

The fact that we call the desired state, &amp;lt; or &amp;amp;#161;, "double escaped" should have tipped me off that one FILTER wouldn't be enough. Static copy of sample output at works fine in Bloglines and two other browser-based readers, and three desktop readers. Google Reader, though, is insane beyond belief, and double-decodes it and renders the HTML.

I should know better than to mention this sort of alternative when I want a patch reviewed, but: this ambiguity is one of the three reasons that Atom exists. <atom:title type="text">&lt;form&gt;</atom:title> has exactly one possible meaning, and no question whatsoever about who is wrong if it isn't displayed as "<form>".
Comment 4 Phil Ringnalda (:philor) 2005-10-24 20:03:37 PDT
Meh. Google contacted. Bloglines contacted. NewsGator Online contacted.

If we're going to leave this security-sensitive until every browser-based RSS reader is free of XSS foolishness, we might want to clone an innocuous version to check in from. The first really wide-spread PoC I remember was the precursor to which was just a style="position:absolute;..." attribute. It's been very widely known in the RSS world that either you only allow through those elements and attributes you can trust, or you break and spew secrets, for more than two years, but it only took me half an hour to find XSS holes in the biggest three free online readers.
Comment 5 Phil Ringnalda (:philor) 2005-10-28 21:00:51 PDT
Google Reader: fixed. NewsGator Online: fixed. Bloglines: strongly not recommended to my friends and family anymore.
Comment 6 Phil Ringnalda (:philor) 2005-11-08 01:20:24 PST
Comment on attachment 200571 [details] [diff] [review]
One filter ain't enough

My blessedly spotty memory made me forget the details of this impossible situation: RSS 1.0 clearly specifies that title is plain text, the author of RSS 2.0 says that because the spec says that description may contain entity-encoded HTML but it doesn't say that title may, the content model for RSS 2.0 title is plain text, so the feed validator warns on any feed which contains anything which might be escaped HTML, so the current RSS is absolutely unquestionably correct, and fails in virtually any client (and those like Firefox where it doesn't have bugs filed on them to make them fail).

If you need to use the character "<" in a feed title, which Bugzilla absolutely does, you have exactly three choices: be invalid and work, be valid and fail, or, the *only* real choice, use Atom instead.
Comment 7 Frédéric Buclin 2005-11-08 01:43:55 PST
If I understand correctly, what we actually do is correct. And even if some tags are interpreted in the bug summary, I really don't see how they could use this exploit to get sensitive information.

I'm suggesting WONTFIX.
Comment 8 Dave Miller [:justdave] ( 2005-11-08 10:39:28 PST
Jesse, Dan, Brad, opinions?  Or Myk even (forumzilla) ?
Comment 9 Dave Miller [:justdave] ( 2005-11-08 11:18:18 PST
The official RSS spec at
says that the <title> element contains #PCDATA, it does not say that it contains plain text.

According to W3Schools (

PCDATA means parsed character data.

Think of character data as the text found between the start tag and the end tag of an XML element.

PCDATA is text that will be parsed by a parser. Tags inside the text will be treated as markup and entities will be expanded.

This means if we follow the spec to the letter, the contents of the title CAN contain <, >, and &, but they need to be escaped.  ONCE.  Any that are not escaped will be treated as tags and parsed.

This is what we do now for title.  So Bugzilla is fine.  If we follow the spec.

However, looking at the source of our RSS, we appear to be escaping the HTML in the <description> field, also, which is also declared to be #PCDATA.

        &lt;td&gt;Dave Miller&lt;/td&gt;
        &lt;td&gt;Severity &lt;/td&gt;
        &lt;td&gt;Mon 07:01&lt;/td&gt;

If I'm reading the spec correctly, this is over-escaped.  Yet it looks fine in my RSS reader (NetNewsWire Lite).
Comment 10 Phil Ringnalda (:philor) 2005-11-08 12:06:04 PST
What the RSS 1.0 spec says matters formally, but not practically, because by far the bulk of consumers just parse everything into uniform buckets and treat the contents as escaped soup even if they didn't come from RSS 2.0. See, e.g., and up the thread where RSS-DEV working group members say, roughly, "RSS 2.0 has polluted the content model, it's now escaped HTML."

If you want to be formally correct with content, while still passing HTML, you can use the <content:encoded> element for escaped HTML, but you still have the same problem with the required <description> that you have with <title>: it cannot contain HTML, but < passed correctly for plain text as &lt; will be interpreted as markup. Even Thunderbird, which survives the unescaped titles, interprets markup in description.

If you insist on not switching to Atom, and insist on not following the lead of virtually every consumer and the advice of the WG, there is one other option: replace every "<" in titles and plain text descriptions with a roughly similar looking character, say a &#x2039;. It looks awful in most fonts, but at least it's unambiguously not markup.
Comment 11 Dave Miller [:justdave] ( 2005-11-15 12:27:48 PST
OK, basically there's no sane way to fix this while continuing to use RSS 1.0.  If we intend to have HTML markup (or characters that could be used to create HTML markup) then we need to be using either RSS 2.0 or Atom.
Comment 12 Phil Ringnalda (:philor) 2005-11-15 13:30:53 PST
(In reply to comment #11)
> we need to be using either RSS 2.0 or Atom.

RSS 2.0 only solves the problem for <description>, and is even more ambiguous for instances of "&lt;" or "&amp;lt;" in <title>s, since the only points of reference for whether it's HTML or plain text are a weblog post by the spec author which explicitly says "this is not spec text" and each individual's interpretation of the meaning of the spec saying of <description> that it "may contain entity-encoded HTML" and saying nothing either way about <title>.

Even with Atom, there are still likely to be consumers which will misrender it; the difference is that for any misrendering, there is no doubt whatsoever whether it's Bugzilla's bug or the consumer's bug.
Comment 13 Myk Melez [:myk] [@mykmelez] 2005-11-18 17:27:27 PST
RSS 2.0 has numerous problems besides just the inconsistency Phil pointed out.  RSS 1.0 is better, but work on it seems to have died out years ago, and it hasn't kept up with the times.  Atom is both extant and sensible.  We should switch to it.
Comment 14 Frédéric Buclin 2005-11-30 14:24:22 PST
This may help us writing Atom feeds:

Also: "2005-11-21: W3C is pleased to launch the W3C Feed Validation Service, a free online tool open to creators of syndication feeds in formats such as RSS and Atom." The URL is:
Comment 15 Frédéric Buclin 2005-12-04 09:35:49 PST
Phil, are you going to implement Atom yourself? This bug is assigned to you and we plan to release 2.20.1 and 2.21.2 very soon now.
Comment 16 Phil Ringnalda (:philor) 2005-12-04 13:09:50 PST
Gah, you're right, I did forget it was assigned to me. I'm working on it now, but anyone should feel free to steal it if I screw up again. Speaking of which: anyone who can, which doesn't seem to include me, should make it public: the last need for it to be security-sensitive, Bloglines' XSS holes, mostly got fixed after I published them, and then I screwed up and syndicated that post to planet.m.o, so the whole Moz community already knows about their vulnerability.

(BTW: W3C's feed validator is a branch, currently from mid-October though they're going to try to get closer to tip; is at most a few hours behind tip.)
Comment 17 Phil Ringnalda (:philor) 2005-12-05 03:48:07 PST
Created attachment 205023 [details] [diff] [review]
First cut work in progress

Close as I think I can get with read-only access: I don't know how to fake a cvs remove of list.rss.tmpl.

The code to get the feed/updated date could use a more experienced eyeballing: it seems to work with my test data, but I haven't ever done much with Template Toolkit to know what's right and wrong.

Both sorts of <id>s make me nervous: the feed/id with lots of URL-encoding and XML-escaping is at risk of misinterpretation by buggy software, though there's really no alternative, and the entry/ids quite possibly should be specific to a particular feed, and not the id that an entry containing the whole content of the bug would use, but minting a feed-specific tag: URI isn't easy.

Then there's the choice of which sort of content escaping to use, which is the biggest reason I'm not really ready for actual review: even though Atom makes it perfectly clear who is at fault, there's at least one widely-used reader which will misinterpret any given type. For the otherwise-best choice, type="html", Firefox is one of the ones misinterpreting it, but with Firefox's only workable choice, type="text", we're back to comment 0 - Bloglines (clearly incorrectly, in type="text" Atom) interprets what they should be escaping, though with a few less XSS holes than a month ago.
Comment 18 Frédéric Buclin 2005-12-05 04:02:39 PST
Comment on attachment 205023 [details] [diff] [review]
First cut work in progress

>Index: webtools/bugzilla/buglist.cgi

>-if ($format->{'extension'} eq 'rss') {
>+if ($format->{'extension'} eq 'rss' ||
>+    $format->{'extension'} eq 'atom') {

Any reason to keep 'rss' here?
Comment 19 Phil Ringnalda (:philor) 2005-12-05 11:54:43 PST
(In reply to comment #18)
> Any reason to keep 'rss' here?

An abortive attempt at not breaking existing users, and installs that want to use RSS or both (client support for Atom isn't perfect and complete yet, and there are people with political preferences for one over the other).

Advice welcome: should I just intercept ctype=rss and force it to ctype=atom at some point?
Comment 20 Myk Melez [:myk] [@mykmelez] 2005-12-05 12:30:37 PST
> Advice welcome: should I just intercept ctype=rss and force it to ctype=atom at
> some point?

I think that's the best behavior.  Feed readers generally grep for feed type, so they should be able to handle the type changing out from under them, and if we're not going to have an RSS feed anymore we're better off directing users to the Atom feed than telling them "document not found".
Comment 21 Phil Ringnalda (:philor) 2005-12-05 17:22:30 PST
Created attachment 205101 [details] [diff] [review]
Like so?

Okay, with a cvs remove list.rss.tmpl I think this wipes it out completely.
Comment 22 Frédéric Buclin 2005-12-05 17:28:04 PST
(In reply to comment #21)
> Okay, with a cvs remove list.rss.tmpl I think this wipes it out completely.

cvsdo remove template/en/default/list/list.rss.html

/usr/bin/cvsdo is in the cvsutils package. And it doesn't require any write access privilege. ;)
Comment 23 Phil Ringnalda (:philor) 2005-12-05 17:35:31 PST
cc:ing some Atomistas, if the security flag lets me: comment 17 and sample output at if you'd be so kind.
Comment 24 Frédéric Buclin 2005-12-06 09:22:19 PST
Looks like Firefox doesn't like Atom. Any suggestion for a good Firefox extension?
Comment 25 Phil Ringnalda (:philor) 2005-12-06 09:50:51 PST
Firefox WFM (at least, with type="text", it's bugged for type="html"): bookmarks manager, File, New Live Bookmark, add the sample URL and it works. How is it failing you?

The usual suspects for content, not just titles, are and though says that both are buggy with type="text" titles.
Comment 26 Frédéric Buclin 2005-12-06 09:55:48 PST
(In reply to comment #25)
> How is it failing you?

Clicking the link in comment 23 makes Firefox to download it, and dragging it in my Personal bar doesn't help. /me has Fx 1.5
Comment 27 Phil Ringnalda (:philor) 2005-12-06 11:25:06 PST
(In reply to comment #26)
> Clicking the link in comment 23 makes Firefox to download it

Oh, bug 155730. You aren't allowed to see application/atom+xml, or any */*+xml. Add it as a Live Bookmark (self-plug: or download it and fool Firefox with a local file.
Comment 28 Frédéric Buclin 2005-12-07 12:21:54 PST
RFC 4287: The Atom Syndication Format
Comment 29 Myk Melez [:myk] [@mykmelez] 2005-12-08 19:44:58 PST
Comment on attachment 205101 [details] [diff] [review]
Like so?

There's only one problem with this patch: the feed for a bug list with one bug contains no entry for that bug, just a description of the feed itself, as if the search had returned no results.  The RSS version displays the single item correctly, so this is a regression.

Fix that, and I'll grant review.

Note that if the list contains no bugs, complains that the feed's author element is missing.  But I think feedvalidator is in the wrong here, since the rule from the RFC is:

   o  atom:feed elements MUST contain one or more atom:author elements,
      unless all of the atom:feed element's child atom:entry elements
      contain at least one atom:author element.

Per this rule, it seems to me that if there are no entries, you don't have to add an author to the feed itself, since all of the entry elements do contain an author element, even if "all" here is zero. also complains that "title should not contain HTML unless declared in the type attribute" for bugs whose titles reference entities (f.e. &#1503;).  But the validator is assuming incorrectly that the title contains HTML, when in fact it does not (the titles are plain text, and any HTML-like constructs are meant to be displayed literally).
Comment 30 Phil Ringnalda (:philor) 2005-12-08 21:13:15 PST
(In reply to comment #29)
> There's only one problem with this patch: the feed for a bug list with one bug
> contains no entry for that bug

I'm astonished that it actually gave them all when n > 1 - I was foolishly popping off the latest bug to get the feed-level updated, I don't see why that doesn't take one off when there's more than one. If I understood Template Toolkit, I might say that was a bug in it.

> Note that if the list contains no bugs, complains that the
> feed's author element is missing.  But I think feedvalidator is in the wrong
> here

Agreed, I'll fix it or file it. And since a Zarro Boogs feed can be useful, I'll file a bug on Bugzilla to move the "Feed" link out of the bugs.size > 0 section once this one's fixed.
Comment 31 Myk Melez [:myk] [@mykmelez] 2005-12-08 21:15:15 PST
>Agreed, I'll fix it or file it.

For what it's worth, the error suggested I mail the feedvalidator-users list if I disagreed with the message, so I have done so.
Comment 32 Phil Ringnalda (:philor) 2005-12-08 21:17:15 PST
Created attachment 205369 [details] [diff] [review]
last, not pop

.last instead of .pop

Have to hope there aren't any clients that will be bothered by having the current date for feed/updated in an empty feed, since there's no way we can give the date when the contents of an empty feed actually changed.
Comment 33 Phil Ringnalda (:philor) 2005-12-08 21:20:12 PST
The "Internal Server Error" I've gotten both times I added an attachment with a review flag set is a known bug with security flagged bugs, isn't it?
Comment 34 Myk Melez [:myk] [@mykmelez] 2005-12-08 21:31:05 PST
Comment on attachment 205369 [details] [diff] [review]
last, not pop

Comment 35 Frédéric Buclin 2005-12-09 03:26:19 PST
Phil, does your patch apply on both the tip and 2.20? If yes, can you request approval for both tip and 2.20 please and add [ready for 2.20.1][ready for 2.21.2] in the status whiteboard? Else a backport for 2.20 is required.
Comment 36 Phil Ringnalda (:philor) 2005-12-10 18:20:34 PST
Created attachment 205513 [details] [diff] [review]
For 2_20-Branch

Just some fuzz in the context.
Comment 37 Frédéric Buclin 2005-12-11 06:49:31 PST
Comment on attachment 205513 [details] [diff] [review]
For 2_20-Branch

backports need to be reviewed too.
Comment 38 Frédéric Buclin 2006-01-11 07:03:56 PST
This patch is now live on our QA installations, e.g. but neither Fx nor Tb can read the generated Atom feed, and it seems that FeedValidator doesn't accept that long uri.

Phil, could you do some tests at the URL above and let us know what's going on? I'm using Fx and I could click the icon in the address bar and successfully add it to my bookmarks, but when opening this feed, I get the message: "Live Bookmark Feed failed to load".
Comment 39 Frédéric Buclin 2006-01-11 07:31:10 PST
Comment on attachment 205513 [details] [diff] [review]
For 2_20-Branch

>Index: template/en/default/list/list.atom.tmpl

>+<?xml version="1.0"[% IF Param('utf8') %] encoding="UTF-8"[% END %]?>

Here is the reason why it doesn't work: utf8 doesn't exist on 2.20:

"undef error - Can't find param named utf8 at Bugzilla/ line 151."
Comment 40 Frédéric Buclin 2006-01-11 07:43:14 PST
(In reply to comment #39)
> >+<?xml version="1.0"[% IF Param('utf8') %] encoding="UTF-8"[% END %]?>

Replacing this line by: <?xml version="1.0"?> fixes the problem. This change is now applied on our QA installation for 2.20.

Phil, please post a new patch with this change.
Comment 41 Phil Ringnalda (:philor) 2006-01-11 09:11:45 PST
Arrgh! Next problem: the very first feed I validated with real data (try validating searches from "Find a Specific Bug", they're shorter URLs) gives us a warning that "name should not be blank" from a reporter without a realname. What do we want to use in that case? Is the username from the email that's used for flags already available to me there? And do we already have a routine to strip empty parameters from query URLs that we could use on the feed URL? The validator's far from the only thing that enforces a needless 255 character limit on URLs.
Comment 42 Phil Ringnalda (:philor) 2006-01-12 01:32:54 PST
Created attachment 208257 [details] [diff] [review]
2_20-Branch v2

Any client which treats an XML decl without an encoding when the file was sent as application/atom+xml as any different than having a hardcoded |encoding="UTF-8"| is utterly broken, anyway.
Comment 43 Frédéric Buclin 2006-01-12 01:38:56 PST
Comment on attachment 208257 [details] [diff] [review]
2_20-Branch v2

This change is already applied on our QA installation for 2.20 and works with Fx and Tb. r=LpSolit
Comment 44 Frédéric Buclin 2006-02-20 16:13:08 PST

Checking in buglist.cgi;
/cvsroot/mozilla/webtools/bugzilla/buglist.cgi,v  <--  buglist.cgi
new revision: 1.325; previous revision: 1.324
Checking in Bugzilla/;
/cvsroot/mozilla/webtools/bugzilla/Bugzilla/,v  <--
new revision: 1.34; previous revision: 1.33
Checking in template/en/default/global/header.html.tmpl;
/cvsroot/mozilla/webtools/bugzilla/template/en/default/global/header.html.tmpl,v  <--  header.html.tmpl
new revision: 1.40; previous revision: 1.39
RCS file: /cvsroot/mozilla/webtools/bugzilla/template/en/default/list/list.atom.tmpl,v
Checking in template/en/default/list/list.atom.tmpl;
/cvsroot/mozilla/webtools/bugzilla/template/en/default/list/list.atom.tmpl,v  <--  list.atom.tmpl
initial revision: 1.1
Checking in template/en/default/list/list.html.tmpl;
/cvsroot/mozilla/webtools/bugzilla/template/en/default/list/list.html.tmpl,v  <--  list.html.tmpl
new revision: 1.46; previous revision: 1.45
Removing template/en/default/list/list.rss.tmpl;
/cvsroot/mozilla/webtools/bugzilla/template/en/default/list/list.rss.tmpl,v  <--  list.rss.tmpl
new revision: delete; previous revision: 1.5


Checking in buglist.cgi;
/cvsroot/mozilla/webtools/bugzilla/buglist.cgi,v  <--  buglist.cgi
new revision:; previous revision:
Checking in Bugzilla/;
/cvsroot/mozilla/webtools/bugzilla/Bugzilla/,v  <--
new revision:; previous revision:
Checking in template/en/default/global/header.html.tmpl;
/cvsroot/mozilla/webtools/bugzilla/template/en/default/global/header.html.tmpl,v  <--  header.html.tmpl
new revision:; previous revision: 1.39
Checking in template/en/default/list/list.atom.tmpl;
/cvsroot/mozilla/webtools/bugzilla/template/en/default/list/list.atom.tmpl,v  <--  list.atom.tmpl
new revision:; previous revision:
Checking in template/en/default/list/list.html.tmpl;
/cvsroot/mozilla/webtools/bugzilla/template/en/default/list/list.html.tmpl,v  <--  list.html.tmpl
new revision:; previous revision:
Removing template/en/default/list/list.rss.tmpl;
/cvsroot/mozilla/webtools/bugzilla/template/en/default/list/Attic/list.rss.tmpl,v  <--  list.rss.tmpl
new revision: delete; previous revision:
Comment 45 Max Kanat-Alexander 2006-02-21 00:20:00 PST
Security Advisory posted; unlocking bug.
Comment 46 victory <never@receive.bug.mails.i.hate.spammer> 2006-02-28 00:49:27 PST
Created attachment 213434 [details] [diff] [review]
docs patch for tip v1 about my comment

now Bugzilla feed is Atom, not RSS 1.0
Comment 47 victory <never@receive.bug.mails.i.hate.spammer> 2006-04-20 00:23:40 PDT
Created attachment 219117 [details] [diff] [review]
other docs patch for 2.20

Note You need to log in before you can comment on or make changes to this bug.