Closed Bug 1575686 Opened 5 years ago Closed 5 years ago

pushlog ATOM feed does not escape filenames

Categories

(Developer Services :: Mercurial: hg.mozilla.org, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: jorgk-bmo, Assigned: sheehan)

Details

Attachments

(1 file)

Usually I used https://hg.mozilla.org/mozilla-central/pushlog as feed address, but that stopped working today.

So I visited https://hg.mozilla.org/mozilla-central/pushlog and noticed that it the feed had moved to https://hg.mozilla.org/mozilla-central/rss-log. So I reconfigured my Thunderbird client and it's working again.

But it only pulls the latest 20 articles. That's pretty bad. Of effective Thunderbird sheriffing, we need an overview of all the changeset landed since we inspect them for certain changes we need to adapt to straight away, for example the package manifests or any TaskCluster changes.

Can you please reinstate the previous state.

Alta88, could you take a look also, this is vital to have.

Flags: needinfo?(alta88)

Hmm, https://hg.mozilla.org/mozilla-central/pushlog now provides one article?

Well, Thunderbird says:
2019-08-22 00:39:58 Feeds INFO FeedParser.parseFeed: - XML Parsing Error: not well-formed
Location: https://hg.mozilla.org/mozilla-central/pushlog
Line Number 32, Column 2276:
<ul class="filelist"><li class="file"> [lots of stuff delete here]

2019-08-22 00:39:58 Feeds WARN downloaded: updates disabled due to error, check the url - http://hg.mozilla.org/mozilla-central/pushlog

2019-08-22 00:39:58 Feeds INFO downloaded: Update: Blogs & News Feeds/mozilla-central Pushlog -> http://hg.mozilla.org/mozilla-central/pushlog is not a valid feed.

Hmm, I found this (out of date) page: https://developer.mozilla.org/en-US/docs/Mercurial/Using_Mercurial/Filter_a_Mercurial_Changelog_feed_by_Pushlog_directory_paths
Based on that, the rss-log always supplied 20 articles, but the pushlog feed provided more. So that's just broken then?

So I'm changing the summary.

Summary: RSS log at https://hg.mozilla.org/mozilla-central/rss-log only supplies 20 articles → RSS log at https://hg.mozilla.org/comm-central/pushlog broken
Summary: RSS log at https://hg.mozilla.org/comm-central/pushlog broken → RSS log at https://hg.mozilla.org/mozilla-central/pushlog broken

The publisher has to provide a valid file (that's a gecko error btw). There are easy tools in the Tb subscribe UI to validate and check urls. Why, you can add a cert exception right there if necessary.

There is even support for automatically updating feed urls seamlessly, if the publisher so chooses. Just takes a click on the Learn More link to get to the guide and read all about it. Publishers can even make related feed items thread if they want. That's sort of useful.

Some publishers have years worth of entries; some choose to age entries away intraday. Sometimes systems that create feed xml files just break..

Flags: needinfo?(alta88)
Summary: RSS log at https://hg.mozilla.org/mozilla-central/pushlog broken → RSS log at https://hg.mozilla.org/mozilla-central/rss-log only supplies 20 articles
Summary: RSS log at https://hg.mozilla.org/mozilla-central/rss-log only supplies 20 articles → RSS log at https://hg.mozilla.org/mozilla-central/pushlog broken

The last deploy to hg.mo was July 31st. Even before that, the most recent change that could possibly have impacted RSS feeds in pushlog was on June 6 (likely deployed shortly after that - I'd need to dig through IRC logs to confirm). Nothing has changed on hgmo recently, and nothing has changed that could have impacted RSS feeds in quite some time.

Is it possible something changed with the XML parsing code? If you're using Nightly Thunderbird which changes at least once a day, and hgmo hasn't changed in 3 weeks, my instincts say this is an issue with the RSS client instead of the server.

If I can do anything to help debug, let me know.

The last changeset that worked to show in https://hg.mozilla.org/mozilla-central/pushlog was changeset 33a280baf6f4955a82e449b1e5dfb355da3a21a6 (not that this is the trigger). https://hg.mozilla.org/comm-central/pushlog still works for me.

Let's rephrase this: Richard is saying that the last article with a changeset he could pull from the feed was:
33a280baf6f4955a82e449b1e5dfb355da3a21a6 - Daniel Varga — Backed out changeset e533d2907a31 (bug 1564252) for build bustage at /lib/gcc/x86_64-unknown-linux-gnu/6.4.0. a=backout
at Wed Aug 21 11:35:40 2019 +0000 yesterday.

I clicked on https://hg.mozilla.org/comm-central/pushlog at the time of reporting the bug last night and saw a HTML page with one article, later no articles. Now I see the usual page with plenty of articles of 22 Aug 2019, down to
Changeset 835d0d9cdae81b3ab583d4ccf7459d00b041fad6 21 August 2019, 23:55
which, from memory, was the only one showing last night.

Sadly, when trying to add the feed to TB I get a feed not valid error, and in another profile where the feed is already added, I get:

2019-08-22 09:28:32 Feeds INFO FeedParser.parseFeed: - XML Parsing Error: not well-formed
Location: https://hg.mozilla.org/mozilla-central/pushlog
Line Number 1264, Column 2276:
<ul class="filelist"><li class="file">browser/base/content/browser-places.js</li><li class="file">browser/components/customizableui/test/browser_973641_button_addon.js</li><li class="file">browser/components/newtab/bin/vendor.js</li><li class="file">browser/components/preferences/in-content/sync.inc.xul</li><li class="file">browser/components/preferences/in-content/tests/siteData/offline/offline.html</li><li class="file">browser/components/preferences/in-content/tests/siteData/service_worker_test.html</li><li class="file">browser/components/preferences/in-content/tests/siteData/site_data_test.html</li><li class="file">browser/components/sessionstore/test/browser_frame_history_a.html</li><li class="file">browser/components/sessionstore/test/browser_frame_history_b.html</li><li class="file">browser/components…

2019-08-22 09:28:32 Feeds WARN downloaded: updates disabled due to error, check the url - http://hg.mozilla.org/mozilla-central/pushlog

2019-08-22 09:28:32 Feeds INFO downloaded: Update: Blogs & News Feeds/mozilla-central Pushlog -> http://hg.mozilla.org/mozilla-central/pushlog is not a valid feed.

So the bug is still valid, the feed is broken now. Nothing changed in the client, I've been using the exact same binary of TB 68 ESR.

You can also use Firefox 60.8 ESR which still has the RSS support and add the feed. You get the error Live Bookmark failed to load.

The last experiment is using FF 68 with the Livemarks extension. That says: Failed to fetch feed

So we have three clients rejecting the feed. Is that enough to debug?

Summary: RSS log at https://hg.mozilla.org/mozilla-central/pushlog broken → RSS log at https://hg.mozilla.org/mozilla-central/pushlog broken: Failing in Thunderbird, FF 60.8 RSS and FF 68 with Livemarks add-on

I assume there is a bug in HG RSS support and it has a hiccup on
Changeset 835d0d9cdae81b3ab583d4ccf7459d00b041fad6 21 August 2019, 23:55

When HG doesn't supply that any more, it will magically fix itself.

Maybe it's possible to "reset" that feed or remove that offending article.

For TB sheriffing it's very important to have this information, otherwise we operate on "trial and error". The bustage here:
https://treeherder.mozilla.org/#/jobs?repo=comm-central&revision=2e6b9d58dad5295da25ba5680ec86919d4009d98
would have been avoided had I had the information from the feed since we scan for any M-C TaskCluster changes.

I took a look and I think I found the problem.

Pulling down the RSS feed with Python and passing to xml.dom.minidom causes a hiccup on a given line and column. Manipulating the object, we can see line 1768 is the offender. :glob took a look as well and managed to get this error:

Error on line 1768: The reference to entity "affiliate" must end with the ';' delimiter. You most likely forgot to escape '&' into '&amp;

It appears the parser is choking on filenames like mobile/android/tests/browser/chrome/tp5/163.com/g.163.com/r@site=netease&affiliate=homepage&cat=homepage&type=column360x100&location=3.html due to not escaping the string before passing through as XML.

Looking into the code, I found this line, which is likely the offender. We likely need to change that template expression from {name} to {name|escape}. The annotate indicates this is the change that introduced the unescaped file names. Note the date of March 10, 2010. So, congrats on uncovering an almost decade-old bug! :D

I'm testing the patch for this now.

Assignee: nobody → sheehan

A few of the Thunderbird devs noticed the ATOM pushlog feed was busted
recently. Upon further inspection it appears we have not been escaping
filenames in the ATOM feed since it's inception (~10 years ago!).

This commit passes the filename through the escape template filter.
We also add a new push/changeset to the test setup, with a filename that
includes an & character, to confirm the results are escaped in
test-hgweb-atom.t. This causes a few other tests to change as well,
since the new push will be included in output of those tests.

Pushed by cosheehan@mozilla.com:
https://hg.mozilla.org/hgcustom/version-control-tools/rev/bcbca3117cf7
hgtemplates: escape filenames in pushlog ATOM feed r=glob

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED

This is deployed.

Summary: RSS log at https://hg.mozilla.org/mozilla-central/pushlog broken: Failing in Thunderbird, FF 60.8 RSS and FF 68 with Livemarks add-on → pushlog ATOM feed does not escape filenames

Wow, this works again, thank you so much for tracking it down and fixing it so quickly, very much appreciated. It makes life sheriffing Thunderbird so much easier (in fact, the two bustages we fixed today on comm-central would have been detected using the "feed analysis" were we look for M-C changes that will likely affect Thunderbird).

You are quite welcome. :)

Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: