Closed Bug 1534163 Opened 1 year ago Closed 5 months ago

Remove nsIRDF* use from feeds code

Categories

(MailNews Core :: Feed Reader, task)

task
Not set

Tracking

(thunderbird68 affected, thunderbird69 affected)

RESOLVED FIXED
Thunderbird 71.0
Tracking Status
thunderbird68 --- affected
thunderbird69 --- affected

People

(Reporter: benc, Assigned: benc)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 19 obsolete files)

33.24 KB, patch
benc
: review+
Details | Diff | Splinter Review
69.20 KB, patch
mkmelin
: review+
Details | Diff | Splinter Review
25.78 KB, patch
benc
: review+
Details | Diff | Splinter Review

The feeds code (under mailnews/extensions/newsblog) is the last main user of the existing rdf code.

I've been looking through the feed code and here are my notes and plans for de-nsIRDF-ing it.

There seem to be three code uses of the nsIRDF code in feeds:

  1. to maintain a database of subscribed feeds (backed by feeds.rdf). Individual Feed objects are defined in terms of nsIRDF objects.

  2. to maintain a database of feeditems (backed by feeditems.rdf). My take is that this holds extra feeds-specific data, outside the normal message storage system (each item has: last-seen, valid, stored).

  3. to parse RSS1 feeds (see Bug 1532177).

So here are my proposed changes:

  • alter all the feeds code to use plain javascript objects rather than the XPCOM nsIRDF classes (ie string rather than nsIRDFResource). This'll be tedious, but not too hard as the vast bulk of the nsIRDF use is simply to store plain datatypes (string/int/bool/date).

  • switch to using a javascript-native RDFXML parser for parsing feeds.rdf, feeditems.rdf and RSS1 feeds. It doesn't need to cover the entire RDFXML spec - just the bits we need. I've already started on this and have a working parser (it produces RDF triples from an XMLDocument - either from DOMParser or from an XMLHttpRequest).

  • update the RSS1 parser to use the javascript-native parser. RSS1 Feeds out in the wild are likely to exercise the parser more than our own feeds.rdf/feeditems.rdf, so there might be a few more bits of the RDFXML spec to implement.

  • decide on how to serialise feeds.rdf and feeditems.rdf back to disk. We could switch to .json easily enough. We still have to be able to read existing rdf files no matter what, to convert them to .json. But if keeping the old file formats is preferred, then writing an RDFXML serialiser isn't tooooo bad (it's a lot easier than writing a parser!).

  • write more feed tests! (eg my patch in Bug 1527764 broke feeds, but no tests picked it up). Both xpcshell and integration (mochi? mozmill?). There were some tests in firefox before they removed feed support which might be relevant.

Because of the way the nsIRDF classes are embedded everywhere, I'm not sure there any nice clean refactoring steps to do this stuff piecemeal :-(

Blocks: mail-killrdf
Blocks: 1527764
Depends on: 1532177

(In reply to Ben Campbell from comment #0)

  • decide on how to serialise feeds.rdf and feeditems.rdf back to disk.
    We could switch to .json easily enough. We still have to be able to read
    existing rdf files no matter what, to convert them to .json. But if
    keeping the old file formats is preferred, then writing an RDFXML serialiser
    isn't tooooo bad (it's a lot easier than writing a parser!).

Sounds like a one-time migration to .json would be the way to go. See MailMigrator.jsm

Would this be better sitting under MailNews:Feed Reader, MailNews: Backend or something?

Absolutely.

Component: General → Feed Reader
Product: Thunderbird → MailNews Core
Attached patch update-rss1-parsing-wip.patch (obsolete) — Splinter Review

Just a work-in-progress update.
This adds a new (lightweight) RDF parser and switches over the RSS1 parsing to use it instead of the heavyweight XPCOM-based rdf code.
Also adds some structure for unit tests. Works for the noddy w3.org rss1 example, but I need to try it out on some more feeds in the wild.

This needs cleaning up, but it'll be the first of (I think) two patches.
The next one will replace the rdf use in feeds.rdf (the subscriptions list) and feeditems.rdf (downloaded news items). I've got a good chunk of the second patch, but it's not yet runable - the nsIRDF* use is pretty pervasive. I managed to slice out the RSS1 parsing into this first patch, but the rest will be a all-at-once change.

Assignee: nobody → benc
Comment on attachment 9051614 [details] [diff] [review]
update-rss1-parsing-wip.patch

Review of attachment 9051614 [details] [diff] [review]:
-----------------------------------------------------------------

Instead of string concatenation, I would suggest to allow using the normal namespace declarations with defaults, so instead of

FeedUtils.RDF_SYNTAX_NS + "type"

the code would just expand a passed in "rdf:type"

@mkmelin: not sure who's area this is, so you're my default r? :-)

This patch is the first of two and it:

  • adds a simple JS-based RDFXML parser.
  • uses it to replaces the nsRDF* parser for RSS1 feeds.
  • adds boilerplate for xpcshell unit tests, along with some test data and tests.

(Second patch should be a lot smaller, and deals with the RDF parsing for feeds.rdf and feeditems.rdf)

Attachment #9051614 - Attachment is obsolete: true
Attachment #9052784 - Flags: review?(mkmelin+mozilla)

(In reply to Magnus Melin [:mkmelin] from comment #5)

Instead of string concatenation, I would suggest to allow using the normal
namespace declarations with defaults, so instead of

FeedUtils.RDF_SYNTAX_NS + "type"

the code would just expand a passed in "rdf:type"

I umm-ed and ahh-ed about this for a while. But I decided to keep it simple and ugly for now.
I was thinking of having provision in RDFData for a prefix table, used to expand any namespace prefixes during lookup.
But it gets more confusing when you're reading values back out - should they be returned in some expanded/shortened canonical form? Mainly a documentation issue, but lots of potential for confusion.

I did implement something similar locally in test_rdfparser.js for describing test data, which helped a lot, so I'll probably revisit it after I finish up on the second patch. There'll be some tidying of RDF uri symbols in that patch anyway - the get RSS_CHANNEL() nsIRDFResource stuff.

Comment on attachment 9052784 [details] [diff] [review]
part-one--update-rss1-parsing-1.patch

If in doubt, refer to:
https://wiki.mozilla.org/Modules/MailNews_Core#Feeds
Attachment #9052784 - Flags: review?(alta88)
Comment on attachment 9052784 [details] [diff] [review]
part-one--update-rss1-parsing-1.patch

Review of attachment 9052784 [details] [diff] [review]:
-----------------------------------------------------------------

Looks ok to me. At least it works for a test case of https://websitesetup.org/feed/rdf/

Wet's see what alta88 thinks

::: mailnews/extensions/newsblog/content/FeedUtils.jsm
@@ +21,1 @@
>  

for another bug, but I'd think all these above should be modules

::: mailnews/extensions/newsblog/content/RDFParser.js
@@ +150,5 @@
> +  },
> +
> +  /**
> +   * Fetch a list of all the subjects in the RDFData object.
> +   * @returns {Array} array of strings containing the subject URIs

@returns {String[]}

@@ +168,5 @@
> +   * but we can't because we don't currently know (or care) if
> +   * objects are literals or not.
> +   * So for now just use a pseudo-N-Triples formatting.
> +   *
> +   * @returns {undefined}

I think you should leave out the @return for voids.

@@ +227,5 @@
> +  },
> +
> +
> +  /**
> +   * internal use only - add a triple to the current RDFData being built.

nit: capital I

For an internal function, perhaps make it _emit

::: mailnews/extensions/newsblog/test/unit/test_feedparser.js
@@ +10,5 @@
> +    }
> +  }
> +}
> +
> +

please remove the double blank lines

::: mailnews/extensions/newsblog/test/unit/xpcshell.ini
@@ +1,2 @@
> +[DEFAULT]
> +head = head_feeds.js 

trailing whitespace
Attachment #9052784 - Flags: review?(mkmelin+mozilla) → review+

(In reply to Jorg K (GMT+1) from comment #8)

Comment on attachment 9052784 [details] [diff] [review]
part-one--update-rss1-parsing-1.patch

If in doubt, refer to:
https://wiki.mozilla.org/Modules/MailNews_Core#Feeds

The blame on just about every line is a hint too..

@@ +168,5 @@

    • but we can't because we don't currently know (or care) if
    • objects are literals or not.
    • So for now just use a pseudo-N-Triples formatting.
    • @returns {undefined}

I think you should leave out the @return for voids.

The linter will complain if the rule is prefers returns. It was annoying at first, but now I think extreme strictness in linting is a Good Thing, especially if used to make up for a language with no typing.

The string sanitization in the getRDFTargetValue* functions has been lost, which regresses a CVE level vulnerability fix (hovering some of that code in searchfox links to the bug). Also, if those functions are obsoleted, they should be removed.

A good torture test rdf feed is https://www.livejournal.com/stats/latest-rss.bml, please verify it works the same before/after.
Many academic journals use rdf: https://journals.sagepub.com/action/showFeed?ui=0&mi=ehikzz&ai=2b4&jc=acrc&type=etoc&feed=rss

RDF resource has a property |valueUTF8|, will IDN and l10n urls continue to work? Meaning when viewing the message, the website header should pretty show the link and it should open correctly in a browser. I don't have an rdf site for this anymore, but you can create a file:// url feed and simulate it. To enable subscribing to local files, issue FeedUtils._validSchemes.push("file") in the console.

I'll look again in greater detail once these are addressed.

Attachment #9052784 - Flags: review?(alta88) → review-

Thanks for the review. Could you please provide such a feed in a file.

(In reply to alta88 from comment #12)

The string sanitization in the getRDFTargetValue* functions has been lost, which regresses a CVE level vulnerability fix (hovering some of that code in searchfox links to the bug). Also, if those functions are obsoleted, they should be removed.

Thanks! That's an excellent point - I'll track it down.

A good torture test rdf feed is https://www.livejournal.com/stats/latest-rss.bml, please verify it works the same before/after.
Many academic journals use rdf: https://journals.sagepub.com/action/showFeed?ui=0&mi=ehikzz&ai=2b4&jc=acrc&type=etoc&feed=rss

They look like great tests (particularly the livejournal one, with Cyrillic).
I'd love to include something like these in the unit test data, but not sure what the copyright implications are.
Anyone know what policy is on this, or alternative feed suggestions without copyright issues?

RDF resource has a property |valueUTF8|, will IDN and l10n urls continue to work? Meaning when viewing the message, the website header should pretty show the link and it should open correctly in a browser. I don't have an rdf site for this anymore, but you can create a file:// url feed and simulate it. To enable subscribing to local files, issue FeedUtils._validSchemes.push("file") in the console.

The new rdfxml parser uses DOMParser to handle the actual parsing (it reads from an already-parsed XMLDocument rather than a byte stream). So I think all strings should come out just fine, whatever their encoding. I perform any further string processing (beyond prepending namespaces), so I think it should be fine.

I'm not currently handling lang=... attributes in the rdf data, but I'm not sure if any rss1 feeds are likely to make use of it (ie multiple languages in the same feed).

In any case it'd be good to add some unit tests to ensure that non-latin encodings are correctly handled (incidentally, the test code stands up a local http server, to get around the validSchemes filter for testing).

(In reply to Ben Campbell from comment #14)

(In reply to alta88 from comment #12)

The string sanitization in the getRDFTargetValue* functions has been lost, which regresses a CVE level vulnerability fix (hovering some of that code in searchfox links to the bug). Also, if those functions are obsoleted, they should be removed.

Thanks! That's an excellent point - I'll track it down.

A good torture test rdf feed is https://www.livejournal.com/stats/latest-rss.bml, please verify it works the same before/after.
Many academic journals use rdf: https://journals.sagepub.com/action/showFeed?ui=0&mi=ehikzz&ai=2b4&jc=acrc&type=etoc&feed=rss

They look like great tests (particularly the livejournal one, with Cyrillic).
I'd love to include something like these in the unit test data, but not sure what the copyright implications are.
Anyone know what policy is on this, or alternative feed suggestions without copyright issues?

All you need is to copy an <item> and just use 1 word in the title and description. The url doesn't have to be real either, so use тест.рф/testitem1 or such as a url.

The important thing is to maintain CDATA if any and so forth, not the full content. (Although US Fair Use laws would likely allow such testing, it's irrelevant as this isn't US origin content).

RDF resource has a property |valueUTF8|, will IDN and l10n urls continue to work? Meaning when viewing the message, the website header should pretty show the link and it should open correctly in a browser. I don't have an rdf site for this anymore, but you can create a file:// url feed and simulate it. To enable subscribing to local files, issue FeedUtils._validSchemes.push("file") in the console.

The new rdfxml parser uses DOMParser to handle the actual parsing (it reads from an already-parsed XMLDocument rather than a byte stream). So I think all strings should come out just fine, whatever their encoding. I perform any further string processing (beyond prepending namespaces), so I think it should be fine.

Ok. I bring it up because it had to be fixed once long ago.

I'm not currently handling lang=... attributes in the rdf data, but I'm not sure if any rss1 feeds are likely to make use of it (ie multiple languages in the same feed).

lang is not used in any parser and there has never been a bug for it, so this can be skipped.

In any case it'd be good to add some unit tests to ensure that non-latin encodings are correctly handled (incidentally, the test code stands up a local http server, to get around the validSchemes filter for testing).

Yes, non latin is very important.

I do not think the complexity of an http server should be added to parser tests. An xhr returns the same document from a local file as from a (local) http served file. The _validSchemes was specifically implemented to allow testing from file, and anything testing document parsing onwards should get the feed from a file://.

So you're regressing a change made to how rdf <item> tags are collected in parseAsRSS1(). If you hover the comment

// Ignore the <items> list and just get the <item>s.

you will see the bug that changed this in 2013. I know you're just following the rdf spec for getting the 'chapter' from the 'table of contents' (<item> from the <items> list), but the reason that way of lookup was removed back then was that the <items> did not always contain a reference to an existing <item>, meaning it wasn't found. Better to just get all the <item>s directly.

I should have removed rdf in parseAsRSS1() in that change. What it means is there's no reason to use an rdf parser here, just get the <item> tags like in the other parsers, and getAttribute() for the rdf:about urn value (itemURI). What aceman alluded to in Bug 1532177. What do you think?

Changes for this new patch:

  • ignores the <items> sequence in RDF RSS1 feeds and just looks for <item>s directly (as per fix for Bug 476641). Still using RDF otherwise, but there's no more structural stuff involved, so should be OK I think. Added a unit test RSS1 feed file with a sabotaged <items> node.

  • added a couple more tests in, mainly to make sure non-latin character encodings aren't being dropped.

  • RSS1: re-added code to filter title and author names to single line (and with trimmed leading/trailing whitespace).

  • deleted the now-unused getRDFTargetValue* functions (in feed-parser.js)

Attachment #9052784 - Attachment is obsolete: true
Attachment #9053181 - Flags: review?(alta88)

(In reply to alta88 from comment #16)

I know you're just following the rdf spec for getting the 'chapter' from the 'table of contents' (<item> from the <items> list), but the reason that way of lookup was removed back then was that the <items> did not always contain a reference to an existing <item>, meaning it wasn't found. Better to just get all the <item>s directly.

Heh - I should have anticipated that this would be the case. Nobody ever manages to write out RDFXML files properly! And nobody writing feed reader software these days is going to bother using an RDFXML parser just to handle RSS1 (other than us!).

Part two - removes the nsIRDF* use from the rest of the feeds code. In particular, replaces feeds.rdf and feeditems.rdf with json equivalents.

Sorry it took so long, but little bugs kept popping up all over the place.

Still not happy with this patch, in particular, the test coverage is still pretty slim. Some parts are pretty tricky to unit test, but adding some mochi tests would help a lot.

Also the feeds/feeditems store management is a direct translation from the RDF version, and as such "smells" a little off now. In the interests of not introducing any more regressions I'll leave it like that for now, but I'll probably file another bug to suggest some cleanups (which would also make unit tests a lot easier).

Attachment #9054781 - Flags: review?(alta88)

(In reply to Ben Campbell from comment #17)

Created attachment 9053181 [details] [diff] [review]
part-one--update-rss1-parsing-2.patch

Changes for this new patch:

  • ignores the <items> sequence in RDF RSS1 feeds and just looks for <item>s directly (as per fix for Bug 476641). Still using RDF otherwise, but there's no more structural stuff involved, so should be OK I think. Added a unit test RSS1 feed file with a sabotaged <items> node.

I'd really prefer to use xml document methods and not a special parser to get nodes and attributes and values. Why wouldn't we do that? Let's join the 'nobody .. RDFXML parser' club.

  • added a couple more tests in, mainly to make sure non-latin character encodings aren't being dropped.

Who needs tests. It's some kind of miracle this component has no tests and no bugs..

No, tests are great! There's even Bug 692073 for tests. Perhaps administratively they could be added there? Up to you.

But, again, why are we using an http server rather than file:// urls? The test just needs to push "file" to _validSchemes.

  • RSS1: re-added code to filter title and author names to single line (and with trimmed leading/trailing whitespace).

  • deleted the now-unused getRDFTargetValue* functions (in feed-parser.js)

(In reply to alta88 from comment #20)

I'd really prefer to use xml document methods and not a special parser to get nodes and attributes and values. Why wouldn't we do that? Let's join the 'nobody .. RDFXML parser' club.

I've just spent the last couple of hours doing this... but then realised that there were loads of RDFXML rules I'd have to re-implement (eg literal values in RDFXML can be either nodes or attributes). So I gave up on it.
The RDFXML parser I wrote already handles most of this stuff (it takes an already-parsed XMLDocument and applies the RDFXML rules to spit out RDF triples, so it's never doing any character-by-character parsing). I think we're safer using it rather than re-implementing a bunch of the RDF parsing rules again just for RSS1.

  • added a couple more tests in, mainly to make sure non-latin character encodings aren't being dropped.

Who needs tests. It's some kind of miracle this component has no tests and no bugs..

The test coverage is still pretty low, but it's a start. At least it's a place to hang future tests if anyone is fixing other stuff in here.

No, tests are great! There's even Bug 692073 for tests. Perhaps administratively they could be added there? Up to you.

Ahh, cool! Hadn't spotted that bug.
The tests are split up across these two patches - I can split them out onto that bug if it's easier, but they'll have to land in the right ordering.

But, again, why are we using an http server rather than file:// urls? The test just needs to push "file" to _validSchemes.

I left the httpd setup in there because it's only a few lines and it'd be nice to add some HTTP-specific tests ("If-Modified-Since" header etc) one day. I can ditch it if you feel strongly about it.

(In reply to Ben Campbell from comment #21)

...but then realised that there were loads of RDFXML rules I'd have to re-implement (eg literal values in RDFXML can be either nodes or attributes). So I gave up on it.

Just to clarify, I've never seen any RSS1 feeds which do store literals as attributes rather than nodes, but it is valid RDFXML. I can easily imagine there are RSS1 feeds out there generated by some RDF library which does use attributes for simple string literals...
So if we need to follow RDFXML rules, may as well use what we've already got rather than re-implementing a subset just to handle rss1.

(In reply to Ben Campbell from comment #22)

(In reply to Ben Campbell from comment #21)

...but then realised that there were loads of RDFXML rules I'd have to re-implement (eg literal values in RDFXML can be either nodes or attributes). So I gave up on it.

Just to clarify, I've never seen any RSS1 feeds which do store literals as attributes rather than nodes, but it is valid RDFXML. I can easily imagine there are RSS1 feeds out there generated by some RDF library which does use attributes for simple string literals...
So if we need to follow RDFXML rules, may as well use what we've already got rather than re-implementing a subset just to handle rss1.

Neither have I and rss rdf in practical usage is a subset of the spec. What if attributes, per spec, were assumed away.

If you look at getRDFTargetValueRaw(), are you convinced it's even possible to get an attribute? From ds.GetTarget the result is QIed to Ci.nsIRDFLiteral to get Value property. If you're saying the result can be either a node or an attribute, then we'll go with the parser.

I've looked at other readers that claim rss rdf support and don't see a specialized rdf parser, and am trying to keep this as simple as possible, if possible.

(In reply to Ben Campbell from comment #21)

(In reply to alta88 from comment #20)

The test coverage is still pretty low, but it's a start. At least it's a place to hang future tests if anyone is fixing other stuff in here.

No, tests are great! There's even Bug 692073 for tests. Perhaps administratively they could be added there? Up to you.

Ahh, cool! Hadn't spotted that bug.
The tests are split up across these two patches - I can split them out onto that bug if it's easier, but they'll have to land in the right ordering.

It's up to you, makework isn't a goal. That bug can be left open though, and you just add whatever tests are ready.

But, again, why are we using an http server rather than file:// urls? The test just needs to push "file" to _validSchemes.

I left the httpd setup in there because it's only a few lines and it'd be nice to add some HTTP-specific tests ("If-Modified-Since" header etc) one day. I can ditch it if you feel strongly about it.

Yes, tests that require an actual http response need a fakeserver (like testing a response header or certs or things). But nothing yet needs that, so I'd implement it only when needed. Partially for simplicity, partially for history. Sometimes things land and then aren't used and someone years later cleans it out..

Re using http for testing, I'd leave that in. That's the real world scenario, and it's preferable to test something that will be used rather than something that is test-only (like feeds through file://)

Status: NEW → ASSIGNED

(In reply to alta88 from comment #23)

Neither have I and rss rdf in practical usage is a subset of the spec. What if attributes, per spec, were assumed away.

If you look at getRDFTargetValueRaw(), are you convinced it's even possible to get an attribute? From ds.GetTarget the result is QIed to Ci.nsIRDFLiteral to get Value property. If you're saying the result can be either a node or an attribute, then we'll go with the parser.

Well, the following are equivalent:

This (all values as nodes):

  <item rdf:about="http://xml.com/pub/2000/08/09/xslt/xslt.html">
    <title>Processing Inclusions with XSLT</title>
    <link>http://xml.com/pub/2000/08/09/xslt/xslt.html</link>
    <description>
      Processing document inclusions with general XML tools can be 
      problematic. This article proposes a way of preserving inclusion 
      information through SAX-based processing.
    </description>
  </item>

and this (some values in attributes):

 <item rdf:about="http://xml.com/pub/2000/08/09/xslt/xslt.html" title="Processing Inclusions with XSLT" link="http://xml.com/pub/2000/08/09/xslt/xslt.html">
    <description>
      Processing document inclusions with general XML tools can be 
      problematic. This article proposes a way of preserving inclusion 
      information through SAX-based processing.
    </description>
  </item>

And while a custom rss1-specific feed parser can be made to deal with it (and other such rules), by that point it seems like we may as well use the RDF parser, as we need it anyway.

Blocks: 692073

(In reply to Ben Campbell from comment #26)

(In reply to alta88 from comment #23)

Neither have I and rss rdf in practical usage is a subset of the spec. What if attributes, per spec, were assumed away.

If you look at getRDFTargetValueRaw(), are you convinced it's even possible to get an attribute? From ds.GetTarget the result is QIed to Ci.nsIRDFLiteral to get Value property. If you're saying the result can be either a node or an attribute, then we'll go with the parser.

Well, the following are equivalent:

Nope, as far as the rdf parser is concerned, they are not. If you put them into a file and subscribe, the first will result in a message being stored and the second will fail on syntax. As I've already implied, you need to demonstrate why xml doc dom methods, as used by the other parsers, can't be used to get the handful of tag values, in order to justify using a special parser.

In fact, I did a fast check with this:

diff -r 81f6ceef61c7 mailnews/extensions/newsblog/content/feed-parser.js
--- a/mailnews/extensions/newsblog/content/feed-parser.js	Fri Feb 15 18:50:20 2019 -0700
+++ b/mailnews/extensions/newsblog/content/feed-parser.js	Mon Apr 08 07:55:10 2019 -0600
@@ -53,18 +53,19 @@ FeedParser.prototype = {
       aFeed.mFeedType = "RSS_1.xRDF";
       FeedUtils.log.debug("FeedParser.parseFeed: type:url - " +
                           aFeed.mFeedType + " : " + aFeed.url);
       // aSource can be misencoded (XMLHttpRequest converts to UTF-8 by default),
       // but the DOM is almost always right because it uses the hints in the
       // XML file.  This is slower, but not noticeably so.  Mozilla doesn't have
       // the XMLHttpRequest.responseBody property that IE has, which provides
       // access to the unencoded response.
-      let xmlString = this.mSerializer.serializeToString(doc);
-      return this.parseAsRSS1(aFeed, xmlString, aFeed.request.channel.URI);
+//      let xmlString = this.mSerializer.serializeToString(doc);
+//      return this.parseAsRSS1(aFeed, xmlString, aFeed.request.channel.URI);
+      return this.parseAsRSS2(aFeed, aDOM);
     } else if (doc.namespaceURI == FeedUtils.ATOM_03_NS) {
       aFeed.mFeedType = "ATOM_0.3";
       FeedUtils.log.debug("FeedParser.parseFeed: type:url - " +
                           aFeed.mFeedType + " : " + aFeed.url);
       return this.parseAsAtom(aFeed, aDOM);
     } else if (doc.namespaceURI == FeedUtils.ATOM_IETF_NS) {
       aFeed.mFeedType = "ATOM_IETF";
       FeedUtils.log.debug("FeedParser.parseFeed: type:url - " +

And it worked. What's more, both your test messages were stored, though with different results (livejournal also works). So I would like to proceed based on that.

And while a custom rss1-specific feed parser can be made to deal with it (and other such rules), by that point it seems like we may as well use the RDF parser

That is a non reason, and practically an anti reason to do anything, sorry. And I see no point swapping one rdf parser (ancient, feature complete, tried and tested) with another (light, handrolled, single purpose, untested), when we can get by with neither.

, as we need it anyway.

For what? If you might mean getting the 2 dbs converted, I doubt it.

Attachment #9053181 - Flags: review?(alta88) → review-

Thanks alta - that's good feedback and the fact we're debating this at all is a good argument for not adding a new lightweight RDFXML parser :-)

I'll ditch the new rdf parser and work up a new patch which:

  1. parses RSS1 using plain XMLDocument DOM methods (like the other feed parsers do).
  2. does the same (XML doc DOM) for parsing the old feeds.rdf and feeditems.rdf
Attachment #9054781 - Flags: review?(alta88)

The patches do not apply trivially at a few lines.
I also get this:
Feeds ERROR FeedUtils.getSubscriptions(): error reading feeds.json (or legacy feeds.rdf): Not enough arguments [nsIFile.remove]
log4moz.js:680
CApp_append resource:///modules/gloda/log4moz.js:680
Logger_log resource:///modules/gloda/log4moz.js:374
Logger_error resource:///modules/gloda/log4moz.js:382
getSubscriptions resource:///modules/FeedUtils.jsm:1399
getFeedUrlsInFolder resource:///modules/FeedUtils.jsm:622
getFolderProperties resource:///modules/FeedUtils.jsm:680
getProperties chrome://messenger/content/folderPane.js:2409
getRowProperties chrome://messenger/content/folderPane.js:993

I can't say how to reproduce as I wasn't clicking any RSS folder, the RSS account just came into view in the folder pane.
But at least I do not see the errors reported in bug 1527764. Viewing RSS messages now works without console errors.

OK, new RDFParser-less version of the RSS1 parsing.

Also added support for multiple authors (the old rss1 parser only took the first author). Good for feeds like the SAGE journals one, where papers usually have a whole bunch of authors.

Attachment #9053181 - Attachment is obsolete: true
Attachment #9058182 - Flags: review?(alta88)

...and corresponding RDFParser-less changes to handle legacy feeds.rdf and feeditems.rdf files.

Attachment #9054781 - Attachment is obsolete: true
Attachment #9058184 - Flags: review?(alta88)

(In reply to :aceman from comment #29)
Thanks for giving them a try!

The patches do not apply trivially at a few lines.

Hmm... could be the tweaks in the C++ code to add feeds.json/feeditems.json to the list of files-which-are-definitely-not-mboxes. The other patches in the RDF-removal work might be interfering there - these feed patches are in isolation, not on top of the other RDF patches. I might have to rebase/merge those patches once this work is landed.

In any case, I've just reworked the feed patches (again, on the live tree, not on top of the other RDF-removal work)

The missing nsIFile::remove() parameter should be fixed now - just a stupid mistake on my part (requires a bool param to control recursive delete).

Duplicate of this bug: 1532177
Comment on attachment 9058182 [details] [diff] [review]
part-one--update-rss1-parsing-3.patch


Ok, this is on the right track. However, I'd like the changes to very strictly follow only what the original parser did. For example, you've made it much stricter regarding what is mandatory. The title doesn't only come from the feed, it is also what the user sets in Subscribe, thus feed.title already exists for updates, and the feed may then consist of nothing other than a link and should not be rejected.

Also, I know you mean well, but adding multiple author support (for rdf and rss2) is a different bug with plenty of issues (I have a wip for it) and additional features should not be added in this sort of major conversion patch. And, you've changed what happens to author if there is no author tag (author becomes the title). Please follow the existing flow and var assignments in exact existing order.
```
>diff --git a/mailnews/extensions/newsblog/content/feed-parser.js b/mailnews/extensions/newsblog/content/feed-parser.js
>--- a/mailnews/extensions/newsblog/content/feed-parser.js
>+++ b/mailnews/extensions/newsblog/content/feed-parser.js
>@@ -311,118 +306,138 @@ FeedParser.prototype = {
>       }
> 
>       this.parsedItems.push(item);
>     }
> 
>     return this.parsedItems;
>   },
> 
>-  parseAsRSS1(aFeed, aSource, aBaseURI) {
>-    // RSS 1.0 is valid RDF, so use the RDF parser/service to extract data.
>-    // Create a new RDF data source and parse the feed into it.
>-    let ds = Cc["@mozilla.org/rdf/datasource;1?name=in-memory-datasource"].
>-             createInstance(Ci.nsIRDFDataSource);
>-
>-    let rdfparser = Cc["@mozilla.org/rdf/xml-parser;1"].
>-                    createInstance(Ci.nsIRDFXMLParser);
>-    rdfparser.parseString(ds, aBaseURI, aSource);
>-
>-    // Get information about the feed as a whole.
>-    let channel = ds.GetSource(FeedUtils.RDF_TYPE, FeedUtils.RSS_CHANNEL, true);
>+  parseAsRSS1(feed, doc) {
```
Please add some jsdoc here.
```
>+    // Get the first channel (assuming there is only one per RSS File).
>+    let channel = doc.querySelector("channel");
>     if (!channel) {
>-      aFeed.onParseError(aFeed);
>+      feed.onParseError(feed);
>       return [];
>     }
> 
>-    if (this.isPermanentRedirect(aFeed, null, channel, ds)) {
>+    if (this.isPermanentRedirect(feed, null, channel)) {
>+      return [];
>+    }
>+    let titleNode = this.childByTagNameNS(channel, FeedUtils.RSS_NS, "title");
>+    let linkNode = this.childByTagNameNS(channel, FeedUtils.RSS_NS, "link");
>+    let descNode = this.childByTagNameNS(channel, FeedUtils.RSS_NS, "description");
>+
>+    // Forget the RSS1 spec. Feeds in the wild are always broken.
>+    // So we'll tolerate some missing fields.
>+    if (!titleNode && !descNode && !linkNode) {
>+      FeedUtils.log.error("FeedParser.parseAsRSS1: <channel is missing " +
>+                          "<title>, <description> and <link>");
>+      feed.onParseError(feed);
>       return [];
>     }
> 
>-    aFeed.title = aFeed.title ||
>-                  this.getRDFTargetValue(ds, channel, FeedUtils.RSS_TITLE) ||
>-                  aFeed.url;
>-    aFeed.description = this.getRDFTargetValueFormatted(ds, channel, FeedUtils.RSS_DESCRIPTION) ||
>-                        "";
>-    aFeed.link = this.validLink(this.getRDFTargetValue(ds, channel, FeedUtils.RSS_LINK)) ||
>-                 aFeed.url;
>+    feed.link = this.validLink(this.getNodeValue(linkNode)) || feed.url;
>+    feed.title = feed.title || this.getNodeValue(titleNode) || feed.link;
>+    feed.description = this.getNodeValueFormatted(descNode);
> 
>-    if (!(aFeed.title || aFeed.description) || !aFeed.link) {
>-      FeedUtils.log.error("FeedParser.parseAsRSS1: missing mandatory element " +
>-                          "<title> and <description>, or <link>");
>-      aFeed.onParseError(aFeed);
>-      return [];
>-    }
>-
```
The original code didn't reject that way, please restore the original if() logic and debug statement.
```
>-    if (!aFeed.parseItems) {
>+    // If we're only interested in the overall feed description, we're done.
>+    if (!feed.parseItems) {
>       return [];
>     }
> 
>-    this.findSyUpdateTags(aFeed, channel, ds);
>+    this.findSyUpdateTags(feed, channel);
>+
>+    feed.invalidateItems();
> 
>-    aFeed.invalidateItems();
>+    // Now process all the individual items in the feed.
>+    let itemNodes = doc.getElementsByTagNameNS(FeedUtils.RSS_NS, "item");
>+    itemNodes = itemNodes ? itemNodes : [];
> 
>-    // Ignore the <items> list and just get the <item>s.
>-    let items = ds.GetSources(FeedUtils.RDF_TYPE, FeedUtils.RSS_ITEM, true);
>+    for (let itemNode of itemNodes) {
>+      if (!itemNode.childElementCount) {
>+        continue;
>+      }
>+
>+      let itemURI = itemNode.getAttribute("about");
> 
>-    while (items.hasMoreElements()) {
>-      let itemResource = items.getNext().QueryInterface(Ci.nsIRDFResource);
>-      let item = new FeedItem();
>-      item.feed = aFeed;
>+      let titleNode = this.childByTagNameNS(itemNode, FeedUtils.RSS_NS, "title") ||
>+                      this.childByTagNameNS(itemNode, FeedUtils.DC_NS, "title") ||
>+                      this.childByTagNameNS(itemNode, FeedUtils.DC_NS, "subject");
>+      let linkNode = this.childByTagNameNS(itemNode, FeedUtils.RSS_NS, "link");
>+      let descNode = this.childByTagNameNS(itemNode, FeedUtils.RSS_NS, "description");
>+
>+      let dateNode = this.childByTagNameNS(itemNode, FeedUtils.DC_NS, "date");
>+      let date = this.getNodeValue(dateNode);
> 
>-      // Prefer the value of the link tag to the item URI since the URI could be
>-      // a relative URN.
```
Please restore the comment.
```
>-      let uri = itemResource.ValueUTF8;
>-      let link = this.validLink(this.getRDFTargetValue(ds, itemResource, FeedUtils.RSS_LINK));
>-      item.url = link || uri;
>-      item.description = this.getRDFTargetValueFormatted(ds, itemResource,
>-                                                         FeedUtils.RSS_DESCRIPTION);
>-      item.title = this.getRDFTargetValue(ds, itemResource, FeedUtils.RSS_TITLE) ||
>-                   this.getRDFTargetValue(ds, itemResource, FeedUtils.DC_SUBJECT) ||
>-                   (item.description ?
>-                     (this.stripTags(item.description).substr(0, 150)) : null);
>+      // Check for authors (might be more than one)
>+      let authors = [];
>+      let authorNodes = itemNode.getElementsByTagNameNS(FeedUtils.DC_NS, "creator");
>+      for (let authorNode of authorNodes) {
>+        let name = this.getNodeValue(authorNode);
>+        name = this.cleanAuthorName(name);
>+        if (name) {
>+          authors.push("<" + name + ">");
>+        }
>+      }
>+      if (authors.length == 0) {
>+        // No authors. Check for publisher as a fallback.
>+        let pubNode = this.childByTagNameNS(channel, FeedUtils.DC_NS, "publisher");
>+        let name = this.getNodeValue(pubNode);
>+        name = this.cleanAuthorName(name);
>+        if (name) {
>+          authors.push("<" + name + ">");
>+        }
>+      }
>+
```
Please just convert author but not add multiple support here or add any features not currently found. 
```
>+      let item = new FeedItem();
>+      item.feed = feed;
>+      item.enclosures = [];
>+      item.keywords = [];
```
These two are in the constructor, are they necessary here? These are not currently supported for rdf.
```
>+      item.description = this.getNodeValueFormatted(descNode);
>+      if (date) {
>+        item.date = date; // NOTE: FeedItem breaks if date is null
>+      }
>+      item.author = authors ? authors : null;
>+
>+      let link = this.getNodeValue(linkNode);
>+      item.url = this.validLink(link) || this.validLink(itemURI);
>+      item.id = item.url;
>+
>+      let title = this.getNodeValue(titleNode);
>+      if (title) {
>+        item.title = item.htmlEscape(title);
```
The reason title isn't escaped in the original was that it could be used for content, if there were no content tag. (This is done in rss2 but not here, so it doesn't matter but also isn't needed). And for this reason, title was originally cleaned up for storage as subject in feedItem.js, ie late in the flow. There are many such easy to regress things in feeds code..
```
>+      } else if (item.description) {
>+        item.title = this.stripTags(item.description).substr(0, 150).trim();
>+      } else {
>+        item.title = null;
>+      }
>+
>       if (!item.url || !item.title) {
>         FeedUtils.log.info("FeedParser.parseAsRSS1: <item> missing mandatory " +
>-                           "element <item rdf:about> and <link>, or <title> and " +
>+                           "rdf:about, <link>, or <title> and " +
>                            "no <description>; skipping");
```
So the phrase was written to indicate the item must have (rdf:about or link) *and* (title or desc). Please restore.
```
>         continue;
>       }
>-
>-      item.id = item.url;
>-      item.url = this.validLink(item.url);
>-
>-      let author = this.getRDFTargetValue(ds, itemResource, FeedUtils.DC_CREATOR) ||
>-                   this.getRDFTargetValue(ds, channel, FeedUtils.DC_CREATOR) ||
>-                   aFeed.title;
>-      author = this.cleanAuthorName(author);
>-      item.author = author ? ["<" + author + ">"] : item.author;
>-
>-      item.date = this.getRDFTargetValue(ds, itemResource, FeedUtils.DC_DATE) ||
>-                  item.date;
>-      item.content = this.getRDFTargetValueFormatted(ds, itemResource,
>-                                                     FeedUtils.RSS_CONTENT_ENCODED);
>-
```
Please restore the original assignment of variables order.
```
>       this.parsedItems.push(item);
>     }
>-    FeedUtils.log.debug("FeedParser.parseAsRSS1: items parsed - " +
>-                        this.parsedItems.length);
```
Why is this debug removed, please restore.


The tests look good on first glance, I'll look in detail once the parser is finished.

The second patch is a giant change and will require a lot of review time to do properly. I'll try to have a first cut within a week but this is on free time and I'm traveling atm etc.
Attachment #9058182 - Flags: review?(alta88) → feedback+

Thanks, alta88 - you're absolutely right. I should have stuck with what was there already, warts and all. This new patch follows the original rss1 parsing much more closely.

Attachment #9058182 - Attachment is obsolete: true
Attachment #9059999 - Flags: review?(alta88)

The second part. No changes, just rebased to remove some minor merge clashes from changes in the first part.

Attachment #9058184 - Attachment is obsolete: true
Attachment #9058184 - Flags: review?(alta88)
Attachment #9060000 - Flags: review?(alta88)
Comment on attachment 9059999 [details] [diff] [review]
part-one--update-rss1-parsing-4.patch

r+ with the items below.

```
>
>diff --git a/mailnews/extensions/newsblog/content/feed-parser.js b/mailnews/extensions/newsblog/content/feed-parser.js
>+    let descNode = this.childByTagNameNS(channel, FeedUtils.RSS_NS, "description");
>+    feed.description = this.getNodeValue(descNode) || "";
```
This should still be getNodeValueFormatted().
```
>+      let linkNode = this.childByTagNameNS(itemNode, FeedUtils.RSS_NS, "link");
>+      item.url = this.getNodeValue(linkNode) || itemNode.getAttribute("about");
```
linkNode value going through validLink() got dropped :/
As for the about: uri value, it could be an urn: value. But it still needs to be spoof proofed, so please do this:
```this.removeUnprintableASCII(abouturi.trim());```
on the about: attribute uri value rather than adding urn: (or whatever else the rdf spec may say) to validLink() and polluting it.

>--- /dev/null
>+++ b/mailnews/extensions/newsblog/test/unit/test_feedparser.js
>@@ -0,0 +1,119 @@
>+// see https://www.w3.org/2000/10/rdf-tests/ for test files (including rss1)
>+// - test examples for all feed types
>+// - test items in test_download()
>+// - test rss1 feed with itunes `new-feed-url` redirect
>+// - test rss1 feed with RSS syndication extension tags (updatePeriod et al)
>+// - test multiple/missing authors (with fallback to feed title)
>+// - test missing dates
>+// - test content formatting
>+
>+// Some RSS1 feeds in the wild:
>+// https://www.livejournal.com/stats/latest-rss.bml
>+// https://journals.sagepub.com/action/showFeed?ui=0&mi=ehikzz&ai=2b4&jc=acrc&type=etoc&feed=rss
>+

nice, please add some more:
https://www.revolutionspodcast.com/index.rdf
https://www.tandfonline.com/feed/rss/uasa20
http://export.arxiv.org/rss/astro-ph

The tests properly differentiate between unit and end-to-end now. aceman, could you take a look at the tests as well please, to make sure they're how you like?
Attachment #9059999 - Flags: review?(alta88)
Attachment #9059999 - Flags: review+
Attachment #9059999 - Flags: feedback?(acelists)

summary of changes:

  • feed description now uses GetNodeValueFormatted() (oops :-)
  • the item url was being passed through validUrl(), but it happened further down and wasn't very clear. I've moved it up for clarity.
  • the item about uri is now passed through removeUnprintableASCII() to sanitise it.
  • added the extra rss1 feed examples as comments in the feedparser test.

(I'll wait for aceman's feedback on the tests, then we should be able to land it)

Attachment #9059999 - Attachment is obsolete: true
Attachment #9059999 - Flags: feedback?(acelists)
Attachment #9060288 - Flags: review+
Attachment #9060288 - Flags: feedback?(acelists)
Comment on attachment 9060288 [details] [diff] [review]
part-one--update-rss1-parsing-5.patch

Review of attachment 9060288 [details] [diff] [review]:
-----------------------------------------------------------------

Seems fine to me and the test passes on Linux.

::: mailnews/extensions/newsblog/test/unit/resources/README.md
@@ +1,4 @@
> +Data files for unit testing the feeds code.
> +
> +- `rss_7_1.rdf`        - a rss1.0 feed (with a search widget we ignore)
> +- `rss_7_1_BORKED.rdf` - a rss1.0 feed with a bad <items> list, pointing

Does the 7_1 mean anything, while it is a RSS1.0 version? Also should the RSS be capitalized as in the case RSS2 below?

::: mailnews/extensions/newsblog/test/unit/resources/rss_7_1.rdf
@@ +4,5 @@
> +     http://groups.yahoo.com/group/rss-dev/files/specification.html
> +     Section 7
> +  -->
> +
> +<rdf:RDF 

Many trailing spaces in this file.

::: mailnews/extensions/newsblog/test/unit/resources/rss_7_1_BORKED.rdf
@@ +4,5 @@
> +     http://groups.yahoo.com/group/rss-dev/files/specification.html
> +     Section 7
> +  -->
> +
> +<rdf:RDF 

Many trailing spaces in this file.

::: mailnews/extensions/newsblog/test/unit/test_feedparser.js
@@ +18,5 @@
> +// Helper to compare feeditems.
> +function assertItemsEqual(got, expected) {
> +  Assert.equal(got.length, expected.length);
> +  for (let i = 0; i < expected.length; i++) {
> +    // only check fields in expected. Means testdata can exclude "description" and other bulky fields.

Only.
Attachment #9060288 - Flags: feedback?(acelists) → feedback+

(In reply to :aceman from comment #39)

::: mailnews/extensions/newsblog/test/unit/resources/README.md
@@ +1,4 @@

+Data files for unit testing the feeds code.
+
+- rss_7_1.rdf - a rss1.0 feed (with a search widget we ignore)
+- rss_7_1_BORKED.rdf - a rss1.0 feed with a bad <items> list, pointing

Does the 7_1 mean anything, while it is a RSS1.0 version? Also should the
RSS be capitalized as in the case RSS2 below?

rss_7_1.rdf is taken directly from the W3C rdf examples. I've updated the README.
It came with all the trailing spaces, and I just included it verbatim.

Attachment #9060288 - Attachment is obsolete: true
Attachment #9060566 - Flags: review+

Rebased to cope with the big google-style code reformatting.

Attachment #9060000 - Attachment is obsolete: true
Attachment #9060000 - Flags: review?(alta88)
Attachment #9060568 - Flags: review?(alta88)
Keywords: checkin-needed
Keywords: leave-open

Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/d33c9556acd4
Parse RSS1 using custom XMLDocument-based parser, rather than RDF. r=alta88

Keywords: checkin-needed
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/ce4e95d15d05
Follow-up: Fix linting error. rs=bustage-fix DONTBUILD
Comment on attachment 9060568 [details] [diff] [review]
part-two--remove-nsirdf-use-4.patch

There are some things that should have been architected out first.

1. The JSONFile async api should be used here, no wheel reinvention. See SessionStoreManager.jsm for a Tb implementation. 

2. There is already a server uri keyed cache for feed url status on the FeedUtils object, and the JSONFile store should be a child object, ie _storeSubscriptions and _storeItems, just like is currently done with the ..DS files. For regular polling, any changes that need to be persisted can be applied using saveSoon() and in cases of user updates that need to be sync, await _save() can be used, else a short delay when JSONFile is contructored.

3. Direct xpcom instantiation should be avoided, there are almost certainly wrapped methods in toolkit for most things; can't OS.File be used to read in a raw file here?
+  loadXML(file) {
+    // Read in file contents.

4. Please do not remove comments unless they no longer apply. Do not remove debug either. Don't change function names gratuitously, ie getSubscriptionsList() doesn't need to change to getSubscriptions(). The goal in a large conversion patch like this is to *create as few diff lines as possible*.

5. Please do not change args (feed from aFeed) in this kind of patch (I let it go before). If you want whitespace/format type changes, it should be done in a separate patch, either before or after the main substance, across the whole file to be consistent, and should not alter blame or pollute history. In the case here, you're touching so many functions that it should all just be converted to the new policy for arg names.

6. Did you smoketest this patch? For example, in rdfToItems() there's a typo assigning from "testContent" which can't have worked. Minimally, you need to ensure the below work before asking for review.
- Create a feed account
- Subscribe a feed, ensure no dupes later
- Delete a feed and check that the files are clean (this was a big fail in the original code)
- Test with utf8 stuff (urls and content)
- Rename folders, have a 2-3 level subfolder and rename the parent, etc
- Edit all options for a feed in Subscribe dialog
- Export to and and reimport from an ompl file
- Forward a feed message in compose
- Change global summary/webpage options and ensure display is as expected

7. Another way of converting feeds.rdf, rather than directly, is to export it to opml and then reimport into the json file. The advantage is that the latter is going to be the way going forward, while rdf to json is one time use dead code. Thoughts?
Attachment #9060568 - Flags: review?(alta88)

Wow, looks like this is the last issue standing in the way of killing RDF. Is this targeted for 68?

Flags: needinfo?(benc)

Yes, it is. We don't want to carry around the RDF code in comm/rdf/ any more. Sadly there has been no progress for a month now :-(

Here is a new patch (Finally! Sorry it's taken so long, but for various reasons I've been mostly away over the last few weeks and I'll be traveling for the next few weeks. But I will be checking in on this bug regularly - this bug is the logjam holding up the final rdf code removal).

This one should be a lot more readable - It's still pretty big, but I've gone through and removed a lot of the noise. I've split out the unit tests into a separate patch.

Attachment #9060568 - Attachment is obsolete: true
Flags: needinfo?(benc)
Attachment #9069270 - Flags: review?(alta88)

Just to address this stuff as it applies to the new patch:

(In reply to alta88 from comment #44)

  1. The JSONFile async api should be used here, no wheel reinvention. See
    SessionStoreManager.jsm for a Tb implementation.

Fantastic! Had no idea this existed, but it's exactly what I wanted. New patch uses it.

  1. There is already a server uri keyed cache for feed url status on the
    FeedUtils object, and the JSONFile store should be a child object, ie
    _storeSubscriptions and _storeItems, just like is currently done with the
    ..DS files.

I'm a little nervous how the FeedUtils object has random URLs as top-level attribute names. But leaving it as it reduces a bit of diff noise, and it can be finessed in a followup if needed.

  1. Direct xpcom instantiation should be avoided, there are almost certainly
    wrapped methods in toolkit for most things; can't OS.File be used to read in
    a raw file here?
  • loadXML(file) {

I originally wanted to use OS.File, but we're in synchronous code (I don't think await helps here, as the events aren't serviced until after the function returns, but I could be missing some JS knowledge here).
In the end I switched to using the same idiom JSONFile.jsm uses for synchronous loading. It's still essentially direct xpcom instantiation, but with a slight veneer of respectability :-)

  1. Please do not remove comments unless they no longer apply. Do not remove
    debug either.

This patch should be a lot better on that front.

  1. Did you smoketest this patch? For example, in rdfToItems() there's a typo
    assigning from "testContent" which can't have worked. Minimally, you need to
    ensure the below work before asking for review.
  • Create a feed account
  • Subscribe a feed, ensure no dupes later
  • Delete a feed and check that the files are clean (this was a big fail in
    the original code)
  • Test with utf8 stuff (urls and content)
  • Rename folders, have a 2-3 level subfolder and rename the parent, etc
  • Edit all options for a feed in Subscribe dialog
  • Export to and and reimport from an ompl file
  • Forward a feed message in compose
  • Change global summary/webpage options and ensure display is as expected

This is a good list - I used it as a starting point for manually testing this patch, and it'd be a good roadmap for further unit tests too.
I did find a couple of bugs, but they already occur without this patch, so I'll write them up separately.

  1. Another way of converting feeds.rdf, rather than directly, is to export
    it to opml and then reimport into the json file. The advantage is that the
    latter is going to be the way going forward, while rdf to json is one time
    use dead code. Thoughts?

I think I'm missing something here. We still need something to parse legacy feeds.rdf and feeditems.rdf. I don't see how the opml step helps with this.
Anyway in this new patch there's a lot less file-handling code (thanks, JSONFIle.jsm!).

Hardly comprehensive, but a start at least. Mainly just checks up on the feeds/feeditem parsing.

Attachment #9069272 - Flags: review?(alta88)

This looks like it's on a much better track. Unfortunately, I'm away (eu) for 5 weeks and don't have a rig with a dev/source environment so can't apply the patch/build. I'll add some more specific comments soon.

Type: enhancement → task

(In reply to Ben Campbell from comment #48)

Just to address this stuff as it applies to the new patch:

(In reply to alta88 from comment #44)

  1. There is already a server uri keyed cache for feed url status on the
    FeedUtils object, and the JSONFile store should be a child object, ie
    _storeSubscriptions and _storeItems, just like is currently done with the
    ..DS files.

I'm a little nervous how the FeedUtils object has random URLs as top-level attribute names. But leaving it as it reduces a bit of diff noise, and it can be finessed in a followup if needed.

Yes, better for noise reduction but also because there's no technical reason it matters. However, I do agree it is better organization so a followup would be great.

  1. Direct xpcom instantiation should be avoided, there are almost certainly
    wrapped methods in toolkit for most things; can't OS.File be used to read in
    a raw file here?
  • loadXML(file) {

I originally wanted to use OS.File, but we're in synchronous code (I don't think await helps here, as the events aren't serviced until after the function returns, but I could be missing some JS knowledge here).
In the end I switched to using the same idiom JSONFile.jsm uses for synchronous loading. It's still essentially direct xpcom instantiation, but with a slight veneer of respectability :-)

But still not nearly enough ;) There's no reason not to future proof xpcom by killing our direct use of it. Like with the existing method in getItemsFile(aServer) using xhr (the third arg of .open() of |false| means be sync). Or even better use await fetch() similar to the implementation here: https://searchfox.org/comm-central/source/mail/base/content/msgHdrView.js#1806.

alta88: any expectation on when you can finish up the review?
We're not going to keep rdf for the 68 cycle - and intend to land the removal on 68. We want to get as much beta testing as possible, so would like to get it killed asap.

Severity: normal → blocker
Flags: needinfo?(alta88)
Version: unspecified → Trunk

This bug is not a blocker. No bug that depends on it is a blocker, including Bug 420506. This bug and kill-rdf is "nice to have", period.

Severity: blocker → normal
Flags: needinfo?(alta88)
Comment on attachment 9069270 [details] [diff] [review]
part-two--remove-nsirdf-use-5.patch

Please submit a patch addressing comment 51.
Attachment #9069270 - Flags: review?(alta88) → review-
Attachment #9069272 - Flags: review?(alta88)

Please note "We're not going to keep rdf for the 68 cycle" = blocker, and must be addressed during the beta period ASAP

As I'm pretty sure you know, that's not the definition of a blocker. I'm also going to assume you're not intentionally making the mistake of attempting to use that flag as a pressure tactic.

Just for clarity, I'm not going to rubber stamp an incorrect, suboptimal, non future proof, untested, regression fodder patch just to rush something like this in. There's too much of that going on already. It's best we discuss anything further to do with the management of this patch offline.

And see comment 50.

Whilst I generally agree with alta88 that it's more like "nice to have", we suspect that bug 1543183 might be fixed once RDF is out of the way.

Getting too late to take this for 68.

(In reply to alta88 from comment #51)

But still not nearly enough ;) There's no reason not to future proof xpcom by killing our direct use of it. Like with the existing method in getItemsFile(aServer) using xhr (the third arg of .open() of |false| means be sync). Or even better use await fetch() similar to the implementation here: https://searchfox.org/comm-central/source/mail/base/content/msgHdrView.js#1806.

(apologies for the slow response - I ended up on the road and mostly offline for the last few weeks)

I had another go at using OS.FIle, but I couldn't get it working. I think you just can't use await/async in non-async code (at least, not without some serious event-loop servicing magic). It'd be a matter of changing the feeds code to use async functions all the way up, and that is way beyond the scope of this work.

So I went with the XMLHttpRequest approach. Less code than the xpcom version. I think it's the best we can do now, without doing an entire feeds overhaul using async.

Attachment #9069270 - Attachment is obsolete: true
Attachment #9080849 - Flags: review?(alta88)
Attachment #9069272 - Attachment is obsolete: true

xhr is good enough. Now to migration issues:

  1. Once the conversion code runs, that's it for the 2 dbs, and there's no going back. So there should be a big relnote for people who might try to use older Tb versions on an updated profile, even with this disallowed now - their feeds will all be gone.
  2. The conversion code is pretty intertwined right now with code that will read the json dbs going forward. We don't want that, we want to rip out the legacy db code (and rdf) after a cycle. So updates should waterfall to 76 (?), which does the conversion, and then in 82 (?) esr it should be removed. This single use code should run in migrateAtProfileStartup() and be self contained for easy removal later. Thoughts?

Most things look ok, closer look is upcoming.

(In reply to alta88 from comment #61)

xhr is good enough. Now to migration issues:

  1. Once the conversion code runs, that's it for the 2 dbs, and there's no going back. So there should be a big relnote for people who might try to use older Tb versions on an updated profile, even with this disallowed now - their feeds will all be gone.

Good point. By relNote, do you mean the TB release notes?
My gut feeling (ie not backed up by any evidence ;-) is that transporting profiles back to an earlier version of TB is likely to be a pretty rare occurrence, so as long as it's documented somewhere that's probably enough.

  1. The conversion code is pretty intertwined right now with code that will read the json dbs going forward. We don't want that, we want to rip out the legacy db code (and rdf) after a cycle. So updates should waterfall to 76 (?), which does the conversion, and then in 82 (?) esr it should be removed. This single use code should run in migrateAtProfileStartup() and be self contained for easy removal later. Thoughts?

a) I'd agree that making the migration a more self-contained step would be nicer. I shall look into it.
b) Will we really be able to drop support for the rdf->json migration at some point? I'm thinking there are likely to be a lot of people (and/or companies) who might go 5 or 10 years between upgrades... (again, something it'd be nice to collect some evidence about)

Well, the goal is to drop rdf sooner. They can't migrate to the rdf free version until they upgrade to the conversion version first, no jumping that. It's called waterfall, for the linear software development process. It will be necessary to get wsm involved in the details; a relatively recent release was done this way I believe.

I think you mean watershed, not fall.
Anyway, which migration code are we talking about?

Updating more than one major release at the time is... if not completely unsupported, then at least discouraged. There have been incompatible changes over the years so you do risk messing up your data.

Yes, watershed, I'm glad you know about it. Otherwise I don't know what the rest of the comment is supposed to mean or what conclusion to draw from it.

You can't go from Fx17 to Fx68 in one update, the path is 17 -> 45.02 -> 57.04 -> 68.01. If you don't manage it that way, you can't ever remove rdf.

This one moves all the feeds.rdf/feeditems.rdf migration out into MailMigrator.jsm.
No major functional changes from the previous patch.
I don't have a strong opinion either way here. On one hand it's nice moving all the one-shot migration code out, but on the other hand I found there is potential for tricky timing issues if you're not careful (some of the feeds code is invoked before the migration check is run, so if you write out json files immediately, the migration will get confused).
Anyway, let me know what you think.

Attachment #9080849 - Attachment is obsolete: true
Attachment #9080850 - Attachment is obsolete: true
Attachment #9080849 - Flags: review?(alta88)
Attachment #9084588 - Flags: review?(alta88)

Yes, very nice to easily dump that migration code. The biff for feeds to download does not start for 1 minute after startup, so that should not be an issue. What is requesting the feeds.rdf file is favicon resolution, and the code is designed to lazily create the file (now feeds.json). If the migrator does not block the UI paint (maybe it should), then favicon resolution will fail if there's an empty feeds.json. So what migration should do, when it finishes, is delete the favicon cache and invalidate() the folder pane tree, so favicons can be fetched again.

I've been trying to replicate the timing issue I thought I had saw, and I now I can't. The migration always completes waaay before any feeds code is invoked. So I'm rather thinking now that I was confusing it with something else.

It looks like MailMigrator is invoked via the final-ui-startup notification, which is triggered just before the first window for the application is displayed. So I don't think we have to worry about UI paint.
Which is what you'd expect - profile migration should be handled before any UI stuff happens.
So with that in mind, unless we can establish a code path where the favicon might get fetched before the migration runs I think we should leave the favicon cache alone.

I'm pretty hazy on the UI side of startup, so maybe there are some paths via which favicons get fetched before the app window opens? What do you think?

So, alta88, when do you expect to get to the review?

Attachment #9084588 - Flags: review?(alta88) → review?(mkmelin+mozilla)
Comment on attachment 9084588 [details] [diff] [review]
part-two--remove-nsirdf-use-7.patch

Review of attachment 9084588 [details] [diff] [review]:
-----------------------------------------------------------------

I'll check it more later once these comments have been addressed

::: mail/base/modules/MailMigrator.jsm
@@ +340,5 @@
>    /* eslint-enable complexity */
>  
>    /**
> +   * RSS subscriptions and items used to be stored in .rdf files,
> +   * but now we've changed to use JSON files instead.

nit: please use the full 80ch for comments

@@ +358,5 @@
> +      }
> +    }
> +
> +    // For each one...
> +    for (let i = 0; i < rssServers.length; i++) {

can just use for (let server of rssServers)

@@ +361,5 @@
> +    // For each one...
> +    for (let i = 0; i < rssServers.length; i++) {
> +      let server = rssServers[i];   // nsIMsgIncomingServer
> +      let rssServer = server;
> +      rssServer.QueryInterface(Ci.nsIRssIncomingServer);

please assigned the QI result at the same time. I think just doing QI is not guaranteed to change the current object in some edge cases.

@@ +366,5 @@
> +
> +      // Convert feeds.rdf to feeds.json (if needed).
> +      let feedsFile = rssServer.subscriptionsPath;
> +      let legacyFeedsFile = server.localPath;
> +      legacyFeedsFile.append("feeds.rdf");

Briefer code is esier to read, so

let legacyFeedsFile = server.localPath.append("feeds.rdf");

@@ +377,5 @@
> +      }
> +      // Convert feeditems.rdf to feeditems.json (if needed).
> +      let itemsFile = rssServer.feedItemsPath;
> +      let legacyItemsFile = server.localPath;
> +      legacyItemsFile.append("feeditems.rdf");

same thing here

@@ +413,5 @@
> +
> +    let feeds = [];
> +    // Skip the fz:root->fz:feeds->etc structure. Just grab fz:feed nodes.
> +    let feedNodes = doc.documentElement.getElementsByTagNameNS(this.FZ_NS, "feed");
> +    if (feedNodes.length < 1) {

== 0
But, is no subscriptions an error? What if you had subscriptions but removed them all?

@@ +419,5 @@
> +    }
> +
> +    let toBool = function(val) { return (val == "true"); };
> +
> +    // map RDF feed property names to js.

Map

@@ +474,5 @@
> +   *
> +   * @param {nsIFile} legacyFile  - Location of the rdf file.
> +   * @param {nsIFile} jsonFile    - Location for the output JSON file.
> +   * @returns {void}
> +   * @throws exception if conversion fails.

js only ever throws Error

@@ +505,5 @@
> +      return Number.isNaN(t) ? 0 : t;
> +    };
> +
> +    let itemNodes = doc.documentElement.getElementsByTagNameNS(this.RDF_SYNTAX_NS, "Description");
> +    if (itemNodes.length < 1) {

== 0
But here too, is that an error?

@@ +570,5 @@
>     * been loaded.
>     */
>    migrateAtProfileStartup() {
>      this._migrateUI();
> +    this._migrateRSS();

please make sure to only migrate once (move it into the _migrateUI and bump the ui version)

::: mailnews/extensions/newsblog/content/Feed.js
@@ +402,4 @@
>    invalidateItems() {
>      let ds = FeedUtils.getItemsDS(this.server);
> +    for (let id in ds.data) {
> +      let item = ds.data[id];

for (let item of ds.data) ?

@@ +415,5 @@
> +   * Discards invalid items (in the feed item store) associated with the
> +   * feed. There's a delay - invalid items are kept around for a set time
> +   * before being purged.
> +   *
> +   * @param {Boolean} aDeleteFeed - is the feed being deleted (bypasses

boolean

@@ +427,3 @@
>      let currentTime = new Date().getTime();
> +    for (let id in ds.data) {
> +      let item = ds.data[id];

for (let item of ds.data) ?

@@ +432,3 @@
>          continue;
>        }
> +      let lastSeenTime = item.lastSeenTime;

let lastSeenTime = item.lastSeenTime || 0;

::: mailnews/extensions/newsblog/content/FeedUtils.jsm
@@ +18,5 @@
>  Services.scriptloader.loadSubScript("chrome://messenger-newsblog/content/FeedItem.js");
>  Services.scriptloader.loadSubScript("chrome://messenger-newsblog/content/feed-parser.js");
>  
>  var FeedUtils = {
> +  Items: {},          // The feed items store (indexed by server URI)

Why is this capitalized. I don't see it used either?

@@ +994,5 @@
>  
> +    // Get affected feed subscriptions - all the ones in the original folder.
> +    let origDS = this.getSubscriptionsDS(aOrigFolder.server);
> +    let affectedSubs = origDS.data.filter(sub => sub.destFolder == origURI);
> +    if (!affectedSubs.length) {

if (affectedSubs.length == 0)

@@ +1306,5 @@
> +   *
> +   * @param {String} feedURL              - URL of the feed.
> +   * @param {nsIMsgIncomingServer} server - server holding subscription.
> +   * @param {String} attrName             - Name of attribute to fetch.
> +   * @param {Undefined} value             - Value to store.

lover case string and undefined

::: mailnews/extensions/newsblog/content/newsblogOverlay.js
@@ +145,5 @@
>          // one on the path to bliss.
> +        let folder = aMsgHdr.folder;
> +        showSummary = true;
> +        const ds = FeedUtils.getSubscriptionsDS(folder.server);
> +        for (let sub in ds.data) {

of?
Attachment #9084588 - Flags: review?(mkmelin+mozilla)

New patch, rebased, adapted to new eslint rules and miscellaneous fixups.

A few things I didn't do:

  • one or two for...in loops not changed to to for...of because I need the key as well as the value.
  • I left {String} and {Boolean} jsdoc types as uppercase because eslint complained when I lowercased them.

But the biggie is the Migration handling. It didn't seem right to tie the RSS migration to the UI version number. I don't know the conditions under which the UI version numbers are increased, but more importantly, I think it'd be a little dangerous from a failure-recovery point of view (ie if either the RSS or the UI migration fail, the next run will try to migrate an already partly-migrated account).
I think the "correct" thing would be to add a version number specifically for the RSS migration and handle it separately.
What's there already 'latches' the migration using the existence of the old feed.rdf/feeditems.rdf files (removing them once the migration is successful and making sure to never overwrite an existing .json file). So it's a couple of extra stat() per Feeds account during startup, but I don't think that's too significant from a performance point of view. I think it's fine as is, but adding a version number is more explicit I guess. I'm open to suggestions.

Attachment #9084588 - Attachment is obsolete: true
Attachment #9091332 - Flags: feedback?(mkmelin+mozilla)
Comment on attachment 9091332 [details] [diff] [review]
part-two--remove-nsirdf-use-8.patch

Review of attachment 9091332 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry, doesn't work.
JavaScript error: resource:///modules/MailMigrator.jsm, line 440: TypeError: legacyFeedsFile is undefined

::: mail/base/modules/MailMigrator.jsm
@@ +435,5 @@
> +      let rssServer = server.QueryInterface(Ci.nsIRssIncomingServer);
> +
> +      // Convert feeds.rdf to feeds.json (if needed).
> +      let feedsFile = rssServer.subscriptionsPath;
> +      let legacyFeedsFile = server.localPath.append("feeds.rdf");

append doesn't return anything....

@@ +556,5 @@
> +   * Convert a legacy feeditems.rdf file into feeditems.json.
> +   * If the conversion is successful, the legacy file will be removed.
> +   *
> +   * @param {nsIFile} legacyFile  - Location of the rdf file.
> +   * @param {nsIFile} jsonFile    - Location for the output JSON file.

nit: don't need the extra indention.  That would make it all off if a third param is added in the future. 
 @param {nsIFile} jsonFile - Location for the output JSON file.

::: mailnews/extensions/newsblog/content/FeedUtils.jsm
@@ +1410,5 @@
> +   *
> +   * @param {nsIMsgIncomingServer} aServer - server to fetch item data for.
> +   * @returns {JSONFile}                   - JSONFile, holding array of
> +   *                                         feed subscriptions in its
> +   *                                         data field.

fix this spacing too, and the other documentation comments

@@ +1422,5 @@
> +    let rssServer = aServer;
> +    rssServer.QueryInterface(Ci.nsIRssIncomingServer);
> +    let feedsFile = rssServer.subscriptionsPath; // Path to feeds.json
> +    let exists = feedsFile.exists();
> +    ds = new JSONFile({ path: feedsFile.path });

let ds here instead of above

@@ +1495,5 @@
> +    let rssServer = aServer;
> +    rssServer.QueryInterface(Ci.nsIRssIncomingServer);
> +    let itemsFile = rssServer.feedItemsPath; // Path to feeditems.json
> +    let exists = itemsFile.exists();
> +    ds = new JSONFile({ path: itemsFile.path });

let ds

::: mailnews/local/public/nsIRssIncomingServer.idl
@@ +7,5 @@
>  interface nsIFile;
>  
>  [scriptable, uuid(6d744e7f-2218-45c6-8734-998a56cb3c6d)]
>  interface nsIRssIncomingServer : nsISupports {
> +  // Path to the subscriptions file for this RSS server

add . at end of sentence

@@ +13,2 @@
>  
> +  // Path to the feed items file for this RSS server

and here
Attachment #9091332 - Flags: feedback?(mkmelin+mozilla) → feedback-
Attachment #9091332 - Attachment is obsolete: true
Attachment #9093516 - Flags: review?(mkmelin+mozilla)
Attachment #9060566 - Attachment description: part-one--update-rss1-parsing-6.patch → [checked in] part-one--update-rss1-parsing-6.patch
Comment on attachment 9093516 [details] [diff] [review]
part-two--remove-nsirdf-use-9.patch

Review of attachment 9093516 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/extensions/newsblog/content/FeedUtils.jsm
@@ -1606,4 @@
>      }
> -    if (!this[aServer.serverURI]) {
> -      this[aServer.serverURI] = {};
> -    }

you need this part too. I get an error now about "this[aServer.serverURI] is undefined" for setting .FeedsDS on it later. Apparently for the top level folder.

Not sure how badly the error affects anything.

@@ +1488,5 @@
>        return this[aServer.serverURI].FeedItemsDS;
>      }
>  
> +    let rssServer = aServer;
> +    rssServer.QueryInterface(Ci.nsIRssIncomingServer);

Please assign to the QI result. (Usually works without it, but there are edge cases where it doesn't and it's also very unintuitive in javascript that it would change the object internally.)

let rssServer = aServer.QueryInterface(Ci.nsIRssIncomingServer);

@@ -1669,5 @@
> -    // its "loaded" property to be sure.  You can also attach an observer
> -    // which will get notified when the load is complete.
> -    if (!this[aServer.serverURI]) {
> -      this[aServer.serverURI] = {};
> -    }

didn't get an error from this removal. but might be worth keeping the check still?
Attachment #9093516 - Flags: review?(mkmelin+mozilla) → review-

(In reply to Magnus Melin [:mkmelin] from comment #74)

Comment on attachment 9093516 [details] [diff] [review]
part-two--remove-nsirdf-use-9.patch

  • if (!this[aServer.serverURI]) {
  •  this[aServer.serverURI] = {};
    
  • }
    you need this part too. I get an error now about "this[aServer.serverURI] is
    undefined" for setting .FeedsDS on it later. Apparently for the top level
    folder.
    Not sure how badly the error affects anything.

Oh, man. That is an embarrassing error :-(
Oddly, I never encountered it. On my system the getStatus() method always gets called first, and that does fill in the blank object, so the error never triggered for me.
Fixed in next patch.

  • let rssServer = aServer;
  • rssServer.QueryInterface(Ci.nsIRssIncomingServer);

Please assign to the QI result. (Usually works without it, but there are
edge cases where it doesn't and it's also very unintuitive in javascript
that it would change the object internally.)

I find QI really confusing in javascript - all the information I can find implies that the QI call changes the object in-place... yet when I change it to your suggestion:

let rssServer = aServer.QueryInterface(Ci.nsIRssIncomingServer);

aServer still functions as an nsIMsgIncomingServer. I'm pretty sure there's no way for a method to act differently based on how the return value is used, so there's some other magic going on. I'm guessing QI applies the members of the new interface on top of whatever's already there.
(nsIRssIncomingServer is not an nsIMsgIncomingServer, it's derived direcly from nsISupports).

Anyway, it works and reads better, so changes made in next patch!

Attachment #9093516 - Attachment is obsolete: true
Attachment #9093749 - Flags: review?(mkmelin+mozilla)
Comment on attachment 9093749 [details] [diff] [review]
part-two--remove-nsirdf-use-10.patch

Review of attachment 9093749 [details] [diff] [review]:
-----------------------------------------------------------------

Bitrotted.
Unbitrotted and tried it out, but at least some feeds are not migrated. I'll send you my testcase feeds.rdf
Attachment #9093749 - Flags: review?(mkmelin+mozilla)

Thanks for the feeds file Magnus.
This latest patch should fix the issues you had. The migration was being too pedantic about which values it required - two of your feeds had no lastModified timestamp set.

You're welcome to review it again, but all these little issues cropping up are really concerning me. I'm going to step back and try to devise some properly-torturous unit tests.

Attachment #9093749 - Attachment is obsolete: true

Some migration tests. Covers most of the issues the migrations part has encountered so far.
Still worried about the lack of tests on the feeds code proper, but I think we've just got to press on and if issues crop up I'll try and add tests for them. So I shall set r? on the previous patch now. Not reviewing this one yet - I'll wait and see if there's any more to add to it first.

Attachment #9094103 - Flags: review?(mkmelin+mozilla)
Comment on attachment 9094339 [details] [diff] [review]
part-three--add-migration-tests-1.patch

Review of attachment 9094339 [details] [diff] [review]:
-----------------------------------------------------------------

I think the test files are missing a license boilerplate at the top. Otherwise looks ok to me
Attachment #9094339 - Flags: review+
Comment on attachment 9094103 [details] [diff] [review]
part-two--remove-nsirdf-use-11.patch

Review of attachment 9094103 [details] [diff] [review]:
-----------------------------------------------------------------

I think this is ready to land now! r=mkmelin
Attachment #9094103 - Flags: review?(mkmelin+mozilla) → review+
Attachment #9094339 - Attachment is obsolete: true
Attachment #9094774 - Flags: review+
Keywords: checkin-needed

Added license lines to the new unit test, so both patches ready to land now.

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/4c02ab544b65
Remove nsIRDF* use from newsfeed support. r=mkmelin
https://hg.mozilla.org/comm-central/rev/d8d02940a5f2
Add unit test to cover feeds config migration. r=mkmelin

Keywords: checkin-needed

It looks like closing this is the right thing to do (that is, the leave-open wasn't intended for now) so that's what I'm doing.

Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 71.0

Anyone noticed the test failure on Windows :-(
TEST-UNEXPECTED-FAIL | comm/mailnews/extensions/newsblog/test/unit/test_rdfmigration.js | xpcshell return code: 0

I'll take a look since you're not on Windows. Maybe something easy to do with LF vs CRLF or / vs \

Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/c5a673e520e1
Fix test_rdfmigration.js on Windows. rs=bustage-fix DONTBUILD

OK, it was a / vs \ mistake. do_get_file() takes a Linux path with / on all platforms, so using OS.Path.join() introduced \ and it failed.

(In reply to Jorg K (GMT+2) from comment #88)

OK, it was a / vs \ mistake. do_get_file() takes a Linux path with / on all platforms, so using OS.Path.join() introduced \ and it failed.

Oof. Sorry about that one. And I'd specifically changed it to use OS.Path.join() instead :-(

Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/e0bb49101c46
Follow-up: Reformat. rs=reformat
You need to log in before you can comment on or make changes to this bug.