Closed Bug 786638 Opened 7 years ago Closed 5 years ago

Implement testing infrastructure for reader mode

Categories

(Firefox for Android :: Reader View, defect, P1)

All
Android
defect

Tracking

()

RESOLVED FIXED
Firefox 36

People

(Reporter: lucasr, Assigned: Margaret)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file, 13 obsolete files)

77.45 KB, patch
bnicholson
: review+
Details | Diff | Splinter Review
We should at least cover a list of pages and check if reader mode enabled or not as expected.
Priority: -- → P1
OS: Linux → Android
Hardware: x86 → All
Attachment #665359 - Flags: review?(gbrown)
Attachment #665360 - Flags: review?(mark.finkle)
Attachment #665363 - Flags: review?(mark.finkle)
These patches add all the necessary bits to run reader mode tests on top of a local pageset. The reader_mode_sites.json (the "manifest" for reader mode tests) in the patch is just a sample. I'm working on the final pageset zip file (removing all remote access from each page) and manifest. It should be possible to review the patches without them anyway.

My assumption here is that the test setup routine will download and unzip the pageset zip into objdir/_test/testing/mochitests/tests/robocop/sites/. I really want to avoid pushing this zip to m-c as it might get a bit too big. jmaher, geoffbrown, let me know what you think.
Comment on attachment 665359 [details] [diff] [review]
Add getVisibility() method to FennecNativeElement

Review of attachment 665359 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM
Attachment #665359 - Flags: review?(gbrown) → review+
Comment on attachment 665363 [details] [diff] [review]
Add testReaderMode conformance test for Reader Mode

Review of attachment 665363 [details] [diff] [review]:
-----------------------------------------------------------------

Some drive-by review comments....

::: mobile/android/base/tests/testReaderMode.java.in
@@ +17,5 @@
> +
> +import org.json.JSONArray;
> +import org.json.JSONException;
> +import org.json.JSONObject;
> +

Please add a brief header comment here -- similar to the one in, say, testAboutPage.

@@ +19,5 @@
> +import org.json.JSONException;
> +import org.json.JSONObject;
> +
> +public class testReaderMode extends PixelTest {
> +    private static String MANIFEST_PATH = "/mnt/sdcard/reader_mode_sites.json";

Oh darn, hard-coded /sdcard is back -- is there any chance of using deviceroot instead?

@@ +65,5 @@
> +
> +            for (int i = 0; i < readable.length(); i++) {
> +                String url = readable.getString(i);
> +                loadUrl(getAbsoluteUrl(PAGE_PREFIX + url));
> +                mActions.expectGeckoEvent("Content:ReaderEnabled").blockForEvent();

There's a race here: Better to call expectGeckoEvent before loadUrl, then call blockForEvent after.

@@ +74,5 @@
> +
> +            for (int i = 0; i < unreadable.length(); i++) {
> +                String url = unreadable.getString(i);
> +                loadUrl(getAbsoluteUrl(PAGE_PREFIX + url));
> +                mActions.expectGeckoEvent("Content:ReaderDisabled").blockForEvent();

...and here too.
Here's the patch with the real reader_mode_sites.json file.
Attachment #665362 - Attachment is obsolete: true
Attachment #665362 - Flags: review?(jmaher)
Attachment #665916 - Flags: review?(jmaher)
I removed all page assets (images, js, css, etc) because they are not relevant when testing reader mode bits. I tweaked all pages to avoid access to any remote resource.
Ooops, this one properly uses deviceroot in the script.
Attachment #665916 - Attachment is obsolete: true
Attachment #665916 - Flags: review?(jmaher)
Attachment #665938 - Flags: review?(jmaher)
Here's a new version of the patch covering gbrown's review.
Attachment #665363 - Attachment is obsolete: true
Attachment #665363 - Flags: review?(mark.finkle)
Attachment #665939 - Flags: review?(mark.finkle)
Attachment #665939 - Flags: feedback?(gbrown)
Attachment #665360 - Flags: review?(mark.finkle) → review+
Comment on attachment 665939 [details] [diff] [review]
Add testReaderMode conformance test for Reader Mode

Looks good to me. Those seem like the right events to block on.
Attachment #665939 - Flags: review?(mark.finkle) → review+
Comment on attachment 665939 [details] [diff] [review]
Add testReaderMode conformance test for Reader Mode

Review of attachment 665939 [details] [diff] [review]:
-----------------------------------------------------------------

This looks great now.

::: mobile/android/base/tests/testReaderMode.java.in
@@ +44,5 @@
> +        return json;
> +    }
> +
> +    private void testOneUrl(String url, String expectedEvent, boolean expectedVisibility) {
> +        Log.d(this.getClass().getSimpleName(), "Testing page: " + url);

If you want to see this in the test output (as well as logcat) you can use mAsserter.dumpLog() -- but it's fine as it is too.
Attachment #665939 - Flags: feedback?(gbrown) → feedback+
Comment on attachment 665938 [details] [diff] [review]
Install reader_mode_sites.json file to be used by reader mode tests

Review of attachment 665938 [details] [diff] [review]:
-----------------------------------------------------------------

this is good and simple.
Attachment #665938 - Flags: review?(jmaher) → review+
Depends on: 796068
FYI: filed bug 796068 to track the needed test infra work to land these patches.
Lucas, what happened to this bug? Are there any parts we can land?
Blocks: readerv2
Flags: needinfo?(lucasr.at.mozilla)
(In reply to :Margaret Leibovic from comment #16)
> Lucas, what happened to this bug? Are there any parts we can land?

IIRC, this is blocked on testing infrastructure (see bug 796068). These tests need a zip file with the local pages to be deployed before the test runs.
Flags: needinfo?(lucasr.at.mozilla)
I can take this over to try to land something. The world has changed since these tests were written, and I think we could write a test purely in JS, similar to what we're already doing in testReadingListCache.
Assignee: lucasr.at.mozilla → margaret.leibovic
Instead of storing test sites in a zip file, which is what caused us to be blocked by bug 796068, this patch just adds the sites to a directory in our tests directory.

If we are concerned about the size of these files, we can reduce the set of sites and try to get a zip file solution working, but I feel like it would be better to get a small test suite running in the tree, rather than a larger test suite hanging out in a patch.
Attachment #665359 - Attachment is obsolete: true
Attachment #665360 - Attachment is obsolete: true
Attachment #665918 - Attachment is obsolete: true
Attachment #665938 - Attachment is obsolete: true
Attachment #665939 - Attachment is obsolete: true
Attachment #8524984 - Flags: review?(nalexander)
This updates testReadingListCache to test the set of test articles. 

I removed the arstechnica.com and www.grantland.com pages from the JSON file because they were causing test failures (the former was bug 875076), so I'll need to investigate those problems to see what's going on.

This is passing locally, but here's a try push:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=a332f7baa739
Attachment #8524988 - Flags: review?(nalexander)
Attachment #8524988 - Flags: review?(bnicholson)
(Oops, wrong patch)
Attachment #8524988 - Attachment is obsolete: true
Attachment #8524988 - Flags: review?(nalexander)
Attachment #8524988 - Flags: review?(bnicholson)
Attachment #8524990 - Flags: review?(nalexander)
Attachment #8524990 - Flags: review?(bnicholson)
Comment on attachment 8524984 [details] [diff] [review]
Add test reader mode sites to tree

Review of attachment 8524984 [details] [diff] [review]:
-----------------------------------------------------------------

Technically, this needs a little build system work.

gps raises the question of whether it's legal to distribute this content.  My guess is no, it's not; but we should discuss and get legal input about this.

::: build/mobile/robocop/Makefile.in
@@ +64,5 @@
>    $(wildcard $(TESTPATH)/*.ogg) \
>    $(wildcard $(TESTPATH)/*.mp4) \
>    $(wildcard $(TESTPATH)/*.webm) \
>    $(wildcard $(TESTPATH)/*.swf) \
> +  $(wildcard $(TESTPATH)/reader_mode_sites) \

I'm surprised this works; I doubt it actually does.  wildcard only goes one deep, IIRC; so you're probably getting just the first layer of your tree.  I think we want to use TEST_HARNESS_FILES here, and this is easy; I'll post a patch for you to test and r?.
Attachment #8524984 - Flags: review?(nalexander) → feedback+
Comment on attachment 8524990 [details] [diff] [review]
Test reader mode parsing of test sites

Review of attachment 8524990 [details] [diff] [review]:
-----------------------------------------------------------------

This looks fine, but consider:

The readable/unreadable dichotomy doesn't allow for much future expansion.  If you went with a list of sites, like:

[
  { url: ..., article: ... },
]

you might be able to bake details about the site in to the schema and test more dynamically.  (For example, you could verify that you get a valid title, or you could include the reading time in the future.)

Food for thought.

::: mobile/android/base/tests/reader_mode_sites.json
@@ +1,3 @@
> +{
> +  "readable": [
> +    "/www.wired.com/wiredscience/2012/08/erez-lieberman-aiden/index.html",

This makes it clear we're not fetching from the 'net, which is good; can you think of a way to make it more clear?  Perhaps replace www.wired.com with www_wired_com or mirror/www.wired.com or similar?

::: mobile/android/base/tests/testReadingListCache.js
@@ +62,5 @@
> +
> +  let readable = json.readable;
> +  do_check_neq(readable, null);
> +  for (let path of readable) {
> +    let url = "http://mochi.test:8888/tests/robocop/reader_mode_sites" + path;

Maybe save yourself some future pain and handle leading and trailing / characters explicitly.

@@ +66,5 @@
> +    let url = "http://mochi.test:8888/tests/robocop/reader_mode_sites" + path;
> +    let article = yield Reader._downloadAndParseDocument(url);
> +    do_check_neq(article, null);
> +    do_check_neq(article.content, null);
> +    do_check_eq(article.url, url);

No title extraction?

@@ +82,5 @@
> +function promiseXhr(url) {
> +  return new Promise((resolve, reject) => {
> +    let xhr = Cc["@mozilla.org/xmlextras/xmlhttprequest;1"].createInstance(Ci.nsIXMLHttpRequest);
> +    xhr.open("GET", url, true);
> +    xhr.onerror = e => reject(e);

nit: Prefer including parens around arguments always, like:

(e) => reject(e) ...

It's more consistent with zero argument functions () and two argument functions (e, f).

@@ +89,5 @@
> +        reject(xhr.status)
> +      }
> +      resolve(xhr.responseText);
> +    };
> +    xhr.send(null);

Is there a reason to include null?  This is a GET request.
Attachment #8524990 - Flags: review?(nalexander) → review+
(In reply to Nick Alexander :nalexander from comment #22)
> gps raises the question of whether it's legal to distribute this content. 
> My guess is no, it's not; but we should discuss and get legal input about
> this.

Yeah, IIRC, this was the reason we needed to zip and download the sites separately in bug 796068.
(In reply to Brian Nicholson (:bnicholson) from comment #24)
> (In reply to Nick Alexander :nalexander from comment #22)
> > gps raises the question of whether it's legal to distribute this content. 
> > My guess is no, it's not; but we should discuss and get legal input about
> > this.
> 
> Yeah, IIRC, this was the reason we needed to zip and download the sites
> separately in bug 796068.

Hmm.  Can we find a corpus of CC0 or PD documents that are interesting tests?  Can we obfuscate the existing tests in a reasonable way?  Checking in to the tree is just so much easier than doing the releng distribution dance.
Comment on attachment 8524990 [details] [diff] [review]
Test reader mode parsing of test sites

Review of attachment 8524990 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM assuming we're allowed to distribute these.
Attachment #8524990 - Flags: review?(bnicholson) → review+
(In reply to Nick Alexander :nalexander from comment #25)
> Hmm.  Can we find a corpus of CC0 or PD documents that are interesting
> tests?  Can we obfuscate the existing tests in a reasonable way?  Checking
> in to the tree is just so much easier than doing the releng distribution
> dance.

Regarding obfuscation, the reader parsing algorithm is deeply tied to the page's structure/tags/content. By the time the page is no longer considered copyright infringement, the parsing algorithm will likely output completely different results.

Finding a bunch of PD documents sounds like a good way to go if they're varied enough. I remember hand-picking several of these because each one presented a different issue at some point, so we'd unfortunately lose some thoroughness if we changed the test set.
(In reply to Nick Alexander :nalexander from comment #22)
> Comment on attachment 8524984 [details] [diff] [review]
> Add test reader mode sites to tree
> 
> Review of attachment 8524984 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Technically, this needs a little build system work.
> 
> gps raises the question of whether it's legal to distribute this content. 
> My guess is no, it's not; but we should discuss and get legal input about
> this.
> 
> ::: build/mobile/robocop/Makefile.in
> @@ +64,5 @@
> >    $(wildcard $(TESTPATH)/*.ogg) \
> >    $(wildcard $(TESTPATH)/*.mp4) \
> >    $(wildcard $(TESTPATH)/*.webm) \
> >    $(wildcard $(TESTPATH)/*.swf) \
> > +  $(wildcard $(TESTPATH)/reader_mode_sites) \
> 
> I'm surprised this works; I doubt it actually does.  wildcard only goes one
> deep, IIRC; so you're probably getting just the first layer of your tree.  I
> think we want to use TEST_HARNESS_FILES here, and this is easy; I'll post a
> patch for you to test and r?.

I found that this was indeed working with multi-level directories. But I'm open to suggestions if there's a better way to do this (I just copy/paste whenever I'm working in a Makefile :).

However, I'm going to change up the approach in this bug to just create our own minimal set of HTML testcases that we can for sure land in the tree. I'll start by trying to sanitize/generalize the test sites that we currently have here, keeping the structure/tags/content that Readability.js cares about. And then I want to change the structure of the JSON file we're using here to include details about expected content. This way we can make sure that in addition to finding an article, our algorithm actually finds the right bits inside it.

My goal is to make it easy to add new testcases in the future, so as we fix bugs we can add regression tests for them. I'm not going to try to cover everything in these testcases, since that would likely take too much time and effort :)
Changing my approach here. The first step is to get a basic framework in place that lets us stick an HTML file in the tree and declare what we expect the reader mode article result to be. This patch just keeps the same basic article we're already using in testReadingListCahce, but refactors the test to make it more extensible.
Attachment #8524984 - Attachment is obsolete: true
Attachment #8524990 - Attachment is obsolete: true
Attachment #8525709 - Flags: review?(nalexander)
Attachment #8525709 - Flags: review?(bnicholson)
Comment on attachment 8525709 [details] [diff] [review]
Create basic framework for testing reading list pages

Review of attachment 8525709 [details] [diff] [review]:
-----------------------------------------------------------------

Looks like this needs a rev, based on IRC.

::: mobile/android/base/tests/testReadingListCache.js
@@ +26,2 @@
>  
> +XPCOMUtils.defineLazyGetter(this, "BASIC_ARTICLE", function () {

Why is this required?
Attachment #8525709 - Flags: review?(nalexander)
Attachment #8525709 - Flags: review?(bnicholson)
After spending 5 minutes looking at the testcases, I decided that it would be too much work to try to sanitize these by hand in a way that preserves the HTML structure, since they're either big and messy, or minimized and messy. I think we should try to find some CC/public domain content to test, or just add testcases ourselves as we find bugs. As I mentioned previously, we've been living a long time without these tests, and I don't think this area is prone to regressions unless we start updating Readability.js (which we should actually try to improve over time).

This just adds the addons.mozilla.org and developer.mozilla.org test sites to the tree. Both of them are CC-BY-SA, and I'm hoping that basically having the original URL/content in the tree is attribution enough (also, really, this is just a test, I don't think anyone is going to get mad at us about this).
Attachment #8525711 - Flags: review?(nalexander)
Attachment #8525711 - Flags: review?(bnicholson)
Comment on attachment 8525711 [details] [diff] [review]
(Part 2) Add mozilla.org TLD test sites

Review of attachment 8525711 [details] [diff] [review]:
-----------------------------------------------------------------

Sure.  I scanned and it looks like there are no remote URIs opened at load time, but it's hard to say.  Try build?

::: mobile/android/base/tests/testReadingListCache.js
@@ +22,5 @@
>        content: "Lorem ipsum" // This is only used for test_store_article.
>      }
> +  },
> +  {
> +    url: "http://mochi.test:8888/tests/robocop/reader_mode_pages/addons.mozilla.org/en-US/firefox/index.html",

Let's lift this common prefix to a constant or a function.

@@ +26,5 @@
> +    url: "http://mochi.test:8888/tests/robocop/reader_mode_pages/addons.mozilla.org/en-US/firefox/index.html",
> +    expected: null
> +  },
> +  {
> +    url: "http://mochi.test:8888/tests/robocop/reader_mode_pages/developer.mozilla.org/en/XULRunner/Build_Instructions.html",

Will url != expected.url in test data.  Seems unlikely.  Maybe be less verbose here and just verify you get back the original url?
Attachment #8525711 - Flags: review?(nalexander) → review+
I ended up making all my changes with both patches applies, and I feel too lazy to split them up again, since there's not really that much going on in them :)

Try push:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=e02db7cf4aa8
Attachment #8525709 - Attachment is obsolete: true
Attachment #8525711 - Attachment is obsolete: true
Attachment #8525711 - Flags: review?(bnicholson)
Attachment #8525723 - Flags: review?(nalexander)
Attachment #8525723 - Flags: review?(bnicholson)
Comment on attachment 8525723 [details] [diff] [review]
Create basic framework for testing reading list pages

Review of attachment 8525723 [details] [diff] [review]:
-----------------------------------------------------------------

Looks fine to me!
Attachment #8525723 - Flags: review?(bnicholson) → review+
https://hg.mozilla.org/integration/fx-team/rev/9b831c52c6fa

I filed bug 1102450 to track incremental improvements we can make here.
https://hg.mozilla.org/mozilla-central/rev/9b831c52c6fa
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 36
You need to log in before you can comment on or make changes to this bug.