Closed Bug 911307 Opened 11 years ago Closed 10 years ago

Changes to top sites aren't immediately reflected in about:newtab

Categories

(Firefox :: General, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
Firefox 31

People

(Reporter: adw, Assigned: adw)

References

Details

(Whiteboard: [tiles] p=8 s=it-31c-30a-29b.2 [qa!])

Attachments

(3 files, 16 obsolete files)

32.49 KB, patch
adw
: review+
Details | Diff | Splinter Review
1.92 KB, patch
Details | Diff | Splinter Review
36.51 KB, patch
Details | Diff | Splinter Review
If you visit a page and that page becomes a top site as a result, it doesn't force about:newtab to recalculate its top sites.  So for example, it won't appear if you open about:newtab immediately afterward.  This doesn't have anything to do with thumbnailing, and it was the case even before background thumbnailing as far as I can tell, but I haven't tested.

You might think bg thumbnailing would mean that if the thumbnail doesn't appear, the bg service would fetch it, create the file, which would end up notifying site.js to refresh its thumbnail.  But here the thumbnail file exists (it was captured by gBrowserThumbnails), it just doesn't appear because the page is not actually a top site on the grid yet.

Filed due to bug 870100 comment 44.  Once your top sites become stable, this bug isn't a big deal, but it may be for new users and anyone else with a clean history.
Yeah, that's especially annoying if you have a new profile and start browsing. OTOH we shouldn't fetch top sites all the time. We could do it more often though.
(In reply to Tim Taubert [:ttaubert] from comment #2)
> Yeah, that's especially annoying if you have a new profile and start
> browsing. OTOH we shouldn't fetch top sites all the time. We could do it
> more often though.

I wonder if a reasonable hack would be to fetch top-sites more regularly only when we don't have enough top-sites to fill the grid?  It wouldn't be perfect, but would probably prevent people noticing and would prevent overhead in the common case?
(In reply to Mark Hammond (:markh) from comment #3)
> I wonder if a reasonable hack would be to fetch top-sites more regularly
> only when we don't have enough top-sites to fill the grid?

That sounds reasonable.  As long as there's a blank spot in the grid, we could keep a history observer that executes the top-sites query every time a URL not already in the grid is visited.
Sounds like a good solution to me.
(In reply to Drew Willcoxon :adw from comment #4)

> That sounds reasonable.  As long as there's a blank spot in the grid, we
> could keep a history observer that executes the top-sites query every time a
> URL not already in the grid is visited.

An alternative, with its own pros and cons, is to kick off the query as about:newtab is opened and fill the missing sites asynchronously.
This bug doesn't block those other two.  It doesn't even depend on them really.
No longer blocks: 870100, 927688
I set it as a blocker just because they are all related and for a better access from the other bugs.
Mark's idea in comment 6 is simpler than my idea in comment 4, so this patch does that.  I understand if populating the link cache and updating the grid at the time that newtab becomes visible runs counter to the point of preloading, but since I don't actually know the rationale for preloading, I thought I'd r? this first.
Attachment #832721 - Flags: review?(ttaubert)
Comment on attachment 832721 [details] [diff] [review]
repopulate top sites when newtab becomes visible and there are blank top sites

Tim, could you please look at this soon?
Attachment #832721 - Flags: review?(mhammond)
Comment on attachment 832721 [details] [diff] [review]
repopulate top sites when newtab becomes visible and there are blank top sites

Review of attachment 832721 [details] [diff] [review]:
-----------------------------------------------------------------

This looks fine to me, although I think we need to test the actual bug here (ie, that new top-sites magically appear on about:newtab).

::: browser/base/content/newtab/page.js
@@ +85,5 @@
> +      if (this.isVisible) {
> +        let hasBlanks = gGrid.sites.some(site => !site);
> +        if (hasBlanks) {
> +          // update() triggers background capture of missing thumbnails.
> +          gLinks.populateCache(() => this.update(), true);

can't you just specify this.update directly as the callback?

::: browser/base/content/test/newtab/browser_newtab_background_captures.js
@@ +57,1 @@
>          "Pre-loaded docshell just synchronously swapped, so background " +

this message should probably be changed to something like "so it should not yet be visible"

@@ +57,4 @@
>          "Pre-loaded docshell just synchronously swapped, so background " +
>          "captures should not be allowed yet");
>  
>    // Enable captures.

it seems as though we should be able to remove this pref from here, and simply enable b/g captures as the test starts?  ie, the test should work correctly without previously disabling captures via this pref and re-enabling here?

(Not that I see anything going wrong when doing this, it just seems better to not explicitly have the capturing disabled before this point, as nothing should go wrong if it is enabled from the start)

@@ +65,1 @@
>    let allowBackgroundCaptures = false;

this variable should probably be renamed to something like isNowVisible?

@@ +87,5 @@
>      file.remove(false);
>      TestRunner.next();
>    }, "page-thumbnail:create", false);
>  
> +  info("Waiting for visible change");

technically we are waiting for the notification of the background capture completing (ie, the comment was slightly misleading before as well)
Attachment #832721 - Flags: review?(mhammond) → feedback+
I'll work on a test.

(In reply to Mark Hammond [:markh] from comment #11)
> > +          gLinks.populateCache(() => this.update(), true);
> 
> can't you just specify this.update directly as the callback?

That would call update without being bound to `this`.  We could do populateCache(this.update.bind(this)) if you want.  update doesn't actually use `this`, so right now we could actually pass this.update directly, but that's just sloppy and there's no way I'm doing that.

> it seems as though we should be able to remove this pref from here, and
> simply enable b/g captures as the test starts?  ie, the test should work
> correctly without previously disabling captures via this pref and
> re-enabling here?

The test needs to prevent the about:newtab it opens while trying to get a handle to a hidden, pre-loaded about:newtab from triggering a capture of the site it added.

> technically we are waiting for the notification of the background capture
> completing (ie, the comment was slightly misleading before as well)

No, we're waiting first for the visible attribute to be changed.  After that, the background capture happens.
Assignee: nobody → adw
Status: NEW → ASSIGNED
(In reply to Drew Willcoxon :adw from comment #12)
> That would call update without being bound to `this`.  We could do
> populateCache(this.update.bind(this)) if you want.  update doesn't actually
> use `this`, so right now we could actually pass this.update directly, but
> that's just sloppy and there's no way I'm doing that.

Doh, yes, fair enough

> > it seems as though we should be able to remove this pref from here, and
> > simply enable b/g captures as the test starts?  ie, the test should work
> > correctly without previously disabling captures via this pref and
> > re-enabling here?
> 
> The test needs to prevent the about:newtab it opens while trying to get a
> handle to a hidden, pre-loaded about:newtab from triggering a capture of the
> site it added.

Yep, ok.

> > technically we are waiting for the notification of the background capture
> > completing (ie, the comment was slightly misleading before as well)
> 
> No, we're waiting first for the visible attribute to be changed.  After
> that, the background capture happens.

It's the observer notification that causes the test to continue (ie, it has the .next call).  So while the code under test is indeed waiting for the visible attribute, the test itself is waiting for the observer.
Attached patch with test (obsolete) — Splinter Review
Attachment #8347020 - Flags: review?(mhammond)
Attachment #832721 - Flags: review?(ttaubert) → review+
Comment on attachment 832721 [details] [diff] [review]
repopulate top sites when newtab becomes visible and there are blank top sites

doh - wrong tab :)
Attachment #832721 - Flags: review+
Attachment #8347020 - Flags: review?(mhammond) → review+
Comment on attachment 8347020 [details] [diff] [review]
with test

Review of attachment 8347020 [details] [diff] [review]:
-----------------------------------------------------------------

I like the move to .isVisible instead of .allowBackgroundCaptures. I don't like that for new profiles this would cause us to issue an SQLite query and re-render about:newtab whenever opening a new tab. This would basically render newtab page preloading useless as long as you have blank spots and regress the tabopen animation esp. for the first use experience.
Attachment #8347020 - Flags: review-
Thanks, Tim, it's very frustrating that you just didn't say that a long time ago when I mentioned that possibility in comment 9.

If the problem (In reply to Tim Taubert [:ttaubert] from comment #16)
> esp. for the first use experience.

We ought to just shut this whole bug down, then, because the first-run case of a completely empty about:newtab is what made me file this bug.  If we can't run a Places query the first time you open about:newtab, then this bug is a wontfix, because doing it any other time is worse.
Flags: needinfo?(ttaubert)
Attached patch history observer (obsolete) — Splinter Review
This uses a history observer to add newly visited pages to the grid when there are blank tiles.  It's OK but not great.

History observers aren't told the frecency of visited pages, so there's no way to insert new pages in the correct order.  This patch keeps a new _observedLinks list alongside _links and concats the two in getLinks().  It stops observing visits once the grid is filled, which means that the grid mysteriously stops updating.  (A problem with the previous patch, too, but more noticeable here since all new pages are added to the grid, regardless of frecency.)  Maybe it should keep observing as long as _links is empty, which is the entire first-run browser session and any other session where you start with no history.

History observers are notified of all visits, even redirects, so to keep redirects from filling up the grid, the patch ignores them.  That's different behavior from PlacesProvider.getLinks(), which seems to ignore all but the first and last pages in a redirect sequence due to how the history service works.  There's no way to do that here without bookkeeping, so I just don't do it.  One consequence, besides the final redirect not appearing in the grid, is that sites can get stuck without proper titles, since the first page in a redirect sequence may not have a title.

Maybe some of this could be fixed by improving history observers.  But it's definitely more complex than the previous patch, whose I/O is off the main thread and only happens when there are blank tiles, and whose extra work you only pay for when you use the feature, not whenever you visit a new page.
Attachment #8347574 - Flags: review?(mhammond)
(In reply to Drew Willcoxon :adw from comment #17)
> If the problem (In reply to Tim Taubert [:ttaubert] from comment #16)
> > esp. for the first use experience.
> 
> We ought to just shut this whole bug down, then, because the first-run case
> of a completely empty about:newtab is what made me file this bug.  If we
> can't run a Places query the first time you open about:newtab, then this bug
> is a wontfix, because doing it any other time is worse.

Not sure I agree with "any other time is worse". This is a valid bug esp. for the first-run case (as you said) but we should try to come up with another solution that doesn't regress anything.
Flags: needinfo?(ttaubert)
IIUC based on a chat with Tim, the issue with this patch isn't actually the SQL query (as it is async), but the fact that the query may complete before the tab animation has completed, and thus we may be updating the DOM during the animation which could introduce jank.

At first glance, this implies to me that simply doing this on a setTimeout(..., 250) would work OK (IIUC, the longest tab transition is 200ms).  On IRC, Tim said this would be "awkward", to which I countered that we are already in awkward territory :)  A 'transitionend' event handler might also be reasonable (ie, always add a transitionend event listener and add a little complexity so this new work is only done when *both* the places query and this event have fired).

Either way, I hope we can find a solution which isn't quite as intrusive as the new patch attached to this bug, but we really do need Tim's advice on what such a solution would look like.
Flags: needinfo?(ttaubert)
The BrowserNewTabPreloader starts preloading the next tab 600ms after opening a new tab. This okay to do as that is a hidden action. Updating the currently shown tab *might* be ok if all we do is append a site at the end of the grid. This might get a little awkward when sites are shuffled around or it takes a little longer to complete the places query.

We could try to call populateCache() right after a new tab has been opened. The SQLite query is executed off the main thread and shouldn't hurt the tabopen animation. Now at "tabopen + 500ms" or something we could then actually update pages.

In case newtab preloading is disabled we probably shouldn't do anything other than call populateCache(force=true) from page.js if there are blank spots in the grid, taking the cached links into account. This would also be the default case should we one day be able to remove newtab preloading i.e. when chrome and content run in separate processes.

Another idea (very similar to what Drew had in mind) is a little more complex unfortunately. If we would maintain a list of top sites ordered by frecency (along with their frecency values) we could update the grid as soon as the frecency of a page changes. This would then also need to batch-update everything should the frecency decay trigger set in and change all frecencies.

The first proposal would be a lot easier to implement but is obviously not very elegant and would stop working once all spots of the grid have been filled. The second proposal would update newtab pages whenever something changes the order of items or the items in the grid and would work all the time. This might be as easy as extending the history observer and firing a notification for UpdateFrecency() [1]. A global notification could be fired when the decay trigger sets in [2].

[1] http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/History.cpp#1008
[2] http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/nsPlacesTriggers.h#134
Flags: needinfo?(ttaubert)
Attached patch add frecency observers WIP (obsolete) — Splinter Review
I think observing frecency changes and updating the grid accordingly might be better than re-running the top sites query, too.

This WIP adds two new methods to nsINavHistoryObserver:

  void onFrecencyChanged(in nsIURI aURI,
                         in long aNewFrecency,
                         in ACString aGUID);
  void onFrecencyDecayed(in double aRate);

onFrecencyDecayed is to avoid calling onFrecencyChanged on every single moz_place when frecencies are decayed.  The idea is that the caller would adjust its own cached frecency values according to the given rate.  In this bug's case, the number of cached frecencies would be small and they'd be in memory, so updating them would be easy.

I haven't actually tested this and I'd like to get feedback first.
Attachment #8358193 - Flags: feedback?(mak77)
As discussed yesterday, my only concern is that I have some performance doubts/fears about this.
first, there are many points where frecency can change, in many cases we just invalidate it and wait for the first occasion to recalculate it. Also invalidated frecencies change ordering. So, all of the points should notify.
This causes my fear for performances, since in some cases there may be hundreds/thousands notifications going out, and it would be enough one single slow consumer to hang up everything.
Moreover I'm not sure you care about all of the changes, maybe you only care about the most common actions causing a change, adding visits, adding/removing bookmark, tags... and for these operations we already have notifications.
The other thing is that, as you know, once we update frecency, we don't get back the calculated value, since for performance reasons we do the calculation in a SQL function (mostly off the main thread). Adding a query to recollect this value would basically nullify that improvement, and may not be cheaper than just requerying for the top 10 frecency items when needed.
Thanks, Marco.

(In reply to Marco Bonardo [:mak] from comment #23)
> and may not be cheaper than just requerying for the top 10 frecency
> items when needed.

The problem is we can't precisely know what "when needed" is.  As Tim says, we don't want to re-run the query and update the page when it becomes visible since that would defeat the point of preloading it.  We could do it when the page is loaded by the preloader and hidden, but that would miss changes in top sites that occur between the time it's preloaded and the time it becomes visible.  We could do it on every page visit while it's hidden, but that's unnecessarily expensive.  We could do it on every page visit while it's hidden and there are blank spots in the grid, but that's not very satisfying, even though it might be the best option right now.

That's why I think we need some sort of frecency notification, or a notification of changes in the relative ranking of pages according to frecency.  I understand your concerns about performance, though.
Once you get notified of frecency changes, you must look at all of them, you can't filter on uri (since an unknown uri may surpass one you have in the list), though you can filter on the frecency value.
If we find a cheap way to notify frecency changes without adding an additional query to fetch the value, we may evaluate that patch, but I'd prefer not to increase the db exchanges.
So first, would be nice to compile a list of points where frecency changes, and see how often they may be called. Then to figure the cost to notify the value from those points (needs a db access? can get it for free?). Then there should be enough data to evaluate how to proceed.
Comment on attachment 8358193 [details] [diff] [review]
add frecency observers WIP

I'm resetting the flag since we discussed it in person and lokos like you are proceeding with an experimental patch
Attachment #8358193 - Flags: feedback?(mak77)
Attached patch frecency observers WIP 2 (obsolete) — Splinter Review
This patch has front-end and Places changes.  It works -- sites are added to and rearranged on newtab as their frecencies change -- but I'm only asking for feedback on the Places parts.

This is similar to the last patch in that it adds two new methods to nsINavHistoryObserver:

  void onFrecencyChanged(in nsIURI aURI,
                         in long aNewFrecency,
                         in ACString aGUID,
                         in AString aTitle,
                         in boolean aHidden,
                         in PRTime aVisitDate);
  void onManyFrecenciesChanged();

onManyFrecenciesChanged is called when many (maybe all) frecencies are updated without respect to any particular places IDs, which happens when all frecencies are invalidated and decayed.  The idea is that it's then up to the consumer to requery as needed.

Most of onFrecencyChanged's parameters are needed by newtab, which is why they're there.  aHidden and aVisitDate are necessary because newtab's initial query excludes hidden places and places with null visit dates, so incremental updates must use the same criteria.  newtab doesn't need aGUID, but it seems like it should be there.  (I wish that all these observer methods were passed a single object like a mozIPlaceInfo.)

To avoid requerying the DB for new frecencies, the patch uses a notify_frecency SQL function that calls out to nsNavHistory observers.  I use it instead of a trigger because I want to avoid notifying in some cases, i.e., when all frecencies are changed.  In every other case where frecency is updated I want to notify, so all those sites use the function.
Attachment #8358193 - Attachment is obsolete: true
Attachment #8361142 - Flags: feedback?(mak77)
Comment on attachment 8361142 [details] [diff] [review]
frecency observers WIP 2

Review of attachment 8361142 [details] [diff] [review]:
-----------------------------------------------------------------

I think this may work, I'm still a little bit scared by the notifications overhead but I don't think there's any better solution as of now.

::: toolkit/components/places/History.cpp
@@ +1178,1 @@
>    nsresult UpdateFrecency(const VisitData& aPlace)

I think there's a point you missed in History.cpp, when we insert a new place in
History::InsertPlace we may get a "fancy" frecency value from the outside, then you may want to notify

::: toolkit/components/places/SQLFunctions.cpp
@@ +850,5 @@
> +    PRTime lastVisitDate = static_cast<PRTime>(aArgs->AsInt64(5));
> +
> +    // This is a const version of the history object for thread-safety.
> +    const nsNavHistory* history = nsNavHistory::GetConstHistoryService();
> +    NS_ENSURE_STATE(history);

doesn't look like you are using it for anything, is this just a sanity check?

::: toolkit/components/places/nsNavHistory.cpp
@@ +999,5 @@
> +        "WHEN url BETWEEN 'place:' AND 'place;' "
> +        "THEN 0 "
> +        "ELSE -1 "
> +        "END) "
> +      "WHERE frecency > 0 "

I think you can still use a SQL fragment out of the if/else here

@@ +3162,5 @@
> +
> +  NS_IMETHOD HandleCompletion(uint16_t aReason)
> +  {
> +    nsresult rv = AsyncStatementTelemetryTimer::HandleCompletion(aReason);
> +    NS_ENSURE_SUCCESS(rv, rv);

I think you may just ignore errors here and continue, so you will still notify, telemetry shouldn't block you
Attachment #8361142 - Flags: feedback?(mak77) → feedback+
(In reply to Marco Bonardo [:mak] from comment #28)
> I think there's a point you missed in History.cpp

Thanks!

> > +    // This is a const version of the history object for thread-safety.
> > +    const nsNavHistory* history = nsNavHistory::GetConstHistoryService();
> > +    NS_ENSURE_STATE(history);
> 
> doesn't look like you are using it for anything, is this just a sanity check?

Didn't mean for that to be there.  It was left over from an earlier version.

> > +        "WHEN url BETWEEN 'place:' AND 'place;' "
> > +        "THEN 0 "
> > +        "ELSE -1 "
> > +        "END) "
> > +      "WHERE frecency > 0 "
> 
> I think you can still use a SQL fragment out of the if/else here

What do you mean exactly?  Build up the query string piece by piece instead of repeating most of it in each branch?
Attachment #8347574 - Flags: review?(mhammond)
Whiteboard: [triage]
Whiteboard: [triage]
Whiteboard: p=0
These are the front-end changes and are more or less ready for review, but they probably need tests, so I'm asking for feedback first.  Later I'll post the places patch that this depends on.

Ed and his team are probably going to build another link provider in bug 965437 for directory tiles.  They need a more generic provider interface as part of that since a directory tile provider would not really be based on Places.  Meanwhile in this bug, I need to expand the provider interface to support link changes.  So this patch was written with both things in mind.

PlacesProvider now receives three kinds of notifications from the history service: a single frecency changed, many changed, and a title changed.  It can have observers, which it notifies on changes.

The Links object still caches links from its provider, but now it's a little more sophisticated due to the single-link-change notification it receives from the provider.  When that happens, it updates the position of the changed link.  When the provider notifies about many links changing at once, Links force-repopulates its cache altogether.  In either case, it notifies its observers, which here is page.js, which then updates its grid.
Attachment #8376673 - Flags: feedback?(ttaubert)
Comment on attachment 8376673 [details] [diff] [review]
front-end frecency observer patch v1

Review of attachment 8376673 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM. I think the page.js changes suggested below would certainly simplify things further. NewTabUtils.jsm has certainly grown to a size where we should think about splitting it into multiple modules some time in the future.

::: browser/base/content/newtab/page.js
@@ +132,5 @@
> +   * Many individual link changes may happen in a small amount of time.  This
> +   * method coalesces changes by waiting a small amount of time for more before
> +   * updating the grid.
> +   */
> +  _scheduleUpdate: function () {

I'd rather not have the page know too much about the implementation of things on the module side. Can't we just schedule and handle updates in the JSM and then call AllPages.update() like we do already in other places? I think we wouldn't need to touch page.js at all here. We would also get Metro support for free using the existing mechanisms.

We could of course add an AllPages.scheduleUpdate() method to make this easier to implement.

@@ +138,5 @@
> +      clearTimeout(this._scheduleUpdateTimeout);
> +    this._scheduleUpdateTimeout = setTimeout(() => {
> +      if (!this.allowBackgroundCaptures)
> +        this.update();
> +    }, 1000);

If this is meant to catch frecency updates when the DB trigger is activated, do we need to rather huge value of 1000ms here? Wouldn't even updating on the next tick do?

::: toolkit/modules/BinarySearch.jsm
@@ +9,5 @@
> +this.BinarySearch = {
> +
> +  indexOf: function (array, target, comparator) {
> +    let [found, idx] = this.search(array, target, comparator);
> +    return found ? idx : -1;

Why does search() return an array? Can't we just use -1 to indicate the value wasn't found?

::: toolkit/modules/NewTabUtils.jsm
@@ +893,5 @@
>      if (AllPages.length && AllPages.enabled)
>        this.populateCache(function () { AllPages.update() }, true);
> +    else {
> +      this._sortedLinks = null;
> +      this._linkMap = null;

Nit: maybe change this to call .resetCache()?
Attachment #8376673 - Flags: feedback?(ttaubert) → feedback+
Making sure Boriss has a look at this.
Do I need to update all JS observer implementations in the tree to avoid method-not-defined errors?  I remember that was a problem.  I ran some existing unit tests but didn't see any errors, so I'm hoping not.
Attachment #832721 - Attachment is obsolete: true
Attachment #8347020 - Attachment is obsolete: true
Attachment #8347574 - Attachment is obsolete: true
Attachment #8361142 - Attachment is obsolete: true
Attachment #8380452 - Flags: review?(mak77)
This isn't too different from the previous patch.  It has a test and addresses your comments.

(In reply to Tim Taubert [:ttaubert] from comment #31)
> NewTabUtils.jsm has certainly grown to a size where we should think about
> splitting it into multiple modules some time in the future.

I think so, too.

> > +  _scheduleUpdate: function () {
> 
> I'd rather not have the page know too much about the implementation of
> things on the module side. Can't we just schedule and handle updates in the
> JSM and then call AllPages.update() like we do already in other places?

Cool, overlooked that, thanks.

> @@ +138,5 @@
> > +      clearTimeout(this._scheduleUpdateTimeout);
> > +    this._scheduleUpdateTimeout = setTimeout(() => {
> > +      if (!this.allowBackgroundCaptures)
> > +        this.update();
> > +    }, 1000);
> 
> If this is meant to catch frecency updates when the DB trigger is activated,
> do we need to rather huge value of 1000ms here? Wouldn't even updating on
> the next tick do?

The way the places patch works is that each individual frecency change causes a runnable to be posted to the main thread that actually notifies history observers of the change.  (That's opposed to situations where we know that a potentially large number of frecencies will change in batch, like on decay, which are covered by a different history observer method.)  In some cases we change some relatively small number of individual frecencies at once, like when enumerating an array of new places, which causes N runnables to be posted.  Waiting one turn of the event loop would only coalesce a couple of the notifications, which is why the patch waits longer.  One second is kind of arbitrary, but it seems reasonable.  Maybe could be lower.

> ::: toolkit/modules/BinarySearch.jsm
> @@ +9,5 @@
> > +this.BinarySearch = {
> > +
> > +  indexOf: function (array, target, comparator) {
> > +    let [found, idx] = this.search(array, target, comparator);
> > +    return found ? idx : -1;
> 
> Why does search() return an array? Can't we just use -1 to indicate the
> value wasn't found?

Returning two values -- a found boolean and an index -- allows search() to be used for both finding a given item and for finding where an item should be inserted.  If it returned -1 on not found, it couldn't be used to insert new items, so there'd need to be another binary search implementation just for that case.  As you can see, there are indexOf and insertionIndexOf convenience functions that call search.

I'm open to changing how these two values are returned.  Would an object be better than an array?

> ::: toolkit/modules/NewTabUtils.jsm
> @@ +893,5 @@
> >      if (AllPages.length && AllPages.enabled)
> >        this.populateCache(function () { AllPages.update() }, true);
> > +    else {
> > +      this._sortedLinks = null;
> > +      this._linkMap = null;
> 
> Nit: maybe change this to call .resetCache()?

Nice, thanks.
Attachment #8376673 - Attachment is obsolete: true
Attachment #8380505 - Flags: review?(ttaubert)
Blocks: 975228
(In reply to Drew Willcoxon :adw from comment #33)
> Do I need to update all JS observer implementations in the tree to avoid
> method-not-defined errors?  I remember that was a problem.  I ran some
> existing unit tests but didn't see any errors, so I'm hoping not.

I don't know what's the current stats of that, I also remember it was a problem, but many things changed from then. I guess we could just ignore it and we will discover if it's still an issue.
Comment on attachment 8380452 [details] [diff] [review]
for-review Places patch (frecency observers)

Review of attachment 8380452 [details] [diff] [review]:
-----------------------------------------------------------------

I think there's an issue with "hidden" and the notification may probably lose aPageId

::: toolkit/components/places/History.cpp
@@ +2074,5 @@
> +  // Post an onFrecencyChanged observer notification.
> +  nsCOMPtr<mozIStorageStatement> placeIdStmt = GetStatement(
> +    "SELECT id "
> +    "FROM moz_places "
> +    "WHERE rowid = last_insert_rowid() "

we should not use last_insert_rowid, cause we may add pages from multiple threads, you may select by url or by guid.

Even if, actually, I'm guessing if we could just not pass the placeID to the observer... we are already evaluating to slowly remove it from the interfaces, and make consumers rely on GUID. I think that for your use-case you may rely just on the guid.
That would save this additional query, I think.

::: toolkit/components/places/nsINavHistoryService.idl
@@ +682,5 @@
> +   * large operation where some large or unknown number of frecencies change at
> +   * once.  Use onManyFrecenciesChanged to detect such changes.
> +   *
> +   * @param aPlaceID
> +   *        The page's Places ID.

so, I'd remove this.

@@ +690,5 @@
> +   *        The page's new frecency.
> +   * @param aGUID
> +   *        The page's GUID.
> +   * @param aHidden
> +   *        True if the page is marked as hidden.

I think the value here may be bogus due to this:
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/History.cpp#1212

we may unhide the page after updating frecency...
is hidden important for the use-case here?

::: toolkit/components/places/nsNavHistory.cpp
@@ +557,5 @@
> +                   nsINavHistoryObserver,
> +                   OnManyFrecenciesChanged());
> +}
> +
> +class FrecencyNotification : public nsRunnable

this class should be in an anonymous namespace (or in mozilla::places if needed)

@@ +578,5 @@
> +
> +  NS_IMETHOD Run()
> +  {
> +    MOZ_ASSERT(NS_IsMainThread(), "Must be called on the main thread");
> +    nsNavHistory* navHistory = nsNavHistory::GetHistoryService();

you may use the const version of the service here

@@ +598,5 @@
> +  PRTime mLastVisitDate;
> +};
> +
> +void
> +nsNavHistory::PostFrecencyChangedNotification(int64_t aPlaceId,

s/Post/Dispatch/

@@ +1005,5 @@
>    return NS_OK;
>  }
>  
>  
> +class InvalidateAllFrecenciesCallback : public AsyncStatementCallback

ditto on namespacing

@@ +1015,5 @@
> +
> +  NS_IMETHOD HandleCompletion(uint16_t aReason)
> +  {
> +    if (aReason == REASON_FINISHED) {
> +      nsNavHistory *navHistory = nsNavHistory::GetHistoryService();

ditto on const version

@@ +1027,5 @@
>  nsresult
>  nsNavHistory::invalidateFrecencies(const nsCString& aPlaceIdsQueryString)
>  {
>    // Exclude place: queries by setting their frecency to zero.
> +  nsAutoCString invalidFrecenciesSQLFragment(

this probably has no advantage in being an autostring, may just be an nsCString

@@ +1031,5 @@
> +  nsAutoCString invalidFrecenciesSQLFragment(
> +    "UPDATE moz_places SET frecency = "
> +  );
> +  if (!aPlaceIdsQueryString.IsEmpty())
> +    invalidFrecenciesSQLFragment.AppendLiteral("NOTIFY_FRECENCY(");

where is this parenthesis being closed?

@@ +1048,4 @@
>    }
> +  nsRefPtr<InvalidateAllFrecenciesCallback> cb =
> +    aPlaceIdsQueryString.IsEmpty() ? new InvalidateAllFrecenciesCallback() :
> +    nullptr;

please reindent as cond ? one
                        : two;

@@ +3180,5 @@
>    return NS_OK;
>  }
>  
>  
> +class DecayFrecencyCallback : public AsyncStatementTelemetryTimer

ditto on namespace

@@ +3192,5 @@
> +  NS_IMETHOD HandleCompletion(uint16_t aReason)
> +  {
> +    (void)AsyncStatementTelemetryTimer::HandleCompletion(aReason);
> +    if (aReason == REASON_FINISHED) {
> +      nsNavHistory *navHistory = nsNavHistory::GetHistoryService();

ditto on const

@@ +4468,5 @@
>    return NS_OK;
>  }
>  
>  
> +class FixInvalidFrecenciesCallback : public AsyncStatementCallbackNotifier

ditto

@@ +4481,5 @@
> +  {
> +    nsresult rv = AsyncStatementCallbackNotifier::HandleCompletion(aReason);
> +    NS_ENSURE_SUCCESS(rv, rv);
> +    if (aReason == REASON_FINISHED) {
> +      nsNavHistory *navHistory = nsNavHistory::GetHistoryService();

ditto
Attachment #8380452 - Flags: review?(mak77) → feedback+
Whiteboard: p=0 → [tiles] p=0
Comment on attachment 8380505 [details] [diff] [review]
for-review front-end patch (frecency observers)

Review of attachment 8380505 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/base/content/newtab/page.js
@@ +68,5 @@
>     */
>    update: function Page_update() {
>      // The grid might not be ready yet as we initialize it asynchronously.
> +    // And don't update if the page is visible.
> +    if (gGrid.ready && !this.allowBackgroundCaptures) {

This change breaks a whole lot of tests because we also use this mechanism to update about:newtab instances in background tabs. If you want to exclude the currently shown tab to not confuse users we would probably need bug 972341 and change this to |... && document.hidden) {|

::: browser/base/content/test/newtab/browser_newtab_update.js
@@ +9,5 @@
> +  // First, start with an empty page.
> +  yield setLinks([]);
> +
> +  // Strategy: Add some visits, open a new page, check the grid, repeat.
> +  fillHistory([link(1)], function () {});

You might want to make the callback argument optional if you don't need it.

::: browser/base/content/test/newtab/head.js
@@ +159,5 @@
>  
>  function fillHistory(aLinks, aCallback) {
>    let numLinks = aLinks.length;
> +  if (!numLinks)
> +    setTimeout(aCallback, 0);

Nit: executeSoon(aCallback);

Also we should probably just add a return statement after that so that it's a little more explicit that nothing will happen here.

::: toolkit/modules/NewTabUtils.jsm
@@ +261,5 @@
> +   * a small amount of time for more updating all pages.
> +   */
> +  scheduleUpdate: function AllPages_scheduleUpdate() {
> +    if (this._scheduleUpdateTimer)
> +      this._scheduleUpdateTimer.cancel();

With a constant stream of updates this would never fire. Don't we rather want to bail out if the update timer is already active? So we would at least have the site updated timely.

@@ +264,5 @@
> +    if (this._scheduleUpdateTimer)
> +      this._scheduleUpdateTimer.cancel();
> +    this._scheduleUpdateTimer = Cc["@mozilla.org/timer;1"].
> +                                createInstance(Ci.nsITimer);
> +    this._scheduleUpdateTimer.initWithCallback(() => {

Nit: Using Timer.jsm might be nicer.

@@ +267,5 @@
> +                                createInstance(Ci.nsITimer);
> +    this._scheduleUpdateTimer.initWithCallback(() => {
> +      delete this._scheduleUpdateTimer;
> +      this.update();
> +    }, 1000, Ci.nsITimer.TYPE_ONE_SHOT);

Please put the 1000 into a constant defined at the top of the file.

@@ +620,5 @@
> +    // The implementation of the query in getLinks orders by place ID descending
> +    // when two places have the same frecency, so it's important to do the same
> +    // here.
> +    let cmp = aLink2.rank.frecency - aLink1.rank.frecency;
> +    return cmp != 0 ? cmp : aLink2.rank.placeID - aLink1.rank.placeID;

Nit: return cmp || aLink2.rank.placeID - aLink1.rank.placeID;

or

return rank2.frecency - rank1.frecency || rank2.placeID - rank1.placeID;
Attachment #8380505 - Flags: review?(ttaubert) → feedback+
Whiteboard: [tiles] p=0 → [tiles] p=8 s=it-30c-29a-28b.3
QA Contact: twalker
Whiteboard: [tiles] p=8 s=it-30c-29a-28b.3 → [tiles] p=8 s=it-30c-29a-28b.3 [qa+]
Whiteboard: [tiles] p=8 s=it-30c-29a-28b.3 [qa+] → [tiles] p=8 [qa+]
Attached patch for-review Places patch v2 (obsolete) — Splinter Review
This is the same as the previous patch but addresses your comments.

(In reply to Marco Bonardo [:mak] from comment #36)
> Even if, actually, I'm guessing if we could just not pass the placeID to the
> observer...

We talked about this on IRC and agreed to do without it.

(This is more relevant to the front-end patch, but: I was saying that the consequence for newtab is that pages with the same frecency might be reordered after restart, since different tie-breaking criteria are used, but it's actually worse than that.  newtab does binary search on the list returned by the places query, so its sort criteria must exactly match the critera used internally by the places query.  I worked around this by simply resorting the list returned by places when necessary.)

> @@ +690,5 @@
> > +   *        The page's new frecency.
> > +   * @param aGUID
> > +   *        The page's GUID.
> > +   * @param aHidden
> > +   *        True if the page is marked as hidden.
> 
> I think the value here may be bogus due to this:
> http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/
> History.cpp#1212
> 
> we may unhide the page after updating frecency...
> is hidden important for the use-case here?

The reason for including hidden in the notification is that hidden pages are excluded from newtab's places query, so I want to exclude them when updating the page.  Even some pages with non-zero frecency are marked as hidden, so I can't rely on frecency telling me anything about a page being hidden.

The site you mention where hidden is updated isn't a big problem.  It just means that sometimes newtab will ignore pages that it shouldn't, but that's better than ignoring all pages like it does now (since it doesn't dynamically update the page at all).  And if such pages are subsequently visited or otherwise have their frecencies changed, then they'll be included at that time.

> @@ +578,5 @@
> > +
> > +  NS_IMETHOD Run()
> > +  {
> > +    MOZ_ASSERT(NS_IsMainThread(), "Must be called on the main thread");
> > +    nsNavHistory* navHistory = nsNavHistory::GetHistoryService();
> 
> you may use the const version of the service here

For that to work, I would need to make nsCategoryCache::GetEntries const, because GetEntries is called (via a macro) in NotifyFrecencyChanged, so I just left it using the non-const service.  Same thing elsewhere.

> @@ +1031,5 @@
> > +  nsAutoCString invalidFrecenciesSQLFragment(
> > +    "UPDATE moz_places SET frecency = "
> > +  );
> > +  if (!aPlaceIdsQueryString.IsEmpty())
> > +    invalidFrecenciesSQLFragment.AppendLiteral("NOTIFY_FRECENCY(");
> 
> where is this parenthesis being closed?

Yikes.  The path where this is a problem wasn't covered by the test, so I added one to cover it.  Thanks!  (This is why I duplicated the query string in the two cases originally. :-))
Attachment #8380452 - Attachment is obsolete: true
Attachment #8390344 - Flags: review?(mak77)
Attached patch for-review front-end patch v2 (obsolete) — Splinter Review
This has some significant changes from the previous patch.  Ed and I talked more about his requirements for the directory tiles work.  We decided to implement a new directory tiles provider and have Links accept multiple providers and merge their links.  (See bug 975228 comment 6.)  I went ahead and implemented the merge logic here, since this work blocks his.  I added NewTabUtils.initWithoutProviders so I could write an xpcshell test without worrying about setting up Places.

Other changes address your comments.

(In reply to Tim Taubert [:ttaubert] from comment #37)
> ::: browser/base/content/newtab/page.js
> @@ +68,5 @@
> >     */
> >    update: function Page_update() {
> >      // The grid might not be ready yet as we initialize it asynchronously.
> > +    // And don't update if the page is visible.
> > +    if (gGrid.ready && !this.allowBackgroundCaptures) {
> 
> This change breaks a whole lot of tests because we also use this mechanism
> to update about:newtab instances in background tabs. If you want to exclude
> the currently shown tab to not confuse users we would probably need bug
> 972341 and change this to |... && document.hidden) {|

I made update() take an aOnlyIfHidden parameter that defaults to false.  NewTabUtils.scheduleUpdate passes true.

> ::: toolkit/modules/NewTabUtils.jsm
> @@ +261,5 @@
> > +   * a small amount of time for more updating all pages.
> > +   */
> > +  scheduleUpdate: function AllPages_scheduleUpdate() {
> > +    if (this._scheduleUpdateTimer)
> > +      this._scheduleUpdateTimer.cancel();
> 
> With a constant stream of updates this would never fire. Don't we rather
> want to bail out if the update timer is already active? So we would at least
> have the site updated timely.

True, right.
Attachment #8380505 - Attachment is obsolete: true
Attachment #8390345 - Flags: review?(ttaubert)
Comment on attachment 8390344 [details] [diff] [review]
for-review Places patch v2

Review of attachment 8390344 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/places/nsINavHistoryService.idl
@@ +27,1 @@
>  interface nsINavHistoryResultNode : nsISupports

why this uuid change?

::: toolkit/components/places/nsNavHistory.cpp
@@ +607,5 @@
> +{
> +  nsCOMPtr<nsIRunnable> notif = new FrecencyNotification(aSpec, aNewFrecency,
> +                                                         aGUID, aHidden,
> +                                                         aLastVisitDate);
> +  (void)NS_DispatchToMainThread(notif);

The Sqlite function may also be invoked on the main-thread (we don't want to and we don't, but just for correctness), in such a case we should just notify immediately, without dispatching.
Attachment #8390344 - Flags: review?(mak77) → review+
Comment on attachment 8390345 [details] [diff] [review]
for-review front-end patch v2

This patch needs a couple of small changes, one to correct an error, the other to make some tests pass.  I'll batch them along with whatever changes are prompted by review.

> diff --git a/browser/base/content/test/newtab/head.js b/browser/base/content/test/newtab/head.js
> --- a/browser/base/content/test/newtab/head.js
> +++ b/browser/base/content/test/newtab/head.js
> @@ -511,11 +511,13 @@ function createDragEvent(aEventType, aDa
>   * Resumes testing when all pages have been updated.
>   */
>  function whenPagesUpdated(aCallback) {
>    let page = {
> -    update: function () {
> -      NewTabUtils.allPages.unregister(this);
> -      executeSoon(aCallback || TestRunner.next);
> +    update: function (aOnlyIfHidden=false) {
> +      if (!aOnlyIfHidden) {
> +        NewTabUtils.allPages.unregister(this);
> +        executeSoon(aCallback || TestRunner.next);
> +      }
>      }
>    };
> 
>    NewTabUtils.allPages.register(page);
> diff --git a/toolkit/modules/NewTabUtils.jsm b/toolkit/modules/NewTabUtils.jsm
> --- a/toolkit/modules/NewTabUtils.jsm
> +++ b/toolkit/modules/NewTabUtils.jsm
> @@ -771,9 +771,9 @@ let Links = {
>      let pinnedLinks = Array.slice(PinnedLinks.links);
>      let links = this._mergeProviderLinks();
> 
>      // Filter blocked and pinned links.
> -    let links = links.filter(function (link) {
> +    links = links.filter(function (link) {
>        return !BlockedLinks.isBlocked(link) && !PinnedLinks.isPinned(link);
>      });
> 
>      // Try to fill the gaps between pinned links.
(In reply to Drew Willcoxon :adw from comment #41)
> > diff --git a/browser/base/content/test/newtab/head.js b/browser/base/content/test/newtab/head.js
> > --- a/browser/base/content/test/newtab/head.js
> > +++ b/browser/base/content/test/newtab/head.js
> > @@ -511,11 +511,13 @@ function createDragEvent(aEventType, aDa
> >   * Resumes testing when all pages have been updated.
> >   */
> >  function whenPagesUpdated(aCallback) {
> >    let page = {
> > -    update: function () {
> > -      NewTabUtils.allPages.unregister(this);
> > -      executeSoon(aCallback || TestRunner.next);
> > +    update: function (aOnlyIfHidden=false) {
> > +      if (!aOnlyIfHidden) {
> > +        NewTabUtils.allPages.unregister(this);
> > +        executeSoon(aCallback || TestRunner.next);
> > +      }
> >      }
> >    };

Actually, whenPagesUpdated needs to take an aOnlyIfHidden too, and then the conditional inside the update() here should compare that to the aOnlyIfHidden passed to it, so like:

  function whenPagesUpdated(aCallback, aOnlyIfHidden=false) {
    let page = {
      update: function (onlyIfHidden=false) {
        if (onlyIfHidden == aOnlyIfHidden) {

The dynamic update test passes true for aOnlyIfHidden, which should maybe be renamed aIsDynamicUpdate.

Also, I'm not calling Links._addObserver anywhere.  Links.populateCache does it now when it's forced to call getLinks on the provider, so it could probably still call it and do so unconditionally, since _addObserver is a no-op after the first call.

With those two additional changes, all newtab tests pass for me locally.

Finally, there should probably be a NewTabUtils.uninit so test_NewTabUtils.js can return it to a clean state for the front-end newtab tests.
Whiteboard: [tiles] p=8 [qa+] → [tiles] p=8 s=it-31c-30a-29b.1 [qa+]
Attached patch for-review front-end patch v3 (obsolete) — Splinter Review
This is updated with the changes I mentioned earlier, plus a BinarySearch.jsm test.

Marco, since Tim is out this week, do you think you'd be able to look at this before next week?  Not much time, I know.
Attachment #8390345 - Attachment is obsolete: true
Attachment #8390345 - Flags: review?(ttaubert)
Attachment #8393889 - Flags: review?(mak77)
I may look at this tomorrow, if that works for you.
That would be great, thanks.  If you can't, I'll probably flag Tim again early next week.
Comment on attachment 8393889 [details] [diff] [review]
for-review front-end patch v3

Review of attachment 8393889 [details] [diff] [review]:
-----------------------------------------------------------------

so, while I tried to go through this and examine it thoroughly, I honestly don't feel like I know the surrounding area well enough to give the final stamp to this.
I would probably do if it was matter of hours/days, but I think Tim will be back on Monday and I hope my first pass will help him spending less time on it.
Let me know if I can do more.

::: browser/base/content/newtab/page.js
@@ +72,2 @@
>      // The grid might not be ready yet as we initialize it asynchronously.
> +    if (gGrid.ready && (!aOnlyIfHidden || !this.allowBackgroundCaptures)) {

let skipUpdate = aOnlyIfHidden && this.allowBackgroundCaptures;
if (grid.ready && !skipUpdate)

::: toolkit/modules/BinarySearch.jsm
@@ +3,5 @@
> + * You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +"use strict";
> +
> +this.EXPORTED_SYMBOLS = ["BinarySearch"];

just out of curiosity, I found another implementation in the codebase
http://mxr.mozilla.org/mozilla-central/source/toolkit/devtools/sourcemap/source-map.js#1482 (binary-search)
though it works a little bit different than yours... it may still be worth to have a bug to merge them

@@ +5,5 @@
> +"use strict";
> +
> +this.EXPORTED_SYMBOLS = ["BinarySearch"];
> +
> +this.BinarySearch = {

please Object.freeze (since we are moving towards having more js and less xpcom, we should pay more attention to unwanted third party changes, since we don't have anymore the xpcom boundary)

@@ +39,5 @@
> +   *
> +   * @param  array
> +   *         An array whose elements are ordered by `comparator`.
> +   * @param  target
> +   *         The value to search for in the array.

nit: I think "in the array" is redundant here, it's already in the method description and I don't see where other places you may search :)

@@ +51,5 @@
> +   * @return An array with two elements.  If `target` is found, the first
> +   *         element is true, and the second element is its index in the array.
> +   *         If `target` is not found, the first element is false, and the
> +   *         second element is the index where it may be inserted to keep the
> +   *         array ordered.

I was reading the fact Tim didn't like this returning an array with 2 elements.  As you said, it's needed to be able to return either an existing index or an insertion index.
Just an idea out of my mind, since that's similar to what we did for frecency. Return positive numbers for found indices, negative numbers for insertion indices. If you always return a valid insertion index this should work.
Otherwise I also think probably a simple { index:, found: } object would be better than an array (the caller can deconstruct objects too).

::: toolkit/modules/NewTabUtils.jsm
@@ +23,5 @@
> +  "resource://gre/modules/BinarySearch.jsm");
> +
> +XPCOMUtils.defineLazyGetter(this, "Timer", () => {
> +  let Timer = {};
> +  Cu.import("resource://gre/modules/Timer.jsm", Timer);

Why not adding (in addition to the existing ones) an exported Timer object inside Timer.jsm and here just defineLazyModuleGetter that? Sounds more useful than having to mock around the module fanciness

@@ +62,4 @@
>  // The gather telemetry topic.
>  const TOPIC_GATHER_TELEMETRY = "gather-telemetry";
>  
> +// The timeout period used in scheduleUpdateForHiddenPages.

the comment is a bit cryptic, would be nice to understand how changing this will influence the behavior.

@@ +62,5 @@
>  // The gather telemetry topic.
>  const TOPIC_GATHER_TELEMETRY = "gather-telemetry";
>  
> +// The timeout period used in scheduleUpdateForHiddenPages.
> +const SCHEDULE_UPDATE_TIMEOUT = 1000;

nit: as usual I always suggest to suffix the metric unit, like SCHEDULE_UPDATE_TIME_MS

@@ +264,2 @@
>     */
> +  update: function AllPages_update(aExceptPage, aHiddenPagesOnly) {

both should be marked as [optional] in the javadoc.
Just for clarify, I'd also provide a default value to aHiddenPagesOnly.

@@ +271,5 @@
>  
>    /**
> +   * Many individual link changes may happen in a small amount of time over
> +   * multiple turns of the event loop.  This method coalesces updates by waiting
> +   * a small amount of time for more before updating hidden pages.

s/for more//

@@ +640,5 @@
> +
> +  /**
> +   * Called by the history service.
> +   */
> +  onFrecencyChanged: function PlacesProvider_onFrecencyChanged(aURI, aNewFrecency, aGUID, aHidden, aLastVisitDate) {

nit: 80 chars limit warning!

fwiw, there should be no more need to name functions now, the stack can properly fetch the method name.

@@ +671,5 @@
> +  },
> +
> +  _callObservers: function PlacesProvider__callObservers(aMethodName, aArg) {
> +    for (let obs of this._observers) {
> +      if (typeof(obs[aMethodName]) == "function") {

what else it may be? if that's a coding error it should not be swallowed

@@ +709,2 @@
>     */
> +  _providers: [],

looks like this may be better handled by a Set

@@ +722,5 @@
> +  _sortProperties: [
> +    "frecency",
> +    "lastVisitDate",
> +    "url",
> +  ],

doesn't looks like this should be a property, rather a const out of this object.

@@ +747,5 @@
> +    let idx = this._providers.indexOf(aProvider);
> +    if (idx < 0)
> +      throw new Error("Unknown provider");
> +    this._providers.splice(idx, 1);
> +    this._providerLinks.delete(aProvider);

shouldn't this remove "this" from the provider observers arrays? or addObserver should use a WeakSet...

@@ +845,5 @@
> +   * Calls getLinks on the given provider and populates our cache for it.
> +   * @param aProvider The provider whose cache will be populated.
> +   * @param aForce When true, populates the provider's cache even when it's
> +   *               already filled.
> +   * @param aCallback The callback to call when finished.

the order of @params documentation doesn't respect the order defined in the method

@@ +853,5 @@
> +      aCallback();
> +    } else {
> +      aProvider.getLinks((links) => {
> +        // Filter out null and undefined links so we don't have to deal with
> +        // them in getLinks when merging links from providers.

when are providers returning null or undefined links? That seems a bit weird... why are those accepted by the provider?

@@ +860,5 @@
> +          sortedLinks: links,
> +          linkMap: links.reduce((map, link) => {
> +            map.set(link.url, link);
> +            return map;
> +          }, new Map()),

I think this code would win a bit of readability if at least the reduce would happen outside of the set call and be cached in a well-named local var

@@ +904,5 @@
> +    }
> +
> +    let links = [];
> +    let nextLink = null;
> +    while (links.length < this.maxNumLinks && (nextLink = getNextLink())) {

I really don't like assignments in conditions, too much error-prone. what about something more classical?
for (let nextLink = getNextLink();
     links.length < this.maxNumLinks && nextLink;
     nextLink = getNextLink()) {

@@ +925,5 @@
> +
> +    let links = this._providerLinks.get(aProvider);
> +    if (!links)
> +      // The provider notified us of a link before we ever called getLinks on
> +      // it.

again, if it's a coding error, it should not be swallowed

@@ +1140,5 @@
>  
>    init: function NewTabUtils_init() {
> +    if (this.initWithoutProviders()) {
> +      PlacesProvider.init();
> +      Links.addProvider(PlacesProvider);

if all providers must be init-ed, may make sense to have addProvider do that. Is there a reason to add a provider multiple times? you may protect from that into addProvider itself.

::: toolkit/modules/tests/xpcshell/test_NewTabUtils.js
@@ +9,5 @@
> +function run_test() {
> +  for (test of tests) {
> +    test();
> +  }
> +}

I'm honestly unsure why this test is not using the common add_test boilerplate

@@ +57,5 @@
> +      do_check_eq(link.frecency, frecency);
> +    }
> +
> +    NewTabUtils.links.removeProvider(evenFrecencyProvider);
> +    NewTabUtils.links.removeProvider(oddFrecencyProvider);

maybe it's me, but this doesn't seem to test the linkChanged paths, that is a large part of the changes. That scares me a little bit...
Attachment #8393889 - Flags: review?(mak77)
A heads up: I pushed a tryserver build that included this patch (with a bunch of other Directory Tiles patches) that has xpcshell tests failing:

https://tbpl.mozilla.org/?tree=Try&rev=5c91e82ec061

10:36:35     INFO -  TEST-INFO | /builds/slave/talos-slave/test/build/tests/xpcshell/tests/toolkit/components/places/tests/unit/test_frecency_observers.js | Starting test_InsertVisitedURIs_UpdateFrecency_and_History_InsertPlace
10:36:35     INFO -  TEST-INFO | (xpcshell/head.js) | test test_InsertVisitedURIs_UpdateFrecency_and_History_InsertPlace pending (2)
10:36:35     INFO -  TEST-INFO | (xpcshell/head.js) | test run_next_test 0 finished (2)
10:36:35     INFO -  TEST-PASS | /builds/slave/talos-slave/test/build/tests/xpcshell/tests/toolkit/components/places/tests/unit/test_frecency_observers.js | [onFrecencyChanged/obs.onFrecencyChanged : 69] true == true
10:36:35     INFO -  TEST-PASS | /builds/slave/talos-slave/test/build/tests/xpcshell/tests/toolkit/components/places/tests/unit/test_frecency_observers.js | [onFrecencyChanged/obs.onFrecencyChanged : 70] true == true
10:36:35     INFO -  <<<<<<<
10:36:35     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/try-builds/edward.lee@engineering.uiuc.edu-5c91e82ec061/try-macosx64/firefox-31.0a1.en-US.mac.crashreporter-symbols.zip
10:36:56  WARNING -  PROCESS-CRASH | /builds/slave/talos-slave/test/build/tests/xpcshell/tests/toolkit/components/places/tests/unit/test_frecency_observers.js | application crashed [@ libsystem_kernel.dylib + 0x10686]
10:36:56     INFO -  Crash reason:  EXC_SOFTWARE / SIGABRT

I'm starting to debug whether these are related to the directory tiles, although those patches don't touch places.
Attached patch for-review front-end patch v4 (obsolete) — Splinter Review
Thanks, Marco.  This addresses some of your comments.

(In reply to Marco Bonardo [:mak] from comment #46)
> I was reading the fact Tim didn't like this returning an array with 2
> elements.  As you said, it's needed to be able to return either an existing
> index or an insertion index.
> Just an idea out of my mind, since that's similar to what we did for
> frecency. Return positive numbers for found indices, negative numbers for
> insertion indices. If you always return a valid insertion index this should
> work.

The problem with that is that zero can be an insertion index.  We could still return a negative number to mean an insertion index, but the caller would have to do -index - 1 to get the true insertion index, which is error prone.

> Otherwise I also think probably a simple { index:, found: } object would be
> better than an array (the caller can deconstruct objects too).

The array is less typing for the caller, which I prefer, so I left it for now.

> fwiw, there should be no more need to name functions now, the stack can
> properly fetch the method name.

Yeah, I agree, I'm just matching the style in this file.

> @@ +747,5 @@
> > +    let idx = this._providers.indexOf(aProvider);
> > +    if (idx < 0)
> > +      throw new Error("Unknown provider");
> > +    this._providers.splice(idx, 1);
> > +    this._providerLinks.delete(aProvider);
> 
> shouldn't this remove "this" from the provider observers arrays? or
> addObserver should use a WeakSet...

The idea is that the provider object will no longer be used by anyone at all once it's removed.

> @@ +853,5 @@
> > +      aCallback();
> > +    } else {
> > +      aProvider.getLinks((links) => {
> > +        // Filter out null and undefined links so we don't have to deal with
> > +        // them in getLinks when merging links from providers.
> 
> when are providers returning null or undefined links? That seems a bit
> weird... why are those accepted by the provider?

Yeah, I recall the PlacesProvider returning null links when I was working on a different bug awhile ago, and I had to add similar checking then, but maybe it's not actually necessary.

> @@ +925,5 @@
> > +
> > +    let links = this._providerLinks.get(aProvider);
> > +    if (!links)
> > +      // The provider notified us of a link before we ever called getLinks on
> > +      // it.
> 
> again, if it's a coding error, it should not be swallowed

It's not an error in this case, it's just that, e.g., the PlacesProvider notifies us of a frecency change between the time the provider was added and the time the front-end calls getLinks.

> @@ +1140,5 @@
> >  
> >    init: function NewTabUtils_init() {
> > +    if (this.initWithoutProviders()) {
> > +      PlacesProvider.init();
> > +      Links.addProvider(PlacesProvider);
> 
> if all providers must be init-ed, may make sense to have addProvider do
> that.

I don't want init() to be part of the provider interface.

> ::: toolkit/modules/tests/xpcshell/test_NewTabUtils.js
> @@ +9,5 @@
> > +function run_test() {
> > +  for (test of tests) {
> > +    test();
> > +  }
> > +}
> 
> I'm honestly unsure why this test is not using the common add_test
> boilerplate

These tests are not async.

> @@ +57,5 @@
> > +      do_check_eq(link.frecency, frecency);
> > +    }
> > +
> > +    NewTabUtils.links.removeProvider(evenFrecencyProvider);
> > +    NewTabUtils.links.removeProvider(oddFrecencyProvider);
> 
> maybe it's me, but this doesn't seem to test the linkChanged paths, that is
> a large part of the changes. That scares me a little bit...

D'oh, true, I'll add a test.  I don't expect Tim to r+ this patch yet, so I'll go ahead and r? it in the meantime and include the new test in the next patch.
Attachment #8393889 - Attachment is obsolete: true
Attachment #8396393 - Flags: review?(ttaubert)
(In reply to Drew Willcoxon :adw from comment #48)
> > @@ +853,5 @@
> > > +      aCallback();
> > > +    } else {
> > > +      aProvider.getLinks((links) => {
> > > +        // Filter out null and undefined links so we don't have to deal with
> > > +        // them in getLinks when merging links from providers.
> > 
> > when are providers returning null or undefined links? That seems a bit
> > weird... why are those accepted by the provider?
> 
> Yeah, I recall the PlacesProvider returning null links when I was working on
> a different bug awhile ago, and I had to add similar checking then, but
> maybe it's not actually necessary.

This sounds like something that should be automatically tested :/ if the pseudo-"API" doesn't tell whether it can return null link (And mostly why), sounds like a bug that should be filed.

> > @@ +925,5 @@
> > > +
> > > +    let links = this._providerLinks.get(aProvider);
> > > +    if (!links)
> > > +      // The provider notified us of a link before we ever called getLinks on
> > > +      // it.
> > 
> > again, if it's a coding error, it should not be swallowed
> 
> It's not an error in this case, it's just that, e.g., the PlacesProvider
> notifies us of a frecency change between the time the provider was added and
> the time the front-end calls getLinks.

Ok, the comment sounded like "we don't expect this to happen, but just in case...", while it was a "this may happen". Maybe it could be clarified a little bit. fwiw, your explanation here sounds already better than the comment there.

> > ::: toolkit/modules/tests/xpcshell/test_NewTabUtils.js
> > @@ +9,5 @@
> > > +function run_test() {
> > > +  for (test of tests) {
> > > +    test();
> > > +  }
> > > +}
> > 
> > I'm honestly unsure why this test is not using the common add_test
> > boilerplate
> 
> These tests are not async.

add_test is not async, I didn't mean add_task :)
Attached patch for-review front-end patch v4.1 (obsolete) — Splinter Review
This adds a test to test_NewTabUtils.js that checks dynamic changes to links.  The only other change from the previous patch is this small change to NewTabUtils.links.onLinkChanged:

diff --git a/toolkit/modules/NewTabUtils.jsm b/toolkit/modules/NewTabUtils.jsm
--- a/toolkit/modules/NewTabUtils.jsm
+++ b/toolkit/modules/NewTabUtils.jsm
@@ -970,10 +970,15 @@ let Links = {
         for (let prop of this._sortProperties) {
           if (!(prop in aLink))
             throw new Error("New link missing required sort property: " + prop);
         }
-        link = aLink;
-        linkMap.set(aLink.url, aLink);
+        // Copy the link object so that if the caller changes it, it doesn't
+        // screw up our bookkeeping.
+        link = {};
+        for (let [prop, val] of Iterator(aLink)) {
+          link[prop] = val;
+        }
+        linkMap.set(link.url, link);
       }
       let idx = this._insertionIndexOf(sortedLinks, link);
       sortedLinks.splice(idx, 0, link);
       if (sortedLinks.length > aProvider.maxNumLinks) {
Attachment #8396393 - Attachment is obsolete: true
Attachment #8396393 - Flags: review?(ttaubert)
Attachment #8396599 - Flags: review?(ttaubert)
(In reply to Ed Lee :Mardak from comment #47)
> A heads up: I pushed a tryserver build that included this patch (with a
> bunch of other Directory Tiles patches) that has xpcshell tests failing:

Thanks, I see this, too.  I don't know when it started failing because it was working before.  I don't know what the crashes are about, but this is actually a timeout due to the way the test uses promises.  It was doing this in a couple of places:

  yield onFrecencyChanged(uri);
  yield onFrecencyChanged(uri);

The problem is that the notification for which line 2 sets up an observer is sent before line 2 gets a chance to run, so the yield on line 2 waits forever.  This patch fixes that by doing this instead:

  yield Promise.all([onFrecencyChanged(uri), onFrecencyChanged(uri)]);

Since this worked before, something changed in the timing here, but the fix is definitely more correct and robust than the previous code, so all right.
Attachment #8390344 - Attachment is obsolete: true
Attachment #8396637 - Flags: review+
No longer blocks: 975211
Comment on attachment 8396599 [details] [diff] [review]
for-review front-end patch v4.1

Bug 975228 comment 27 points out the need to handle duplicate URLs across providers, which this patch doesn't do, so I'll need to post another new patch.  I'll leave the r? on the current one because there's a lot else there to be commented on.
Blocks: 988447
Er, no, I'm wrong, duplicates are handled just fine, nevermind. :-/  Meaning they'll be sorted correctly, etc.  Ed filed bug 988447 for different behavior, what seems to amount to preferring one provider over another in the case of dupes, but we shouldn't let that block this bug.
browser_newtab_update.js seems to have intermittent mochitest failures on Linux Opt:

fail: https://tbpl.mozilla.org/?tree=Try&rev=79a2687c728d
pass: https://tbpl.mozilla.org/?tree=Try&rev=994434a02f47

fail w/ directory patches: https://tbpl.mozilla.org/?tree=Try&rev=7293638bed6c
pass w/ directory patches: https://tbpl.mozilla.org/?tree=Try&rev=d23c2b35f47b
Comment on attachment 8396599 [details] [diff] [review]
for-review front-end patch v4.1

Review of attachment 8396599 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good in general. f+ not r+ only because of the sortedFirstLinks array question where I think we can do this more efficiently.

::: browser/base/content/newtab/page.js
@@ +72,2 @@
>      // The grid might not be ready yet as we initialize it asynchronously.
> +    if (gGrid.ready && (!aOnlyIfHidden || !this.allowBackgroundCaptures)) {

I like Marco's suggestion here:

let skipUpdate = aOnlyIfHidden && this.allowBackgroundCaptures;
if (gGrid.ready && !skipUpdate) {

::: toolkit/modules/NewTabUtils.jsm
@@ +24,5 @@
> +
> +XPCOMUtils.defineLazyGetter(this, "Timer", () => {
> +  let Timer = {};
> +  Cu.import("resource://gre/modules/Timer.jsm", Timer);
> +  return Timer;

Shouldn't this just work?

return Cu.import("resource://gre/modules/Timer.jsm", {});

@@ +745,5 @@
> +   */
> +  removeProvider: function Links_removeProvider(aProvider) {
> +    if (!this._providers.delete(aProvider))
> +      throw new Error("Unknown provider");
> +    this._providerLinks.delete(aProvider);

Did you mean to write if (!this._providers.has(...))? According to MDN .delete() always returns false.

@@ +782,5 @@
> +    for (let provider of this._providers) {
> +      this._populateProviderCache(provider, () => {
> +        if (--numProvidersRemaining == 0)
> +          executeCallbacks();
> +      }, aForce);

I think this could be much nicer with promises... but we don't use them here at all yet. Stuff for a follow-up.

@@ +847,5 @@
> +   *               already filled.
> +   */
> +  _populateProviderCache: function Links_populateProviderCache(aProvider, aCallback, aForce) {
> +    if (this._providerLinks.has(aProvider) && !aForce) {
> +      aCallback();

Nit: setTimeout(aCallback) would be better to always let that be "async" i.e. called earliest on the next tick. I should have done the same for populateCache() I guess.

@@ +849,5 @@
> +  _populateProviderCache: function Links_populateProviderCache(aProvider, aCallback, aForce) {
> +    if (this._providerLinks.has(aProvider) && !aForce) {
> +      aCallback();
> +    } else {
> +      aProvider.getLinks((links) => {

Nit: aProvider.getLinks(links => {

@@ +869,5 @@
> +  /**
> +   * Merges the cached lists of links from all providers whose lists are cached.
> +   * @return The merged list.
> +   */
> +  _mergeProviderLinks: function Links__mergeProviderLinks() {

Nit: _mergeProviderLinks() sound like something that would modify state without returning anything. Maybe _getMergedProviderLinks()?

@@ +877,5 @@
> +      // is the next link.
> +      let sortedFirstLinks = [];
> +      for (let [provider, sortedLinks] of sortedLinksPerProvider) {
> +        if (sortedLinks.length) {
> +          let link = { link: sortedLinks[0], provider: provider };

Shouldn't we only construct this list of links once? It seems expensive to do that every time getNextLink() is called. Alternatively you could make this a generator so you don't have to track state.

@@ +898,5 @@
> +    // Build a mapping from each provider to a copy of its sortedLinks list.
> +    let sortedLinksPerProvider = new Map();
> +    for (let [provider, links] of this._providerLinks) {
> +      sortedLinksPerProvider.set(provider, links.sortedLinks.slice());
> +    }

Nit: Can you please do all this at the top of the function? I was temporarily confused where the variable is coming from in getNextLink().

@@ +925,5 @@
> +    let links = this._providerLinks.get(aProvider);
> +    if (!links)
> +      // The provider notified us of a link before we ever called getLinks on
> +      // it.
> +      return;

Nit: please add a pair of brackets here.
Attachment #8396599 - Flags: review?(ttaubert) → feedback+
(In reply to Tim Taubert [:ttaubert] from comment #55)
> > +      let sortedFirstLinks = [];
> > +      for (let [provider, sortedLinks] of sortedLinksPerProvider) {
> > +        if (sortedLinks.length) {
> > +          let link = { link: sortedLinks[0], provider: provider };
> Shouldn't we only construct this list of links once? It seems expensive to
> do that every time getNextLink() is called. Alternatively you could make
> this a generator so you don't have to track state.
The only re-construction that happens should be the first element of each provider where initially there's only two providers. Arguably sortedFirstLinks doesn't need to be maintained in the first place as we only care about the #1 item.

let firstLink = provider[0][0];
for (i = 1; i < providers.length; i++)
  (firstLink, ignoreLink) = compare(firstLink, providers[i][0]);

The generator idea does make sense if we want to maintain a reference to first link of each provider, but maybe the above code is simple enough?
You're right, we can actually leave that as is. I read it wrong because it just looks so complicated - there is definitely value in cleaning this up and making it easier to read. I wouldn't want to block important work on that, so if we can do this in a follow-up that would be sufficient. The module in its current state needs some cleanup anyway as it's become really messy.
Comment on attachment 8396599 [details] [diff] [review]
for-review front-end patch v4.1

r=me with all the feedback addressed.
Attachment #8396599 - Flags: feedback+ → review+
nextLinks keeps a reference to the links array of the best provider's of sortedLinksPerProvider. (So we can refer to its next/best link as well as slice from the live array!)
Attachment #8398047 - Flags: feedback?(adw)
(In reply to Tim Taubert [:ttaubert] from comment #55)
> Shouldn't this just work?
> 
> return Cu.import("resource://gre/modules/Timer.jsm", {});

I think you need:

return Cu.import("resource://gre/modules/Timer.jsm", {}).Timer;
Oh, nevermind, I forgot what Timer.jsm's EXPORTED_SYMBOLS were.
Comment on attachment 8398047 [details] [diff] [review]
simplify.getNextLink.patch

Review of attachment 8398047 [details] [diff] [review]:
-----------------------------------------------------------------

Oh yeah!  *music from Ferris Bueller starts playing*
Attachment #8398047 - Flags: feedback?(adw) → feedback+
(In reply to Ed Lee :Mardak from comment #54)
> browser_newtab_update.js seems to have intermittent mochitest failures on
> Linux Opt:

Thanks, Ed!  After I saw your comment I retriggered a bunch of bc tests on a try run I had already done, and now I see this, too.  The test clears history, adds visits, updates pages, and waits for all of those to happen, so it's definitely plausible that there are async errors in there that pop up due to different timings.

I think I found the problem and have fixed it.  I pushed another try run to see.  Hopefully this can land tomorrow.

https://tbpl.mozilla.org/?tree=Try&rev=0fbb7e333e50

> diff --git a/browser/base/content/test/newtab/browser_newtab_update.js b/browser/base/content/test/newtab/browser_newtab_update.js
> --- a/browser/base/content/test/newtab/browser_newtab_update.js
> +++ b/browser/base/content/test/newtab/browser_newtab_update.js
> @@ -5,10 +5,21 @@
>   * Checks that newtab is updated as its links change.
>   */
> 
>  function runTests() {
> -  // First, start with an empty page.
> -  yield setLinks([]);
> +  // First, start with an empty page.  setLinks will trigger a hidden page
> +  // update because it calls clearHistory.  We need to wait for that update to
> +  // happen so that the next time we wait for a page update below, we catch the
> +  // right update and not the one triggered by setLinks.
> +  //
> +  // Why this weird way of yielding?  First, these two functions don't return
> +  // promises, they call TestRunner.next when done.  Second, the point at which
> +  // setLinks is done is independent of when the page update will happen, so
> +  // calling whenPagesUpdated cannot wait until that time.
> +  setLinks([]);
> +  whenPagesUpdated(null, true);
> +  yield;
> +  yield;
> 
>    // Strategy: Add some visits, open a new page, check the grid, repeat.
>    fillHistory([link(1)]);
>    yield whenPagesUpdated(null, true);
Attached patch for-check-in front-end patch (obsolete) — Splinter Review
Thanks, Tim!  This addresses your comments, incorporates Ed's patch, adds a test to the xpcshell test to test the new async callback behavior when _populateProviderCache is called on a cached provider, refactors some of the xpcshell test, and hopefully fixes the bc test failure that Ed noticed.

As I mentioned above, I'm going to wait until the results of the try run come in before landing.

(In reply to Tim Taubert [:ttaubert] from comment #55)
> @@ +745,5 @@
> > +   */
> > +  removeProvider: function Links_removeProvider(aProvider) {
> > +    if (!this._providers.delete(aProvider))
> > +      throw new Error("Unknown provider");
> > +    this._providerLinks.delete(aProvider);
> 
> Did you mean to write if (!this._providers.has(...))? According to MDN
> .delete() always returns false.

MDN says, "Returns true if an element in the Set object has been removed successfully; otherwise false."  The xpcshell test would be failing if this were a problem, but I double checked in a JS shell, and sure enough, delete returns true if the item was in the Set and false otherwise.
Attachment #8396599 - Attachment is obsolete: true
(In reply to Drew Willcoxon :adw from comment #64)
> MDN says, "Returns true if an element in the Set object has been removed
> successfully; otherwise false."  The xpcshell test would be failing if this
> were a problem, but I double checked in a JS shell, and sure enough, delete
> returns true if the item was in the Set and false otherwise.

You're right, I misread the comment on the Set's overview page. Carry on!
(In reply to :Gavin Sharp (email gavin@gavinsharp.com) from comment #61)
> Oh, nevermind, I forgot what Timer.jsm's EXPORTED_SYMBOLS were.

yes, I suggested to add a Timer wrapper to Timer.jsm, to make it possible to defineLazyModuleGetter it. Maybe I should file a separate bug for that.
(In reply to Marco Bonardo [:mak] from comment #66)
> (In reply to :Gavin Sharp (email gavin@gavinsharp.com) from comment #61)
> > Oh, nevermind, I forgot what Timer.jsm's EXPORTED_SYMBOLS were.
> 
> yes, I suggested to add a Timer wrapper to Timer.jsm, to make it possible to
> defineLazyModuleGetter it. Maybe I should file a separate bug for that.

+1. I hate that this module wants to be special :)
There was a persistent error thrown in the try run (https://tbpl.mozilla.org/?tree=Try&rev=0fbb7e333e50) that screwed up a bunch of the newtab bc tests.  One error, but the test it happened in is different depending on the run.  That didn't happen in a try run I did with the previous patch.  I can reproduce locally.  The problem seems to be the change to address this comment:

(In reply to Tim Taubert [:ttaubert] from comment #55)
> @@ +847,5 @@
> > +   *               already filled.
> > +   */
> > +  _populateProviderCache: function Links_populateProviderCache(aProvider, aCallback, aForce) {
> > +    if (this._providerLinks.has(aProvider) && !aForce) {
> > +      aCallback();
> 
> Nit: setTimeout(aCallback) would be better to always let that be "async"
> i.e. called earliest on the next tick. I should have done the same for
> populateCache() I guess.

If aCallback is called asyncly, the error happens, if syncly, no error.  I pushed a new run with that change: https://tbpl.mozilla.org/?tree=Try&rev=1006f849a247

I'm still trying to understand why it happens.
Attached patch for-check-in front-end patch v2 (obsolete) — Splinter Review
Latest try run: https://tbpl.mozilla.org/?tree=Try&rev=b205883b96f9

The earlier patch didn't fix the failure Ed reported, so I had to figure that out.  I'm pretty sure that patch did fix a potential problem, though.  I finally fixed the failure by waiting at the beginning of the test for dynamic updates triggered by the previous test to finish.

Then there were other failures, only on Windows XP.  Those failures were due to calling Date.now inside a loop while creating visits in head.js's fillHistory, which the patch picked up because it now secondarily sorts links on visit date instead of Place ID.  I moved the now() to outside.  Working on this made me notice that fillHistory needed to be updated to account for the new sort criteria, so I did that, too.

Finally, the failures due to _populateProviderCache's calling its callback asyncly that I mentioned in my last comment -- I'm just going to punt on those and revert back to making it sync.  I'm pretty sure it has to do with a race between head.js's addNewTabPageTab and page.js's gPage._init.  The error is due to the grid not being initialized; the grid is initialized by gPage._init inside a populateCache callback.  addNewTabPageTab does not wait for the grid to be initialized before calling TestRunner.next, so sometimes when the test is continued the grid is not initialized.  But I had trouble confirming this hypothesis, I think because preloading got in my way, and by that point I had invested enough time.
Attachment #8398285 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/3951f7e82b4c
https://hg.mozilla.org/mozilla-central/rev/70638b7c97f1
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 31
Hi Tracy, the current iteration ends on Monday.  Will you have time then for testing or should I move this to the next iteration for verification?
Flags: needinfo?(twalker)
Backed out in https://hg.mozilla.org/mozilla-central/rev/0e19655e93df for frequent timeouts (starred as bug 963048, though a new regression from this patch, it seems).
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
You can ignore my previous 'need info' request Tracy - bug has been reopened.
Flags: needinfo?(twalker)
Attached patch for-check-in front-end patch v3 (obsolete) — Splinter Review
I can reproduce locally on an opt build.  I can't reproduce locally anymore with this patch.  I'll land again tonight PDT if this try run is OK: https://tbpl.mozilla.org/?tree=Try&rev=8865f6adcf6c

The failing PB test calls head.js's setLinks, which clears history, which triggers a dynamic onManyLinksChanged update (introduced by the patch here), which triggers a grid update for every non-visible page.  The test is failing because its grid is being updated as a result of this, while it is running, even though the grid's page is visible and should therefore not be updated.

So the problem is that the page thinks it's hidden when in fact it's visible.  It thinks it's hidden because Page.allowBackgroundCaptures is broken for pages in PB windows.  allowBackgroundCaptures is set on pages that are created by the preloader -- by BrowserNewTabPreloader.newTab.  But the preloader is skipped altogether by tabbrowser when the window is a PB window: http://mxr.mozilla.org/mozilla-central/source/browser/base/content/tabbrowser.xml#1597  (BrowserNewTabPreloader.newTab is used even when the preloader is disabled, so this bug should not manifest when the preloader is disabled.)

I've fixed this by making allowBackgroundCaptures return true if inPrivateBrowsingMode().  This isn't great because it doesn't capture other possible scenarios where BrowserNewTabPreloader.newTab is bypassed.  Right now newtab uses the presence of an attribute to mean allowBackgroundCaptures; ideally that would be inverted so there would be a prohibitBackgroundCaptures attribute.  Then pages created by the preloader would have that attribute set (while they are preloaded), and pages created outside the preloader wouldn't, so they could assume they're visible.  But the preloader uses a frame script to asyncly talk to the page, so there'd be some small window where prohibitBackgroundCaptures would not be set, so that's not easily doable.  Making this visibility/allowBackgroundCaptures mechanism nicer would be a good follow-up, except Tim's bug 972341 should make it unnecessary altogether.
Attachment #8398890 - Attachment is obsolete: true
(In reply to Drew Willcoxon :adw from comment #75)
> The failing PB test calls head.js's setLinks, which clears history, which
> triggers a dynamic onManyLinksChanged update (introduced by the patch here),
> which triggers a grid update for every non-visible page.  The test is
> failing because its grid is being updated as a result of this, while it is
> running, even though the grid's page is visible and should therefore not be
> updated.
I wonder if this is related to the async issues from setTimeout(aCallback)? Where there's unexpected updates triggering whenPagesUpdated at the wrong time?

I started digging into browser_newtab_perwindow_private_browsing.js earlier, and I believe I tracked it down to some unexpected update causing  yield blockCell(1);  (in pb mode) to wait forever.
(In reply to Ed Lee :Mardak from comment #76)
> I wonder if this is related to the async issues from setTimeout(aCallback)?
> Where there's unexpected updates triggering whenPagesUpdated at the wrong
> time?

Which setTimeout(aCallback) are you referring to?  If it's the one Tim suggested for _populateProviderCache, the final patch didn't have that.  It just calls aCallback directly.

The problem was the page updating itself when it shouldn't have.  Visible pages are supposed to just ignore updates altogether, so "unexpected" updates shouldn't matter.

> I started digging into browser_newtab_perwindow_private_browsing.js earlier,
> and I believe I tracked it down to some unexpected update causing  yield
> blockCell(1);  (in pb mode) to wait forever.

That's what I saw, too.  Blocking and rearranging sites triggers CSS transitions, and transformations.js would end up waiting for a transitionend on a node that had been removed by the update, end everything was blocked on that endless wait.  In debugging, I removed the transitions, which resulted in unexpected errors related to DOM nodes, which is how I tracked this down.
Backed out in https://hg.mozilla.org/integration/fx-team/rev/05906fafbb0f for intermittent Win8 debug failures in browser_newtab_update.js (https://tbpl.mozilla.org/php/getParsedLog.php?id=36957575&tree=Fx-Team, https://tbpl.mozilla.org/php/getParsedLog.php?id=36955999&tree=Fx-Team) like the one you had on try (https://tbpl.mozilla.org/php/getParsedLog.php?id=36953938&tree=Try). Don't know if it's actually only on Win8 debug, or just that that's what I retriggered the most since you did fail there on try.

I'm also deeply suspicious that you're spewing and slowing things down, since the number of "Output exceeded 52428800 bytes" and "command timed out: 12000 seconds elapsed" failures seems excessive, but browser-chrome is so incredibly broken anyway that it's very difficult to tell more-broken from same-broken.
From the test logs:

grid status = 2,1,3,4,,,,, - Got 2,1,3,,,,,,, expected 2,1,3,4,,,,,
grid status = 2,1,3,4,,,,, - Got 2,1,3,,,,,,, expected 2,1,3,4,,,,,
grid status = 2,1,3,4,,,,, - Got 2,1,,,,,,,, expected 2,1,3,4,,,,,

This comes after a successful checkGrid("1,2,,,,,,,");

So it seems like fillHistory([link(2), link(3), link(4)]); might be triggering multiple updates. Should this fix things?

yield fillHistory([link(2), link(3), link(4)], TestRunner.next);

Or maybe

yield fillHistory([link(2), link(3)], TestRunner.next);
fillHistory([link(4)]);
(In reply to Ed Lee :Mardak from comment #80)
> grid status = 2,1,3,4,,,,, - Got 2,1,3,,,,,,, expected 2,1,3,4,,,,,
> grid status = 2,1,3,4,,,,, - Got 2,1,3,,,,,,, expected 2,1,3,4,,,,,
> grid status = 2,1,3,4,,,,, - Got 2,1,,,,,,,, expected 2,1,3,4,,,,,
To be clear, these are 3 separate test runs' logs.

Wait for all 3 links to be added before waiting for update:
+  yield fillHistory([link(2), link(3), link(4)], TestRunner.next);
https://tbpl.mozilla.org/?tree=Try&rev=3dfce54dd936
https://hg.mozilla.org/try/rev/41262fe7f6fd

Add 2 links (in expected descending date order) then add link 2 which will be first anyway in the same pattern only single links were added for previous checkGrid()s:
+  yield fillHistory([link(3), link(4)], TestRunner.next);
+  fillHistory([link(2)]);
https://tbpl.mozilla.org/?tree=Try&rev=8bf217945adb
https://hg.mozilla.org/try/rev/3e5655cb4c66
Thank you for spending your Sunday relanding this, Ed.
Whiteboard: [tiles] p=8 s=it-31c-30a-29b.1 [qa+] → [tiles] p=8 s=it-31c-30a-29b.2 [qa+]
verified with todays Nightly build 20140401
Status: RESOLVED → VERIFIED
Whiteboard: [tiles] p=8 s=it-31c-30a-29b.2 [qa+] → [tiles] p=8 s=it-31c-30a-29b.2 [qa!]
No longer blocks: fxdesktopbacklog
Flags: firefox-backlog+
Blocks: 1036280
Blocks: 1060742
Blocks: 1071448
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: