Closed Bug 1409711 Opened 7 years ago Closed 3 years ago

[API] Establish a good practice for paginating results.

Categories

(Webtools Graveyard :: Pontoon, enhancement, P3)

enhancement

Tracking

(Not tracked)

RESOLVED MOVED

People

(Reporter: stas, Unassigned)

References

Details

Let's use this bug to discuss the naming scheme for paginated API fields and their shapes.

Currently with have two types of relation fields:

  - singular: project(slug), locale(code),
  - plural: projects, locales, localizations.

The plural fields return a full list of all available items.

For building UIs, paginated fields will be useful and more efficient. I imagine something like the following:

  {
    projectPages(pageNumber: 1, pageSize: 20) {
      totalCount
      pageInfo {
        pageNumber
        pageSize
        hasNext
        hasPrevious
      }
      items {
        name
        slug
      }
    }
  }

The field naming scheme would be: the noun in singular + "Pages". So: projectPages, localePages, localizationPages, resourcePages, entityPages, translationPages etc.
It's worth mentioning that for rapidly changing data GraphQL recommends Relay and the cursor pagination: https://facebook.github.io/relay/graphql/connections.htm. I think Pontoon's data doesn't change as often so it's likely that Relay would be too much for our use-case.
I would recommend to take a look at what github does, and consider it is probably the right thing to do. :)
They use Relay. Relay has been designed to solve the problem of paginating over a set which changes while you paginate. Django's Paginator might return the same items twice or skip a few. Relay won't, because it anchors each page to a cursor pointing at a specific item.

I'm pretty sure we don't need Relay for projects, locales, resources or entities. We might need for suggestions and the accessing the history of translations for an entity.
…and for user notifications.
Priority: -- → P2
To illustrate the difference between Django Paginator and Relay, here are two example queries which perform the same query.

Paginator:

  {
    projectPages(pageNumber: 2, pageSize: 20) {
      totalCount
      pageInfo {
        hasNext
      }
      items {
        name
        missingStrings
      }
    }
  }

Relay:

  {
    projectStream(first: 20, after: "opaqueCursor") {
      pageInfo {
        hasNextPage
      }
      edges {
        cursor
        node {
          name
          missingStrings
        }
      }
    }
  }

The main difference is how you specify the slice: in Relay you use an opaque cursor returned by the previous query. The cursor is generated by Relay and returned as a field on the "edge" shape which in turn requires that the actual list item be wrapped in the "node" shape.
Perhaps we could go for the simpler pagination with Django Paginator now and add Relay for suggestions and notifications later? We could use names like projectPages, localePages etc. for the Paginator and notificationStream for Relay.

graphene-django seems to have improved the Relay implementation in the upcoming 2.0 version. The last time I looked at the 1.x branch I found the implementation a bit lacking. Namely there were issues with Relay's ConnectionFields removing all prefetch_related and select_related optimizations from querysets. I'll verify if these problems are still present on the 1.x branch. If so, it might make sense to wait  with using Relay until 2.0 is released.
I wouldn't be surprised if 2.0 was much better in this regard :-)

Also, flod mentioned yesterday to me that he already suffers a bit from the depth of the returned json, and it seems like projectStream is even deeper nested than projectPages? Not just in the query, but also the result?
Yes, it introduces another level of nesting by means of the edges -> node relationship. This is due to the fact that the "cursor" must be defined on the edge rather than the node.

Note that for building reports I'd imaging the non-paginated fields to be easier to work with. I see pagination as something useful to front-end mostly: a way to not send the whole dataset at once to the user.
Depends on: 1411955
Priority: P2 → P3

Hello, I am new to contributing to Bugzilla can you assigned me to this issue so that I can learn to fix these bugs and also suggest to me how can Is start working on this bug to fix it.

Hi Falguni Islam, let's sync on chat.mozilla.org first.

*This bug has been moved to GitHub.*

*Please check it out on https://github.com/mozilla/pontoon/issues.*
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → MOVED
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.