Closed Bug 1905958 Opened 1 year ago Closed 6 months ago

Create a JSON schema to describe the graph that is serialized for session restore

Tracking

()

Status:

RESOLVED FIXED

Milestone:

140 Branch

Tracking Flags:

Tracking

Status

firefox140

---

fixed

People

(Reporter: sfoster, Assigned: sfoster)

References

Details

(Whiteboard: [fidefe-session-restore])

Attachments

(1 file)

Bug 1905958 - Add a JSON schema for browser session state and a way to validate against it. r?#sessionstore-reviewers 6 months ago Sam Foster [:sfoster] (he/him) 48 bytes, text/x-phabricator-request		Details \| Review

Sam Foster [:sfoster] (he/him)

Assignee

Description

•

1 year ago

In bug 1849393 we identified a need for a formal schema that defines what goes into a session file. These are an in-memory javascript graph that gets serialized to JSON and compressed and written to disk as a .jsonlz4 file, allowing a user to seamlessly resume a session after a restart or crash.

The basic structure looks something like this:

{
  "version": [
    "sessionrestore",
    1
  ],
  "windows": [
    {
      "tabs": [
        {
          "entries": [
            {
              "url": "https://example.com/",
              "title": "Page title",
              "hasUserInteraction": false,
              "triggeringPrincipal_base64": "{\"3\":{}}"
            },
            ....
          ],
          "lastAccessed": 1697149705740,
          "hidden": false,
          "searchMode": null,
          "userContextId": 0,
          "attributes": {},
          "index": 3,
          "requestedIndex": 0,
          "image": "data:image/x-icon;base64,etc.."
        },
        ...
      ],
      "_closedTabs": [
        {
          "state": {
            "entries": [
              {
                "url": "https://elsewhere.com/",
                "title": "Page title",
                "resultPrincipalURI": null,
                "principalToInherit_base64": "{\"0\":{\"0\":\"moz-nullprincipal:{b8140753-f226-4c5d-9749-fc3eea899d9f}\"}}",
                "hasUserInteraction": true,
                "triggeringPrincipal_base64": "{\"3\":{}}",
                "persist": true
              }
            ],
            "lastAccessed": 1697142677640,
            "hidden": false,
            "searchMode": null,
            "userContextId": 0,
            "attributes": {},
            "index": 1,
            "requestedIndex": 0,
            "image": "..."
          },
          "title": "Page title",
          "image": "...",
          "pos": 0,
          "closedAt": 1697147353955,
          "closedInGroup": false,
          "removeAfterRestore": true,
          "closedId": 3,
          "sourceWindowId": "window0"
        },
        ...
      ]
    },
    ...
  },
  "selectedWindow": 0,
  "_closedWindows": [
     ...
  ],

  "session": {
    "lastUpdate": 1697149716870,
    "startTime": 1697149627011,
    "recentCrashes": 1
  },
  "global": {},
  "cookies": [
     {
      "host": "example.com",
      "value": "af6e3...",
      "path": "/",
      "name": "someName",
      "secure": true,
      "httponly": true,
      "expiry": 1697142669447,
      "originAttributes": {
        "firstPartyDomain": "",
        "geckoViewSessionContextId": "",
        "inIsolatedMozBrowser": false,
        "partitionKey": "",
        "privateBrowsingId": 0,
        "userContextId": 3
      },
      "sameSite": 1,
      "schemeMap": 2
    },
    ...
 ]
}

...but there is quite a lot of detail down at the individual entries level for each tab.

Having a schema would be useful in tests and also a known quantity and jumping off point for any future optimization or re-architecting of how session (res)store works.

Sam Foster [:sfoster] (he/him)

Assignee

Updated

•

1 year ago

Comment 1

•

1 year ago

•

Edited

I have a work-in-progress for the JSON schema at https://github.com/sfoster/moz-sessionrestore-tools, the draft schema itself its at session-schema.json. Once that is closer to done and we have figured out where in the tree this should live, I'll get a patch on here. PRs welcome in the meantime.

Sam Foster [:sfoster] (he/him)

Assignee

Updated

•

1 year ago

Assignee: nobody → sfoster

Status: NEW → ASSIGNED

Sam Foster [:sfoster] (he/him)

Assignee

Comment 2

•

10 months ago

:adw it looks like maybe you or :mak might be able to answer this question. When validating an array, JsonSchemaValidator seems to assume arrays of items should all have the same type? That's not how I understand the spec though - for draft-07 at least, each item in an array can have its own schema to validate it.

A concrete example, to validate the version property of the Session restore document I have:

"version": {
  "type": "array",
  "items": [
    {"type": "string"},
    {"type": "integer"}
  ]
}

and the example input looks like:

  "version": [
    "sessionrestore",
    1
  ],

Did I misread this (the code or the spec) or do we need a patch on the validator implementation to support this?

Flags: needinfo?(adw)

Dimitrios Apostolou

Comment 3

•

10 months ago

Some information that might be useful in the schema, is the importance of each field. For example:

Opening a new tab is of high importance, and it should be synced to disk immediately.
But the position of a page is not that important. In fact, syncing to disk every time the user scrolls is quite wasteful. I assume there are more fields like this.

Sam Foster [:sfoster] (he/him)

Assignee

Comment 4

•

10 months ago

(In reply to Dimitrios Apostolou from comment #3)

Some information that might be useful in the schema, is the importance of each field. For example:

Opening a new tab is of high importance, and it should be synced to disk immediately.

But the position of a page is not that important. In fact, syncing to disk every time the user scrolls is quite wasteful. I assume there are more fields like this.

We are planning on eventually moving to a incremental write model, where the cost of a single property update will be greatly reduced. If there's still a need for tracking what kinds of changes have been made, these kind of weighting values could potentially live in the schema or just in the code. Similar heuristics elsewhere are typically just implemented in code.

Drew Willcoxon :adw

Comment 5

•

10 months ago

Sorry for the delay, I was out until today. JsonSchemaValidator isn't quite standard and we should probably replace it with a standardized one at some point.

items can't be an array but type can, so this schema fragment shoud work for you:

  {
    type: "array",
    items: {
      type: ["string", "integer"],
    },
  },

Here's a fuller example:

JsonSchemaValidator.validate(
  {
    version: [
      "sessionrestore",
      1
    ],
  },
  {
    type: "object",
    properties: {
      version: {
        type: "array",
        items: {
          type: ["string", "integer"],
        },
      },
    },
  }
);

Flags: needinfo?(adw)

Drew Willcoxon :adw

Comment 6

•

10 months ago

But there's no way to say "the first element must be a string and the second element must be an int," so if you need that, you'll have to do some post-validation validation on your own. In that case I would suggest not using an array at all but an object instead.

Sarah Clements [:sclements]

Updated

•

10 months ago

Whiteboard: [fidefe-session-restore]

Jira Integration Bot

Updated

•

10 months ago

See Also: → https://mozilla-hub.atlassian.net/browse/FIDEFE-6117

Andreas Farre [:farre]

Comment 7

•

7 months ago

(In reply to Sam Foster [:sfoster] (he/him) from comment #4)

We are planning on eventually moving to a incremental write model, where the cost of a single property update will be greatly reduced. If there's still a need for tracking what kinds of changes have been made, these kind of weighting values could potentially live in the schema or just in the code. Similar heuristics elsewhere are typically just implemented in code.

When we start going forward with this, we might want to revisit how core-session-restore integrates, since data collection from content documents are very incremental. Is there a bug for the incremental work?

Sam Foster [:sfoster] (he/him)

Assignee

Comment 8

•

7 months ago

(In reply to Andreas Farre [:farre] from comment #7)

When we start going forward with this, we might want to revisit how core-session-restore integrates, since data collection from content documents are very incremental. Is there a bug for the incremental work?

There's no bug yet, just the discussion in bug 1849393 in which incremental updates are identified as a solution to the problem.

Sam Foster [:sfoster] (he/him)

Assignee

Comment 9

•

6 months ago

Attached file Bug 1905958 - Add a JSON schema for browser session state and a way to validate against it. r?#sessionstore-reviewers — Details

Sam Foster [:sfoster] (he/him)

Assignee

Comment 10

•

6 months ago

I"m attaching a snapshot of the WIP of this patch: this is a xpcshell test that takes the draft schema and validates a session data file against it, passing if it validates, failing if not.

I'm using Validator in the JsonSchema module, rather than JsonSchemaValidator.sys.mjs. That follows $refs properly and has better exceptions.
The structure of the entities and the references and where all that lives is work-in-progress. We may want to run validation at runtime when some debug pref is enabled so the test directory is not the final home for the schema. Though it may be good for now in for this particular bug/patch.
And we likely want separate files for some of these entities which have meaning and value outside this particular use case. Like history entries, sidebar properties, cookies, etc.

So far, its loading up the right documents, validating stuff and validation fails as expedcted if you put invalid data in the session data. But its not yet flagging missing data correctly. I think there are a few cases where data can be wholly missing but if its there it should be valid. And some where it must be there (required).

Phabricator Automation

Updated

•

6 months ago

Attachment #9480168 - Attachment description: WIP: Bug 1905958 - Smoke test the schema by validating a sample session restore document → Bug 1905958 - Add a JSON schema for browser session state and a way to validate against it. r?#sessionstore-reviewers

Pulsebot

Comment 11

•

6 months ago

Pushed by sfoster@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/371df68a102a Add a JSON schema for browser session state and a way to validate against it. r=sessionstore-reviewers,sidebar-reviewers,nsharpley,dwalker

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 12

•

6 months ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/371df68a102a

Status: ASSIGNED → RESOLVED

Closed: 6 months ago

status-firefox140: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 140 Branch

Camelia Badau [:cbadau], Desktop Test Engineering

Updated

•

5 months ago

QA Whiteboard: [qa-triage-done-c141/b140]

You need to log in before you can comment on or make changes to this bug.