Closed Bug 1612174 Opened 6 months ago Closed 4 months ago

Emit "init" lifecycleEvent in Runtime.evaluate when document.open() is called (to fix page.setContent in Puppeteer)

Categories

(Remote Protocol :: Runtime, task, P1)

Tracking

(firefox76 fixed)

RESOLVED FIXED
Firefox 76
Tracking Status
firefox76 --- fixed

People

(Reporter: whimboo, Assigned: maja_zf)

References

(Blocks 1 open bug)

Details

(Whiteboard: [puppeteer-beta-mvp])

Attachments

(2 files)

Puppeteer relies on an init lifecycleEvent when document.open() gets called:

https://searchfox.org/mozilla-central/rev/ad6234148892bc29bf08d736604ac71925138040/remote/test/puppeteer/lib/DOMWorld.js#204-215

This is how it has been implemented in Chromium:

https://chromium.googlesource.com/chromium/src/+/16de2eabee54981bf2f6b74c45d876bb47264885%5E%21/

I asked Olli in how we could determine in Gecko that document.open has been called. He proposed to use a readystatechange listener, and then check when the document's state changed back to READYSTATE_UNINITIALIZED. When this happened we could emit the init lifecycleEvent via the Page domain.

Priority: -- → P3
Whiteboard: [pup

This is needed for Page.setContent for Puppeteer, and it's used a lot in their unit tests. Perhaps we should change the tests to use data URIs if setContent isn't popular outside of Puppeteer unit tests.

Todo -- push to try with data URI returned by setContent to see how many tests we'll pass with this change.

Flags: needinfo?(mjzffr)
Flags: needinfo?(mjzffr)
Whiteboard: [pup

Replacing Puppeteer's Page.setContent implementation with just a navigation to a data URL works well for the vast majority of Puppeteer unit tests. When running this change against Chrome, test failures appear in navigation.spec.js, page.spec.js (actual setContent tests) and one in requestinterception.spec.js. These failures show what other behaviour is expected from setContent -- we still have to support that.

Given that more than 60 additional Puppeteer unit tests are passing this with, and that this will be pretty helpful for us to verify that our implementations work, I will try to get this fixed.

Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Priority: P3 → P1
Whiteboard: [puppeteer-beta-mvp]

Actually we do not really hang in setContent given that the lifecycle watcher is fine with the DOMContentLoaded lifecycle event. So emitting init might be optional.

Assignee: hskupin → nobody
Status: ASSIGNED → NEW
Priority: P1 → P3
Whiteboard: [puppeteer-beta-mvp]

There's a race. I'm investigating. Here are two log excerpts, with newlines added for highlight. Using Puppeteer at 0b1a9ce (latest master).

PASS

  puppeteer:protocol SEND ► {"sessionId":1,"method":"Runtime.callFunctionOn","params":{"functionDeclaration":"() => {\n      let retVal = '';\n      if (document.doctype)\n        retVal = new XMLSerializer().serializeToString(document.doctype);\n      if (document.documentElement)\n        retVal += document.documentElement.outerHTML;\n      return retVal;\n    }\n//# sourceURL=__puppeteer_evaluation_script__\n","executionContextId":2,"arguments":[],"returnByValue":true,"awaitPromise":true,"userGesture":true},"id":18} +2ms

1583179174320   RemoteAgent TRACE   <-(connection {5d7fc80e-5369-fb43-87b8-460b4e186241}) {"sessionId":1,"method":"Page.lifecycleEvent","params":{"frameId":"11","loaderId":null,"name":"DOMContentLoaded","timestamp":1583179174291}}
  puppeteer:protocol ◀ RECV {"sessionId":1,"method":"Page.domContentEventFired","params":{"timestamp":1583179174291}} +19ms
  puppeteer:protocol ◀ RECV {"sessionId":1,"method":"Page.lifecycleEvent","params":{"frameId":"11","loaderId":null,"name":"DOMContentLoaded","timestamp":1583179174291}} +2ms

1583179174333   RemoteAgent TRACE   (connection {5d7fc80e-5369-fb43-87b8-460b4e186241})-> {"sessionId":1,"method":"Runtime.callFunctionOn","params":{"functionDeclaration":"() => {\n      let retVal = '';\n      if (document.doctype)\n        retVal = new XMLSerializer().serializeToString(document.doctype);\n      if (document.documentElement)\n        retVal += document.documentElement.outerHTML;\n      return retVal;\n    }\n//# sourceURL=__puppeteer_evaluation_script__\n","executionContextId":2,"arguments":[],"returnByValue":true,"awaitPromise":true,"userGesture":true},"id":18}
1583179174340   RemoteAgent TRACE   <-(connection {5d7fc80e-5369-fb43-87b8-460b4e186241}) {"sessionId":1,"id":18,"result":{"result":{"type":"string","value":"<html><head></head><body><div>hello</div></body></html>","description":"<html><head></head><body><div>hello</div></body></html>"}}}
  puppeteer:protocol ◀ RECV {"sessionId":1,"id":18,"result":{"result":{"type":"string","value":"<html><head></head><body><div>hello</div></body></html>","description":"<html><head></head><body><div>hello</div></body></html>"}}} +20ms
  
  puppeteer:protocol SEND ► {"method":"Target.disposeBrowserContext","params":{"browserContextId":6},"id":19} +1ms
1583179174353   RemoteAgent TRACE   (connection {5d7fc80e-5369-fb43-87b8-460b4e186241})-> {"method":"Target.disposeBrowserContext","params":{"browserContextId":6},"id":19}
1583179174411   RemoteAgent TRACE   <-(connection {5d7fc80e-5369-fb43-87b8-460b4e186241}) {"method":"Target.targetDestroyed","params":{"targetId":11}}
  puppeteer:protocol ◀ RECV {"method":"Target.targetDestroyed","params":{"targetId":11}} +70ms
1583179174427   RemoteAgent TRACE   <-(connection {5d7fc80e-5369-fb43-87b8-460b4e186241}) {"id":19}
  puppeteer:protocol ◀ RECV {"id":19} +17ms
1) [ OK ] Firefox Browser Page Page.setContent should work (page.spec.js:715:5)

FAIL

  puppeteer:protocol SEND ► {"sessionId":1,"method":"Runtime.callFunctionOn","params":{"functionDeclaration":"html => {\n      document.open();\n      document.write(html);\n      document.close();\n    }\n//# sourceURL=__puppeteer_evaluation_script__\n","executionContextId":2,"arguments":[{"value":"<div>hello</div>"}],"returnByValue":true,"awaitPromise":true,"userGesture":true},"id":17} +2ms
1583179615485   RemoteAgent TRACE   (connection {2a905a15-88d7-934b-8b34-3087fb72c635})-> {"sessionId":1,"method":"Runtime.callFunctionOn","params":{"functionDeclaration":"html => {\n      document.open();\n      document.write(html);\n      document.close();\n    }\n//# sourceURL=__puppeteer_evaluation_script__\n","executionContextId":2,"arguments":[{"value":"<div>hello</div>"}],"returnByValue":true,"awaitPromise":true,"userGesture":true},"id":17}
1583179615496   RemoteAgent TRACE   <-(connection {2a905a15-88d7-934b-8b34-3087fb72c635}) {"sessionId":1,"id":17,"result":{"result":{"type":"undefined"}}}
1583179615496   RemoteAgent TRACE   <-(connection {2a905a15-88d7-934b-8b34-3087fb72c635}) {"sessionId":1,"method":"Page.domContentEventFired","params":{"timestamp":1583179615490}}
  puppeteer:protocol ◀ RECV {"sessionId":1,"id":17,"result":{"result":{"type":"undefined"}}} +28ms

  puppeteer:protocol ◀ RECV {"sessionId":1,"method":"Page.domContentEventFired","params":{"timestamp":1583179615490}} +11ms
1583179615518   RemoteAgent TRACE   <-(connection {2a905a15-88d7-934b-8b34-3087fb72c635}) {"sessionId":1,"method":"Page.lifecycleEvent","params":{"frameId":"11","loaderId":null,"name":"DOMContentLoaded","timestamp":1583179615490}}
  puppeteer:protocol ◀ RECV {"sessionId":1,"method":"Page.lifecycleEvent","params":{"frameId":"11","loaderId":null,"name":"DOMContentLoaded","timestamp":1583179615490}} +13ms
  
  puppeteer:protocol SEND ► {"method":"Target.disposeBrowserContext","params":{"browserContextId":6},"id":18} +15s
1583179630477   RemoteAgent TRACE   (connection {2a905a15-88d7-934b-8b34-3087fb72c635})-> {"method":"Target.disposeBrowserContext","params":{"browserContextId":6},"id":18}
1583179630504   RemoteAgent TRACE   <-(connection {2a905a15-88d7-934b-8b34-3087fb72c635}) {"method":"Target.targetDestroyed","params":{"targetId":11}}
  puppeteer:protocol ◀ RECV {"method":"Target.targetDestroyed","params":{"targetId":11}} +32ms
1583179630528   RemoteAgent TRACE   <-(connection {2a905a15-88d7-934b-8b34-3087fb72c635}) {"id":18}
  puppeteer:protocol ◀ RECV {"id":18} +26ms
1) [TIME] Firefox Browser Page Page.setContent should work (page.spec.js:715:5)
  Message:
    Timeout Exceeded 15000ms

Assignee: nobody → mjzffr
Status: NEW → ASSIGNED
Priority: P3 → P1
Whiteboard: [puppeteer-beta-mvp]

Failures increase when SLOW_MO=80.

As Henrik says, this bug might be a symptom of Bug 1599257

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #6)

Actually we do not really hang in setContent given that the lifecycle watcher is fine with the DOMContentLoaded lifecycle event. So emitting init might be optional.

We do need init, actually.

Puppeteer tracks life cycle events internally in a way that doesn't correspond exactly to the events received from the browser. For example, it records "load" and "DOMContentLoaded" when it sees "Page.frameStoppedLoading". It's that internal state that is checked by the life cycle watcher for setContent, and it's looking for "load".

Due to the race mentioned in Comment 7, and due to the missing init event which is needed to clear the internal life cycle state, the life cycle watcher finds a 'load' from the initial navigation to about:blank and that's what resolves the Promise for setContent.

So we need to emit an init Page.lifeCycleEvent (based on readystate as Olli suggested) as well as a load Page.lifeCycleEvent which we should get from the document.write/close in the setContent script.

However, I can't capture either of those events from the Page domain. The only events we get via the domain's chromeEventHandler are DOMContentLoaded, pagehide, pageshow. I've tried listening for other events but I get nothing. I also tried listening on this.content, this.content.document, etc.

Henrik, who would be a good person to ask about this? Could you forward the need-info while I'm away the next days?

Flags: needinfo?(hskupin)

Olli, do you have any idea why the chromeEventHandler for readystatechange doesn't fire when it is registered? Maybe we use the wrong node or even a wrong method? See Maja's comment from above. Thanks!

Flags: needinfo?(hskupin) → needinfo?(bugs)

Are you perhaps using a listener in bubble phase? readystatechange doesn't bubble, so need to use capturing listener.

Flags: needinfo?(bugs)

Many thanks!

So with capture: true we also see load and unload in addition to readystatechange on chromeEventHandler. We were really missing these on the Page domain, so expect the way we emit life cycle events can improve a lot now.

As an aside, I also looked at what we get from the webProgress onStateChange handler when a document.open/write/close occurs: an about:document-onload-blocker request starts at that point, so that might be an extra source of info in case we don't get what we need from readystatechange.

Evaluating document.open + write + close triggers the readyState
to change to "loading". Puppeteer, for example, relies on a seeing an
"init" life cycle event in that case for their Page.setContent method.

Also move "Page.loadEventFired" to be triggered by an actual "load"
instead of "pagehide", which we don't see after a document.write.

Attachment #9134821 - Attachment description: Bug 1612174 - Emit "init" lifecycleEvent when readyState is "loading" → Bug 1612174 - [remote] Emit "init" lifecycleEvent when readyState is "loading"

Note that this patch reduces the test job duration from about 30 minutes to only ~14 minutes! This contributes well to ci costs. CC'ing Joel for info. It also means soon we can run this test job more often again.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=cf42dfbf5515b1d9db8a402a0f47dcee40c9c1b9&selectedJob=294472949

Pushed by mjzffr@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/6a8ca36f31e8
[remote] Emit "init" lifecycleEvent when readyState is "loading" r=remote-protocol-reviewers,whimboo
https://hg.mozilla.org/integration/autoland/rev/06b9dd699708
[remote] Update puppeteer test expectations r=remote-protocol-reviewers,whimboo
Status: ASSIGNED → RESOLVED
Closed: 4 months ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 76
You need to log in before you can comment on or make changes to this bug.