1487100 - Add API to specify preferred alt-data type by contentType

Assignee

Description

•

6 years ago

Right now consumers may specify a preferred alt-data type before async open - if that type isn't available, we will still keep a reference to the cache entry so that we may write some alt-data to it. From what I understand, the requirements for WASM machine code caching would be slighly different. For one we need a way to make the preferred data type conditional on the content type. For example: if contentType == "application/wasm" then: mPreferredCachedAltDataType = "wasm_type1" else: mPreferredCachedAltDataType = "" // don't do anything for types such as JSON/HTML/etc and release the cache entry. I don't know the API will look. We could make preferAlternativeDataType take a dictionary<contentType,altDataType>, or we could allow multiple calls to preferAlternativeDataType(contentType,altDataType) and the channel would keep track of them. Feedback welcome.

Luke Wagner [:luke]

Comment 1

•

6 years ago

As a bit of background/motivation: we're trying to use alt-data to cache wasm machine code compiled by: WebAssembly.compileStreaming(fetch('foo.wasm')) and the problem is that we don't know at the point of the AsyncOpen() performed by fetch() whether the Response is destined for wasm or (more likely) something else. compileStreaming() does however require the mime type 'application/wasm', which tells us a little earlier than when the Response is consumed: https://webassembly.github.io/spec/web-api/index.html#compile-a-potential-webassembly-response

Luke Wagner [:luke]

Updated

•

6 years ago

Blocks: 1487113

Nicolas B. Pierron [:nbp]

Comment 2

•

6 years ago

So if I understand correctly the idea would be to instrument fetch with a list of alt-data mapping per mime type. Therefore some requests would by default receive the alt-data instead of the default stream? Wouldn't this cause issues when JS request the to manipulate the stream, such as applying patches or displaying the WASM module in the debugger? Another idea would be to make fetch collect all the Response meta-data about the content type, including the cache entry. Then delegate the opening of either content or alt-data of the cache entry to the response.getReader.read() function call? This would cause one extra IPC, but it would limit the amount of streamed data.

Honza Bambas (:mayhemer)

Comment 3

•

6 years ago

ok, so one loader, but may accept two different alt-types, right? like: "I will accept js_byte_code but also wasm_type1, give me what you have. but if the response content-type is NOT 'application/wasm' just give me the raw content and NOT the alt-data" is that it? (In reply to Nicolas B. Pierron [:nbp] from comment #2) > Wouldn't this > cause issues when JS request the to manipulate the stream, such as applying > patches or displaying the WASM module in the debugger? can you please rephrase this question? it's not clear to me. > > Another idea would be to make fetch collect all the Response meta-data about > the content type, including the cache entry. Then delegate the opening of > either content or alt-data of the cache entry to the > response.getReader.read() function call? This would cause one extra IPC, > but it would limit the amount of streamed data. no, this would mean to involve an IPC loop, and to actually restart the load. not good for performance at all. I think the preconditions can be specified with a well designed API. I think preferAlternativeDataType(contentType,altDataType) called multiple times is a good approach.

Honza Bambas (:mayhemer)

Comment 4

•

6 years ago

(In reply to Honza Bambas (:mayhemer) from comment #3) > I think preferAlternativeDataType(contentType,altDataType) called multiple > times is a good approach. but note that this is very much tight to this one use case only.

Luke Wagner [:luke]

Comment 5

•

6 years ago

You're right that, even if the mime type is 'application/wasm', the Response may be read with .arrayBuffer() in which case we simply need the bytecode. I had assumed there was some way to say "actually, we need the raw response after all", but I guess that doesn't exist now? But maybe you're right and it would be simpler / more symmetric to have the two-step approach as you suggest (but only when the content-type was in the (currently singleton) list of mime types for which we *might* want alt-data).

Luke Wagner [:luke]

Comment 6

•

6 years ago

(I'm replying to comment 2 in comment 5.) Replying to comment 3: is there a way to just *delay* the decision to grab raw contents or alt-data until (when the content-type is 'application/wasm') we see the body consumed by compileStreaming() vs. .arrayBuffer() et al?

Nicolas B. Pierron [:nbp]

Comment 7

•

6 years ago

(In reply to Honza Bambas (:mayhemer) from comment #3) > (In reply to Nicolas B. Pierron [:nbp] from comment #2) > > Wouldn't this > > cause issues when JS request the to manipulate the stream, such as applying > > patches or displaying the WASM module in the debugger? > > can you please rephrase this question? it's not clear to me. The `fetch(…)` function returns a Response object. This Response object can then be consumed by `compileStreaming()` or with `response.getReader().read()`. The first case would expect to read the alternate data (assembly) if any, while the second would expect to read the raw content (bytecode), and if I read the MDN documentation correctly potentially both. The problem is that at the time when `fetch(…)` is executed, we have no idea where the response is flowing into, and as such we do not know whether we need to load the raw or the alternate data. (In reply to Honza Bambas (:mayhemer) from comment #3) > (In reply to Nicolas B. Pierron [:nbp] from comment #2) > > Another idea would be to make fetch collect all the Response meta-data about > > the content type, including the cache entry. Then delegate the opening of > > either content or alt-data of the cache entry to the > > response.getReader.read() function call? This would cause one extra IPC, > > but it would limit the amount of streamed data. > > no, this would mean to involve an IPC loop, and to actually restart the > load. not good for performance at all. I think the preconditions can be > specified with a well designed API. I will note that this is the strategy which is currently used in the ScriptLoader when we fail to match the SRI hash. We do make a second load to get the raw content instead of the alternate data. However, the procedure used in the ScriptLoader cannot be used here, as we have to guarantee that the raw content matches the alternate data in the case where both can be loaded from the Response.

Valentin Gosu [:valentin] (he/him)

Assignee

Comment 8

•

6 years ago

This is the IRC chat we had about alt-data: https://gist.github.com/valenting/7cdedffd75c078a9b0f31297864670ff The TL;DR is that we could add/change the behaviour so that both the input streams for cached server response and the saved alt-data are delivered to the consumer. I assume that the server response is delivered through the regular onDataAvailable, while for alt-data we notify the listener at onStartRequest that we have some alt-data and it may decide to open the alt-data stream. I hope I understood this correctly.

Honza Bambas (:mayhemer)

Comment 9

•

6 years ago

(In reply to Valentin Gosu [:valentin] from comment #8) > This is the IRC chat we had about alt-data: > https://gist.github.com/valenting/7cdedffd75c078a9b0f31297864670ff Thanks Valentin, now it's way more understandable what the goal is. > > The TL;DR is that we could add/change the behaviour so that both the input > streams for cached server response and the saved alt-data are delivered to > the consumer. I assume that the server response is delivered through the > regular onDataAvailable, while for alt-data we notify the listener at > onStartRequest that we have some alt-data and it may decide to open the > alt-data stream. I presume the decision is made on the content process, right? Note that if the nsHttpChannel (on the parent) keeps the entry alive (referred) then it can open the alt data stream at any time. How we deliver it then is another questions, tho. > > I hope I understood this correctly. Only thing I'm not sure is if we still need the alt-data by mime-type API.

Nicolas B. Pierron [:nbp]

Comment 10

•

6 years ago

(In reply to Honza Bambas (:mayhemer) from comment #9) > Only thing I'm not sure is if we still need the alt-data by mime-type API. The ScriptLoader uses the mime-type as a way to discard cache entries from a different build dates. This is used to discard the alternate data on updates, in case of any format update. I presume WASM would be looking for the same kind of feature.

Valentin Gosu [:valentin] (he/him)

Assignee

Comment 11

•

6 years ago

(In reply to Honza Bambas (:mayhemer) from comment #9) > (In reply to Valentin Gosu [:valentin] from comment #8) > > This is the IRC chat we had about alt-data: > > https://gist.github.com/valenting/7cdedffd75c078a9b0f31297864670ff > > Thanks Valentin, now it's way more understandable what the goal is. > > > > > The TL;DR is that we could add/change the behaviour so that both the input > > streams for cached server response and the saved alt-data are delivered to > > the consumer. I assume that the server response is delivered through the > > regular onDataAvailable, while for alt-data we notify the listener at > > onStartRequest that we have some alt-data and it may decide to open the > > alt-data stream. > > I presume the decision is made on the content process, right? Note that if > the nsHttpChannel (on the parent) keeps the entry alive (referred) then it > can open the alt data stream at any time. How we deliver it then is another > questions, tho. > > I hope I understood this correctly. > > Only thing I'm not sure is if we still need the alt-data by mime-type API. That depends on how the consumer will consume the content. We could have a byMIMEtype API that opens the alt-data stream as well and delivers it in onStartRequest so we save an IPC round-trip.

Luke Wagner [:luke]

Comment 12

•

6 years ago

As an update, I finished the SpiderMonkey refactorings (bug 1330661, bug 1487475) to switch from the (now-dead) IDB serialization scheme to something that would fit into what we're talking about here. So re-reading the IRC logs, it seems like the key constraint is that we must always have some sort of fallback in cases where a Response is .buffer()ed or .clone()ed. Considering the various alternatives and trying to minimize both (1) holding cache entries alive unnecessarily, (2) wasting time streaming the "wrong" thing from parent->child, (3) performing extra RTTs from content->parent, how about: 1. We extend preferAlternativeDataType() as suggested in comment 0 2. But we generalize the meaning of this preferred alt data type so that, even if you preferred and got the alt-data, you can always (at the cost of a RTT) ask for the original resource contents My only concern is that if step 1 means building and passing a little { application/wasm -> wasm_buildid } dictionary to AsyncOpen (and from child to parent process), this will add a bit of overhead on *all* fetch()s. Really, it seems like we just need a single bit saying "the consumer is fetch()" and there would be a single hard-coded "fetch policy" which could be extended directly if we get other Response-consumer+altdata-producer cases. How does that sound?

Nicolas B. Pierron [:nbp]

Comment 13

•

6 years ago

(In reply to Luke Wagner [:luke] from comment #12) > My only concern is that if step 1 means building and passing a little { > application/wasm -> wasm_buildid } dictionary to AsyncOpen (and from child > to parent process), this will add a bit of overhead on *all* fetch()s. > Really, it seems like we just need a single bit saying "the consumer is > fetch()" and there would be a single hard-coded "fetch policy" which could > be extended directly if we get other Response-consumer+altdata-producer > cases. As long as we have a single alt-data this should be fine to have a single bit for the policy. Later on, if we start storing multiple alt-data then we would be faced with more complex cases. I guess we can have a limited set of alt-data types, which can be transmitted with a bitset. The property such as whether the alt-data should be ignored based on the build-id is a property which seems to be part of the alt-data type, and as such does not need to go through IPC as done today. > How does that sound? To me this sounds reasonable.

Valentin Gosu [:valentin] (he/him)

Assignee

Comment 14

•

6 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8570aaaed695a42bbedcefd92fc1e9b0186e9dbe

Valentin Gosu [:valentin] (he/him)

Assignee

Comment 15

•

6 years ago

Attached file Bug 1487100 - Allow calling nsICacheInfoChannel.preferAlternativeDataType(altDataType, contentType) multiple times r=michal! — Details

Valentin Gosu [:valentin] (he/him)

Assignee

Comment 16

•

6 years ago

Attached file Bug 1487100 - Allow opening the input stream for original content when alt-data is available r=michal! — Details

Depends on D8071

Valentin Gosu [:valentin] (he/him)

Assignee

Comment 17

•

6 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=41bf73fcbfe39a11359e8d7e658df2678efeb89a

Valentin Gosu [:valentin] (he/him)

Assignee

Comment 18

•

6 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=eeb263bf12d3c093663cb8125a00f8003f6bde9b

Valentin Gosu [:valentin] (he/him)

Assignee

Comment 19

•

6 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=fe108a912391101ae4362aa11713dbc81bb6f9fb

Valentin Gosu [:valentin] (he/him)

Assignee

Comment 20

•

6 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=30dda5686621284f0d9df8b8fbe8c30e2ea919e9

Pulsebot

Comment 21

•

6 years ago

Pushed by valentin.gosu@gmail.com: https://hg.mozilla.org/integration/autoland/rev/dd1c31ea78c2 Allow calling nsICacheInfoChannel.preferAlternativeDataType(altDataType, contentType) multiple times r=michal,luke https://hg.mozilla.org/integration/autoland/rev/7f9d03c29a6f Allow opening the input stream for original content when alt-data is available r=michal,luke

Dorel Luca [:dluca]

Comment 22

•

6 years ago

Backed out 2 changesets (bug 1487100) for XPCShell failures in netwerk/test/unit_ipc/test_alt-data_simple_wrap.js Log: https://treeherder.mozilla.org/logviewer.html#?job_id=206275630&repo=autoland&lineNumber=10293 INFO - (xpcshell/head.js) | test MAIN run_test pending (1) 02:37:02 INFO - (xpcshell/head.js) | test pending (2) 02:37:02 INFO - (xpcshell/head.js) | test run in child pending (3) 02:37:02 INFO - (xpcshell/head.js) | test MAIN run_test finished (3) 02:37:02 INFO - running event loop 02:37:02 INFO - CHILD-TEST-STARTED 02:37:02 INFO - (xpcshell/head.js) | test MAIN run_test pending (1) 02:37:02 INFO - Ignoring profile creation from child process. 02:37:02 INFO - (xpcshell/head.js) | test pending (2) 02:37:02 INFO - (xpcshell/head.js) | test MAIN run_test finished (2) 02:37:02 INFO - running event loop 02:37:02 INFO - "CONSOLE_MESSAGE: (warn) [JavaScript Warning: "Use of nsIFile in content process is deprecated." {file: "Z:/task_1539827455/build/tests/xpcshell/head.js" line: 345}]" 02:37:02 INFO - "CONSOLE_MESSAGE: (warn) [JavaScript Warning: "Use of nsIFile in content process is deprecated." {file: "Z:/task_1539827455/build/tests/xpcshell/head.js" line: 345}]" 02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | undefined assertion name - 13 == 13 02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | readServerContent - [readServerContent : 103] "response body" == "response body" 02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | readServerContent - [readServerContent : 104] "" == "" 02:37:02 INFO - (xpcshell/head.js) | test pending (2) 02:37:02 INFO - (xpcshell/head.js) | test flushAndOpenAltChannel pending (3) 02:37:02 INFO - (xpcshell/head.js) | test finished (3) 02:37:02 INFO - (xpcshell/head.js) | test finished (2) 02:37:02 INFO - (xpcshell/head.js) | test pending (3) 02:37:02 INFO - (xpcshell/head.js) | test flushAndOpenAltChannel finished (3) 02:37:02 INFO - (xpcshell/head.js) | test finished (2) 02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | undefined assertion name - 10 == 10 02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | readAltContent - [readAltContent : 150] true == true 02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | readAltContent - [readAltContent : 151] "text/binary" == "text/binary" 02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | readAltContent - [readAltContent : 152] "!@#$%^&*()" == "!@#$%^&*()" 02:37:02 INFO - (xpcshell/head.js) | test pending (2) 02:37:02 WARNING - TEST-UNEXPECTED-FAIL | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | onInputStreamReady/< - [onInputStreamReady/< : 159] "" == "response body" 02:37:02 INFO - Z:/task_1539827455/build/tests/xpcshell/tests/netwerk/test/unit/test_alt-data_simple.js:onInputStreamReady/<:159 02:37:02 INFO - Z:/task_1539827455/build/tests/xpcshell/head.js:run:692 02:37:02 INFO - Z:/task_1539827455/build/tests/xpcshell/head.js:_do_main:219 02:37:02 INFO - Z:/task_1539827455/build/tests/xpcshell/head.js:_execute_test:533 02:37:02 INFO - typein:null:0 02:37:02 INFO - exiting test 02:37:02 INFO - (xpcshell/head.js) | test finished (2) 02:37:02 INFO - CHILD-TEST-COMPLETED 02:37:02 INFO - (xpcshell/head.js) | test finished (1) 02:37:02 INFO - exiting test 02:37:02 INFO - PID 10596 | [Parent 10596, Gecko_IOThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 02:37:02 INFO - PID 10596 | [Child 7240, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 02:37:02 INFO - <<<<<<< 02:37:03 INFO - TEST-START | toolkit/components/extensions/test/xpcshell/test_ext_native_messaging.js Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=adb07f2a1331142569039567d81f5168ada17198 Backout: https://hg.mozilla.org/integration/autoland/rev/856c528a098312ad0ee77b5c6960993974ed19f4

Flags: needinfo?(valentin.gosu)

Pulsebot

Comment 23

•

6 years ago

Pushed by dluca@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e37adb23fd48 Allow calling nsICacheInfoChannel.preferAlternativeDataType(altDataType, contentType) multiple times r=michal,luke https://hg.mozilla.org/integration/autoland/rev/42319047f3d9 Allow opening the input stream for original content when alt-data is available r=michal,luke

Dorel Luca [:dluca]

Comment 24

•

6 years ago

This proved to be intermittent, as such, I relaned bug 1487100 and reopened bug 1499917 for the intermittent failures.

Flags: needinfo?(valentin.gosu)

Eliza Balazs [:ebalazs_]

Comment 25

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/e37adb23fd48 https://hg.mozilla.org/mozilla-central/rev/42319047f3d9

Status: NEW → RESOLVED

Closed: 6 years ago

status-firefox64: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla64

Valentin Gosu [:valentin] (he/him)

Assignee

Updated

•

6 years ago

Depends on: 1500594

Bug 1487100 - Allow calling nsICacheInfoChannel.preferAlternativeDataType(altDataType, contentType) multiple times r=michal! 6 years ago Valentin Gosu [:valentin] (he/him) 46 bytes, text/x-phabricator-request		Details \| Review
Bug 1487100 - Allow opening the input stream for original content when alt-data is available r=michal! 6 years ago Valentin Gosu [:valentin] (he/him) 46 bytes, text/x-phabricator-request		Details \| Review