Closed
Bug 1487100
Opened 6 years ago
Closed 6 years ago
Add API to specify preferred alt-data type by contentType
Categories
(Core :: Networking: Cache, enhancement, P3)
Tracking
()
RESOLVED
FIXED
mozilla64
Tracking | Status | |
---|---|---|
firefox64 | --- | fixed |
People
(Reporter: valentin, Assigned: valentin)
References
Details
(Whiteboard: [necko-triaged])
Attachments
(2 files)
Right now consumers may specify a preferred alt-data type before async open - if that type isn't available, we will still keep a reference to the cache entry so that we may write some alt-data to it.
From what I understand, the requirements for WASM machine code caching would be slighly different. For one we need a way to make the preferred data type conditional on the content type. For example:
if contentType == "application/wasm"
then: mPreferredCachedAltDataType = "wasm_type1"
else: mPreferredCachedAltDataType = ""
// don't do anything for types such as JSON/HTML/etc and release the cache entry.
I don't know the API will look. We could make preferAlternativeDataType take a dictionary<contentType,altDataType>, or we could allow multiple calls to preferAlternativeDataType(contentType,altDataType) and the channel would keep track of them.
Feedback welcome.
Comment 1•6 years ago
|
||
As a bit of background/motivation: we're trying to use alt-data to cache wasm machine code compiled by:
WebAssembly.compileStreaming(fetch('foo.wasm'))
and the problem is that we don't know at the point of the AsyncOpen() performed by fetch() whether the Response is destined for wasm or (more likely) something else.
compileStreaming() does however require the mime type 'application/wasm', which tells us a little earlier than when the Response is consumed:
https://webassembly.github.io/spec/web-api/index.html#compile-a-potential-webassembly-response
Comment 2•6 years ago
|
||
So if I understand correctly the idea would be to instrument fetch with a list of alt-data mapping per mime type. Therefore some requests would by default receive the alt-data instead of the default stream? Wouldn't this cause issues when JS request the to manipulate the stream, such as applying patches or displaying the WASM module in the debugger?
Another idea would be to make fetch collect all the Response meta-data about the content type, including the cache entry. Then delegate the opening of either content or alt-data of the cache entry to the response.getReader.read() function call? This would cause one extra IPC, but it would limit the amount of streamed data.
Comment 3•6 years ago
|
||
ok, so one loader, but may accept two different alt-types, right? like:
"I will accept js_byte_code but also wasm_type1, give me what you have. but if the response content-type is NOT 'application/wasm' just give me the raw content and NOT the alt-data"
is that it?
(In reply to Nicolas B. Pierron [:nbp] from comment #2)
> Wouldn't this
> cause issues when JS request the to manipulate the stream, such as applying
> patches or displaying the WASM module in the debugger?
can you please rephrase this question? it's not clear to me.
>
> Another idea would be to make fetch collect all the Response meta-data about
> the content type, including the cache entry. Then delegate the opening of
> either content or alt-data of the cache entry to the
> response.getReader.read() function call? This would cause one extra IPC,
> but it would limit the amount of streamed data.
no, this would mean to involve an IPC loop, and to actually restart the load. not good for performance at all. I think the preconditions can be specified with a well designed API.
I think preferAlternativeDataType(contentType,altDataType) called multiple times is a good approach.
Comment 4•6 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #3)
> I think preferAlternativeDataType(contentType,altDataType) called multiple
> times is a good approach.
but note that this is very much tight to this one use case only.
Comment 5•6 years ago
|
||
You're right that, even if the mime type is 'application/wasm', the Response may be read with .arrayBuffer() in which case we simply need the bytecode. I had assumed there was some way to say "actually, we need the raw response after all", but I guess that doesn't exist now?
But maybe you're right and it would be simpler / more symmetric to have the two-step approach as you suggest (but only when the content-type was in the (currently singleton) list of mime types for which we *might* want alt-data).
Comment 6•6 years ago
|
||
Comment 7•6 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #3)
> (In reply to Nicolas B. Pierron [:nbp] from comment #2)
> > Wouldn't this
> > cause issues when JS request the to manipulate the stream, such as applying
> > patches or displaying the WASM module in the debugger?
>
> can you please rephrase this question? it's not clear to me.
The `fetch(…)` function returns a Response object. This Response object can then be consumed by `compileStreaming()` or with `response.getReader().read()`. The first case would expect to read the alternate data (assembly) if any, while the second would expect to read the raw content (bytecode), and if I read the MDN documentation correctly potentially both.
The problem is that at the time when `fetch(…)` is executed, we have no idea where the response is flowing into, and as such we do not know whether we need to load the raw or the alternate data.
(In reply to Honza Bambas (:mayhemer) from comment #3)
> (In reply to Nicolas B. Pierron [:nbp] from comment #2)
> > Another idea would be to make fetch collect all the Response meta-data about
> > the content type, including the cache entry. Then delegate the opening of
> > either content or alt-data of the cache entry to the
> > response.getReader.read() function call? This would cause one extra IPC,
> > but it would limit the amount of streamed data.
>
> no, this would mean to involve an IPC loop, and to actually restart the
> load. not good for performance at all. I think the preconditions can be
> specified with a well designed API.
I will note that this is the strategy which is currently used in the ScriptLoader when we fail to match the SRI hash. We do make a second load to get the raw content instead of the alternate data. However, the procedure used in the ScriptLoader cannot be used here, as we have to guarantee that the raw content matches the alternate data in the case where both can be loaded from the Response.
Assignee | ||
Comment 8•6 years ago
|
||
This is the IRC chat we had about alt-data:
https://gist.github.com/valenting/7cdedffd75c078a9b0f31297864670ff
The TL;DR is that we could add/change the behaviour so that both the input streams for cached server response and the saved alt-data are delivered to the consumer. I assume that the server response is delivered through the regular onDataAvailable, while for alt-data we notify the listener at onStartRequest that we have some alt-data and it may decide to open the alt-data stream.
I hope I understood this correctly.
Comment 9•6 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #8)
> This is the IRC chat we had about alt-data:
> https://gist.github.com/valenting/7cdedffd75c078a9b0f31297864670ff
Thanks Valentin, now it's way more understandable what the goal is.
>
> The TL;DR is that we could add/change the behaviour so that both the input
> streams for cached server response and the saved alt-data are delivered to
> the consumer. I assume that the server response is delivered through the
> regular onDataAvailable, while for alt-data we notify the listener at
> onStartRequest that we have some alt-data and it may decide to open the
> alt-data stream.
I presume the decision is made on the content process, right? Note that if the nsHttpChannel (on the parent) keeps the entry alive (referred) then it can open the alt data stream at any time. How we deliver it then is another questions, tho.
>
> I hope I understood this correctly.
Only thing I'm not sure is if we still need the alt-data by mime-type API.
Comment 10•6 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #9)
> Only thing I'm not sure is if we still need the alt-data by mime-type API.
The ScriptLoader uses the mime-type as a way to discard cache entries from a different build dates. This is used to discard the alternate data on updates, in case of any format update. I presume WASM would be looking for the same kind of feature.
Assignee | ||
Comment 11•6 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #9)
> (In reply to Valentin Gosu [:valentin] from comment #8)
> > This is the IRC chat we had about alt-data:
> > https://gist.github.com/valenting/7cdedffd75c078a9b0f31297864670ff
>
> Thanks Valentin, now it's way more understandable what the goal is.
>
> >
> > The TL;DR is that we could add/change the behaviour so that both the input
> > streams for cached server response and the saved alt-data are delivered to
> > the consumer. I assume that the server response is delivered through the
> > regular onDataAvailable, while for alt-data we notify the listener at
> > onStartRequest that we have some alt-data and it may decide to open the
> > alt-data stream.
>
> I presume the decision is made on the content process, right? Note that if
> the nsHttpChannel (on the parent) keeps the entry alive (referred) then it
> can open the alt data stream at any time. How we deliver it then is another
> questions, tho.
> > I hope I understood this correctly.
>
> Only thing I'm not sure is if we still need the alt-data by mime-type API.
That depends on how the consumer will consume the content.
We could have a byMIMEtype API that opens the alt-data stream as well and delivers it in onStartRequest so we save an IPC round-trip.
Comment 12•6 years ago
|
||
As an update, I finished the SpiderMonkey refactorings (bug 1330661, bug 1487475) to switch from the (now-dead) IDB serialization scheme to something that would fit into what we're talking about here.
So re-reading the IRC logs, it seems like the key constraint is that we must always have some sort of fallback in cases where a Response is .buffer()ed or .clone()ed.
Considering the various alternatives and trying to minimize both (1) holding cache entries alive unnecessarily, (2) wasting time streaming the "wrong" thing from parent->child, (3) performing extra RTTs from content->parent, how about:
1. We extend preferAlternativeDataType() as suggested in comment 0
2. But we generalize the meaning of this preferred alt data type so that, even if you preferred and got the alt-data, you can always (at the cost of a RTT) ask for the original resource contents
My only concern is that if step 1 means building and passing a little { application/wasm -> wasm_buildid } dictionary to AsyncOpen (and from child to parent process), this will add a bit of overhead on *all* fetch()s. Really, it seems like we just need a single bit saying "the consumer is fetch()" and there would be a single hard-coded "fetch policy" which could be extended directly if we get other Response-consumer+altdata-producer cases.
How does that sound?
Comment 13•6 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #12)
> My only concern is that if step 1 means building and passing a little {
> application/wasm -> wasm_buildid } dictionary to AsyncOpen (and from child
> to parent process), this will add a bit of overhead on *all* fetch()s.
> Really, it seems like we just need a single bit saying "the consumer is
> fetch()" and there would be a single hard-coded "fetch policy" which could
> be extended directly if we get other Response-consumer+altdata-producer
> cases.
As long as we have a single alt-data this should be fine to have a single bit for the policy.
Later on, if we start storing multiple alt-data then we would be faced with more complex cases.
I guess we can have a limited set of alt-data types, which can be transmitted with a bitset.
The property such as whether the alt-data should be ignored based on the build-id is a property which seems to be part of the alt-data type, and as such does not need to go through IPC as done today.
> How does that sound?
To me this sounds reasonable.
Assignee | ||
Comment 14•6 years ago
|
||
Assignee | ||
Comment 15•6 years ago
|
||
Assignee | ||
Comment 16•6 years ago
|
||
Depends on D8071
Assignee | ||
Comment 17•6 years ago
|
||
Assignee | ||
Comment 18•6 years ago
|
||
Assignee | ||
Comment 19•6 years ago
|
||
Assignee | ||
Comment 20•6 years ago
|
||
Comment 21•6 years ago
|
||
Pushed by valentin.gosu@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/dd1c31ea78c2
Allow calling nsICacheInfoChannel.preferAlternativeDataType(altDataType, contentType) multiple times r=michal,luke
https://hg.mozilla.org/integration/autoland/rev/7f9d03c29a6f
Allow opening the input stream for original content when alt-data is available r=michal,luke
Comment 22•6 years ago
|
||
Backed out 2 changesets (bug 1487100) for XPCShell failures in netwerk/test/unit_ipc/test_alt-data_simple_wrap.js
Log:
https://treeherder.mozilla.org/logviewer.html#?job_id=206275630&repo=autoland&lineNumber=10293
INFO - (xpcshell/head.js) | test MAIN run_test pending (1)
02:37:02 INFO - (xpcshell/head.js) | test pending (2)
02:37:02 INFO - (xpcshell/head.js) | test run in child pending (3)
02:37:02 INFO - (xpcshell/head.js) | test MAIN run_test finished (3)
02:37:02 INFO - running event loop
02:37:02 INFO - CHILD-TEST-STARTED
02:37:02 INFO - (xpcshell/head.js) | test MAIN run_test pending (1)
02:37:02 INFO - Ignoring profile creation from child process.
02:37:02 INFO - (xpcshell/head.js) | test pending (2)
02:37:02 INFO - (xpcshell/head.js) | test MAIN run_test finished (2)
02:37:02 INFO - running event loop
02:37:02 INFO - "CONSOLE_MESSAGE: (warn) [JavaScript Warning: "Use of nsIFile in content process is deprecated." {file: "Z:/task_1539827455/build/tests/xpcshell/head.js" line: 345}]"
02:37:02 INFO - "CONSOLE_MESSAGE: (warn) [JavaScript Warning: "Use of nsIFile in content process is deprecated." {file: "Z:/task_1539827455/build/tests/xpcshell/head.js" line: 345}]"
02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | undefined assertion name - 13 == 13
02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | readServerContent - [readServerContent : 103] "response body" == "response body"
02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | readServerContent - [readServerContent : 104] "" == ""
02:37:02 INFO - (xpcshell/head.js) | test pending (2)
02:37:02 INFO - (xpcshell/head.js) | test flushAndOpenAltChannel pending (3)
02:37:02 INFO - (xpcshell/head.js) | test finished (3)
02:37:02 INFO - (xpcshell/head.js) | test finished (2)
02:37:02 INFO - (xpcshell/head.js) | test pending (3)
02:37:02 INFO - (xpcshell/head.js) | test flushAndOpenAltChannel finished (3)
02:37:02 INFO - (xpcshell/head.js) | test finished (2)
02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | undefined assertion name - 10 == 10
02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | readAltContent - [readAltContent : 150] true == true
02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | readAltContent - [readAltContent : 151] "text/binary" == "text/binary"
02:37:02 INFO - TEST-PASS | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | readAltContent - [readAltContent : 152] "!@#$%^&*()" == "!@#$%^&*()"
02:37:02 INFO - (xpcshell/head.js) | test pending (2)
02:37:02 WARNING - TEST-UNEXPECTED-FAIL | netwerk/test/unit_ipc/test_alt-data_simple_wrap.js | onInputStreamReady/< - [onInputStreamReady/< : 159] "" == "response body"
02:37:02 INFO - Z:/task_1539827455/build/tests/xpcshell/tests/netwerk/test/unit/test_alt-data_simple.js:onInputStreamReady/<:159
02:37:02 INFO - Z:/task_1539827455/build/tests/xpcshell/head.js:run:692
02:37:02 INFO - Z:/task_1539827455/build/tests/xpcshell/head.js:_do_main:219
02:37:02 INFO - Z:/task_1539827455/build/tests/xpcshell/head.js:_execute_test:533
02:37:02 INFO - typein:null:0
02:37:02 INFO - exiting test
02:37:02 INFO - (xpcshell/head.js) | test finished (2)
02:37:02 INFO - CHILD-TEST-COMPLETED
02:37:02 INFO - (xpcshell/head.js) | test finished (1)
02:37:02 INFO - exiting test
02:37:02 INFO - PID 10596 | [Parent 10596, Gecko_IOThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
02:37:02 INFO - PID 10596 | [Child 7240, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
02:37:02 INFO - <<<<<<<
02:37:03 INFO - TEST-START | toolkit/components/extensions/test/xpcshell/test_ext_native_messaging.js
Push with failures:
https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=adb07f2a1331142569039567d81f5168ada17198
Backout:
https://hg.mozilla.org/integration/autoland/rev/856c528a098312ad0ee77b5c6960993974ed19f4
Flags: needinfo?(valentin.gosu)
Comment 23•6 years ago
|
||
Pushed by dluca@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e37adb23fd48
Allow calling nsICacheInfoChannel.preferAlternativeDataType(altDataType, contentType) multiple times r=michal,luke
https://hg.mozilla.org/integration/autoland/rev/42319047f3d9
Allow opening the input stream for original content when alt-data is available r=michal,luke
Comment 24•6 years ago
|
||
This proved to be intermittent, as such, I relaned bug 1487100 and reopened bug 1499917 for the intermittent failures.
Flags: needinfo?(valentin.gosu)
Comment 25•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/e37adb23fd48
https://hg.mozilla.org/mozilla-central/rev/42319047f3d9
Status: NEW → RESOLVED
Closed: 6 years ago
status-firefox64:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
You need to log in
before you can comment on or make changes to this bug.
Description
•