Closed Bug 744713 Opened 13 years ago Closed 9 years ago

Add support for putting "total size" into appcache manifest

Categories

(Core :: Networking: Cache, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
blocking-kilimanjaro +
blocking-basecamp -

People

(Reporter: sicking, Unassigned)

References

Details

(Whiteboard: [tech-bug])

In order to display a progress bar as an app is being installed, we need to know how many bytes the app is, and how many bytes have been downloaded. Bug 744710 is for letting is know how many bytes have been downloaded. This bug is for letting us know how many bytes the total is. To do this we should add support for a new section in the appcache manifest (probably "moz" prefixing the name of the section) where the author can put the total application size. We should then expose whatever value the author put in there through nsIOfflineCacheUpdate
We may also use preflight HEAD requests (in parallel bursts). I don't think developers will add information about total size to the manifest. It may be hard to count. I cannot imagine I update the field in the manifest every time I update the app my self.
You have to update the manifest any time you update any of your files, otherwise browsers won't check for new versions of the files. But I agree it'll be problematic to get people to put the right information in here. But I'm concerned that HEAD requests would significantly increase download times.
Over to Honza in hopes that he can help out here.
Assignee: nobody → honzab.moz
I personally feel that requiring authors to maintain an accurate size in the manifest file is bound to be troublesome. I'd much prefer a small update time penalty to figure out the size ourselves over assuming that people will consistently get the size in the manifest updated correctly.
having a fallback which uses HEAD requests to figure out sizes certainly seems like a good idea (even if the "fallback" ends up being the common case)
This blocks our webapps story, so blocking kilimanjaro.
blocking-kilimanjaro: --- → +
App marketplaces can offer this information themselves, through internal store APIs. See bug 751758 for calculating the size based on server-reported resource sizes.
What's the actual technical proposal for that? Should the size be passed to the install function? It seems strange to me that we would pull *all* other information about an applications name/icon/resources/securitypolicy etc from the app developer, but get the size from the store. It feels very random and inconsistent.
The store tracks the total size internally, and in the PHP that generates the app-store page does Install size: <?php echo get_size();?>MB or something to that effect. Putting the size in the manifest itself would be OK as long as the manifest is trusted, but when it's not, then developers can say their install is 1KB when it's actually 100MB. (In reply to Jonas Sicking (:sicking) from comment #9) > What's the actual technical proposal for that? Should the size be passed to > the install function? It seems strange to me that we would pull *all* other > information about an applications name/icon/resources/securitypolicy etc > from the app developer, but get the size from the store. It feels very > random and inconsistent. The developer will have submit their files to the store (usually), so asking them to record the size is redundant. The store can figure that out on its own.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #10) > Putting the size in the manifest itself would > be OK as long as the manifest is trusted, but when it's not, then developers > can say their install is 1KB when it's actually 100MB. We're getting all other data from the manifest, so we better ensure that it's trust worthy. > The developer will have submit their files to the store (usually) What are you basing this statement on?
(In reply to Jonas Sicking (:sicking) from comment #11) > (In reply to Chris Jones [:cjones] [:warhammer] from comment #10) > > Putting the size in the manifest itself would > > be OK as long as the manifest is trusted, but when it's not, then developers > > can say their install is 1KB when it's actually 100MB. > > We're getting all other data from the manifest, so we better ensure that > it's trust worthy. > Trusted stores can sanitize manifests. Arbitrary manifests are untrusted, we can't believe their size if they declare it. > > The developer will have submit their files to the store (usually) > > What are you basing this statement on? If a store vouches for code in any way, it has to host the code (or delegate to a trusted third party, or use code signing with a PKI).
(In reply to Chris Jones [:cjones] [:warhammer] from comment #12) > > > The developer will have submit their files to the store (usually) > > > > What are you basing this statement on? > > If a store vouches for code in any way, it has to host the code (or delegate > to a trusted third party, or use code signing with a PKI). I think most apps are going to fall into the "untrusted" bucket. See Lucas emails regarding the 4 types of content. This is a much safer (for user) and lower cost (for developers/store) model and so hopefully one that more developers will choose. For trusted content, we haven't yet decided how it will be implemented, but using PKI and signatures is definitely a viable option. So I don't think that we can say that files will usually, if ever, come from the store at this point.
(In reply to Jonas Sicking (:sicking) from comment #13) > (In reply to Chris Jones [:cjones] [:warhammer] from comment #12) > > > > The developer will have submit their files to the store (usually) > > > > > > What are you basing this statement on? > > > > If a store vouches for code in any way, it has to host the code (or delegate > > to a trusted third party, or use code signing with a PKI). > > I think most apps are going to fall into the "untrusted" bucket. > So I don't think that we can say that files will usually, if ever, come from > the store at this point. Eventually, yes. Initially, no.
Idea: - write and publicly populate a PHP script that web admins may host - the script sums size of resources in the manifest on the server side - it would be used as "http://example.com/cachesize.php?manifest.txt" and in the content or in a header it would return the summed cache size - we may cache the result and return 304 when the manifest has not been changed on the server (etag or l-m may be the same as etag and l-m of the manifest, for instance) - have a new section of the manifest like "SIZE:" - have an item in the section like "http://example.com/cachesize.php", simply a URL pointing to the script (under the same-origin-with-the-manifest rule) - during the update, first make a request for "http://example.com/cachesize.php?the-manifest-URL" and get the size - browsers that don't support the SIZE section will ignore it and just cache the cachesize.php resource as an explicit item (have to double check this) This way we don't need to do HEAD request bursts, we are not dependent on hosting anything on our market place and we don't put the effort of calculating the size of all files to web admins every time they deploy, they only need to host the script file (upload to FTP) and refer the same script URL in all their hosted manifests.
If no one objects the suggestion from comment 15, I'll start to write the PHP script.
Have we already rejected the idea of solving app delivery problems by packaging the apps in (optionally signed) archives? Because, packaging the apps in archives would give us this for free, in the Content-Length of the response containing the archive. I think it would be way too error-prone for web developers to be able to keep details like sizes and signatures in the AppCache manifest in sync with the resources on their servers, especially in the face of caching and rewriting/transcoding HTTP proxies that will break the signatures and change the sizes of all the resources mentioned in the manifest (and even the manifest itself), but also even in the face of caching done by their own server infrastructure. It is not uncommon for such intermediate proxies to ignore HTTP caching directives, and such proxies are commonly deployed by mobile phone operators. Also, for similar reasons, we cannot expect any app store to receive the same responses from a website that the user would receive from that same site even at the same point in time. So, we can put stuff in the manifest files, but we shouldn't rely on it being correct.
(In reply to Brian Smith (:bsmith) from comment #17) > delivery problems by > packaging the apps in (optionally signed) archives? How would then app cache work with it? Is there a bug or thread about this? > > I think it would be way too error-prone for web developers to be able Agree > So, we can put stuff in the manifest files, but we shouldn't rely on it > being correct. And what about the PHP script idea? Not sure how well it can count size of script responses, though (I'm no deep PHP expect).
Rather than using a server-side solution to generate the size from a separate url, why not use a servier-side solution to generate the manifest? I.e. create a php script which given a cache manifest generates a new manifest which includes a SIZE section which includes the actual size. So something like http://example.com/cachesize.php?manifest.txt would return the manifest itself, modified to include the total size of all resources inside a SIZE section. I definitely think that we should look into packaging in case we don't think that people can produce reliable information in the manifest. However that's very much out of scope for this bug. We are very aware that the appcache feature have big problems, but it's the only thing we have to work with right now.
(In reply to Jonas Sicking (:sicking) from comment #19) > Rather than using a server-side solution to generate the size from a > separate url, why not use a servier-side solution to generate the manifest? One important note: the content of files and even the manifest it self may dependent on headers sent to the server during the update process, e.g. Accept-Language or User-Agent. So, we should generate the size at runtime (and cache it) by doing get_headers() with passing some selected headers that may influence the content size using stream_context_set_default(). But I don't know how well get_headers() works when accessing files on the same server. > I.e. create a php script which given a cache manifest generates a new manifest > which includes a SIZE section which includes the actual size. > So something like http://example.com/cachesize.php?manifest.txt would return the > manifest itself, modified to include the total size of all resources inside a > SIZE section. Assuming that http://example.com/cachesize.php?manifest.txt would be referenced with the manifest= attribute, then that is not much difference from what I propose, except we don't need to do any extra request. I like it, good idea.
Yes, that's exactly what I was thinking. Sounds great!
(In reply to Jonas Sicking (:sicking) from comment #19) > I definitely think that we should look into packaging in case we don't think > that people can produce reliable information in the manifest. However that's > very much out of scope for this bug. We are very aware that the appcache > feature have big problems, but it's the only thing we have to work with > right now. The important thing is that we should treat the size only as an estimate or advice, and we should expect it to be wrong many times. In particular, we shouldn't fail because it is wrong, and any UI needs to handle the case where the size doesn't match what we actually download.
Basically I think we should fire progress notifications as if the data is correct, but if we reach the number of bytes that was in the manifest, while still having resources to download, we should simply switch to acting as if the manifest didn't include a size estimate at all. Similarly, if we reach the end of all resources even if the manifest indicated that there are additional bytes to download, we should fire whatever "done" notifications we have just as normal.
blocking-basecamp: --- → -
Every other app marketplace tells you how large apps you're about to download are, and the lack of this is especially serious in markets with limited bandwidth. Bad user-visible problem.
Whiteboard: [tech-bug]
I don't think this can be solved in a good way, except when sending a packaged app over the wires (Content-Length is the correct source of information here): There are just too many variables, e. g. UA-, locale-/region- or User-dependant modifications as well as Encoding, to name a few. Imagine a total size of 5kB is assumed, but the App is actually 5MB, the User will feel misguided e.g. if the progress bar changes from 100% to an undetermined state and stays that way for several minutes without any indication what is going on (except "downlaoding") – the User may also believe the installer stalled (or sth similar). I'd rather prefer using an undetermined state, and as soon as we have a pretty *accurate* estimate on how much data there's left to download, show a progress bar with a determined state. This estimate can be provided by a marketplace or the Content-Length property. The only good source for the size information of unpackaged apps is each file's Content-Length. Then either 1) do a preflight HEAD for all resources prior to downloading, or 2) start downlading (preferably several files in parallel) and wait for the last resource to send a Content-Length header, or 3) start downloading and simultaniously HEAD all remaining resources to be able to determine the total filesize as quickly as possible. I prefer option 3 as it is a compromise between speed, accuracy and best UX. Any thoughts?
I wouldn't worry too much about getting an exact number here. I think the idea is to simply allow the app author to provide a number in the manifest (since it's so much work to try to get it otherwise). If the author is wrong, then that's their fault. And they don't need to be exactly right--this is more for users to eyeball when they're deliberating whether to download over a non-wifi connection, etc. See comment 22 and 23.
Actually, reading over the earlier bug comments, we seemed to be going with the idea of the marketplace-based script that calculates the length in advance and then automatically writes it into the manifest. I can't tell from comment 20 whether we're planning on changing that approach because of possible length differences due to locale. Do we really expect length to vary by a such a significant amount because of language that we can't use the result as a user-facing estimate? Seems better to get a rough estimate in advance that may be slightly off for some languages than to wait for an estimate at load time just to get more accurate numbers. Honza, which approach are you planning to do?
Do we really need to write this into the manifest? It shouldn't be there (due to the facts stated above). Instead, I'd rather like to see a marketplace API for obtaining a size estimate, and for non-marketplace app-cache-d webapps option 3 I explained above.
Create a mozilla web tool that takes the manifest URL, downloads all the files with the request headers that might vary forwarded, cache the size. Until the manifest changes (md5 sum) provide the cached size for the same UA/lang/accept/whatever we chose as varying headers. Result can be a json file like {"manifest":"http://...", "byte-size":89394}. We can use it to display size on the marketplace as well as to allow web developers to add this number to their manifests as a cached quick hint. We can introduce a SIZE: section for instance (have to think more about the manifest format changes).
Assignee: honzab.moz → nobody
app cache is going away
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.