Closed Bug 12579 Opened 25 years ago Closed 25 years ago

Implement jar: protocol (tracking bug)

Categories

(Core :: Networking, defect, P3)

All
Other
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: warrensomebody, Assigned: warrensomebody)

References

Details

(Keywords: perf, Whiteboard: [pdt+])

We need jar: protocol support for several reasons:
- performance (accessing lots of small files from a jar should be faster)
- downloadable chrome
- general product packaging

There are a couple of pieces to this:

- implementing jar: URLs (page 16 of the JavaHelp spec
<http://java.sun.com/products/javahelp/spec-1.0.pdf>)

- implementing the jar protocol handler (to read and write entries -- might
look a lot like the file: protocol)

- upgrading the current tree to make use of jar files, fixing xul, makefiles,
etc. (need a separate bug for tracking this)
Target Milestone: M11
Blocks: 12833
Whiteboard: [Perf]
Putting on [Perf] radar.
Blocks: 12838
No longer blocks: 12833
*** Bug 4707 has been marked as a duplicate of this bug. ***
Target Milestone: M11 → M13
Assignee: valeski → mstoltz
At this point, I'm writing the protocol handler and Gayatri is writing the

jar:URL parser.
Here are some thoughts on how I think some of this should look...

URL parsing -- The jar: URLs should probably have 2 instance variables that
point to other URLs, i.e.:

class nsJARURL : public nsIURL {
...
  nsCOMPtr<nsIURI> mJARPath;
  nsCOMPtr<nsIURI> mRelativePath;
};

where the syntax is "jar:<mJARPath>!/<mRelativePath>". In other words -- we
shouldn't need to write a complicated parser here, just something that looks
for "jar:" and "!/" and delegates the rest to nsStdURL.


JAR protocol handler -- I'm about to check in changes to nsFileTransport that
abstract out the part that really opens/closes/reads/writes files from the
state machine that deals with asynchronous activity, suspend/resume/cancel,
etc. You should be able to use this to implement the JAR protocol.

Basically, there will be a new interface, nsIFileSystem that you'll have to map
to libjar. Once you do that, I think it will just work. I'll let you know when
it goes in (today hopefully).

Please come by if either of you have any questions. I don't want us to work in
a vacuum on this stuff. We should try to keep in close contact to make the
quickest progress.

Thanks for doing this!

Warren
Just checking, but is <mJARPath> a URL in its own right? i.e.
"jar:http://foo.com/my.jar!/some/stuff.html"?

The original Sun spec we saw on this could handle archives embedded in other
archives. We may not want to figure that out now, but lets use a non-greedy
search for !/ so we can support more later.

Moving resources into archives may be a space win (especially on FAT drives,
even without compression), but it's not going to be a performance win if for
each requested resource we open the archive, parse the directory, and *then*
serve the data. We need some way to keep these archives open, and I don't know
how we can do that using stateless URL syntax.

Maybe this could be part of the magic of the "chrome:" protocol. Rather than
have to change all existing chrome: URLs we just add to the magic where chrome:
already adds "default" or local directories, it could open archives and
eventually close them when the chrome system shuts down. Not sure how that
plays with the jar: protocol idea, though.
Adding this functionality to chrome: is a less general solution, since other code
besides UI could make use of a jar: protocol. In particular, I need it to
implement signed scripts. Since in the first iteration, using libjar as is, we
need to download a jar file to disk before extracting files from it, there should
be a way to keep ahold of that file on disk for subsequent calls to that URL.
This will improve performance.
Sorry, I didn't mean we should give up on the "jar:" protocol -- I really want
that to try some cool ideas I have floating around.  I just meant that a "jar:"
protocol *by itself* is not going to solve the perceived chrome performance
problem which I assumed was the impetus for this bug given the [perf] status.

Maybe part of that solution would be for libjar (at a level below the protocol
handler) to keep a refcounted table of open archives, and return references to
them.  That way, for example, simultaneous browser windows wouldn't cause the
main resource .jar to be opened multiple times with the associated memory
cost to hold the directory structure.  Then the chrome/XUL system could open
and keep a reference to its main archives (using nsIZip directly), and then the
jar: protocol will find the archives already opened, saving much time.

For other uses jar: still works, just isn't as optimized.
> Just checking, but is <mJARPath> a URL in its own right? i.e.
> "jar:http://foo.com/my.jar!/some/stuff.html"?

Yes, and I assume you can have cases like this too:

  jar:jar:http://foo.com/my.jar!/other.jar!/stuff.html

This would imply to me that we have to look for the "!/" starting from the end
and working backwards.

Also note that we'll most likely need to pull the jar file down to the local
drive so that we can access it with our libjar code. We can shortcircuit this
for the case where the "mJARPath" part is a file: URL, but in general this means
some sort of cache management code to clean up jar files after they're no longer
needed (whenever that is). Whoever does this should sync up with Scott Furman to
determine whether any of his network cache services will be useful in
implementing this.

Re. chrome URLs: I have this idea in the back of my mind that they can go away,
being replaced by a more general search-path-based approach, probably subsumed
by some change to resource: URLs. E.g. chrome://foo/bar.xul might become
something like resource://ChromeDirs/foo/bar.xul. Obviously, jar files are one
place on the search path you might want to look, so this resource: URL could
again get translated into jar:file://<exe-dir>/chrome.jar!/foo/bar.xul.
Status: NEW → ASSIGNED
Blocks: 16654
Summary: Implement jar: protocol → [dogfood]Implement jar: protocol
Whiteboard: [Perf] → [Perf][PDT+]
Added dogfood to label along with PDT+ annotation. We noted that it was lower
level infrastructure, and that it was listed as a high priority bug in a status
report.  If we are confused about this being dogfood, please email the PDT
alias.
Thanks.
I do not believe this is required for dogfood at all.  Chrome will work fine
without needing to be in jars.  We may want it for beta, but we certainly don't
need it for dogfood.

I'd go so far as to say it may not even be needed in the shipping product for
chrome... we should do it if we think it's a performance win, but if we get fast
enough without it, then I wouldn't advocate doing the extra work just for
chrome.

If there are other reasons why we need it, then that's cool.  I just wanted to
give the chrome perspective.
Note that jar: URLs are needed for smart update. The resource/chrome URL idea is
separate. However, I think that this sort of architectural detail should be
ironed out now before we get too far down the road and entrenched with chrome
URLs. I intend to investigate to see if there's something better/more general
that we can do here.
Apparently, the large number of small files causes massive disk bloat on the Mac
because the large filesystem block size is inefficient for small files.  It's
been reported that a Mac Mozilla install requires 220M on disk.  So, even for
chrome, I would hazard that jar support is a requirement in the shipping
product.  I don't see why it should be considered necessary for dogfood,
however.
I agree that it's not dogfood, but I do think it's a
porkjockeys/architectural/beta feature.
Yeah, sounds like we need it for the shipping product to reduce bloat at the
very least.
Summary: [dogfood]Implement jar: protocol → Implement jar: protocol
I second that. We really don't want to ship a product with hundreds of .xul
files, that's sloppy. I am removing the dogfood label, since this is not crucial
for dogfood.
Whiteboard: [Perf][PDT+] → [Perf] needed for Beta
jar: URLS are not required for XPInstall (smartupdate). We need ZIP archive
support, but we have enough in nsIZip already.

Clearing the PDT status to match the clearing of Dogfood.
Blocks: 16950
Assignee: mstoltz → gayatrib
Status: ASSIGNED → NEW
Reassigning to myself.
Status: NEW → ASSIGNED
Blocks: 17432
Blocks: 17907
Blocks: 18433
Depends on: 18434, 18435
Blocks: 18471
Blocks: 18951
Blocks: 20203
Blocks: 21564
Bulk move of all Necko (to be deleted component) bugs to new Networking

component.
Keywords: perf
Bulk add of "perf" to new keyword field.  This will replace the [PERF] we were
using in the Status Summary field.
Gayatri...is this feature stable enough to declare it FIXED? What else needs to
be done?
Target Milestone: M13 → M14
Moving milestone to M14. I need to get some testing done.
Dan Veditz needs to let me know some information regarding cancelling
a jar installation while extraction is going on. So I will file that
as a separate bugand try to close the jar protocol asap.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
This still misses the cancel method implementation. Opening that as a
separate bug. The rest of teh functionality is fine.

To test it:
1) Please note that when you use local jar files, you need to type
file:/// and not file:// in the browser window. An example local
jar url would be something like:
jar:file:///h:/testHtml.zip!/hello1.html

2) Example urls to test http and ftp downloads are:
jar:http://www.relisoft.com/source.zip!/readme.txt
jar:ftp://ftp.mall.net/wham131.zip!/wham.txt

These are small files and can be easily downloaded and tested.
No longer blocks: 18433
Depends on: 18433, 24338
No longer depends on: 18433
Blocks: 18433
This really isn't done until all the dependencies are done. Reopening.
Status: RESOLVED → REOPENED
Depends on: 24765
No longer depends on: 24765
Depends on: 24765
Depends on: 24764
Clearing FIXED resolution due to reopen.
Resolution: FIXED → ---
On behalf of PorkJockeys: putting on beta1 radar, per beta criteria priority #2 
- performance (esp. file I/O on Mac) is not within beta metrics. removing 
extraneous tags, cc waterson
Keywords: beta1
Whiteboard: [Perf] needed for Beta
what parts of jar protocol are working, and which ones not?  does brutal sharing 
help?
Per warren:
"Look at the dependencies, or more importantly, the dependencies of
http://bugzilla.mozilla.org/showdependencytree.cgi?id=18433."
Per phil:
"Warren, I thought part of the performance win was that we wouldn't need to open 
and close a zillion little xul, js, and dtd files in order to paint the chrome. 
But now that we have butal sharing, we probably only do that once, which means 
less performance benefit to jar files. Or am I off base?" 
I don't know about a performance benefit, but a Nav-only install on a FAT drive 
takes over 50Mb because of all the small files.
Mac (HFS) has the same problem. 

Additionally, this feature has regressed: jar:http no longer works. I see 
'Shortcut+' in the debug output, and then nothing happens, it fails silently. 
jar:file still works on Linux.
Whiteboard: [pdt+]
Clarification: as of today, jar:http fails silently on all platforms. jar:file 
works on all platforms. 
Works for me. I tried this:

jar:http://www.boulderdesign.com/Bin.zip!/chrome/global/content/default/about.h
tml

The initial try takes a long time to download (because it's big), the second 
one is really fast because it comes out of the jar cache. Mitch: Are you sure 
you were giving a valid jar: URL?
Warren, I tried this. It does not work for me as of now. I am trying to debug 
into it. What happens is that it goes into http::OpenInputStream(), it then 
comes into nsJARDownloadObserver::OnStopRequest(), and return from there. Then 
the browser url becomes the http site without the jar part and that's about it. 
Nothing happens after that. I am trying to debug this--just keeping you posted.
I tried both with your url and also with:
jar:http://www.relisoft.com/source.zip!/readme.txt
(this is only a 28kb file--so the download should be pretty fast)
Works for me (jar:http://www.relisoft.com/source.zip!/readme.txt), on both my 
machines. What are you doing differently? Are you up to date?
This has become a tracking bug, the fix was already checked in.  Moving to 
Warren.  Selmer from Gayatri's machine.
Assignee: gayatrib → warren
Status: REOPENED → NEW
Summary: Implement jar: protocol → Implement jar: protocol (tracking bug)
I believe the actual jar: protocol is done enough for beta1, and keeping this 
as a tracking bug has become confusing so I'm going to close it.

The remaining tasks are to use the cache manager to manage downloaded jar files 
(bug 24765) which depends on stream-as-file (bug 21250), and also implementing 
an open jar file cache for performance (bug 24764).
Status: NEW → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → FIXED
tracking bug, marking verified
Status: RESOLVED → VERIFIED
No longer blocks: 17432
No longer blocks: 17907
No longer blocks: 18471
No longer blocks: 18951
No longer blocks: 20203
No longer blocks: 21564
You need to log in before you can comment on or make changes to this bug.