Closed Bug 840094 Opened 11 years ago Closed 11 years ago

Change how nsZipArchive logging works

Categories

(Core :: Networking: JAR, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla22
Tracking Status
firefox21 --- fixed
firefox22 --- fixed

People

(Reporter: glandium, Assigned: glandium)

References

(Depends on 1 open bug)

Details

Attachments

(4 files, 1 obsolete file)

Currently, when setting the MOZ_JAR_LOG_DIR environment variable, you get .log files in the directory given as the env variable value corresponding to the leaf name of zip archives being opened. So if /foo/bar/omni.ja is opened, then /foo/bar/baz/omni.ja, both will have their accesses logged in $MOZ_JAR_LOG_DIR/omni.ja.
Now log in a single file given by the MOZ_JAR_LOG_FILE environment variable.
Log entries contain the URI of the Zip archive, followed by the path in the
archive.

This will require changes to the packager to handle the resulting log.
Attachment #712516 - Flags: review?(taras.mozilla)
Comment on attachment 712516 [details] [diff] [review]
Change how nsZipArchive logging works

+      rv = logFile->OpenNSPRFileDesc(PR_WRONLY|PR_CREATE_FILE|PR_APPEND, 0644, &file);

Why would you ever want to append to an existing file? This seems wrong

This will misbehave in an e10s world with multiple writers to same file.

Why not keep the existing code, but make the log filename be jarfile.md5sum_of_path.log? Seems simpler
Attachment #712516 - Flags: review?(taras.mozilla) → review-
(In reply to Taras Glek (:taras) from comment #2)
> Comment on attachment 712516 [details] [diff] [review]
> Change how nsZipArchive logging works
> 
> +      rv = logFile->OpenNSPRFileDesc(PR_WRONLY|PR_CREATE_FILE|PR_APPEND,
> 0644, &file);
> 
> Why would you ever want to append to an existing file? This seems wrong

Because we sometimes OpenArchive() then CloseArchive() and OpenArchive() again, which opens, closes and re-opens the log. The current code ends up doing the same, too.

> This will misbehave in an e10s world with multiple writers to same file.

If that is indeed a problem, it already is a problem with the current code. We can care about that in a followup.

> Why not keep the existing code, but make the log filename be
> jarfile.md5sum_of_path.log? Seems simpler

Seems simpler said than done. We'd need to ensure the md5sum can properly be computed from python, where the logs are going to be used, and that all openings of the same file will lead to the same md5sum. It's also not very human friendly. (as a side note, how many md5 implementations do we have in the tree? I already spotted four)
Append mode actually solves the e10s problem by having every write() seek at the end of the file.
(In reply to Mike Hommey [:glandium] from comment #4)
> Append mode actually solves the e10s problem by having every write() seek at
> the end of the file.

Except NSPR doesn't do the right thing on windows, it does it the racy way... sigh.
Depends on: 840435
(In reply to Mike Hommey [:glandium] from comment #4)
> Append mode actually solves the e10s problem by having every write() seek at
> the end of the file.

I'm pretty sure it does not. I used similar logging code elsewhere and multiple threads ended up causing new lines to be inserted midline. Problem arises because data isn't flushed in lines.

If you don't want to use md5, you can s/\//_/ on the filename to use the full path as a filename. e10s problem can be solved by using pid or some other identifier in the log.
(In reply to Taras Glek (:taras) from comment #6)
> (In reply to Mike Hommey [:glandium] from comment #4)
> > Append mode actually solves the e10s problem by having every write() seek at
> > the end of the file.
> 
> I'm pretty sure it does not. I used similar logging code elsewhere and
> multiple threads ended up causing new lines to be inserted midline. Problem
> arises because data isn't flushed in lines.

Are you sure you were doing that with write() and not printf()? AFAIK, Apache httpd just does that for its logs, and i never heard anyone complain about the logs having lines inserted midline.

> If you don't want to use md5, you can s/\//_/ on the filename to use the
> full path as a filename. e10s problem can be solved by using pid or some
> other identifier in the log.

That adds another problem: when it's the subprocess that does accesses before the main process to the same zip, how do you know the best order to use?
(In reply to Mike Hommey [:glandium] from comment #7)
> (In reply to Taras Glek (:taras) from comment #6)
> > (In reply to Mike Hommey [:glandium] from comment #4)
> > > Append mode actually solves the e10s problem by having every write() seek at
> > > the end of the file.
> > 
> > I'm pretty sure it does not. I used similar logging code elsewhere and
> > multiple threads ended up causing new lines to be inserted midline. Problem
> > arises because data isn't flushed in lines.
> 
> Are you sure you were doing that with write() and not printf()? AFAIK,
> Apache httpd just does that for its logs, and i never heard anyone complain
> about the logs having lines inserted midline.

it was using printf.

can try this with atomic writes.
> 
> > If you don't want to use md5, you can s/\//_/ on the filename to use the
> > full path as a filename. e10s problem can be solved by using pid or some
> > other identifier in the log.
> 
> That adds another problem: when it's the subprocess that does accesses
> before the main process to the same zip, how do you know the best order to
> use?
Well, you could add timestamps to resolve such deadlocks. But I would argue that having hot data together is more relevant than exact order.
Comment on attachment 712516 [details] [diff] [review]
Change how nsZipArchive logging works

r+ is based on irc claims of atomic writes. On you to fix bug 840435 or use exclusive mode on windows.
Attachment #712516 - Flags: review- → review+
Same patch as before, but with a work around for bug 840435. Note we're doing an even more elaborate CreateFileW/PR_ImportFile dance in nsLocalFileWin, so this isn't actually something new. Stress-testing with multiple threads and multiple processes shows this works as expected. In the process of stress testing, i was also able to validate that PR_Write is atomic on all tier-1 platforms, but the PR_Seek64 thing on windows does create problems: With threads, about 4% of the writes are actually overwriting previous lines, but with processes, it gets to more than 40%. Note that with how things work in NSPR, using PR_ImportFile makes NSPR never call PR_Seek64 for the file descriptor, which is exactly what we want (FILE_APPEND_DATA does the right thing ; it can't however be used in NSPR itself because of other problems, but they are not a concern for our use of the file descriptor in nsZipArchive).
Attachment #713348 - Flags: review?(taras.mozilla)
Attachment #712516 - Attachment is obsolete: true
Comment on attachment 713348 [details] [diff] [review]
Change how nsZipArchive logging works

fugly
Attachment #713348 - Flags: review?(taras.mozilla) → review+
bug 638375 comment 0 has links to some of the things we ran into with multi-process NSPR logging in necko.  We wound up supporting either logging to separate files, or interleaved within one file (I suspect using printf under the covers, so...) depending on the use case (some people really want their logs interleaved).
Note there are also a few unrelated cleanups in there.
Attachment #714028 - Flags: review?(gps)
So, I've done some
Blocks: metro-build
(In reply to Mike Hommey [:glandium] from comment #14)
> So, I've done some

Forget this, it was just old remains from an older session.
Comment on attachment 714028 [details] [diff] [review]
Corresponding packager changes

Review of attachment 714028 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM. Some minor Python nits.

::: python/mozbuild/mozpack/mozjar.py
@@ +757,5 @@
> +        for line in fileobj:
> +            url, path = line.strip().split(None, 1)
> +            if not url or not path:
> +                continue
> +            if not url in urlmap:

Nit: if url not in urlmap:

While equivalent, "not in" is an operator and should be used.

@@ +763,5 @@
> +            jar = urlmap[url]
> +            if not jar in self:
> +                self[jar] = []
> +            if not path in self[jar]:
> +                self[jar].append(path)

entry = self.setdefault(jar, [])
if path not in entry:
    entry.append(path)

@@ +774,5 @@
> +        canonicalized corresponding path, but it can also be jar: urls, in
> +        which case the canonicalized key is the tuple (jarpath, subpath) when
> +        the url is jar:file://jarpath!/subpath. In case the jar is even more
> +        nested (like jar:jar:file://jarpath!/subpath1!/subpath2), the returned
> +        value is a "flat" tuple, like (jarpath, subpath1, subpath2).

While I enjoy good English prose, perhaps this would be clearer as a bullet or ordered list? e.g.

The following formats in the MOZ_JAR_LOG_FILE are parsed:

  1) file:///<something> --> Blah blah.
  2) jar:file://jarpath!/subpath -> Foo bar.
  3) ...

@@ +778,5 @@
> +        value is a "flat" tuple, like (jarpath, subpath1, subpath2).
> +        '''
> +        if not isinstance(url, ParseResult):
> +            # Assume that if it doesn't start with jar: or file:, it's a path.
> +            if not url.startswith('jar:') and not url.startswith('file:'):

I'm pretty sure startswith() can take a tuple as its argument and all values in the tuple will be tested.

@@ +781,5 @@
> +            # Assume that if it doesn't start with jar: or file:, it's a path.
> +            if not url.startswith('jar:') and not url.startswith('file:'):
> +                url = 'file:///' + os.path.abspath(url)
> +            url = urlparse(url)
> +        assert url.scheme

How about asserting url.scheme in ('jar', 'file') just in case?
Attachment #714028 - Flags: review?(gps) → review+
Landed in one piece:
https://hg.mozilla.org/mozilla-central/rev/0577eb1893c4
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Backed out for mass debug orange.
https://hg.mozilla.org/mozilla-central/rev/a0c57dffd179
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla21 → ---
The oranges are due to nsZipArchive::CloseArchive being called twice: once from nsJAR::Close (itself called from ~nsJAR), and once from ~nsZipArchive.
Attachment #715547 - Flags: review?(taras.mozilla) → review+
https://hg.mozilla.org/mozilla-central/rev/c60c5b83e9bb
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Comment on attachment 718345 [details] [diff] [review]
Change how nsZipArchive logging works, as landed.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 755724
User impact if declined: Possible startup time regression due to the fact that the browser omni.ja is not reordered. Before bug 755724 both GRE and browser data were in a single omni.ja which was ordered according to access patterns observed when running firefox. With bug 755724, the data is split between omni.ja and browser/omni.ja. The former is ordered, while the latter is not. This bug makes both ordered.
Testing completed (on m-c, etc.): Has been on m-c for a week.
Risk to taking this patch (and alternatives if risky): Low
String or UUID changes made by this patch: None
Attachment #718345 - Flags: approval-mozilla-aurora?
Attachment #718345 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Depends on: 849635
No longer depends on: 849635
Depends on: 844387
Depends on: 873640
No longer depends on: 873640
How do I generate the log file?  I set environment variable MOZ_JAR_LOG_DIR=C:\ but get no log file when starting firefox.  I'm building a custom firefox install and this used to work for me.
Blocks: 1195331
You need to log in before you can comment on or make changes to this bug.