Closed
Bug 840094
Opened 11 years ago
Closed 11 years ago
Change how nsZipArchive logging works
Categories
(Core :: Networking: JAR, defect)
Core
Networking: JAR
Tracking
()
RESOLVED
FIXED
mozilla22
People
(Reporter: glandium, Assigned: glandium)
References
(Depends on 1 open bug)
Details
Attachments
(4 files, 1 obsolete file)
6.79 KB,
patch
|
taras.mozilla
:
review+
|
Details | Diff | Splinter Review |
15.54 KB,
patch
|
gps
:
review+
|
Details | Diff | Splinter Review |
2.48 KB,
patch
|
taras.mozilla
:
review+
|
Details | Diff | Splinter Review |
22.06 KB,
patch
|
bajaj
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
Currently, when setting the MOZ_JAR_LOG_DIR environment variable, you get .log files in the directory given as the env variable value corresponding to the leaf name of zip archives being opened. So if /foo/bar/omni.ja is opened, then /foo/bar/baz/omni.ja, both will have their accesses logged in $MOZ_JAR_LOG_DIR/omni.ja.
Assignee | ||
Comment 1•11 years ago
|
||
Now log in a single file given by the MOZ_JAR_LOG_FILE environment variable. Log entries contain the URI of the Zip archive, followed by the path in the archive. This will require changes to the packager to handle the resulting log.
Attachment #712516 -
Flags: review?(taras.mozilla)
Comment 2•11 years ago
|
||
Comment on attachment 712516 [details] [diff] [review] Change how nsZipArchive logging works + rv = logFile->OpenNSPRFileDesc(PR_WRONLY|PR_CREATE_FILE|PR_APPEND, 0644, &file); Why would you ever want to append to an existing file? This seems wrong This will misbehave in an e10s world with multiple writers to same file. Why not keep the existing code, but make the log filename be jarfile.md5sum_of_path.log? Seems simpler
Attachment #712516 -
Flags: review?(taras.mozilla) → review-
Assignee | ||
Comment 3•11 years ago
|
||
(In reply to Taras Glek (:taras) from comment #2) > Comment on attachment 712516 [details] [diff] [review] > Change how nsZipArchive logging works > > + rv = logFile->OpenNSPRFileDesc(PR_WRONLY|PR_CREATE_FILE|PR_APPEND, > 0644, &file); > > Why would you ever want to append to an existing file? This seems wrong Because we sometimes OpenArchive() then CloseArchive() and OpenArchive() again, which opens, closes and re-opens the log. The current code ends up doing the same, too. > This will misbehave in an e10s world with multiple writers to same file. If that is indeed a problem, it already is a problem with the current code. We can care about that in a followup. > Why not keep the existing code, but make the log filename be > jarfile.md5sum_of_path.log? Seems simpler Seems simpler said than done. We'd need to ensure the md5sum can properly be computed from python, where the logs are going to be used, and that all openings of the same file will lead to the same md5sum. It's also not very human friendly. (as a side note, how many md5 implementations do we have in the tree? I already spotted four)
Assignee | ||
Comment 4•11 years ago
|
||
Append mode actually solves the e10s problem by having every write() seek at the end of the file.
Assignee | ||
Comment 5•11 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #4) > Append mode actually solves the e10s problem by having every write() seek at > the end of the file. Except NSPR doesn't do the right thing on windows, it does it the racy way... sigh.
Comment 6•11 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #4) > Append mode actually solves the e10s problem by having every write() seek at > the end of the file. I'm pretty sure it does not. I used similar logging code elsewhere and multiple threads ended up causing new lines to be inserted midline. Problem arises because data isn't flushed in lines. If you don't want to use md5, you can s/\//_/ on the filename to use the full path as a filename. e10s problem can be solved by using pid or some other identifier in the log.
Assignee | ||
Comment 7•11 years ago
|
||
(In reply to Taras Glek (:taras) from comment #6) > (In reply to Mike Hommey [:glandium] from comment #4) > > Append mode actually solves the e10s problem by having every write() seek at > > the end of the file. > > I'm pretty sure it does not. I used similar logging code elsewhere and > multiple threads ended up causing new lines to be inserted midline. Problem > arises because data isn't flushed in lines. Are you sure you were doing that with write() and not printf()? AFAIK, Apache httpd just does that for its logs, and i never heard anyone complain about the logs having lines inserted midline. > If you don't want to use md5, you can s/\//_/ on the filename to use the > full path as a filename. e10s problem can be solved by using pid or some > other identifier in the log. That adds another problem: when it's the subprocess that does accesses before the main process to the same zip, how do you know the best order to use?
Comment 8•11 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #7) > (In reply to Taras Glek (:taras) from comment #6) > > (In reply to Mike Hommey [:glandium] from comment #4) > > > Append mode actually solves the e10s problem by having every write() seek at > > > the end of the file. > > > > I'm pretty sure it does not. I used similar logging code elsewhere and > > multiple threads ended up causing new lines to be inserted midline. Problem > > arises because data isn't flushed in lines. > > Are you sure you were doing that with write() and not printf()? AFAIK, > Apache httpd just does that for its logs, and i never heard anyone complain > about the logs having lines inserted midline. it was using printf. can try this with atomic writes. > > > If you don't want to use md5, you can s/\//_/ on the filename to use the > > full path as a filename. e10s problem can be solved by using pid or some > > other identifier in the log. > > That adds another problem: when it's the subprocess that does accesses > before the main process to the same zip, how do you know the best order to > use? Well, you could add timestamps to resolve such deadlocks. But I would argue that having hot data together is more relevant than exact order.
Comment 9•11 years ago
|
||
Comment on attachment 712516 [details] [diff] [review] Change how nsZipArchive logging works r+ is based on irc claims of atomic writes. On you to fix bug 840435 or use exclusive mode on windows.
Attachment #712516 -
Flags: review- → review+
Assignee | ||
Comment 10•11 years ago
|
||
Same patch as before, but with a work around for bug 840435. Note we're doing an even more elaborate CreateFileW/PR_ImportFile dance in nsLocalFileWin, so this isn't actually something new. Stress-testing with multiple threads and multiple processes shows this works as expected. In the process of stress testing, i was also able to validate that PR_Write is atomic on all tier-1 platforms, but the PR_Seek64 thing on windows does create problems: With threads, about 4% of the writes are actually overwriting previous lines, but with processes, it gets to more than 40%. Note that with how things work in NSPR, using PR_ImportFile makes NSPR never call PR_Seek64 for the file descriptor, which is exactly what we want (FILE_APPEND_DATA does the right thing ; it can't however be used in NSPR itself because of other problems, but they are not a concern for our use of the file descriptor in nsZipArchive).
Attachment #713348 -
Flags: review?(taras.mozilla)
Assignee | ||
Updated•11 years ago
|
Attachment #712516 -
Attachment is obsolete: true
Comment 11•11 years ago
|
||
Comment on attachment 713348 [details] [diff] [review] Change how nsZipArchive logging works fugly
Attachment #713348 -
Flags: review?(taras.mozilla) → review+
Comment 12•11 years ago
|
||
bug 638375 comment 0 has links to some of the things we ran into with multi-process NSPR logging in necko. We wound up supporting either logging to separate files, or interleaved within one file (I suspect using printf under the covers, so...) depending on the use case (some people really want their logs interleaved).
Assignee | ||
Comment 13•11 years ago
|
||
Note there are also a few unrelated cleanups in there.
Attachment #714028 -
Flags: review?(gps)
Assignee | ||
Comment 15•11 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #14) > So, I've done some Forget this, it was just old remains from an older session.
Comment 16•11 years ago
|
||
Comment on attachment 714028 [details] [diff] [review] Corresponding packager changes Review of attachment 714028 [details] [diff] [review]: ----------------------------------------------------------------- LGTM. Some minor Python nits. ::: python/mozbuild/mozpack/mozjar.py @@ +757,5 @@ > + for line in fileobj: > + url, path = line.strip().split(None, 1) > + if not url or not path: > + continue > + if not url in urlmap: Nit: if url not in urlmap: While equivalent, "not in" is an operator and should be used. @@ +763,5 @@ > + jar = urlmap[url] > + if not jar in self: > + self[jar] = [] > + if not path in self[jar]: > + self[jar].append(path) entry = self.setdefault(jar, []) if path not in entry: entry.append(path) @@ +774,5 @@ > + canonicalized corresponding path, but it can also be jar: urls, in > + which case the canonicalized key is the tuple (jarpath, subpath) when > + the url is jar:file://jarpath!/subpath. In case the jar is even more > + nested (like jar:jar:file://jarpath!/subpath1!/subpath2), the returned > + value is a "flat" tuple, like (jarpath, subpath1, subpath2). While I enjoy good English prose, perhaps this would be clearer as a bullet or ordered list? e.g. The following formats in the MOZ_JAR_LOG_FILE are parsed: 1) file:///<something> --> Blah blah. 2) jar:file://jarpath!/subpath -> Foo bar. 3) ... @@ +778,5 @@ > + value is a "flat" tuple, like (jarpath, subpath1, subpath2). > + ''' > + if not isinstance(url, ParseResult): > + # Assume that if it doesn't start with jar: or file:, it's a path. > + if not url.startswith('jar:') and not url.startswith('file:'): I'm pretty sure startswith() can take a tuple as its argument and all values in the tuple will be tested. @@ +781,5 @@ > + # Assume that if it doesn't start with jar: or file:, it's a path. > + if not url.startswith('jar:') and not url.startswith('file:'): > + url = 'file:///' + os.path.abspath(url) > + url = urlparse(url) > + assert url.scheme How about asserting url.scheme in ('jar', 'file') just in case?
Attachment #714028 -
Flags: review?(gps) → review+
Assignee | ||
Comment 17•11 years ago
|
||
Landed in one piece: https://hg.mozilla.org/mozilla-central/rev/0577eb1893c4
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Comment 18•11 years ago
|
||
Backed out for mass debug orange. https://hg.mozilla.org/mozilla-central/rev/a0c57dffd179
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla21 → ---
Assignee | ||
Comment 19•11 years ago
|
||
The oranges are due to nsZipArchive::CloseArchive being called twice: once from nsJAR::Close (itself called from ~nsJAR), and once from ~nsZipArchive.
Assignee | ||
Comment 20•11 years ago
|
||
Attachment #715547 -
Flags: review?(taras.mozilla)
Updated•11 years ago
|
Attachment #715547 -
Flags: review?(taras.mozilla) → review+
Assignee | ||
Comment 21•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/c60c5b83e9bb
Comment 22•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/c60c5b83e9bb
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Assignee | ||
Comment 23•11 years ago
|
||
Assignee | ||
Comment 24•11 years ago
|
||
Comment on attachment 718345 [details] [diff] [review] Change how nsZipArchive logging works, as landed. [Approval Request Comment] Bug caused by (feature/regressing bug #): bug 755724 User impact if declined: Possible startup time regression due to the fact that the browser omni.ja is not reordered. Before bug 755724 both GRE and browser data were in a single omni.ja which was ordered according to access patterns observed when running firefox. With bug 755724, the data is split between omni.ja and browser/omni.ja. The former is ordered, while the latter is not. This bug makes both ordered. Testing completed (on m-c, etc.): Has been on m-c for a week. Risk to taking this patch (and alternatives if risky): Low String or UUID changes made by this patch: None
Attachment #718345 -
Flags: approval-mozilla-aurora?
Updated•11 years ago
|
Attachment #718345 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 25•11 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/16d994cc562c
status-firefox21:
--- → fixed
status-firefox22:
--- → fixed
Comment 26•11 years ago
|
||
How do I generate the log file? I set environment variable MOZ_JAR_LOG_DIR=C:\ but get no log file when starting firefox. I'm building a custom firefox install and this used to work for me.
You need to log in
before you can comment on or make changes to this bug.
Description
•