Closed Bug 244478 Opened 21 years ago Closed 17 years ago

Logging truncates lines at 512 chars

Categories

(NSPR :: NSPR, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dwitte, Assigned: dwitte)

References

Details

Attachments

(1 file, 6 obsolete files)

So NSPR logging has this hardcoded line length, beyond which it simply truncates data... we ran into this problem in bug 236772, where the log simply doesn't contain the data we require, making the log useless for debugging purposes. bug 46630 increased the limit from 200 to 512 chars, but I think we need to move to an allocated buffer here. larryh made the argument in that bug that doing so would incur malloc overhead for every PR_LOG call, but I don't see why that's a problem. Logging is generally only enabled when people want to debug things, and under that circumstance perf is not critical. Cookie code can pass very long lines to PR_LOG... there's really no limit, since cookies can be up to 4kb in length each, and we can have multiple cookies in a single header. Henrik Gemal also points out that IMAP code can have very long lines. I think there's enough motivation here to bite the bullet and fix PR_LOG to be sane with its buffer handling, which is certainly better than forcing consumers to feed nspr bite-size chunks. The code in question is at: http://lxr.mozilla.org/seamonkey/source/nsprpub/pr/src/io/prlog.c#436 (As a side note, there's a bug at http://lxr.mozilla.org/seamonkey/source/nsprpub/pr/src/io/prlog.c#256 - the sense of that |if| check needs to be reversed.)
What about allocating a single, static buffer at logging initalisation and only realloc it (=grow it) when the current buffer size is too small ? That would avoid the malloc overhead for each call (which is a BAD idea for cases like when something fails due memory shortage...) ...
Or, perhaps just keep the 512 buffer, and if a PR_LOG call needs more than that, do the malloc for that call only.
Darin Fisher wrote: > Or, perhaps just keep the 512 buffer, and if a PR_LOG call needs more than > that, do the malloc for that call only. If you have many of these long lines you'll always run |malloc()| for them. And running into problems on memory shorage is much more likely with that design than growing the buffer (maybe we should even think about rounding the buffer size up to the value of |MIN(16384, getpagesize())-1| (e.g. round up to system pagesize but clamp the value at 16k (since newer Solaris can have variable pagesizes ranging from 8KB-16GB) ) :)
Comment 0 says "Logging is generally only enabled when people want to debug things, and under that circumstance perf is not critical." That's false in general. Logging is useful to debug race conditions, and any serialization on a global malloc heap lock will frustrate such use-cases. Logging also may help find memory bugs, and malloc'ing will interfere with memory allocation. In law school they teach that hard cases make bad law. How many logging use-cases need malloc'd dynamic range of buffer size? Only a few, I bet. That favors the approach darin suggests in comment 2. /be
Brendan Eich wrote: > Logging is useful to debug race conditions, and any serialization on a > global malloc heap lock will frustrate such use-cases. Who cares ? Writes to a |FILE *| are serialised, too. Either we have the lock at the |malloc()| or later when we write to the |FILE *| - that doesn't matter much. :) One alternative may be to allocate the buffer from stack up to |getpagesize()-1| bytes and use |malloc()| ONLY beyond that value.
gisburn: http://lxr.mozilla.org/mozilla/source/nsprpub/pr/src/io/prlog.c#117. There are several options. Also, you missed my point about serialized writes. prlog.c itself supports that, of course -- otherwise several threads logging to the same file might find their log entries mixed up or overwritten in a buffer. Also, stdio does not incur a malloc on ever fwrite (or whatever) call. So you are not being accurate. The serialization I want to avoid is the one exactly implied by malloc'ing on every call, which comment 0 asserted was not a problem. /be
brendan: cookie and imap cases as listed already are going to happen a lot. for bonus points, imap at least in theory could happen from multiple threads (i'm not so sure about cookie, perhaps it lives on the main thread, perhaps it lives on the http thread, in the former case it can contend w/ imap, otherwise it'll have to contend w/ itself). i've hit this limit with my logging too, which I use a lot (although nowhere near as much as imap would if it has to actually log my inbox, which it does when i use mozillamail). atm my consumers pretty much have to use alternate logging systems (mostly jsconsole or console), and then my coworkers get annoyed by the debug spew/churn. as for file locking, um... nspr logging already tramples itself if two nspr apps use the same log file and it isn't WinDebug. and wrt file locking, i think i've had spidermonkey crash because it wasn't properly locking (f)printf. anyway, at a certain point, nspr logging does get a lock. It happens to grab the lock, conveniently enough, in PR_LogPrint. Note that right now nspr is not particularly good at deciding if a printf will fit (see bug 229662).
What was unclear in what I wrote in comment 6? I'm against adding a malloc per log call. If the existing #ifdef'd locking is not correct, or not enabled where it should be, file other bugs. SpiderMonkey doesn't use fprintf or printf except for debugging to stdout/err. Are those kinds of calls not threadsafe by default on Linux? /be
it looked like the msvcrt's functions aren't threadsafe by default (this would make sense, it's the performant way to do things, you can probably ask for threadsafe versions, but doing the locking yourself is probably better overall), but i'm not certain, it was a random hit among my many many crashes :). anyway, your argument should be against comment 2 since for imap/cookies that is a malloc for every call instead of comment 1 which is only a malloc on size misses. there's one minor problem, which is that right now, the locking is only on the actual io, if you try to share a buffer, then you're locking a formatting call too.
Attached patch implement approach in comment 2 (obsolete) — Splinter Review
so, per darin and brendan's advice, here's a patch that implements their suggestions. it's a tad ugly, because of what we have to do: (a) do the sprintf into the fixed buffer first, to see if it might've overflowed; (b) if it did, redo the sprintf w/ malloc; (c) malloc a new buffer big enough to hold the whole string (thread prefix + log line); (d) copy the string data. this patch also fixes the logic bug i mentioned in comment 0: - PR_SetLogBuffering(isSync ? bufSize : 0); + PR_SetLogBuffering(isSync ? 0: bufSize); it's been tested and seems to work fine. darin, if you've got time, any chance you could take a look?
Assignee: wchang0222 → dwitte
Status: NEW → ASSIGNED
Attachment #151260 - Flags: review?(darin)
Attachment #151260 - Flags: review?(darin)
Attached patch slightly better patch (obsolete) — Splinter Review
uses realloc/memmove instead of allocing a second buffer... probably a better way to go.
Attachment #151260 - Attachment is obsolete: true
Attachment #151262 - Flags: review?(darin)
Comment on attachment 151262 [details] [diff] [review] slightly better patch looks ok to me. wtc might prefer different style or naming conventions, so asking him for sr=
Attachment #151262 - Flags: superreview?(wchang0222)
Attachment #151262 - Flags: review?(darin)
Attachment #151262 - Flags: review+
Comment on attachment 151262 [details] [diff] [review] slightly better patch + line_long = (char*) PR_REALLOC(line_long, nb_thread+nb_long+1); + if (line_long) { I'm fairly certain you're totally botching realloc. This is from msdn, but you can get similar docs from man realloc on solaris or linux (included) or macosx (included): Reallocate memory blocks. void *realloc( void *memblock, size_t size ); Parameters memblock Pointer to previously allocated memory block. size New size in bytes. Return Value realloc returns a void pointer to the reallocated (and possibly moved) memory block. The return value is NULL if the size is zero and the buffer argument is not NULL, or if there is not enough available memory to expand the block to the given size. In the first case, the original block is freed. In the second, the original block is unchanged. The return value points to a storage space that is guaranteed to be suitably aligned for storage of any type of object. To get a pointer to a type other than void, use a type cast on the return value. -- realloc() returns a pointer to the newly allocated memory, which is suitably aligned for any kind of variable and may be different from ptr, or NULL if the request fails. If size was equal to 0, either NULL or a pointer suitable to be passed to free() is returned. If realloc() fails the original block is left untouched - it is not freed or moved. GNU 1993-04-04 MALLOC(3) -- The realloc() function tries to change the size of the allocation pointed to by ptr to size, and return ptr. If there is not enough room to enlarge the memory allocation pointed to by ptr, realloc() creates a new allocation, copies as much of the old data pointed to by ptr as will fit to the new allocation, frees the old allocation, and returns a pointer to the allocated memory. realloc() returns a NULL pointer if there is an error, and the allocation pointed to by ptr is still valid. BSD November 21, 2002 BSD -- in short: *never* do: foo = realloc(foo, ...) You could do it if you stored a copy of the address, but you don't (and most people don't). The correct way is something like: foo2 = realloc(foo, ...) if (!foo2) { handle(foo); /*perhaps you want to free foo instead of leaking?*/ } if (foo2) { foo=foo2; /*because foo is nolonger valid*/ } -- while i'm poking. are you sure that the function you're calling will actually return a friendly error or size warning if the string doesn't fit?
Attachment #151262 - Flags: superreview?(wchang0222)
Attachment #151262 - Flags: review-
Attachment #151262 - Flags: review+
Attached patch v3 (obsolete) — Splinter Review
thanks, nice catch timeless. all fixed now. ;) carrying forward r=darin from previous... wtc, would you mind taking a look at this? thanks!
Attachment #151262 - Attachment is obsolete: true
Attachment #151590 - Flags: superreview?(wchang0222)
Attachment #151590 - Flags: review+
Comment on attachment 151590 [details] [diff] [review] v3 no word from wtc... bryner, would you like to sr this? let me know if not and i'll kick it to someone else. (it has r=darin already)
Attachment #151590 - Flags: superreview?(wtchang) → superreview?(bryner)
Comment on attachment 151590 [details] [diff] [review] v3 Dan, sorry about the lack of response. It wasn't intentional. I believe this patch may cause an overflow of logBuf buffer. Consider the following code in PR_LogPrint: _PR_LOCK_LOG(); if (logBuf == 0) { _PUT_LOG(logFile, line, nb); } else { if (logp + nb > logEndp) { _PUT_LOG(logFile, logBuf, logp - logBuf); logp = logBuf; } memcpy(logp, line, nb); logp += nb; } _PR_UNLOCK_LOG(); The memcpy call requires that nb be <= the size of the logBuf buffer. The size of the logBuf buffer is >= LINE_BUF_SIZE (see PR_SetLogBuffering). In the current code, nb is <= LINE_BUF_SIZE, so the requirement holds. With your patch applied, nb can be arbitrarily large. So the requirement may not hold. The patch has two other minor problems. 1. The #include "plstr.h" is not necessary. 2. It is a shame that we have to call PR_vsmprint, PR_REALLOC, and memmove. I hope we can eliminate that by calling _PUT_LOG twice.
Attachment #151590 - Flags: superreview?(bryner) → superreview-
dan: did you address Wan-Teh Chang comments?
not yet; i don't think i'll be able to get to it for at least a month.
Wan-Teh Chang: Any chance you could have a look at the patch and adress your issues?
I just spent half an hour looking at this bug. I found that prlog.c seems to have some off-by-one bugs that will be manifested when the log input is as long as the log 'line' buffer. Since prlog.c is a short file, I recommend that someone do a thorough code review while implementing a fix for this bug. Unfortunately I don't have time to work on this bug now. Any volunteer? Why was there a renewed interest in this bug around 2005-12-21?
(In reply to comment #20) > > Why was there a renewed interest in this bug around > 2005-12-21? > once again because we need to log some long IMAP commands (to debug a IMAP problem) and they get truncated in the log.
Also because of the work in bug 320192.
QA Contact: wtchang → nspr
Attached patch v4 (obsolete) — Splinter Review
updated per comment 16 and comment 20. i changed things around so we _PUT_LOG() directly in the malloced line case, to avoid the shuffling around. also, i audited all the code in prlog.c and fixed the off-by-one errors wtc mentioned, for the case where nb is close to the line length.
Attachment #151590 - Attachment is obsolete: true
Attachment #268332 - Flags: review?(wtc)
Attached patch v5 (obsolete) — Splinter Review
Dan, thanks for the v4 patch and sorry that it fell through the cracks. Your v4 patch has only one problem -- in the malloc case (line == line_long), it is illegal to add '\' at the end of the line_long string returned by PR_vsmprintf because there is no room at the end of the malloc'ed string. The solution I came up with is a separate _PUT_LOG call to write out "\n" if necessary. I also performed several transformations to 1. free the line_long string after unlocking, 2. use PR_smprintf_free to free the line_long string (this is the documented function), 3. shorten nb_thread to nb_tid, and 4. just use the original 'line' as the name of the stack line buffer, rather than making 'line' pointing to either 'line_buf' or 'line_long'. (We don't need this flexible 'line' because we can't add '\n' at the end in the 'line_long' case.) The result is this v5 patch. I haven't compiled or tested it yet. Please review it. If you can test it, that'll be even better. Thanks.
Attachment #268332 - Attachment is obsolete: true
Attachment #279335 - Flags: review?
Attachment #268332 - Flags: review?(wtc)
Comment on attachment 279335 [details] [diff] [review] v5 I forgot to mention that - I deleted the obsolete XP_MAC (classic Mac OS) code, and - I am not sure if we can pass the va_list variable 'ap' to other functions twice. I am worried that those functions may change the internal state of 'ap', for example, the next argument pointer.
Attachment #279335 - Flags: review?
Attached patch v6 (obsolete) — Splinter Review
thanks wtc - i reviewed your patch and it looks good, apart from the va_list issue you mentioned. i read the documentation and you're right, it needs to be reinitialized before reuse. this patch (v6) wraps both PR_vs*printf calls with their own va_start and va_end, which is legal and should fix the problem. i've tested this patch on linux and it seems to work fine.
Attachment #279335 - Attachment is obsolete: true
Attachment #279423 - Flags: review?
Attachment #279423 - Flags: review? → review?(wtc)
(the patch is otherwise identical to your v5 patch.)
Attached patch v7 (checked in)Splinter Review
Good idea, Dan. I checked in your v6 patch with one change: in the line_long case, when we ensure a trailing newline, I changed + if (nb && (line_long[nb-1] != '\n')) { to + if (!nb || (line_long[nb-1] != '\n')) { because if 'nb' is 0, we already printed the thread id, so we should also add a newline. Checking in prlog.c; /cvsroot/mozilla/nsprpub/pr/src/io/prlog.c,v <-- prlog.c new revision: 3.38; previous revision: 3.37 done
Attachment #279423 - Attachment is obsolete: true
Attachment #279463 - Flags: review?
Attachment #279423 - Flags: review?(wtc)
Fixed on the NSPR trunk (NSPR 4.7).
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Target Milestone: --- → 4.7
In the normal case (!line_long), nb is guaranteed to be > 0 (because nb >= nb_tid, which is > 0), so the expression: if (nb && (line[nb-1] != '\n')) { could be simplified to if (line[nb-1] != '\n') { Should we make this change?
yes, that seems correct. also, as an unrelated thing, i noticed that we call PR_LogFlush() unconditionally at the end of the function - which flushes out everything in logBuf. this makes logBuf kind of pointless, since it never gets a chance to accumulate any strings beyond the immediate logging call. i'm not sure why this call was added, but blame isn't any help since it's been there since rev 1.1. do we need to explicitly flush on every logging call?
I found that in the line_long case, the expression + if (!nb || (line_long[nb-1] != '\n')) { can also be simplified + if (line_long[nb-1] != '\n') { because nb must be > 0 to overflow the 'line' buffer. I agree that it is wrong to call PR_LogFlush() at the end of the PR_LogPrint function. Perhaps log buffering has never worked -- the logic error of isSync since rev. 1.1 was another evidence. For backward compatibility we probably should not fix this bug.
Attachment #279463 - Flags: review?
Blocks: 399647
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: