Closed Bug 244478 Opened 20 years ago Closed 17 years ago

Logging truncates lines at 512 chars

Categories

(NSPR :: NSPR, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dwitte, Assigned: dwitte)

References

Details

Attachments

(1 file, 6 obsolete files)

So NSPR logging has this hardcoded line length, beyond which it simply truncates
data... we ran into this problem in bug 236772, where the log simply doesn't
contain the data we require, making the log useless for debugging purposes.

bug 46630 increased the limit from 200 to 512 chars, but I think we need to move
to an allocated buffer here. larryh made the argument in that bug that doing so
would incur malloc overhead for every PR_LOG call, but I don't see why that's a
problem. Logging is generally only enabled when people want to debug things, and
under that circumstance perf is not critical.

Cookie code can pass very long lines to PR_LOG... there's really no limit, since
cookies can be up to 4kb in length each, and we can have multiple cookies in a
single header. Henrik Gemal also points out that IMAP code can have very long
lines. I think there's enough motivation here to bite the bullet and fix PR_LOG
to be sane with its buffer handling, which is certainly better than forcing
consumers to feed nspr bite-size chunks.

The code in question is at:
http://lxr.mozilla.org/seamonkey/source/nsprpub/pr/src/io/prlog.c#436

(As a side note, there's a bug at
http://lxr.mozilla.org/seamonkey/source/nsprpub/pr/src/io/prlog.c#256 - the
sense of that |if| check needs to be reversed.)
What about allocating a single, static buffer at logging initalisation and only
realloc it (=grow it) when the current buffer size is too small ? That would
avoid the malloc overhead for each call (which is a BAD idea for cases like when
something fails due memory shortage...) ...
Or, perhaps just keep the 512 buffer, and if a PR_LOG call needs more than that,
do the malloc for that call only.
Darin Fisher wrote:
> Or, perhaps just keep the 512 buffer, and if a PR_LOG call needs more than 
> that, do the malloc for that call only.

If you have many of these long lines you'll always run |malloc()| for them. And
running into problems on memory shorage is much more likely with that design
than growing the buffer (maybe we should even think about rounding the buffer
size up to the value of |MIN(16384, getpagesize())-1| (e.g. round up to system
pagesize but clamp the value at 16k (since newer Solaris can have variable
pagesizes ranging from 8KB-16GB) ) :)
Comment 0 says "Logging is generally only enabled when people want to debug
things, and under that circumstance perf is not critical."  That's false in
general.  Logging is useful to debug race conditions, and any serialization on a
global malloc heap lock will frustrate such use-cases.  Logging also may help
find memory bugs, and malloc'ing will interfere with memory allocation.

In law school they teach that hard cases make bad law.  How many logging
use-cases need malloc'd dynamic range of buffer size?  Only a few, I bet.  That
favors the approach darin suggests in comment 2.

/be
Brendan Eich wrote:
> Logging is useful to debug race conditions, and any serialization on a
> global malloc heap lock will frustrate such use-cases.

Who cares ? Writes to a |FILE *| are serialised, too. Either we have the lock at
the |malloc()| or later when we write to the |FILE *| - that doesn't matter
much. :)
One alternative may be to allocate the buffer from stack up to |getpagesize()-1|
bytes and use |malloc()| ONLY beyond that value.
gisburn: http://lxr.mozilla.org/mozilla/source/nsprpub/pr/src/io/prlog.c#117. 
There are several options.  Also, you missed my point about serialized writes. 
prlog.c itself supports that, of course -- otherwise several threads logging to
the same file might find their log entries mixed up or overwritten in a buffer.
 Also, stdio does not incur a malloc on ever fwrite (or whatever) call.  So you
are not being accurate.

The serialization I want to avoid is the one exactly implied by malloc'ing on
every call, which comment 0 asserted was not a problem.

/be
brendan: cookie and imap cases as listed already are going to happen a lot. for 
bonus points, imap at least in theory could happen from multiple threads (i'm 
not so sure about cookie, perhaps it lives on the main thread, perhaps it lives 
on the http thread, in the former case it can contend w/ imap, otherwise it'll 
have to contend w/ itself).

i've hit this limit with my logging too, which I use a lot (although nowhere 
near as much as imap would if it has to actually log my inbox, which it does 
when i use mozillamail). atm my consumers pretty much have to use alternate 
logging systems (mostly jsconsole or console), and then my coworkers get 
annoyed by the debug spew/churn.

as for file locking, um... nspr logging already tramples itself if two nspr 
apps use the same log file and it isn't WinDebug.  and wrt file locking, i 
think i've had spidermonkey crash because it wasn't properly locking (f)printf.

anyway, at a certain point, nspr logging does get a lock. It happens to grab 
the lock, conveniently enough, in PR_LogPrint.

Note that right now nspr is not particularly good at deciding if a printf will 
fit (see bug 229662).
What was unclear in what I wrote in comment 6?  I'm against adding a malloc per
log call.  If the existing #ifdef'd locking is not correct, or not enabled where
it should be, file other bugs.

SpiderMonkey doesn't use fprintf or printf except for debugging to stdout/err. 
Are those kinds of calls not threadsafe by default on Linux?

/be
it looked like the msvcrt's functions aren't threadsafe by default (this would 
make sense, it's the performant way to do things, you can probably ask for 
threadsafe versions, but doing the locking yourself is probably better 
overall), but i'm not certain, it was a random hit among my many many 
crashes :).

anyway, your argument should be against comment 2 since for imap/cookies that 
is a malloc for every call instead of comment 1 which is only a malloc on size 
misses.

there's one minor problem, which is that right now, the locking is only on the 
actual io, if you try to share a buffer, then you're locking a formatting call 
too.
Attached patch implement approach in comment 2 (obsolete) — Splinter Review
so, per darin and brendan's advice, here's a patch that implements their
suggestions. it's a tad ugly, because of what we have to do:
(a) do the sprintf into the fixed buffer first, to see if it might've
overflowed;
(b) if it did, redo the sprintf w/ malloc;
(c) malloc a new buffer big enough to hold the whole string (thread prefix +
log line);
(d) copy the string data.

this patch also fixes the logic bug i mentioned in comment 0:
-	 PR_SetLogBuffering(isSync ? bufSize : 0);
+	 PR_SetLogBuffering(isSync ? 0: bufSize);

it's been tested and seems to work fine. darin, if you've got time, any chance
you could take a look?
Assignee: wchang0222 → dwitte
Status: NEW → ASSIGNED
Attachment #151260 - Flags: review?(darin)
Attachment #151260 - Flags: review?(darin)
Attached patch slightly better patch (obsolete) — Splinter Review
uses realloc/memmove instead of allocing a second buffer... probably a better
way to go.
Attachment #151260 - Attachment is obsolete: true
Attachment #151262 - Flags: review?(darin)
Comment on attachment 151262 [details] [diff] [review]
slightly better patch

looks ok to me.  wtc might prefer different style or naming conventions, so
asking him for sr=
Attachment #151262 - Flags: superreview?(wchang0222)
Attachment #151262 - Flags: review?(darin)
Attachment #151262 - Flags: review+
Comment on attachment 151262 [details] [diff] [review]
slightly better patch

+	     line_long = (char*) PR_REALLOC(line_long, nb_thread+nb_long+1);
+	     if (line_long) {

I'm fairly certain you're totally botching realloc.

This is from msdn, but you can get similar docs from man realloc on solaris or
linux (included) or macosx (included):

Reallocate memory blocks.

void *realloc(
   void *memblock,
   size_t size 
);
Parameters
memblock 
Pointer to previously allocated memory block. 
size 
New size in bytes. 
Return Value
realloc returns a void pointer to the reallocated (and possibly moved) memory
block. The return value is NULL if the size is zero and the buffer argument is
not NULL, or if there is not enough available memory to expand the block to the
given size. In the first case, the original block is freed. In the second, the
original block is unchanged. The return value points to a storage space that is
guaranteed to be suitably aligned for storage of any type of object. To get a
pointer to a type other than void, use a type cast on the return value.

--
       realloc() returns a pointer to the newly allocated memory,
       which is suitably aligned for any kind of variable and may
       be different from ptr, or NULL if the  request  fails.  If
       size  was equal to 0, either NULL or a pointer suitable to
       be passed to free() is returned.  If realloc()  fails  the
       original  block	is  left  untouched  - it is not freed or
       moved.
GNU			    1993-04-04			MALLOC(3)

--
     The realloc() function tries to change the size of the allocation pointed
     to by ptr to size, and return ptr.  If there is not enough room to
     enlarge the memory allocation pointed to by ptr, realloc() creates a new
     allocation, copies as much of the old data pointed to by ptr as will fit
     to the new allocation, frees the old allocation, and returns a pointer to
     the allocated memory.  realloc() returns a NULL pointer if there is an
     error, and the allocation pointed to by ptr is still valid.
BSD			       November 21, 2002			   BSD

--

in short:
*never* do:
foo = realloc(foo, ...)

You could do it if you stored a copy of the address, but you don't (and most
people don't).

The correct way is something like:

foo2 = realloc(foo, ...)
if (!foo2) {
handle(foo); /*perhaps you want to free foo instead of leaking?*/
}
if (foo2) {
foo=foo2; /*because foo is nolonger valid*/
}

--
while i'm poking. are you sure that the function you're calling will actually
return a friendly error or size warning if the string doesn't fit?
Attachment #151262 - Flags: superreview?(wchang0222)
Attachment #151262 - Flags: review-
Attachment #151262 - Flags: review+
Attached patch v3 (obsolete) — Splinter Review
thanks, nice catch timeless. all fixed now. ;)

carrying forward r=darin from previous... wtc, would you mind taking a look at
this? thanks!
Attachment #151262 - Attachment is obsolete: true
Attachment #151590 - Flags: superreview?(wchang0222)
Attachment #151590 - Flags: review+
Comment on attachment 151590 [details] [diff] [review]
v3

no word from wtc... bryner, would you like to sr this? let me know if not and
i'll kick it to someone else. (it has r=darin already)
Attachment #151590 - Flags: superreview?(wtchang) → superreview?(bryner)
Comment on attachment 151590 [details] [diff] [review]
v3

Dan, sorry about the lack of response.	It wasn't
intentional.

I believe this patch may cause an overflow of
logBuf buffer.	Consider the following code in
PR_LogPrint:

    _PR_LOCK_LOG();
    if (logBuf == 0) {
	_PUT_LOG(logFile, line, nb);
    } else {
	if (logp + nb > logEndp) {
	    _PUT_LOG(logFile, logBuf, logp - logBuf);
	    logp = logBuf;
	}
	memcpy(logp, line, nb);
	logp += nb;
    }
    _PR_UNLOCK_LOG();

The memcpy call requires that nb be <= the size of
the logBuf buffer.  The size of the logBuf buffer
is >= LINE_BUF_SIZE (see PR_SetLogBuffering).

In the current code, nb is <= LINE_BUF_SIZE, so
the requirement holds.

With your patch applied, nb can be arbitrarily large.
So the requirement may not hold.

The patch has two other minor problems.
1. The #include "plstr.h" is not necessary.
2. It is a shame that we have to call PR_vsmprint,
PR_REALLOC, and memmove.  I hope we can eliminate
that by calling _PUT_LOG twice.
Attachment #151590 - Flags: superreview?(bryner) → superreview-
dan: did you address Wan-Teh Chang comments?
not yet; i don't think i'll be able to get to it for at least a month.
Wan-Teh Chang: Any chance you could have a look at the patch and adress your issues?
I just spent half an hour looking at this bug.
I found that prlog.c seems to have some off-by-one
bugs that will be manifested when the log input is
as long as the log 'line' buffer.  Since prlog.c
is a short file, I recommend that someone do a
thorough code review while implementing a fix for
this bug.  Unfortunately I don't have time to work
on this bug now.  Any volunteer?

Why was there a renewed interest in this bug around
2005-12-21?
(In reply to comment #20)
> 
> Why was there a renewed interest in this bug around
> 2005-12-21?
> 

once again because we need to log some long IMAP commands (to debug a IMAP problem) and they get truncated in the log.
QA Contact: wtchang → nspr
Attached patch v4 (obsolete) — Splinter Review
updated per comment 16 and comment 20. i changed things around so we _PUT_LOG() directly in the malloced line case, to avoid the shuffling around. also, i audited all the code in prlog.c and fixed the off-by-one errors wtc mentioned, for the case where nb is close to the line length.
Attachment #151590 - Attachment is obsolete: true
Attachment #268332 - Flags: review?(wtc)
Attached patch v5 (obsolete) — Splinter Review
Dan, thanks for the v4 patch and sorry that it fell through
the cracks.

Your v4 patch has only one problem -- in the malloc case
(line == line_long), it is illegal to add '\' at the end
of the line_long string returned by PR_vsmprintf because
there is no room at the end of the malloc'ed string.

The solution I came up with is a separate _PUT_LOG call
to write out "\n" if necessary.  I also performed several
transformations to
1. free the line_long string after unlocking,
2. use PR_smprintf_free to free the line_long string
   (this is the documented function),
3. shorten nb_thread to nb_tid, and
4. just use the original 'line' as the name of the stack
   line buffer, rather than making 'line' pointing to
   either 'line_buf' or 'line_long'.  (We don't need
   this flexible 'line' because we can't add '\n' at
   the end in the 'line_long' case.)

The result is this v5 patch.  I haven't compiled or
tested it yet.  Please review it.  If you can test it,
that'll be even better.  Thanks.
Attachment #268332 - Attachment is obsolete: true
Attachment #279335 - Flags: review?
Attachment #268332 - Flags: review?(wtc)
Comment on attachment 279335 [details] [diff] [review]
v5

I forgot to mention that
- I deleted the obsolete XP_MAC (classic Mac OS) code, and
- I am not sure if we can pass the va_list variable 'ap'
  to other functions twice.  I am worried that those functions
  may change the internal state of 'ap', for example, the
  next argument pointer.
Attachment #279335 - Flags: review?
Attached patch v6 (obsolete) — Splinter Review
thanks wtc - i reviewed your patch and it looks good, apart from the va_list issue you mentioned. i read the documentation and you're right, it needs to be reinitialized before reuse. this patch (v6) wraps both PR_vs*printf calls with their own va_start and va_end, which is legal and should fix the problem.

i've tested this patch on linux and it seems to work fine.
Attachment #279335 - Attachment is obsolete: true
Attachment #279423 - Flags: review?
Attachment #279423 - Flags: review? → review?(wtc)
(the patch is otherwise identical to your v5 patch.)
Attached patch v7 (checked in)Splinter Review
Good idea, Dan.  I checked in your v6 patch with one change:
in the line_long case, when we ensure a trailing newline,
I changed

+        if (nb && (line_long[nb-1] != '\n')) {

to

+        if (!nb || (line_long[nb-1] != '\n')) {

because if 'nb' is 0, we already printed the thread id,
so we should also add a newline.

Checking in prlog.c;
/cvsroot/mozilla/nsprpub/pr/src/io/prlog.c,v  <--  prlog.c
new revision: 3.38; previous revision: 3.37
done
Attachment #279423 - Attachment is obsolete: true
Attachment #279463 - Flags: review?
Attachment #279423 - Flags: review?(wtc)
Fixed on the NSPR trunk (NSPR 4.7).
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Target Milestone: --- → 4.7
In the normal case (!line_long), nb is guaranteed to be > 0
(because nb >= nb_tid, which is > 0), so the expression:

        if (nb && (line[nb-1] != '\n')) {

could be simplified to

        if (line[nb-1] != '\n') {

Should we make this change?
yes, that seems correct. also, as an unrelated thing, i noticed that we call PR_LogFlush() unconditionally at the end of the function - which flushes out everything in logBuf. this makes logBuf kind of pointless, since it never gets a chance to accumulate any strings beyond the immediate logging call. i'm not sure why this call was added, but blame isn't any help since it's been there since rev 1.1. do we need to explicitly flush on every logging call?
I found that in the line_long case, the expression

+        if (!nb || (line_long[nb-1] != '\n')) {

can also be simplified

+        if (line_long[nb-1] != '\n') {

because nb must be > 0 to overflow the 'line' buffer.

I agree that it is wrong to call PR_LogFlush() at the
end of the PR_LogPrint function.  Perhaps log buffering
has never worked -- the logic error of isSync since
rev. 1.1 was another evidence.  For backward compatibility
we probably should not fix this bug.
Attachment #279463 - Flags: review?
Blocks: 399647
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: