Closed
Bug 11981
Opened 25 years ago
Closed 25 years ago
nsStr/nsString off-by-N bugs
Categories
(Core :: XPCOM, defect, P3)
Core
XPCOM
Tracking
()
VERIFIED
FIXED
People
(Reporter: brendan, Assigned: brendan)
Details
(Whiteboard: [09271999]waiting to hear from reporter)
Attachments
(3 files)
6.20 KB,
patch
|
Details | Diff | Splinter Review | |
995 bytes,
patch
|
Details | Diff | Splinter Review | |
877 bytes,
patch
|
Details | Diff | Splinter Review |
While implementing javascript: URLs, I found a couple of bugs in xpcom/ds/nsStr* and made the fixes yesterday, in the heat of stomach flu fever and in order to get javascript: working for M9 (see bug 1646). Here's my analysis and summary of changes. Attached unified diffs (ignoring whitespace changes) of the files themselves. Rick, could you bless these or let me know if I've missed something subtle? With your and jevering's approval I'd like to get these in before the M9 branch is created. Thanks, /be ----- Bugfixes: - Size classification and log2 problems: nsPoolingMemoryAgent::Alloc computed the ceiling-log2 of the string length, then multiplied by 1 or 2 depending on mCharSize, then subtracted 4 (the log2 of the smallest pool's string length). But nsPoolingMemoryAgent::Free did not scale by mCharSize before subtracting 4. What's more, once you're in log2-space, law of exponents says to add 0 or 1, not multiply by 1 or 2, to scale for ASCII or UCS2. So an allocation for a 33 character Unicode string would pull from the pool at index 6*2-4 or 8, while a Free on that string would push its storage in the pool at index 6*1-4 or 2. Fixed to compute ceiling-log2 correctly, subtract 4 from the resulting log2, and add 0 or 1 (add mCharSize, which is encoded as 0 for ASCII, 1 for UCS2; neat!) to find the right byte-size-classified pool. For the 33-character Unicode string example, ceiling-log2 of 33 is 6, +1 for mCharSize, -4 for the log2 of the minimum pooled length, gives index 3 as the pool to use for Alloc and Free. - Net (string length excluding NUL) vs. gross (buffer size) capacity: nsString.cpp's nsCAutoString::nsCAutoString was calling nsStr::Initialize with aBuffer.mCapacity, but nsStr mCapacity is net, not gross. Fixed to pass aBuffer.mCapacity-1. The same problem existed in nsAutoString::nsAutoString. It's not clear that the same problem doesn't arise in both the PRUnichar* and char* flavors of nsSubsumeCStr::nsSubsumeCStr -- they compute strlen if given aLength==-1, otherwise they set mLength and mCapacity to aLength-1. That says that any non-negative value for aLength should be taken as a gross rather than net capacity, as with nsString::ToCString's aBufLength argument. I can't find any callers who pass the third aLength parameter, so it always defaults to -1 in currently exercised code. This net vs. gross issue could be resolved by using "aBufLength" or "aBufCapacity" or some such to suggest that the argument is gross, not net. CBufDescriptor::mCapacity might also be renamed to better spell out what's going on. Cleanups: - Add to comment in nsStr.h about mCapacity for an nsStr being in character count units, and not counting the allocated space for the final NUL char; whereas mCapacity for a CBufDescriptor is a gross capacity, not net of NUL. - Define kNumPools and use it consistently to declare mPools and bound loops over, and indexes into, that array. Previous revision used 16, 10, and 12 respectively. - Rename eDefaultSize eMinLength (it's a net string length or character count, not a "sizeof" byte size or buffer capacity), derived from eMinLengthLog2. Use eMinLengthLog2 consistently instead of magic 4. - Use PR_CeilingLog2 instead of ad-hoc exp/log2-computing loops. I changed nsMemoryAgent::Alloc to use this NSPR function too. - Make comments about "given amount in charunits... have to scale by charsize" apply to the scaling code, and to the particular amount variable. - Defer theNewCapacity++ in nsPoolingMemoryAgent::Alloc until we enter the losing case for sure, where we need to call 'new char[theSize]' -- and then avoid gratuitous ++ by adding 1 when computing theSize. - Expand tabs to 2-space-equivalent (actually, Emacs does this based on the magic comment at the top of the file -- it says indent-tabs-mode: nil).
Comment 1•25 years ago
|
||
Assignee | ||
Updated•25 years ago
|
Status: NEW → ASSIGNED
Comment 2•25 years ago
|
||
Sorry, I'm just flat wrong about the second point (net vs. gross capacity): CBufDescriptor.mCapacity is net excluding the terminating null character, just as nsStr.mCapacity is. I don't know what made me think otherwise in the heat of debugging yesterday. I noticed that nsString2.cpp's nsString::ToCString is doing two copies of the string, one 2to1 and the next 1to1 (char-size-wise, that is). The first is from a copy-constructor for nsCString called to convert *this from nsString type to nsCString type, which is the only matching type here for the thrice-overloaded nsCString::Assign's first argument. This is all hidden within the line: temp.Assign(*this); around line 604 of nsString2.cpp. Was "2to1" Assign removed from the nsString package? It seems just as bad to have a "2to1" copy-constructor, in terms of risk of Unicode data loss. In any case, this code needs a nsCString::Assign(const nsString& aString...) in order to avoid a gratuitous copy. /be
Comment 3•25 years ago
|
||
Comment 4•25 years ago
|
||
So that assertion ( + NS_ASSERTION(temp.mStr==aBuf, "lost buffer"); ) that I added to nsString::ToCString is botching here: nsDebug::Assertion(const char * 0x100740d8, const char * 0x100740c8, const char * 0x100740a0, int 0x0000025d) line 176 + 13 bytes nsString::ToCString(char * 0x006af328, unsigned int 0x00000100, unsigned int 0x00000000) line 605 + 33 bytes CToken::DebugDumpSource(ostream & {...}) line 142 + 29 bytes WriteTokenToLog(CToken * 0x01ffbe50) line 1050 CNavDTD::HandleStartToken(CToken * 0x01ffbe50) line 1250 + 9 bytes NavDispatchTokenHandler(CToken * 0x01ffbe50, nsIDTD * 0x02a09d20) line 241 + 12 bytes CTokenHandler::operator()(CToken * 0x01ffbe50, nsIDTD * 0x02a09d20) line 80 + 14 bytes CNavDTD::HandleToken(CNavDTD * const 0x02a09d20, CToken * 0x01ffbe50, nsIParser * 0x02b427e0) line 731 + 18 bytes CNavDTD::BuildModel(CNavDTD * const 0x02a09d20, nsIParser * 0x02b427e0, nsITokenizer * 0x02a0bb30, nsITokenObserver * 0x00000000, nsIContentSink * 0x02b425a0) line 551 + 20 bytes nsParser::BuildModel() line 941 + 34 bytes nsParser::ResumeParse(nsIDTD * 0x00000000, int 0x00000000) line 886 + 11 bytes nsParser::OnDataAvailable(nsParser * const 0x02b427e4, nsIChannel * 0x02b37310, nsISupports * 0x00000000, nsIInputStream * 0x02b40300, unsigned int 0x00000000, unsigned int 0x000005c8) line 1168 + 19 bytes nsDocumentBindInfo::OnDataAvailable(nsDocumentBindInfo * const 0x02b323f0, nsIChannel * 0x02b37310, nsISupports * 0x00000000, nsIInputStream * 0x02b40300, unsigned int 0x00000000, unsigned int 0x000005c8) line 2093 + 32 bytes nsHTTPResponseListener::OnDataAvailable(nsHTTPResponseListener * const 0x02b40170, nsIChannel * 0x024bf040, nsISupports * 0x02b37310, nsIInputStream * 0x02b40300, unsigned int 0x00001158, unsigned int 0x000005c8) line 163 + 47 bytes nsOnDataAvailableEvent::HandleEvent(nsOnDataAvailableEvent * const 0x02b43f70) line 350 nsStreamListenerEvent::HandlePLEvent(PLEvent * 0x02b43f74) line 149 + 12 bytes PL_HandleEvent(PLEvent * 0x02b43f74) line 509 + 10 bytes PL_ProcessPendingEvents(PLEventQueue * 0x00abd640) line 470 + 9 bytes _md_EventReceiverProc(HWND__ * 0x00000c40, unsigned int 0x0000d32a, unsigned int 0x00000000, long 0x00abd640) line 932 + 9 bytes KERNEL32! bff7363b() It looks like ToCString ignores its char* buffer argument and converts into an auto-grown malloc buffer, which it then leaks. Also, the anOffset parameter seems to be ignored, although I didn't remember seeing anyone use it in the calls I surveyed. Attached is my attempt at a fix, which also fixes the double-copy hiding in the unintended copy-constructor. /be
Comment 5•25 years ago
|
||
I'll review the patch, but first a few comments. I did in fact remove the 2to1 assignment operator in an attempt to make things safer, but have changed my mind. Look for it to reappear real soon. I'm not clear on the issue you're raising with toCString (I'll read your comments more closely), but let's make sure of one thing: if you give me a buffer, I assume I have to reduce it to account for a null terminator. It's possible therefore for me to auto-alloc a heap buffer if you give me a length of N, rather than N+1.
Comment 7•25 years ago
|
||
ToCString's interface does not provide for auto-alloc, because it returns a char* to the same char* buffer passed in. If it sometimes were to return the auto-grown buffer, the caller would have to test (retptr != buffer) and if true call free or delete. But today, per the assertion and the backtrace involving CToken::DebugDumpSource, the pointer to the original buffer is returned and the autogrown buffer is leaked. It seems better to me to truncate the string to fit the user's C buffer than not to use the buffer at all. Adding back 2to1 will still result in a copy-construct and extra copy -- is my patch to use nsStr::Assign ok? Feel free to take this bug away from me! /be
I've fixed the major problems cited here: ToCString() logic error, the extra copy construction, and I've improved the docs. Feel free to address the remaining issues (which I hope are largely cosmetic) yourself. Thanks for the bug report.
Assignee | ||
Updated•25 years ago
|
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Comment 9•25 years ago
|
||
Thanks for fixing this stuff. /be
Updated•25 years ago
|
Whiteboard: [09271999]waiting to hear from reporter
Updated•25 years ago
|
Status: RESOLVED → VERIFIED
Comment 10•25 years ago
|
||
marking verified
Comment hidden (collapsed) |
You need to log in
before you can comment on or make changes to this bug.
Description
•