Closed
Bug 139649
Opened 22 years ago
Closed 22 years ago
Fix string code to use IsDependentOn instead of depending on nsAPromiseString type
Categories
(Core :: XPCOM, defect, P3)
Core
XPCOM
Tracking
()
RESOLVED
FIXED
mozilla1.0
People
(Reporter: jag+mozilla, Assigned: jag+mozilla)
References
Details
(Keywords: topembed+, Whiteboard: [driver:scc])
Attachments
(4 files, 4 obsolete files)
43.60 KB,
patch
|
scc
:
review+
darin.moz
:
superreview+
|
Details | Diff | Splinter Review |
6.71 KB,
patch
|
alecf
:
review+
scc
:
superreview+
|
Details | Diff | Splinter Review |
9.91 KB,
text/plain
|
Details | |
7.64 KB,
text/plain
|
Details |
Consider the code: nsString A; //... A = Substring(A, 2, 3); Normally when you assign something into A you'll clear A's buffer, then copy the string you want to assign into it. That creates a problem for this code because the (sub)string we want to copy from depends on the string we want to assign into, so when we cleared A's buffer we thus lost the characters we wanted to copy. The solution is to make a copy of the substring first, then clear A's buffer, and then store the copy in A. We're currently doing this by looking at the type of the string we want to copy from, and if it's an nsAPromiseString, we know we need to check whether there is a dependency on the string we're assigning into. The problem with this approach is that we can't depend on the string's type, since someone might cast that type information away, for example: void foopy(const nsAString& inStr, nsAString& outStr) { outStr = inStr; // very silly example } //... nsString A; foopy(Substring(A, 2, 3), A); Here the type information is cast away, so we don't do the dependency check, and A isn't what it's supposed to be. The real solution is to always check for such dependencies when modifying a string (Assign, Append, Insert, ...). This will allow us to get rid of (the now defunct) |nsAPromiseString|. We might also be able to special case a few assignments (e.g. from PRUnichar*) where we know there is no dependency.
Updated•22 years ago
|
Status: NEW → ASSIGNED
Keywords: mozilla1.0,
topembed
Priority: -- → P3
Target Milestone: --- → mozilla1.0
Comment 1•22 years ago
|
||
it just occured to me that there still isn't anything we can do about this broken pattern: { nsString foopy("loves bar"); PRUnichar *p = foopy.get(); foopy = p; } as well as all the variations on this that might accidentally show up in real code. should we add an IsDependentOn that takes a raw pointer? or is that just overkill?
Assignee | ||
Comment 2•22 years ago
|
||
A pointer to a buffer inside a string is only valid for as long as you know that the string doesn't change. I think that clearly covers this case as "don't do that".
Comment 3•22 years ago
|
||
i guess, but conceptually it's not that much different then A = Substring(A,x,y); especially given the example you mentioned. i could just as easily replace inStr with a |const PRUnichar *| value. users might fall into the same trap, right? at any rate, i don't feel that strongly about this issue... i just wanted to point out that there is a bit of bias toward objects for no apparent reason :-/
Updated•22 years ago
|
Assignee: darin → jaggernaut
Status: ASSIGNED → NEW
Comment 4•22 years ago
|
||
-> jag
Assignee | ||
Comment 5•22 years ago
|
||
Assignee | ||
Comment 6•22 years ago
|
||
So darin measured the effect of this patch on Tp and Ts and it doesn't have a negative impact on either.
Assignee | ||
Comment 7•22 years ago
|
||
Ignore the assertions I added in nsDependentSubstring, they're something I was playing with.
Comment 8•22 years ago
|
||
i ran the perf numbers on a 733Mhz redhat 7.2 box w/ 256 Mb RAM and an IDE drive. compiled with these options: ac_add_options --enable-crypto ac_add_options --enable-xterm-updates ac_add_options --enable-extensions ac_add_options --enable-svg mk_add_options MOZ_INTERNAL_LIBART_LGPL=1 MOZ_INTERNAL_LIBART_LGPL=1 ac_add_options --enable-optimize=-O2 ac_add_options --disable-debug ac_add_options --enable-strip
No longer blocks: 138000
Comment 9•22 years ago
|
||
bug 128288 probably depends on this or is a dupe of this...
Comment 10•22 years ago
|
||
Comment on attachment 82053 [details] [diff] [review] Fix sr=darin (nice!)
Attachment #82053 -
Flags: superreview+
Comment 11•22 years ago
|
||
Comment on attachment 82053 [details] [diff] [review] Fix hmmm, I don't think the |IsDependentOn| test in the |nsDependentSubstring| classes is strong enough. Shouldn't it be return mString->IsDependentOn(aString); ? In any case where there is a more complex underlying string, or the possibility of one characterized by a reference to an abstract type, then we have to forward the |IsDependentOn| call into it. Only in simple types can you answer the question directly. Is this not true? Or am I confused? All classes that directly own the data can answer the question with a simple test. All classes that only refer to the data owned by some other class must forward the |IsDependentOn| call. Hmmm, maybe I'm reading this wrong, but it looks like one could circumvent the |IsDependentOn| test by just assigning from an element pointer and length that happen to point into ones self (e.g., trying to trim both ends of a string in a single operation, not using |SubString|). Similarly with other element pointer based operations. Other than those points, I like this patch. If I'm right about the points above, however, those would be killers. I'm marking this 'needs-work' so you can determine if my observations are valid. If they aren't, I'll update that to an r/sr.
Attachment #82053 -
Flags: needs-work+
Assignee | ||
Comment 12•22 years ago
|
||
(IsDependentOn in nsDependentSubstring should indeed forward the request) scc and I just discussed adding support for detecting char iterator based substring dependencies, which if you look at it slightly differently means you'll just check for overlap of buffers in all cases, which in turn means IsDependentOn can be (protected) a non-virtual function on nsAString. I'll come up with a patch that does this in a m*n (where m is always 1 because you can only write into single fragment strings, and n is most likely 1 or close to it) fashion, scc is looking into an m+n algorithm.
Updated•22 years ago
|
Assignee | ||
Comment 14•22 years ago
|
||
Assignee | ||
Updated•22 years ago
|
Attachment #82053 -
Attachment is obsolete: true
Assignee | ||
Comment 15•22 years ago
|
||
Attachment #83396 -
Attachment is obsolete: true
Comment 16•22 years ago
|
||
This is an outline of the faster algorithm. Like the brute force approach it is O(m) when the number of source buffers is 1. Using a algorithm designed to handle two muli-fragment strings from the start would allow us to make |IsDependentOn| a non-virtual function in |nsAC?String| --- even if we didn't turn on the ability to do multi-fragment sources.
Comment 17•22 years ago
|
||
1) are we really concerned about performance w.r.t. the number of string fragments? I mean how often are we talking about strings with more than say 4 string fragments? The only way I even know how to create a multi-fragment string is via the concatentation operator. 2) do we really need the faster (and significantly more complex!) algorithm to freeze strings?
Comment 18•22 years ago
|
||
i have to agree with alecf... the current patch seems like the right one for the 1.0 branch (it is much simpler). can we defer the optimization work until after mozilla 1.0?
Comment 19•22 years ago
|
||
I'm fine with that, since at the moment, we only do Mx1 comparisons. My algorithm _looks_ a lot more complicated; but most of that is comments. The straight-line flow for an Mx1 case is actually simpler than jag's algorithm _except_ for my binary search, which looks bulky, but in the one pair case only expends an add, a shift, and a couple of (register) assigments over jag's. I'm not disagreeing; mine does have bigger brain-print, and _currently_ jag's fills the bill. However, the test in jag's |IsDependentOn| is not strong enough. It currently reads if ( f2.mStart >= f1.mStart && f2.mStart < f1.mEnd ) it should read // if it _isn't_ the case that // one fragment starts after the other ends, // or ends before the other starts, // then, they conflict if ( !(f2.mStart>=f1.mEnd || f2.mEnd<=f1.mStart) ) simplifying we get if ( f2.mStart<f1.mEnd && f2.mEnd>f1.mStart ) Let's get jag's patch in as soon as possible (like tonight, according to drivers). We can upgrade to my algorithm post 1.0 as this doesn't change the API.
Comment 20•22 years ago
|
||
jag explained to me that his check is strong enough since |this| is always a single fragment string (we should assert this) and any overlap would mean that f2.mStart >= f1.mStart (given that we only call IsDependentOn before writing to a string). that said, there's no runtime difference between what scott suggests and what jag has already implemented, so why not use what scott suggests? it certainly seems clearer and less fragile (just count the number of conditionals in my first paragraph!).
Comment 21•22 years ago
|
||
The address of a string buffer is completely unrelated to the address of any other independent buffer. The buffers of a single string are not necessarily in the same order in memory. Nor is a buffer of one string guaranteed to start later than a buffer of another string. So nothing about strings themselves forces |f2| to be after f1. I don't see anything in the algorithm that explictly assigns |f2| to be the later of the two fragments. I must be missing something, because I don't see how jag's test can catch f1 |----------| f2 |--------| Perhaps by coincidence, because currently our writable strings all totally own the characters they manage, and thus, own them from the beginning of the buffer. This would be invalidated, however, by several ideas that easily fit into the scheme we have now, e.g., writable substrings and strings with smarter buffer strategies (where the string start might not be the buffer start). Besides, we don't know what needs people have to determine string intersection in the future. It may be that future clients what to see if two strings interfere with each other for reasons totally unrelated to assigning into one of them. This broken test compromises that notion, and makes the function itself, er, uh, well, false. The function claims to test if two strings of any number of fragments intersect. With this test, it doesn't really do that. With the UNSTATED coincidence mentioned above, I concede his test will work for a little while :-) The fact that this assumption is not obvious to an ordinary reader (it had to be explained to you, Darin), and is easily invalidated, makes this test weaker IMHO that my test of equivalent effort. Jag: you need better comments (er, _any_ comments in this function :-) I'm not trying to be hard on you Jag, don't take these comments personally or anything like that; it's just a test with one too many assumptions.
Comment 22•22 years ago
|
||
scc: agreed.
Comment 23•22 years ago
|
||
drivers are drooling. how much testing have seen with this? Can we fix the test and check it in tonight? Drivers are saying "directly into the branch". Can I get some driver approval for that? Jag, Darin, do you feel it's ready for the branch. If we don't get it in now, and I mean _right_now_ ... it might not make 1.0.
Comment 24•22 years ago
|
||
i've tested the patch with a trunk build (under linux), and everything seems fine. i've run Ts and Tp performance tests (under linux), and there don't appear to be any performance regressions. in short, i'd say this patch is ready!
Assignee | ||
Comment 25•22 years ago
|
||
Attachment #83449 -
Attachment is obsolete: true
Comment 26•22 years ago
|
||
adding adt1.0.0+ pending r=/sr= on the final patch. Please get drivers approval before checking into the 1.0 branch.
Keywords: adt1.0.0+
Updated•22 years ago
|
Attachment #83613 -
Flags: superreview+
Comment 27•22 years ago
|
||
Comment on attachment 83613 [details] [diff] [review] Using a better check (thanks scc). r/sr=darin
Comment 28•22 years ago
|
||
Comment on attachment 83613 [details] [diff] [review] Using a better check (thanks scc). r=scc
Attachment #83613 -
Flags: review+
Assignee | ||
Comment 30•22 years ago
|
||
Minor change.
Comment 31•22 years ago
|
||
Comment on attachment 83714 [details] [diff] [review] I knew I had forgotten something. sr=scc
Attachment #83714 -
Flags: superreview+
Comment 32•22 years ago
|
||
Comment on attachment 83714 [details] [diff] [review] I knew I had forgotten something. r=alecf
Attachment #83714 -
Flags: review+
Assignee | ||
Comment 33•22 years ago
|
||
Apparently Tstartup went up a little as a result of my first checkin, see bug 144863.
Depends on: 144863
Assignee | ||
Comment 34•22 years ago
|
||
Fix has been checked in. I'll deal with Tstart issues in bug 144863.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment 35•22 years ago
|
||
Use |GetReadableFragment| instead of iterators (I should have remembered that would be faster. Re-ordered some tests, where putting the more likely test first can short-circuit the whole expression. Fixed one test that needed to be a little stronger. Added some optional optimizations (intent indicated with |#if|), that could be measured. With respect to the start-up slow down reported (see bug 144863), I don't think sticking this algorithm in will help --- except for one thing easily inserted into jag's patch: the |Empty()| tests at the top. It seems likely to me that during startup, a lot of string assignment is into empty strings. This will short circuit that case. It would be easy to intrument this in a debug build by having |IsDependentOn| just count the total number of times it's called, and the total number of times it's called on empty destination. If the latter dominates other uses of |IsDependentOn|, add the |Empty| test.
Attachment #83551 -
Attachment is obsolete: true
What about a string that's a dependent concatenation of multiple empty strings?
Comment 37•22 years ago
|
||
Here's working code demonstrating the heap sort and overlap resolution phase of the algorithm above
Updated•3 years ago
|
Component: String → XPCOM
You need to log in
before you can comment on or make changes to this bug.
Description
•