Closed Bug 140070 Opened 22 years ago Closed 22 years ago

JS_GetStringBytes returns the wrong string

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla1.0

People

(Reporter: jband_mozilla, Assigned: brendan)

References

Details

(Keywords: js1.5)

Attachments

(1 file)

from: news://news.mozilla.org/aa8kiq$l5h1@ripley.netscape.com

Michael Schneider wrote:
> Hey all!
> 
> 
> 
> I'm having (once again ;-) very strange troubles:
> 
> 
> 
> In some scripts which do a lot of string manipulation, I get in a native
> function (e.g. write) the jsval for the string, on which I call
> JS_ValueToString()... The 16bit string which is contained in the returned
> JSString structure looks fine... But when I call JS_GetStringBytes(), then
> sometimes, not really reproducible, I get a string which is not related with
> the 16 bit string... It's always a string that has been used before in that
> script.... I tracked it down to the deflated_string_cache hash table, which
> contained the wrong string... Of course manually jump over the if (he)
> {bytes = (char *) he->value} statement (jsstr.c:2792) and executing instead
> the js_DeflateString() statement returns the correct string!
> 
> 
> 
> So, nows the question, why does the hash table contain that entry... And I
> have no idea... ;) We are using SM 1.5 RC4, in a multithreaded environment,
> but of course compiled with JS_THREADSAFE... But I've had the error even
> when just making serialized requests to the engine... The garbage collector
> does not necessarily run while the error occurs... So, I wonder HOW can
> there be the same hash key twice, since the hash key is the string pointer
> and the garbage collector does not run??! Another strange observation: once
> I've got this error, it happens more often, but sometimes I can't reproduce
> it, no matter how hard I try... but when it happens, then it normally does
> that not only once, it happens for say 20 string operations like 10 times,
> but then again for 1000 without a problem before it starts again....
> 
> 
> 
> And now comes my biggest problem: I can't reproduce it in the jsshell
> application ;-).... but I hope that some of you have some ideas what could
> cause this strange behavior... The only thing I've to add before I start
> debugging again is that the script uses nothing else than core JS functions
> (beside the write, which is more or less implemented like Print() in
> js.c)... And that the only difference in the runtime environment is that we
> have a resolve handler installed on the global object, but it only returns
> JS_TRUE when it is called (in that script) and should have no effect... But
> I'm still not 100% sure about that!
> 
> 
> 
> Any ideas and hints are greatly appreciated!
> 
> Regards,
> 
> Mike

It sounds to me like there are probably case where str pointers are not being
properly removed from the deflated string cache when they are are collected and
then the same pointer is being reused by the heap (as should happen) but the
stale cache entry remains.

I can think of one place that we screwed this up... External strings. We have
external string finalizers but we do not seem to have a mechanism for removing
the deflated string cache entry that might be present for a given external
string. I think we should fix this to be transparent; i.e. call the external
finalizer and then have the engine automatically clean up the entry (rather than
exposing a function that the external finalizer is expected to call).

The above is certainly a problem that needs to be fixed. I'm also thinking that
someone should take a close look to see that the depende string changes do not
mess anything up. I'm not seeing anything, but it is worth some thought.
Reading Michael's comments closer (esp. related to the question of when and if
the gc runs), I wonder if the external string problem I'v identified is really
the source of what he sees. Perhaps something else is lurking here.
Reassigning to Kenton 
Assignee: rogerl → khanson
Keywords: js1.5
There was a bug in RC4 where dependent strings could be finalized after their
bases, leading to crashes or even leading to failure to remove the deflated
string cache entry for the dependent string.  Fixed in RC4a.

The external string problem remains.  Patch in a bit.

/be
Assignee: khanson → brendan
Keywords: mozilla1.0+
Target Milestone: --- → mozilla1.0
Looking for fast r= and sr= from you-know-who, for 1.0RC2.

/be
Comment on attachment 81028 [details] [diff] [review]
proposed fix for external string deflation

Looks OK, though I wonder if maybe we shouldn't be trying to use the same path
for internal strings as well, by broadening the check in the finalization pass:
do externals follow the same rules for validity (non-NULL str->chars) and such?
Broadening the test just slows things down, because the GCX_STRING type index is
not adjacent to GCX_MUTABLE_STRING and the GCX_EXTERNAL_STRING types.

Really, if we are to clean this up, I think we ought to get rid of the type test
altogether and reminimize to the gc_finalizers dispatch table only.  I.e.,
somehow wrap all string finalizers with a wrapper that calls
js_PurgeDeflatedStringCache, then calls the real finalizer -- but that's slow
too.  So I'm happy for now to penalize the uncommon case (compilers for modern
superscalar architectures will select a branch-likely around the then part,
optimizing for the internal gc-thing type case).

/be
Comment on attachment 81028 [details] [diff] [review]
proposed fix for external string deflation

r=jband.

A couple things though...

1) I trust that you got the deflated_string_cache_bytes #ifdef 
right to not break release builds :)

2) Why call the finalizer *after* calling js_PurgeDeflatedStringCache?
Does this protect from anything I can't think of? This opens
a *small* hole where finalizer implementors might re-prime 
the cache by calling JS_GetStringBtyes (for some reason I can't
think of but might make sense to them).
Attachment #81028 - Flags: review+
To answer shaver's final question: yes, external strings must conform to the
rules for a JSString (dependent or independent).  The GC thing type has to do
with storage class, but the access to that storage is determined independently
by the contents of the JSString struct (flag bits and all).

jband #1: Yes, I tested both (and built the js shell both ways).

jband #2: See the use of JSSTRING_LENGTH in js_PurgeDeflatedStringCache.  I
argue the case you worry about is a "don't do that" hazard we can document in
the API docs (which have yet to be written for external string finalizers!).

/be
Nominating for 1.0, asking for a= from a driver.  This is a potential dataloss
bug in JS engine embeddings, if the wrong decimated-to-ISO-Latin-1 string is
returned for a given JS string.

/be
Blocks: 138000
Comment on attachment 81028 [details] [diff] [review]
proposed fix for external string deflation

a=rjesup@wgate.com.  Please check into branch and trunk
Attachment #81028 - Flags: approval+
Fixed in branch and trunk.

/be
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Michael, can you confirm that this fixes your problem? If so, please
mark this bug "Verified", otherwise, you can reopen it; thanks -
I've updated to rc4a and applied the patch.. This seems to fix my problems! :)

But I don't have enough rights to change the status of this bug, so please 
someone else could set it to "Verified"? 

Thanks a lot!
Michael: thank you. I'll mark this one Verified, then -
Status: RESOLVED → VERIFIED
adding branch resolution keyword.
Keywords: fixed1.0.0
Patch checkin verified on MOZILLA_1_0_BRANCH; adding verified1.0.0 keyword
Keywords: verified1.0.0
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: