Closed Bug 224703 Opened 21 years ago Closed 4 years ago

document.lastModified : zero-pad the date in the encoding of the current locale

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: jshin1987, Assigned: jshin1987)

References

()

Details

(Keywords: intl)

Attachments

(1 file)

How to reproduce: 1. Launch mozilla under ko_KR.UTF-8 locale on linux 2. Go to the above URL which imports the following javascript: (http://www.yale.edu/lawweb/avalon/diana/js/footer.js) document.write("<DIV CLASS=\"copyright\">") document.write("<SPAN CLASS=\"copytitle\">" + document.title + "</SPAN><BR>") document.write("is located at : " + window.location + ";<BR><BR>") document.write("This document was last updated on: " + document.lastModified + "<BR><BR>") document.write("&copy; 1996-2003 Project DIANA at Yale Law School.<BR>") document.write("The Lillian Goldman Law Library in Memory of Sol Goldman.<BR><BR>") document.write("</DIV>") Expected result: This document was last updated on: 11/05/2003 03:41:21 AM [1] Actual result: The date/time in Korean '2003년 11월 5일 오전 3시 41분 21초' with each byte of Korean characters in UTF-8 zero-padded comes up. For instance, '년' in UTF-8 is 0xeb 0x85 0x84. In its place, U+00EB, U+0085 and U+0084 show up. There are two bugs here: 1. document.lastModified should return the result in the language (or the locale corresponding to the current character encoding [2] if there's no lang specified) of the current document. Instead, Mozilla's lastModified returns the date/time in the language of the current locale under which it's launched. 2. Across 'xpconnect' boundary, UTF-8 byte sequences are zero-paaded,
Cc'ing Brendan, tor@acm.org, jst - Will the issues above be addressed by the fix to bug 153586, "Date.prototype.toLocaleString ( ) not implemented correctly"? Or is this a DOM issue, having specifically to do with the way |document.lastModified| is implemented?
This is only happening because the server isn't sending back a last-modified header, so we're faking one from the current time. This eventually hits strftime with %c (locale-specific format) and we treat this as an ascii string. Changing the time format string is probably the appropriate fix.
Regarding the first point - for servers that return last-modified, we pass that string end-to-end without any parsing or reconstruction, so the result will be exactly what the server sent. It appears IE does parse the last-modified string and thus can return localized strings.
fwiw: servers sometimes send a Last-modified header with a value that does not conform to the HTTP-date type specified in RFC 2616. this is not a big deal, but the code should take care to handle this error case. i guess it can just fallback on fabricating a value for document.lastModified. see: http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.3 for more info on the format of the HTTP-date string.
>this is not a big deal i meant: this is not a big deal to necko since it will ignore a malformed Last-modified header in cache expiration time calculations, but will send the Last-modified header value (in the original format) as the argument to an If-modified-since request header (if doing a cache validation request).
Do I understand correctly, then, this is not a JS Engine issue? If so, what is the right component? Thanks -
this should probably be reassigned to one of the DOM components... but, i don't know which one. jst?
I say DOM level 0. /be
Assignee: general → dom_bugs
Component: JavaScript Engine → DOM Level 0
Setting default QA -
QA Contact: PhilSchwartau → ian
> This eventually hits strftime with %c (locale-specific format) and if/when possible, nsIScriptableDateFormat::FormatDateTime (or nsIDateTimeFormat::FormatTime) should be called with the locale corresponding to the language and the encoding of the current document instead of the current locale. > we treat this as an ascii string. and treated as such when converted to Unicode. Hmm, on Unix, nsDateTimeFormatUnix::FormatTMTime is doing the right thing as far as the conversion is concerned.... where does that zero-padding happen, then?
It turns out that strftime() is called directly in prmjtime.c http://lxr.mozilla.org/seamonkey/source/js/src/prmjtime.c#470 Because it's in C (unless there's a wrapper-function), nsIDateTimeFormat can't be made use of, can it?
jshin: No, and even if it were in C++, JS is built early and cannot depend on more derived files. Isn't this off-topic for this bug, though? See bug 153586. /be
I think it's on-topic because 1) the summary line has 'in the encoding of the current locale' 2) (more specifically) my point #1 in comment #0 depends on whether it's possible or not. So, it's absolutely impossible to implement #1? bug 153586 is rather different. It's for honoring the locale under which Mozilla is launched. I have some reservation about that because there are a few different locales involved with Mozilla(system locale, application locale). What I'm after here is that datetime format (when rendered in the document rendering area) should follow the format suitable for the current language and encoding of the document being rendered instead of the format of the current (system and application) locale.
If it's not possible to fix this as I suggested (on modern unix, using nl_langinfo() and iconv(3) would work most of time. on win32 we can use MultiByteToWideChar(?) with ACP, etc, but my point #1 in comment #0 wouldn't be fixed), we'd better use the lang/locale neutral format YYYY-MM-DD hh:mm:ss (per ISO 8601). off-topic: ECMAscript appears to need some locale-related enhancement.
see also bug 107445
A possible way to fix this (comment #0, point #1) is to override JS engine's lastModified in DOM (the way escape() and unescape() used to be overriden before bug 44272 was fixed) so that it's rendered in the locale (and encoding) corresponding to the current document lang/encoding.
There's nothing to override in the JS engine: lastModified is a DOM-implemented property of document. tor, what's going on with bug 153586? Wouldn't that help here? /be
Sorry, I keep forgetting comment 13. Still, there's no JS engine property to override for document.lastModified. The value faked up when the server sends no corresponding response header is from the DOM code, right? /be
Thanks for the pointer. I was misled by comment #2 to assume that it's 'strftime' in JS engine (I ran 'lxr' search on seamonkey which doesn't include NSPR). It turned out that it's 'strftime' in NSPR. http://lxr.mozilla.org/mozilla/source/content/base/src/nsDocument.cpp#4071 http://lxr.mozilla.org/mozilla/source/nsprpub/pr/src/misc/prtime.c#1675 Now that I know where the action is, it's rather easy to fix.
There's an issue to resolve here, though. 'lang' is not usually available to nsDocument while the character encoding is. However, it's rather hard (if not impossible) to 'infer' the locale from the document encoding. For instance, 'iso-8859-1' can be used for any number of locales, let alone 'UTF-8'. I can/have to split the task to two chunks. In the first stage, I can just fix the conversion issue (that is, make sure the return value of 'strftime' in the locale encoding is properly converted to UTF-16/UTF-8). In the second stage, I'll devise a heuristic to map 'the document' encoding to 'the locale' (I can make use of the current locale, Accept-Lang list, etc). I'm assigning this to myself.
Assignee: dom_bugs → jshin
This is stage #1 fix.
Comment on attachment 137704 [details] [diff] [review] a patch to fix the conversion problem >+ // some servers send last-modified header in non-ascii violating >+ // RFC 2616 (bug 224703 comment #3 and #4) >+ if (NS_FAILED(rv) && !IsASCII(lastModified)) { > mLastModified.Truncate(); It's '||' instead of '&&' if (NS_FAILED(rv) || !IsASCII(lastModified)) {
Comment on attachment 137704 [details] [diff] [review] a patch to fix the conversion problem Um, no. see the comment in @@ 776,11 it says: /// ... fall back to what NS4.x returned. and what are you doing? oho, you're making it do something else. Anyway, if you want an lxr which actually mostly works, try http://landfill.mozilla.org/mxr-test (or /mxr if you don't want to risk my mxr patches) Trees are updated when i get around to them. eventually i'll probably cron it to match lxr.mozilla.org
Attachment #137704 - Flags: review-
Why is it important to preserve the NS 4.x behavior? (The inconsistency between the comment and the actual code should be removed, of course).
*** Bug 131006 has been marked as a duplicate of this bug. ***
i don't think i said anything specifically about one format being bad or more important. i know that i clearly said you made the code comment lie about what the code did. that's unacceptable. if you're going to change the code to do something completely different then you must fix the comment to accurate indicate what in the world the code is going to do. no one debugging the code should be mislead to believe that the code that follows as a fallback actually does what the old comment says it does if in fact it does something completely different. is that clear? as for whether this is the right or wrong change (if properly implemented), i really don't have time to form an opinion.
QA Contact: ian → general

This might have been fixed by bug 99224.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: