Consider using ICU to compute time zone information for all Date methods

RESOLVED FIXED in Firefox 63

Status

()

enhancement
P5
normal
RESOLVED FIXED
3 years ago
4 months ago

People

(Reporter: anba, Assigned: anba)

Tracking

(Depends on 1 bug, Regressed 1 bug)

Trunk
mozilla63
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox63 fixed)

Details

Attachments

(7 attachments, 8 obsolete attachments)

4.61 MB, patch
Waldo
: review+
Details | Diff | Splinter Review
4.00 KB, patch
anba
: review+
Details | Diff | Splinter Review
16.27 KB, patch
anba
: review+
Details | Diff | Splinter Review
6.67 KB, patch
anba
: review+
Details | Diff | Splinter Review
60.27 KB, patch
anba
: review+
Details | Diff | Splinter Review
129.35 KB, patch
anba
: review+
Details | Diff | Splinter Review
789 bytes, patch
anba
: review+
Details | Diff | Splinter Review
We're currently using OS-dependent functions to retrieve time zone information for the Date object. This has led and is still leading to various compatibility and user issues, so we may want to consider switching to ICU to compute the time zone infos. ICU provides a richer and more robust API for time zones, which should help us to resolve the following issues:

1. Support "historic" dates (historic means any date earlier than 1 January 1970, start of the Unix epoch)
Example bugs: 437772, 487897, 637244, 704486, 718175, 802627, 935909, 937261, 1118690, 1122571, 1143398, 1254041

2. Support for future dates (where future means any date later than 19 January 2038, end of the Unix epoch)
Example bugs: 1029923

3. We no longer need to worry about OS-specific time zones issues. For example on Windows, the DST rules for the current year are applied to all years.
Example bugs: 1233809, 1304774, 1335818

4. Fix discrepancies between Date (time zone info from OS) and Intl.DateTimeFormat (time zone info from ICU/tzdata) methods (e.g. when the tzdata versions don't match)
Example bugs: 1147872, 1246930
Depends on: 1343826
You may run into an issue here that I ran into when implementing the same thing in V8: ICU doesn't have an API for getting the same timezone string that the OS will give you. See the code review discussion for https://codereview.chromium.org/2724373002/ for more information.

I'm wondering, should we just delete the named timezone string, if it's already omitted in some contexts?
Blocks: 1385643
Priority: -- → P5
Ensure getHours/getMinutes/getSeconds/getMilliseconds never return negative zero for dates before 1970.

A test in a later part revealed this bug.
Assignee: nobody → andrebargull
Status: NEW → ASSIGNED
Attachment #8994557 - Flags: review?(jwalden+bmo)
Split a private js::ResetTimeZone function from JS::ResetTimeZone, so that internal users can call js::ResetTimeZone when we only want to check the current time zone [1], but external users always get a new time zone when they call JS::ResetTimeZone. This plus a missing tzset() call in nsRFPService.cpp is probably the cause of bug 1470161.

Also removed some unnecessary #includes and moved |DateConstructor| into jsdate.cpp as a static function.


[1] Which can all go away when bug 1343826 gets implemented, but that probably never happens... :-/
Attachment #8994560 - Flags: review?(jwalden+bmo)
Only refresh the ICU time zone when we actually observed a time zone change. When want to rely on ICU for all time zone computations, we should try to reduce the number of times ICU requeries the host system time zone.

(This can lead to incorrect results in some edge cases because of bug 1300110, but I don't think we need to worry about this given that it is a pre-existing issue. More reasons bug 1343826 is useful to have. :-)
Attachment #8994561 - Flags: review?(jwalden+bmo)
Prepare DateTimeInfo and jsdate.cpp for the next patches. (This patch doesn't really make sense on its own, but it'll help when applying the ICU time zone patch.)


DateTimeInfo
- Add the "_" suffix to all fields.
- Change getDSTOffsetMilliseconds() to return an in32_t value instead of an int64_t. All possible dst offset values fit into int32_t, so this change with the change for localTZA_ allow us to pack both values more efficiently in DateTimeInfo.
- Change localTZA_ from double to in32_t, because neither operating system functions nor ICU support fractional milliseconds TZA and all possible TZA values fit into int32_t. 
- Move the cached range fields into a separate inner class |RangeCache|, so the ICU patch can reuse the logic to cache other values. And, yes, we still need to caches, even when using ICU. For example this µ-benchmark:
```
var t = dateNow();
for (var i = 0; i < 500000; ++i) {
    new Date(2018,7-1,21,0,0,(i&31)).getHours();
}
print(dateNow() - t);
```
Currently needs 320ms for, with the ICU patch without caching, it needs 500ms, but when additional caching is added, the µ-benchmark needs only 240ms with ICU.
- The caching should also help to reduce the time needed to hold the DateTimeInfo lock and that way reduce possible thread contention. 
- Renamed |MaxUnixTimeT| to |MaxTimeT| because the ICU patch will define a different limit. Also changed the Unix limit to be UTC instead of PST8PDT based.


vm/Time.{h,cpp}:
- Change the buffer limit parameter to size_t, so it matches std::strftime.


jsdate.cpp:
- Split the time zone comment computation into a separate function.
- The result string is now computed using js::ConcatStrings, because that's what we'll need for the ICU patch.
- And changed FormatDate to reduce the level of indentation.


js/src/tests/non262/Date:
- Move some helper methods into shell.js, because we'll end up needing the same helpers for the ICU patch and I don't want to copy them another time.
Attachment #8994566 - Flags: review?(jwalden+bmo)
Only remove exemplar city names from the time zone name data in ICU. This change is needed so that the long descriptive name of, for example, UTC is "Coordinated Universal Time" instead of "GMT" (in English).

I've changed the sed arguments to allow any number of leading whitespace instead of requiring exactly eight whitespace characters, because it makes the regular expression easier to read.

Oh, and ICU is now hosted on GitHub, so the update script needed to be changed to use the new location.
Attachment #8994568 - Flags: review?(jwalden+bmo)
The actual patch to use ICU for all time zone computations.

ICU doesn't provide a good C-API for time zone calculations, so we need to use the C++-API, which probably means we can only use it when we don't use the system ICU, because there's no stable C++ ABI.

The patch uses the |icu::TimeZone::LONG| format, because that (mostly) matches the current format on Windows and because it matches what V8 does. 

Also I'm not sure errors can actually happen when calling the icu::TimeZone::getOffset/getOffsetFromLocal methods (except maybe for OOM in ICU?), therefore the patch simply returns zero on error. If errors can actually happen, we'll need to change various DateTimeInfo interfaces to propagate error codes.

And I needed to change the Mutex-ID for IcuTimeZoneStateMutex from 500 to 600, so it's possible to acquire the IcuTimeZoneState mutex while already holding the DateTimeInfo mutex. (This happens when calling js::DateTimeInfo::timeZone().)

Plus lots of regression tests and imported some existing time zones tests from <https://github.com/anba/es6draft/tree/master/src/test/scripts/suite/objects/Date>.
Attachment #8994578 - Flags: review?(jwalden+bmo)
Clobber because the ICU data files was changed.
Attachment #8994579 - Flags: review+
Blocks: 1470161
Comment on attachment 8994557 [details] [diff] [review]
bug1346211-part1-negative-time-components.patch

Review of attachment 8994557 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsdate.cpp
@@ +125,5 @@
> +// 5.2.5 Mathematical Operations
> +static inline double
> +modulo(double dividend, double divisor)
> +{
> +    MOZ_ASSERT(divisor != 0 && IsFinite(divisor));

Two separate asserts.

This definition isn't correct if |divisor < 0|.  |5 modulo -4| is supposed to be -3 (such that |5 - -3 == -2 * -4|), but fmod(5, -4) is supposed to be |5 - -1*-4 == 1| that would in this use be directly returned.

Every caller in this file passes a positive value, so this isn't a problem -- for here.  It's also not a problem in ECMAScript overall, where every modulo is passed a positive |y|.  But I'm leery of defining this in a way inconsistent with the spec, just in case it grows new users.

Could you make this function assert |divisor > 0| and rename this function to PositiveModulo to clarify that a positive value is always returned, not a value with sign consistent with one or the other operand?

::: js/src/tests/non262/Date/time-components-negative-zero.js
@@ +11,5 @@
> +assertEq(utc.getTime() < 0, true);
> +assertEq(utc.getUTCHours(), 0);
> +assertEq(utc.getUTCMinutes(), 0);
> +assertEq(utc.getUTCSeconds(), 0);
> +assertEq(utc.getUTCMilliseconds(), 0);

I might make all these +0 for rhetorical emphasis, even if it doesn't change any semantics.
Attachment #8994557 - Flags: review?(jwalden+bmo) → review+
Stare as I might, I don't understand the part-2 patch.  The change breaks the documentation on JS::ResetTimeZone, which claims that function is periodically called inside the JS engine, except with this change that's no longer the case.  If there *is* a visible difference, then documentation needs to be updated.  If there *isn't* a visible difference...well, then, why is it important to bifurcate?

So then: how is it visible that js::ResetTimeZone(true) is called, versus js::ResetTimeZone(false)?  Or, after future patching, how does this change induce a difference *then*?  Also that bool should be an enum class with well-chosen enum and initializer names.

Currently this all just looks like complexity in pursuit of achieving no particular aim at all.  Help me out here!  I don't get it At All.
Flags: needinfo?(andrebargull)
(In reply to Jeff Walden [:Waldo] from comment #10)
> Stare as I might, I don't understand the part-2 patch.  The change breaks
> the documentation on JS::ResetTimeZone, which claims that function is
> periodically called inside the JS engine, except with this change that's no
> longer the case.  If there *is* a visible difference, then documentation
> needs to be updated.  If there *isn't* a visible difference...well, then,
> why is it important to bifurcate?

Apparently some users try to switch resist-fingerprinting on and off dynamically (bug 1470161), which means the time zone is switched to and from Etc/UTC. But because of bug 1300110 this doesn't really work if the system time zone is for example Europe/London, because Europe/London and Etc/UTC have the same time zone offset, so the time zone cache ignores the request to reset the time zone. For the occasional time zone reset from Realm::init() we still need to stick with our current broken behaviour for same offset time zones, because of... performance reasons. :-/

I've intentionally not updated the documentation for JS::ResetTimeZone, because I didn't want to add details about the internal time zone caching and by that overcomplicate the function documentation. But I guess I could simply remove mentioning that SpiderMonkey directly calls JS::ResetTimeZone. So change the docs from:

   Left to its own devices, SpiderMonkey itself may occasionally call this
   method to attempt to keep up with system time changes.

to:

   Left to its own devices, SpiderMonkey itself may occasionally try to detect
   system time changes.

I'll also add a test case for the Europe/London <-> UTC switch.


> So then: how is it visible that js::ResetTimeZone(true) is called, versus
> js::ResetTimeZone(false)?  Or, after future patching, how does this change
> induce a difference *then*?  Also that bool should be an enum class with
> well-chosen enum and initializer names.

Does this work for you or is it too specific?

enum class ResetTimeZoneMode : bool {
  DontResetIfOffsetUnchanged,
  ResetEvenIfOffsetUnchaged,
};
Flags: needinfo?(andrebargull)
Okay.  So what we *really* want, then, is rather than to key time zone/LocalTZA offset behavior off of a single number, to instead key things off "what's the current time zone rules in use".  That would not-equate Etc/UTC with Europe/London, and so a switch either direction would cause a full cache invalidation.  But if Realm::init saw the same key in a second check, it could not change anything and we would have consistent perf.  Right?

So, second best is what we're stuck with right now, until we implement something like the above.  (And maybe get such functionality into ICU, if that's needed first.  And any such work certainly in a separate patch and bug for certain.)  Bleh.
(In reply to Jeff Walden [:Waldo] from comment #12)
> Okay.  So what we *really* want, then, is rather than to key time
> zone/LocalTZA offset behavior off of a single number, to instead key things
> off "what's the current time zone rules in use".  That would not-equate
> Etc/UTC with Europe/London, and so a switch either direction would cause a
> full cache invalidation.  But if Realm::init saw the same key in a second
> check, it could not change anything and we would have consistent perf. 
> Right?

Sounds right.

> 
> So, second best is what we're stuck with right now, until we implement
> something like the above.  (And maybe get such functionality into ICU, if
> that's needed first.  And any such work certainly in a separate patch and
> bug for certain.)  Bleh.

Maybe someday someone will implement bug 1343826, which we'll allow us to remove the ResetTimeZone call from Realm::init.
Comment on attachment 8994560 [details] [diff] [review]
bug1346211-part2-reset-timezone-strict.patch

Review of attachment 8994560 [details] [diff] [review]:
-----------------------------------------------------------------

I'd kind of like if js::ResetTimeZone were renamed to Internal before or after, rather than have to think about overloading and goo.

::: js/src/vm/DateTime.cpp
@@ +336,3 @@
>  
>  #if ENABLE_INTL_API && defined(ICU_TZ_HAS_RECREATE_DEFAULT)
>      js::IcuTimeZoneState->lock().get() = js::IcuTimeZoneStatus::NeedsUpdate;

While you're here, could you assign |lock()| into a block-scoped local, whose lifetime clearly dominates the assignment into |get()|?  I'm 99.9% sure the lifetime of the |lock()| temporary is required to exist past that assignment happening, but I am not absolutely certain, and I can imagine others being not absolutely certain, and it seems best to model the unequivocally safe approach so that people are somewhat likelier to do it as a matter of course.

::: js/src/vm/DateTime.h
@@ +49,5 @@
> + * Engine-internal variant of JS::ResetTimeZone with an additional flag to
> + * control whether to forcibly reset all time zone data (this is the default
> + * behavior when calling JS::ResetTimeZone) or to try to reuse the previous
> + * time zone data.
> + */

Your suggested updated docs on JS::ResetTimeZone WFM.

@@ +51,5 @@
> + * behavior when calling JS::ResetTimeZone) or to try to reuse the previous
> + * time zone data.
> + */
> +extern void
> +ResetTimeZone(bool force);

Also the enum names are okay too, or at least not entirely awful.
Attachment #8994560 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8994561 [details] [diff] [review]
bug1346211-part3-fewer-icu-timezone-refreshes.patch

Review of attachment 8994561 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/DateTime.cpp
@@ +359,4 @@
>  
>  #if ENABLE_INTL_API && defined(ICU_TZ_HAS_RECREATE_DEFAULT)
> +    if (needsUpdate)
> +        js::IcuTimeZoneState->lock().get() = js::IcuTimeZoneStatus::NeedsUpdate;

As before I'd also prefer a separate local specifically for the lock object.

::: js/src/vm/DateTime.h
@@ +138,5 @@
>      // zone change (missing the necessary poking of ICU as well), so ensure
>      // only js::ResetTimeZone() can call this via access restrictions.
>      friend void js::ResetTimeZone(bool);
>  
> +    static bool updateTimeZoneAdjustment(bool force) {

Put a comment by this function and the other like "Returns true iff the internal DST offset cache was purged", or something.
Attachment #8994561 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8994566 [details] [diff] [review]
bug1346211-part4-preparation-cleanup.patch

Review of attachment 8994566 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsdate.cpp
@@ +2741,2 @@
>      if (!IsFinite(utcTime)) {
> +        JSString* str = NewStringCopyZ<CanGC>(cx, js_NaN_date_str);

Probably should just add an "Invalid Date" atom and use that.  Could be in an rs=me followup patch if you prefer.

::: js/src/tests/non262/Date/shell.js
@@ +2,5 @@
> + * Date functions used by tests in Date suite
> + */
> +(function(global) {
> +    const msPerDay = 86400000; // 1000 * 60 * 60 * 24
> +    const msPerHour = 3600000; // 1000 * 60 * 60

Maybe just write out the maths literally, IMO, if we're touching this?

::: js/src/vm/DateTime.cpp
@@ +229,5 @@
> +
> +    if (range.oldStartSeconds <= seconds && seconds <= range.oldEndSeconds)
> +        return range.oldOffsetMilliseconds;
> +
> +    range.oldOffsetMilliseconds = range.offsetMilliseconds;

Maybe add a

    auto checkSanity = MakeScopeExit([this]() {
        this->range.sanityCheck();
    });

so every path through here gets a double-check?

::: js/src/vm/DateTime.h
@@ +21,5 @@
> +constexpr double MinutesPerHour = 60;
> +constexpr double SecondsPerMinute = 60;
> +constexpr double msPerSecond = 1000;
> +constexpr double msPerMinute = msPerSecond * SecondsPerMinute;
> +constexpr double msPerHour = msPerMinute * MinutesPerHour;

*Technically* there are no precision requirements on compile-time floating-point math at least as of C++11 (dunno if fixt later), so these could all miscompute, but let's set aside C++'s shortcomings here.  :-)

::: js/src/vm/Time.h
@@ +56,5 @@
>  #endif
>  
>  /* Format a time value into a buffer. Same semantics as strftime() */
>  extern size_t
> +PRMJ_FormatTime(char* buf, size_t buflen, const char* fmt, const PRMJTime* tm,

Yesssssssssssssssssss.
Attachment #8994566 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8994568 [details] [diff] [review]
bug1346211-part5-missing-tz-names.patch

Review of attachment 8994568 [details] [diff] [review]:
-----------------------------------------------------------------

::: intl/icu/GIT-INFO
@@ +3,5 @@
> +Date:   Wed Jun 20 05:34:56 2018 +0000
> +
> +    ICU-13823 Merged #13840 number parser memory overflow fix (r41541) to maint-62 for 62.1 GA.
> +    
> +    X-SVN-Rev: 41542

GIT-INFO file contents are so much worse and less-informative trash than SVN-INFO.  :-(

Searches turn up elaborate scripts to produce better info, but this is basically good enough sort of kind of, and it feels like too much pain to even think about what licensing would be on some of those scripts (even if some come off Stack Overflow which I think is all MIT-licensed or some specific very-open thing).
Attachment #8994568 - Flags: review?(jwalden+bmo) → review+
(In reply to Jeff Walden [:Waldo] from comment #14)
> I'd kind of like if js::ResetTimeZone were renamed to Internal before or
> after, rather than have to think about overloading and goo.

So in addition to the parameter which makes it different from JS:ResetTimeZone, also name it js::ResetTimeZoneInternal? (Just want to double check to make sure I didn't misunderstand the comment.)

> @@ +51,5 @@
> > + * behavior when calling JS::ResetTimeZone) or to try to reuse the previous
> > + * time zone data.
> > + */
> > +extern void
> > +ResetTimeZone(bool force);
> 
> Also the enum names are okay too, or at least not entirely awful.

Not entirely awful works for me. :-)
Comment on attachment 8994578 [details] [diff] [review]
bug1346211-part6-switch-to-icu-timezone.patch

Review of attachment 8994578 [details] [diff] [review]:
-----------------------------------------------------------------

I am not particularly excited about time zone strings changing everywhere in format stuff, but it *is* what the specs demand, so... *wheeeeeeeeeeeeeee*

::: js/src/builtin/TestingFunctions.cpp
@@ +4972,5 @@
> +    }
> +
> +    UniqueChars locale = JS_GetDefaultLocale(cx);
> +    if (!locale) {
> +        JS_ReportErrorNumberASCII(cx, GetErrorMessage, nullptr, JSMSG_DEFAULT_LOCALE_ERROR);

If I read JS_GetDefaultLocale correctly, it's inconsistent about whether it reports OOM.  :-|  DuplicateString(cx, ...) will report OOM, but JSRuntime::getDefaultLocale() returning null will not.  Fix in a followup?

@@ +6059,5 @@
> +JS_FN_HELP("setDefaultLocale", SetDefaultLocale, 1, 0,
> +"setDefaultLocale(locale)",
> +"  Set the runtime default locale to the given value.\n"
> +"  An empty string or undefined resets the runtime locale to its default value.\n"
> +"  NOTE: The input string is not validated, it must be a valid BCP-47 language tag."),

Could we at least do a super-basic sanity check like

  IsAsciiAlpha(chars[0]) &&
  std::all_of(chars, chars + length,
              [](JS::Latin1Char c) { return IsAsciiAlpha(c) || c == '=' || IsAsciiDigit(c); });

to catch simple lapses?  And then make this say "not fully validated" to give wiggle room to complain sometimes without requiring it all times.

::: js/src/jsdate.cpp
@@ +570,5 @@
>      // Spec bug: https://bugs.ecmascript.org/show_bug.cgi?id=4007
>  
>      return t - AdjustTime(t - DateTimeInfo::localTZA() - msPerHour);
>  }
> +#endif /* ENABLE_INTL_API && !MOZ_SYSTEM_ICU */

That's some big #ifdef there.

It might be nice, in a followup, to put all these functions as statics inside a class, so that the numerous helper functions could be private and clearly *only* used as helper function inside the three public static signatures actually used by the rest of the file.  (And there could be standalone helper functions to call the class-statics, so as to avoid touching the world to rename things.)

@@ +1864,5 @@
> + *
> + * The 'tz' database explicitly does not support fractional-second time zones.
> + * For example the Netherlands observed Amsterdam Mean Time, estimated to be
> + * UT +00:19:32.13, from 1909 to 1937, but in tzdata AMT is defined as exactly
> + * UT +00:19:32.

I definitely remember seeing this when looking at time zone info import goo, and I thought we had a comment about this in this code that explicitly pointed out this case, but I guess it wasn't being pointed out here.  Weird.  Maybe my comment in https://bugzilla.mozilla.org/show_bug.cgi?id=837961 never actually did make its way into source code somehow, more's the pity.

@@ +2718,5 @@
> +    char16_t tzbuf[100] = {0};
> +    size_t tzbuflen = mozilla::ArrayLength(tzbuf) - 3; // Excluding space for " (" and ")".
> +
> +    const char* locale = cx->runtime()->getDefaultLocale();
> +    if (!locale) {

You should do all this stuff *before* declaring the buffer or doing anything in it, for readability/simplicity.

@@ +2722,5 @@
> +    if (!locale) {
> +        JS_ReportErrorNumberASCII(cx, GetErrorMessage, nullptr, JSMSG_DEFAULT_LOCALE_ERROR);
> +        return nullptr;
> +    }
> +

IMO you should have this:

    char16_t tzbuf[100];
    tzbuf[0] = ' ';
    tzbuf[1] = '(';

and then

    int64_t utcMilliseconds = static_cast<int64_t>(utcTime);

    char16_t* timeZoneStart = tzbuf + 2;
    constexpr size_t remainingSpace = mozilla::ArrayLength(tzbuf) - 2 - 1; // for the trailing ')'
    if (!DateTimeInfo::timeZoneDisplayName(timeZoneStart, remainingSpace, utcMilliseconds, locale)) {
        ....
    }

so that you're logically filling in the buffer from start to end, not filling in the time zone part, then bumping back to fill in the start characters, then bumping after to fill in ')' -- and also avoiding zeroing a buffer for no reason.

Happily and unusually, none of the code in this function requires null-termination for a change, so zeroing the array is completely wasted work.

@@ +2730,5 @@
> +        return nullptr;
> +    }
> +
> +    // Reject if the result string is empty.
> +    size_t len = js_strlen(tzbuf + 2);

js_strlen(timeZoneStart) is a lot nicer than repeating a + 2 here.

@@ +2737,5 @@
> +
> +    // Parenthesize the returned display name.
> +    tzbuf[0] = ' ';
> +    tzbuf[1] = '(';
> +    tzbuf[len + 2] = ')';

And then with |timeZoneStart| existing, these can reduce to simply

    timeZoneStart[len] = ')';

without another 2 showing up explicitly.

@@ +2739,5 @@
> +    tzbuf[0] = ' ';
> +    tzbuf[1] = '(';
> +    tzbuf[len + 2] = ')';
> +
> +    return NewStringCopyN<CanGC>(cx, tzbuf, len + 3);

Could write that as |2 + len + 1| to align more closely with the memory layout of the various parts.

@@ +2810,5 @@
>      }
>  
>      return cx->names().empty;
>  }
> +#endif /* ENABLE_INTL_API && !MOZ_SYSTEM_ICU */

Again with the long #ifdef that it'd be nice to see which functions in it are actually used, and which are only helpers.

@@ +2855,5 @@
> +        //
> +        // When ICU is used to retrieve the time zone string, the localized
> +        // 'long' name format from CLDR is used. For example when the default
> +        // locale is English, PST is displayed as 'Pacific Standard Time', but
> +        // when it is Russian, 'Тихоокеанское стандартное время' is used. This

Might want to say "en-US" and "ru" for these, for concision and precision.

::: js/src/tests/non262/Date/time-zones-historic.js
@@ +109,5 @@
> +    assertEq(dt.getHours(), 0);
> +    assertEq(dt.getMinutes(), 0);
> +    assertEq(dt.getSeconds(), 0);
> +    assertEq(dt.getMilliseconds(), 0);
> +    assertEq(dt.getTimezoneOffset(), -120);    

Buzzsaw out all the trailing WS before landing.

::: js/src/vm/DateTime.cpp
@@ +169,5 @@
> +        // Tell the analysis the |pFree| function pointer called by uprv_free
> +        // cannot GC.
> +        JS::AutoSuppressGCAnalysis nogc;
> +
> +        timeZone_.reset();

I'd prefer just assigning nullptr here.

@@ +187,5 @@
>  {
>      internalUpdateTimeZoneAdjustment(true);
>  }
>  
> +js::DateTimeInfo::~DateTimeInfo() = default;

It doesn't seem useful or desirable to declare a destructor, then later define it as defaulted, when the implicitly generated default destructor is perfectly adequate.  I would remove both.

@@ +212,5 @@
> +#if ENABLE_INTL_API && !MOZ_SYSTEM_ICU
> +    int32_t rawOffset, dstOffset;
> +    UErrorCode status = U_ZERO_ERROR;
> +
> +    timeZone()->getOffset(UDate(utcSeconds * msPerSecond), false, rawOffset, dstOffset, status);

constexpr bool dateIsLocalTime = false;

and then pass that to this call.

Also: feh, reference outparams.

@@ +214,5 @@
> +    UErrorCode status = U_ZERO_ERROR;
> +
> +    timeZone()->getOffset(UDate(utcSeconds * msPerSecond), false, rawOffset, dstOffset, status);
> +    if (U_FAILURE(status))
> +        return 0;

Mm, the failure-modes are starting to pile up here.  :-\  It may be time soon to make this function truly fallible (and propagate that fallibility outward into a moderate number of callers).

@@ +359,5 @@
> +    // Once <https://unicode-org.atlassian.net/browse/ICU-13705> is fixed we
> +    // can remove this extra cast.
> +    auto* basicTz = static_cast<icu::BasicTimeZone*>(timeZone());
> +    basicTz->getOffsetFromLocal(UDate(localSeconds * msPerSecond), icu::BasicTimeZone::kFormer,
> +                                icu::BasicTimeZone::kFormer, rawOffset, dstOffset, status);

Given the absence of any documentation anywhere in ICU source for what the two kFormer arguments are intended to mean, some comments and better-descriptive constexpr variable names would be nice for them.

@@ +375,5 @@
> +
> +    int32_t rawOffset, dstOffset;
> +    UErrorCode status = U_ZERO_ERROR;
> +
> +    timeZone()->getOffset(UDate(utcSeconds * msPerSecond), false, rawOffset, dstOffset, status);

Named constexpr for false.

@@ +388,5 @@
> +{
> +    int64_t seconds = toClampedSeconds(milliseconds);
> +    return isUTC
> +           ? getOrComputeValue(localRange_, seconds, &DateTimeInfo::computeLocalOffsetMilliseconds)
> +           : getOrComputeValue(utcRange_, seconds, &DateTimeInfo::computeUTCOffsetMilliseconds);

Can't say I'm super-happy about passing around function pointers and having indirect function calls this way, but maybe we can refactor somehow to eliminate this, after this all lands.

@@ +436,5 @@
> +
> +    // Return an empty string if the display name doesn't fit into the buffer.
> +    size_t length = js_strlen(cachedName.get());
> +    if (length < buflen)
> +        mozilla::PodCopy(buf, cachedName.get(), length);

std::copy(cachedName.get(), cachedName.get() + length, buf); and #include <algorithm> instead of using PodOperations.h.

@@ +441,5 @@
> +    else
> +        length = 0;
> +
> +    buf[length] = '\0';
> +    return buf;

return true; (probably this is a compile warning/error with some compiler?)

::: js/src/vm/DateTime.h
@@ +146,5 @@
> +    /**
> +     * Return the time zone offset, including DST, in milliseconds at the
> +     * given time. The input time can be either at UTC or at local time.
> +     */
> +    static int32_t getOffsetMilliseconds(int64_t milliseconds, bool isUTC) {

Make this an enum with descriptive initializers, maybe

  enum class TimeStampFormat { UTC, Local };

@@ +220,5 @@
> +    // to only use ICU when we use our in-tree ICU copy.
> +
> +    // Use the full date-time range when we can use ICU's TimeZone support.
> +    static const int64_t MinTimeT = static_cast<int64_t>(StartOfTime / msPerSecond);
> +    static const int64_t MaxTimeT = static_cast<int64_t>(EndOfTime / msPerSecond);

constexpr for these?

@@ +229,5 @@
> +     */
> +    mozilla::UniquePtr<icu::TimeZone> timeZone_;
> +
> +    RangeCache utcRange_; // localtime-based ranges
> +    RangeCache localRange_; // UTC-based ranges

Hoo boy.  This gon' be gud.

@@ +246,5 @@
>      // 32-bit time_t, e.g. the ARM simulator on 32-bit Linux (bug 1406993).
>      // Bug 1406992 explores to use 64-bit time_t when supported by the
>      // underlying operating system.
>      static const int64_t MinTimeT = 0; /* time_t 01/01/1970 */
>      static const int64_t MaxTimeT = 2145830400; /* time_t 12/31/2037 */

constexpr too
Attachment #8994578 - Flags: review?(jwalden+bmo) → review+
(In reply to Jeff Walden [:Waldo] from comment #19)
> I am not particularly excited about time zone strings changing everywhere in
> format stuff, but it *is* what the specs demand, so... *wheeeeeeeeeeeeeee*

<https://tc39.github.io/ecma262/#sec-timezoneestring> actually doesn't give any restrictions or requirements for the time zone string, so we could also use IANA time zone abbreviations for non-Windows platforms and only CLDR names on Windows. This configuration would match more or less our current output for Date.prototype.toString without this patch. But having the same output on all platforms seemed like a better choice to me. :-)


> 
> @@ +187,5 @@
> >  {
> >      internalUpdateTimeZoneAdjustment(true);
> >  }
> >  
> > +js::DateTimeInfo::~DateTimeInfo() = default;
> 
> It doesn't seem useful or desirable to declare a destructor, then later
> define it as defaulted, when the implicitly generated default destructor is
> perfectly adequate.  I would remove both.
> 

Oops, I actually wanted to replace the "unicode/timezone.h" include in vm/DateTime.h with a forward declaration for icu::TimeZone, but forgot to add this in the uploaded patch. (And with the default inline destructor, we'd need to full declaration for icu::TimeZone because of the mozilla::UniquePtr<icu::TimeZone> member in DateTimeInfo.)
And I also thought about the implications of having an inline destructor when multiple members are now destroyed in the destructor and what that means for the destructor size. So basically a hint for the compiler not to worry about potentially inlining the destructor for DateTimeInfo. But that's probably not a real issue for DateTimeInfo, given that the destructor will only be called in two places (js::InitDateTimeState and js::FinishDateTimeState) at most.
Updated part 1 per review comments, carrying r+.
Attachment #8994557 - Attachment is obsolete: true
Attachment #9001692 - Flags: review+
Updated part 2 per review comments, carrying r+.

Also added a test case.
Attachment #8994560 - Attachment is obsolete: true
Attachment #9001695 - Flags: review+
Updated part 3 per review comments, carrying r+.
Attachment #8994561 - Attachment is obsolete: true
Attachment #9001697 - Flags: review+
Updated part 4 per review comments (*), carrying r+.

(*) Including the suggestion to add an "Invalid Date" atom.

Also changed |SecondsPerHour| and |SecondsPerDay| from |const| to |constexpr|, so that all constants in this file use |constexpr|.
Attachment #8994566 - Attachment is obsolete: true
Attachment #9001701 - Flags: review+
(In reply to Jeff Walden [:Waldo] from comment #19)
> ::: js/src/builtin/TestingFunctions.cpp
> @@ +4972,5 @@
> > +    }
> > +
> > +    UniqueChars locale = JS_GetDefaultLocale(cx);
> > +    if (!locale) {
> > +        JS_ReportErrorNumberASCII(cx, GetErrorMessage, nullptr, JSMSG_DEFAULT_LOCALE_ERROR);
> 
> If I read JS_GetDefaultLocale correctly, it's inconsistent about whether it
> reports OOM.  :-|  DuplicateString(cx, ...) will report OOM, but
> JSRuntime::getDefaultLocale() returning null will not.  Fix in a followup?
> 

Filed bug 1483961.

> ::: js/src/jsdate.cpp
> @@ +570,5 @@
> >      // Spec bug: https://bugs.ecmascript.org/show_bug.cgi?id=4007
> >  
> >      return t - AdjustTime(t - DateTimeInfo::localTZA() - msPerHour);
> >  }
> > +#endif /* ENABLE_INTL_API && !MOZ_SYSTEM_ICU */
> 
> That's some big #ifdef there.
> 
> It might be nice, in a followup, to put all these functions as statics
> inside a class, so that the numerous helper functions could be private and
> clearly *only* used as helper function inside the three public static
> signatures actually used by the rest of the file.  (And there could be
> standalone helper functions to call the class-statics, so as to avoid
> touching the world to rename things.)
> 

Filed bug 1483962.

> @@ +359,5 @@
> > +    // Once <https://unicode-org.atlassian.net/browse/ICU-13705> is fixed we
> > +    // can remove this extra cast.
> > +    auto* basicTz = static_cast<icu::BasicTimeZone*>(timeZone());
> > +    basicTz->getOffsetFromLocal(UDate(localSeconds * msPerSecond), icu::BasicTimeZone::kFormer,
> > +                                icu::BasicTimeZone::kFormer, rawOffset, dstOffset, status);
> 
> Given the absence of any documentation anywhere in ICU source for what the
> two kFormer arguments are intended to mean, some comments and
> better-descriptive constexpr variable names would be nice for them.
> 

Changed to: 

+    // ES2019 draft rev 0ceb728a1adbffe42b26972a6541fd7f398b1557
+    //
+    // 20.3.1.7 LocalTZA
+    //
+    // If |localSeconds| represents either a skipped (at a positive time zone
+    // transition) or repeated (at a negative time zone transition) locale
+    // time, it must be interpreted as a time value before the transition.
+    constexpr int32_t skippedTime = icu::BasicTimeZone::kFormer;
+    constexpr int32_t repeatedTime = icu::BasicTimeZone::kFormer;


The ICU arguments are called |nonExistingTimeOpt| and |duplicatedTimeOpt|, but I went with |skippedTime| and |repeatedTime| instead to match the language in the ES spec.


> @@ +388,5 @@
> > +{
> > +    int64_t seconds = toClampedSeconds(milliseconds);
> > +    return isUTC
> > +           ? getOrComputeValue(localRange_, seconds, &DateTimeInfo::computeLocalOffsetMilliseconds)
> > +           : getOrComputeValue(utcRange_, seconds, &DateTimeInfo::computeUTCOffsetMilliseconds);
> 
> Can't say I'm super-happy about passing around function pointers and having
> indirect function calls this way, but maybe we can refactor somehow to
> eliminate this, after this all lands.

Maybe lambdas or std::bind?

> ::: js/src/vm/DateTime.h
> @@ +146,5 @@
> > +    /**
> > +     * Return the time zone offset, including DST, in milliseconds at the
> > +     * given time. The input time can be either at UTC or at local time.
> > +     */
> > +    static int32_t getOffsetMilliseconds(int64_t milliseconds, bool isUTC) {
> 
> Make this an enum with descriptive initializers, maybe
> 
>   enum class TimeStampFormat { UTC, Local };

Went with |TimeZoneOffset|, because |TimeStampFormat| sounded too much like a thing related to string formatting.
Updated part 6 per review comments (*), carrying r+.

(*) Except the default destructor in DateTime.cpp, but this time the patch includes only a forward declaration for |icu::TimeZone|, which makes it necessary to have a non-inline destructor (comment #20).
Attachment #8994578 - Attachment is obsolete: true
Attachment #9001715 - Flags: review+
Updated clobber.
Attachment #8994579 - Attachment is obsolete: true
Attachment #9001716 - Flags: review+
The input validation for the "setDefaultLocale" testing function revealed that the LC_TIME/LC_ALL setting in js/src/tests/lib/tests.py is apparently ignored on Windows (*). Therefore added a check to shell.js so we only call "setDefaultLocale" when the input looks like BCP 47 language tag. 

(*) When tested locally it still returned the OS defaults, e.g. getDefaultLocale() returned "English-United States", which, when passed back into setDefaultLocale(), will throw an error by the basic input validation in setDefaultLocale().
Attachment #9001715 - Attachment is obsolete: true
Attachment #9002106 - Flags: review+
Forgot to qref patch for clobber file.
Attachment #9001716 - Attachment is obsolete: true
Attachment #9002114 - Flags: review+
Pushed by dvarga@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b5c5bbed871f
Part 1: Avoid returning negative zero for time components with dates before 1970. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/62d58886e8d7
Part 2: Split JS::ResetTimeZone into an external and internal implementation. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/7cad19d2d2d9
Part 3: Only refresh ICU default time zone after time zone change was observed. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/fe6637fcb770
Part 4: Split functions, create additional abstractions, and cleanup in preparation for ICU time zone changes. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/a40001643f53
Part 5: Only remove 'exemplar city' names from ICU time zone names data files. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/7b1a3a49547d
Part 6: Use ICU for all time zone computations when available. r=Waldo
https://hg.mozilla.org/integration/mozilla-inbound/rev/fd8d17192696
Part 7: Clobber after modifying ICU data file. r=clobber
Keywords: checkin-needed
Depends on: 1484535
Duplicate of this bug: 1300197
Duplicate of this bug: 1470161
Duplicate of this bug: 487897
Duplicate of this bug: 637244
Duplicate of this bug: 704486
Duplicate of this bug: 935909
Duplicate of this bug: 937261
Duplicate of this bug: 1079720
Duplicate of this bug: 1107837
Duplicate of this bug: 1122571
Duplicate of this bug: 1143398
Duplicate of this bug: 1233809
Duplicate of this bug: 1254041
Duplicate of this bug: 1304774
Duplicate of this bug: 1330307
Duplicate of this bug: 1335818
Duplicate of this bug: 1342853
Duplicate of this bug: 1365192
Duplicate of this bug: 1385643
Duplicate of this bug: 1401696
Duplicate of this bug: 1459842
Duplicate of this bug: 1514677
Regressions: 1534160
You need to log in before you can comment on or make changes to this bug.