Closed Bug 1068014 Opened 6 years ago Closed 6 years ago

skip strptime() in datetime_from() if the date is in a standard format

Categories

(Bugzilla :: Bugzilla-General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Bugzilla 5.0

People

(Reporter: glob, Assigned: glob)

Details

Attachments

(1 file, 1 obsolete file)

Bugzilla::Util::datetime_from() always calls strptime to parse the date.

strptime is a generic date string parser, however most of the dates that we parse are strings that have been read from the database and are in a known format.

date parsing is about 8 times quicker if we use a regex for the known formats first, and fall back to strptime if it doesn't match.
Attached patch 1068014_1.patch (obsolete) — Splinter Review
Attachment #8490073 - Flags: review?(dylan)
Comment on attachment 8490073 [details] [diff] [review]
1068014_1.patch

>+    if ($date =~ /^(\d\d\d\d)[\.-](\d\d)[\.-](\d\d) (\d\d):(\d\d):(\d\d)$/) {

Could you improve your regexp to also accept dates in the YYYY-MM-DD format, i.e. without a time specified? Your regexp would become

  /^(\d\d\d\d)[\.-](\d\d)[\.-](\d\d)(?: (\d\d):(\d\d):(\d\d))?$/

and would behave exactly the same way as strptime().
Summary: skip strptime im datetime_from if the date is in a standard format → skip strptime() in datetime_from() if the date is in a standard format
Comment on attachment 8490073 [details] [diff] [review]
1068014_1.patch

Review of attachment 8490073 [details] [diff] [review]:
-----------------------------------------------------------------

I think the regexp only matching date + time is acceptable with the justification that full datetime strings are much more common, and as an optimization we want to target the common case.

::: Bugzilla/Util.pm
@@ +554,5 @@
>  
> +    my @time;
> +    # Most dates will be in this format, avoid strptime's generic parser
> +    if ($date =~ /^(\d\d\d\d)[\.-](\d\d)[\.-](\d\d) (\d\d):(\d\d):(\d\d)$/) {
> +        @time = ($6, $5, $4, $3, $2 - 1, $1 - 1900, undef);

I would consider using \d{4} and \d{2}, which allow the regex engine to perform the match in fewer operations.

Given correct input, /^(\d{4})[\.-](\d{2})[\.-](\d{2}) (\d{2}):(\d{2}):(\d{2})$/ will match in just 39 steps,
but /^(\d\d\d\d)[\.-](\d\d)[\.-](\d\d) (\d\d):(\d\d):(\d\d)$/ will take 55 steps.
For non-matching strings, the difference is greater.
Attachment #8490073 - Flags: review?(dylan) → review+
Is it still acceptable to set the milestone for this to 5.0?
Flags: approval?
(In reply to Dylan William Hardison [:dylan] from comment #3)
> I think the regexp only matching date + time is acceptable with the
> justification that full datetime strings are much more common

My suggestion is so trivial to implement that it would be a shame to not do it. We have date-only fields (deadline + custom fields) which call datetime_from(), see bug 1046168, and so you get some perf win here too.
Attached patch 1068014_2.patchSplinter Review
given how easy it is, i think it's worth supporting timeless dates.

here's a revised patch which incorporates that, as well as the regex repetition change.
Attachment #8490073 - Attachment is obsolete: true
Attachment #8490816 - Flags: review?(dylan)
Comment on attachment 8490816 [details] [diff] [review]
1068014_2.patch

Review of attachment 8490816 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good

r=dylan
Attachment #8490816 - Flags: review?(dylan) → review+
To ssh://gitolite3@git.mozilla.org/bugzilla/bugzilla.git
   1bead98..af46080  master -> master

To ssh://gitolite3@git.mozilla.org/webtools/bmo/bugzilla.git
   c8c2d8e..4c16c87  master -> master
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: approval? → approval+
Resolution: --- → FIXED
Target Milestone: --- → Bugzilla 5.0
You need to log in before you can comment on or make changes to this bug.