Skip to content

Commit

Permalink
Lift expensive Regex construction from DateFormat method body.
Browse files Browse the repository at this point in the history
Constructing the Regex touched in this commit can represent a
significant fraction (e.g. half or better) of the runtime of
the DateFormat method touched in this commit. To make this
DateFormat method more efficient, let's lift that Regex
construction out of that method body.
  • Loading branch information
Sacha0 committed Jan 4, 2022
1 parent a3c2798 commit 7fb5bb8
Showing 1 changed file with 31 additions and 2 deletions.
33 changes: 31 additions & 2 deletions stdlib/Dates/src/io.jl
Original file line number Diff line number Diff line change
Expand Up @@ -332,6 +332,23 @@ const CONVERSION_TRANSLATIONS = IdDict{Type, Any}(
Time => (Hour, Minute, Second, Millisecond, Microsecond, Nanosecond, AMPM),
)

# The `DateFormat(format, locale)` method just below consumes the following Regex.
# Constructing this Regex is fairly expensive; doing so in the method itself can
# consume half or better of `DateFormat(format, locale)`'s runtime. So instead we
# construct and cache it outside the method body. Note, however, that when
# `keys(CONVERSION_SPECIFIERS)` changes, the cached Regex must be updated accordingly;
# hence the mutability (Ref-ness) of the cache, the helper method with which to populate
# the cache, the cache of the hash of `keys(CONVERSION_SPECIFIERS)` (to facilitate checking
# for changes), and the lock (to maintain consistency of these objects across threads when
# threads simultaneously modify `CONVERSION_SPECIFIERS` and construct `DateFormat`s).
function compute_dateformat_regex(conversion_specifiers)
letters = String(collect(keys(conversion_specifiers)))
return Regex("(?<!\\\\)([\\Q$letters\\E])\\1*")
end
const DATEFORMAT_REGEX_LOCK = ReentrantLock()
const DATEFORMAT_REGEX_HASH = Ref(hash(keys(CONVERSION_SPECIFIERS)))
const DATEFORMAT_REGEX_CACHE = Ref(compute_dateformat_regex(CONVERSION_SPECIFIERS))

"""
DateFormat(format::AbstractString, locale="english") -> DateFormat
Expand Down Expand Up @@ -379,8 +396,20 @@ function DateFormat(f::AbstractString, locale::DateLocale=ENGLISH)
prev = ()
prev_offset = 1

letters = String(collect(keys(CONVERSION_SPECIFIERS)))
for m in eachmatch(Regex("(?<!\\\\)([\\Q$letters\\E])\\1*"), f)
# To understand this block, please see the comments attached to the definitions of
# DATEFORMAT_REGEX_LOCK, DATEFORMAT_REGEX_HASH, and DATEFORMAT_REGEX_CACHE.
lock(DATEFORMAT_REGEX_LOCK)
try
dateformat_regex_hash = hash(keys(CONVERSION_SPECIFIERS))
if dateformat_regex_hash != DATEFORMAT_REGEX_HASH[]
DATEFORMAT_REGEX_HASH[] = dateformat_regex_hash
DATEFORMAT_REGEX_CACHE[] = compute_dateformat_regex(CONVERSION_SPECIFIERS)
end
finally
unlock(DATEFORMAT_REGEX_LOCK)
end

for m in eachmatch(DATEFORMAT_REGEX_CACHE[], f)
tran = replace(f[prev_offset:prevind(f, m.offset)], r"\\(.)" => s"\1")

if !isempty(prev)
Expand Down

0 comments on commit 7fb5bb8

Please sign in to comment.