-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Conversation
We would like to be able to link against versions of ICU installed as a "operating system level library" which means we can't take a dependency on any C++ APIs. This change moves away from icu::Calendar in favor of UCalendar. I also introduce a small helper template to manage the lifetime of ICU resources.
Part of the effort to remove our usage of C++ ICU APIs. The major issue here was that the C++ API used char*'s for some things whereas the C API used UChar*'s so we needed to define our own copies of some constants. We also need to manage a buffer ourselves, instead of being able to use the underlying buffer of a retured UnicodeString.
Remove uses of the C++ DateFormat and SimpleDateFormat classes in favor of UDateFormat. As part of this change, it was easier to move some of the code that converts an ICU format string to a .NET Style format string from native code up to managed code. This code used UnicodeString and we'll need to move away from that as well as we remove all the C++ usage.
This change removes NumberFormat in favor of UNumberFormat. There is a bit of work that needs to happen in order to keep the normalization code we use to convert an ICU pattern so to something we can match against working. Instead of UnicodeStrings, the input to the normalization function is now a UChar* and we build up a std::string during normalization. This allows us to also skip a conversion from UChar* back to char* so we can find the correct pattern in our collection of patterns to examine.
Getting the regular eras is straight forward, we can do the thing we do for other locale data and just ask ICU using a specific UDateFormatSymbolType. For abbreviated eras, there's no C API, but we can try to just read the data from ICU resources and fall back to the standard width eras if that doesn't work.
To prepare for removing icu::Locale in favor of just using the id directly, remove all the uses of Locale methods except for .getName(). We now use GetLocale to create a Locale but then turn it into a char* for all the helper methods. After this change, we can update GetLocale to do locale parsing into a char buffer and remove all the locale.getName() calls with just `locale'.
Remove all the uses of the icu::Locale type in favor of just using a char* which is the raw locale id (which is exactly what all the ICU C apis use for a locale). The meat of this change si in locale.cpp to actually handle doing the conversion from UChar* to char*. The rest of the places are dealing with the fallout (GetLocale now has a different signiture and the .getName() dance is no longer needed as we have a raw locale name all the time now).
Use "= delete" syntax to make it clear the IcuHolder copy constructor and assignment opperators are removed. Remove superfluous "public" modifier on the struct closers used by the IcuHolders.
OSX ships with a copy of ICU (their globalization APIs are built on top of it). Since we only use stable c based APIs, we can link against it using the methods described in using a System ICU in the ICU User's Guide (basically we disable function renaming, don't use C++ and only use stable APIs). The ICU headers are not part of the SDK, so we continue to need ICU installed via Homebrew as a build time dependency. Fixes dotnet/corefx#3849
typedef IcuHolder<UDateFormat, UDateFormatCloser> UDateFormatHolder; | ||
typedef IcuHolder<UNumberFormat, UNumberFormatCloser> UNumberFormatHolder; | ||
typedef IcuHolder<ULocaleDisplayNames, ULocaleDisplayNamesCloser> ULocaleDisplayNamesHolder; | ||
typedef IcuHolder<UResourceBundle, UResourceBundleCloser> UResourceBundleHolder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, you could cut down on the ceremony by just making the closers overloads of an IcuClose function. That would eliminate the extra template arg on IcuHolder, which you could then use directly as IcuHolder<T>
without the typedefs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I will do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I am not sure if this will work. All of these Uxxx things are typedefs of void*, so I don't think we can overload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yuck (talking to ICU, not you). Ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code did look nicer, so maybe partial template specialization is the way to go in the future. I doubt I will make any changes for RC1 however, what we have works even if it is ugly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could just be trading one set of ceremony for another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how specialization would work any better than overloading if these are all really the same type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I was not thinking clearly.
This matches what we do in other places in calendarData.cpp, the RAII pattern will make it easier to not leak memory.
Looks great, Matt. |
cc: @eerhardt |
Ugh, tagged the wrong Eric at first. Feel free to ignore this @ericeil. |
{ | ||
void operator()(UNumberFormat* pNumberFormat) const | ||
{ | ||
udat_close(pNumberFormat); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does NumberFormat have the same close as DateFormat or is this an all too easy mistake to make with the "everything is really void*" silliness?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Will fix.
message(FATAL_ERROR "Cannot find libicui18n, try installing libicu-dev (or the appropriate package for your platform)") | ||
return() | ||
if(NOT CLR_CMAKE_PLATFORM_DARWIN) | ||
find_library(ICUUC NAMES icuuc PATHS ${ICU_HOMEBREW_LIB_PATH}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to have "HOMEBREW_LIB_PATH" for non-Darwin platforms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you feel about switching this to use find_package(ICU REQUIRED), and then use the ${ICU_LIBRARIES} in target_link_libraries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the HOMEBREW_LIB_PATH thing. I will look to into find_package
in a follow-up to this PR.
case AbbrevMonthNames: | ||
return EnumMonths( | ||
locale, calendarId, DateFormatSymbols::STANDALONE, DateFormatSymbols::ABBREVIATED, callback, context); | ||
return EnumSymbols(locale, calendarId, UDAT_STANDALONE_SHORT_MONTHS, 0, callback, context); | ||
case SuperShortDayNames: | ||
#ifdef HAVE_DTWIDTHTYPE_SHORT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check in the cmake file is out of date with what is being used in code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
LGTM - I just had a few minor questions. |
There were a few problems that needed to be addressed: - Our detection logic around testing if ICU supported a feature was still checking for C++ stuff instead of the coresponding C code (which we ended up using). - There was some cleanup we could do now that the OSX and other Unix builds were split apart
3eac0d9
to
392f5f4
Compare
Remove OSX Homebrew ICU dependency
👏 |
…-plus Remove OSX Homebrew ICU dependency Commit migrated from dotnet/coreclr@e93beb1
This change converts the System.Globalization.Native shim to use the ICU C API (instead of C++). Once that is done, we are able to link against the version of ICU that OSX uses for all of their locale APIs.
Please take a look @stephentoub @ericeil
The individual commits have some duplicated work, I did it this way so I could make progress incrementally, continue to test and slowly make progress driving towards C. I'm not sure if it is easier to review the 8 files or the 15 commits.
I ran all of the corefx tests locally on both Ubuntu and OSX.