-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[RF] Modernise RooFit's Category Classes #5502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Starting build on |
|
Build failed on ROOT-fedora31/noimt. Warnings:
|
|
Starting build on |
|
Build failed on ROOT-fedora30/cxx14. Failing tests: |
|
Build failed on mac1015/cxx17. Failing tests:
|
|
Build failed on ROOT-debian10-i386/cxx14. Failing tests: |
|
Hi, I'll take a look today, probably in the afternoon. @hageboeck looks like there are conflicts |
|
Due to Oksana's update of gtest. Rebasing gave me the opportunity to add a fix and squash commits. Pushing soon. |
ec65c0c to
e18fc1f
Compare
|
Starting build on |
|
Build failed on mac1015/cxx17. Warnings:
|
e18fc1f to
d49f044
Compare
|
Starting build on |
1 similar comment
|
Starting build on |
|
Build failed on ROOT-fedora29/python3. Failing tests: |
|
Build failed on ROOT-performance-centos7-multicore/default. Failing tests: |
eguiraud
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, all good. Unfortunately the important commit contains way too many changes for a RooFit-newbie to digest in a reasonable time. I appreciate the addition of tests and the verbose commit messages though 👍
| _insertionOrder.push_back(catType->getVal());\ | ||
| } }"; | ||
| #pragma read sourceClass="RooAbsCategory" targetClass="RooAbsCategory" version="[1-2]" include="RooFitLegacy/RooCatTypeLegacy.h" \ | ||
| source="RooCatType _value" target="_currentIndex" code="{ _currentIndex = onfile._value.getVal(); }" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious..what on earth does this do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's schema evolution. These are read rules telling ROOT how to read an old class version on file that stores category states in the 284-bytes-too-large RooCatType. You basically say:
Get me the RooCatType from file, and I do with it what's necessary.
@eguiraud
lmoneta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me ! Very nice improvement !
RooFit categories were using the class RooCatType (: public TObject)
instead of an integer to represent category states.
This class contained an integer and a char[256] to store a category index
and a (possibly truncated) state name. This prevented fast batch access,
and increased the overhead for data storage and data loading.
From now on, categories are represented as integers.
Every state can be denoted by index or name, but instead of storing
both in each row of the dataset, only the index is stored, and the
mapping from index to state name is stored in the category instance.
In more detail:
- Replace RooCatType by an integer.
- Replace names saved in each row of dataset with RooCatType by a map in
RooAbsCategory that maps state names to numbers. This saves 64 bytes
+ the size of TObject for each event, and makes loading strings in each
event unnecessary.
- Provide legacy iterator for iterating through states instances of
RooCatType, which will be created on the fly.
- Make RooVectorDataStore store integers instead of RooCatType instances.
- With this, also reduce the amount of unnecessary members in RooVectorDataStore.
- Outline functions that polluted the header of RooVectorDataStore.
Updates for category interfaces:
- Provide new interfaces for categories to work without RooCatType.
- Add bracket operator and constructor arguments to define multiple
category states at once or using category["stateName"] = 1.
- Mark legacy interfaces exposing RooCatType as deprecated in doxygen.
- Stop including RooCatType when R__LESS_INCLUDES is set.
- Flag legacy interfaces with R__SUGGEST_ALTERNATIVE.
- Don't create RooCatType instances, unless users continue to use
the legacy interface.
- Provide functions hasIndex() and hasLabel() to check validity of
index/label.
- Provide function states() with direct access to map of category names
to category numbers.
- Add functions getCurrent{Index|Label} to make clear that they access
class state.
- Prevent clearing of shapeDirty in RooAbsCategory::evaluate(), since
derived category classes might need to recompute their shapes if one
of their input categories change.
Changes in special category classes:
- Consolidate RooMultiCategory and RooSuperCategory. These share almost
all the code, so RooSuperCategory will now use a RooMultiCategory for
state definition and retrieval. The only additional functionality for
RooSuperCategory is that it can assign states to its subcategories.
- Clean up header of RooMappedCategory. Outline as much as possible.
Move several legacy category classes to
roofitcore/{inc,src}/RooFitLegacy. These are only needed for reading old
files, and shouldn't be used actively.
To make clear that these functions retrieve the current state, the functions have been renamed to getCurrentIndex() and getCurrentState(). To not break old code, aliases with the old names have been defined, but their doxygen documentation will be listed in the section "Legacy interface". In this way, both functions can be used, but only the ones with the more expressive names show up in the main documentation.
stateNames() is supposed to be an access function for category classes. It was public, though. Users should access this map only when using RooCategory, the fundamental category type. Further, remove two unimplemented functions from RooAbsRealLValue, and fix whitespace errors.
RooCategory instances can share range definitions with clones of themselves. This is now achieved with shared_ptr, but these also had to be restored when reading current and old categories from files.
- Add test for reading v6.20 RooFit proxies.
When inserting category states using the newly added operator[], the insertion order would not be saved. That lead to a crash when generating random states from the category. (To remain backward-compatible, the insertion order is needed when randomising states. Otherwise, the order of generated states is different.)
57ca8c2 to
baef1f1
Compare
|
Starting build on |
|
Build failed on ROOT-fedora29/python3. Failing tests: |
In the previous commit, the wrong old-style implementation of `RooMultiCategory::evaluate()` was removed. Now, the only place where the `RooMultiCatIter` is still used is the `RooSuperCategory::MakeIterator()` function, which is already deprecated since the RooFit category rewrite for ROOT 6.22 (root-project#5502). It's not worth to carry around the `RooMultiCatIter` just for that, so this commit suggests to remove the deprecated `RooSuperCategory::MakeIterator()` function and get rid of the RooMultiCatIter.
In the previous commit, the wrong old-style implementation of `RooMultiCategory::evaluate()` was removed. Now, the only place where the `RooMultiCatIter` is still used is the `RooSuperCategory::MakeIterator()` function, which is already deprecated since the RooFit category rewrite for ROOT 6.22 (#5502). It's not worth to carry around the `RooMultiCatIter` just for that, so this commit suggests to remove the deprecated `RooSuperCategory::MakeIterator()` function and get rid of the RooMultiCatIter.
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a place where the servers of a derived category are redirected. The `mustReplaceAll` flag was set to true here. This made sense when the internal categories were still the only servers of the `RooSuperCategory`, but since the refactor in root-project#5502 this is not the case anymore. Therefore, a harmless error is printed now, which should be avoided by setting the `mustReplaceAll` flag to `false`.
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a place where the servers of a derived category are redirected. The `mustReplaceAll` flag was set to true here. This made sense when the internal categories were still the only servers of the `RooSuperCategory`, but since the refactor in root-project#5502 this is not the case anymore. Therefore, a harmless error is printed now, which should be avoided by setting the `mustReplaceAll` flag to `false`.
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a place where the servers of a derived category are redirected. The `mustReplaceAll` flag was set to true here. This made sense when the internal categories were still the only servers of the `RooSuperCategory`, but since the refactor in root-project#5502 this is not the case anymore. Therefore, a harmless error is printed now, which should be avoided by setting the `mustReplaceAll` flag to `false`.
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a place where the servers of a derived category are redirected. The `mustReplaceAll` flag was set to true here. This made sense when the internal categories were still the only servers of the `RooSuperCategory`, but since the refactor in #5502 this is not the case anymore. Therefore, a harmless error is printed now, which should be avoided by setting the `mustReplaceAll` flag to `false`.
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a place where the servers of a derived category are redirected. The `mustReplaceAll` flag was set to true here. This made sense when the internal categories were still the only servers of the `RooSuperCategory`, but since the refactor in root-project#5502 this is not the case anymore. Therefore, a harmless error is printed now, which should be avoided by setting the `mustReplaceAll` flag to `false`.
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a place where the servers of a derived category are redirected. The `mustReplaceAll` flag was set to true here. This made sense when the internal categories were still the only servers of the `RooSuperCategory`, but since the refactor in root-project#5502 this is not the case anymore. Therefore, a harmless error is printed now, which should be avoided by setting the `mustReplaceAll` flag to `false`.
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a place where the servers of a derived category are redirected. The `mustReplaceAll` flag was set to true here. This made sense when the internal categories were still the only servers of the `RooSuperCategory`, but since the refactor in #5502 this is not the case anymore. Therefore, a harmless error is printed now, which should be avoided by setting the `mustReplaceAll` flag to `false`.
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a place where the servers of a derived category are redirected. The `mustReplaceAll` flag was set to true here. This made sense when the internal categories were still the only servers of the `RooSuperCategory`, but since the refactor in root-project#5502 this is not the case anymore. Therefore, a harmless error is printed now, which should be avoided by setting the `mustReplaceAll` flag to `false`.
For reviewers:
The most important commit is the first, and specifically the files
RooAbsCategory.h, RooCategory.hThe former is the base class of all categories, the latter the user-facing implementation.
Here's what happened:
RooFit categories were using the class RooCatType (: public TObject)
instead of an integer to represent category states.
This class contained an integer and a char[256] to store a category index
and a (possibly truncated) state name. This prevented fast batch access,
and increased the overhead for data storage and data loading.
From now on, categories are represented as integers.
Every state can be denoted by index or name, but instead of storing
both packed into an object in each row of the dataset, only the index is stored, and the
mapping from index to state name is stored in the category instance.
Since all the category classes were happily using the RooCatType, several
parts had to be touched to convert everything to using integers. This feature
has been cooking since March, and the interface for category classes was
discussed in the PPP, but nevertheless, have a double look at the above-mentioned
headers.
In more detail:
RooAbsCategory that maps state names to numbers. This saves 64 bytes
event unnecessary.
RooCatType, which will be created on the fly.
Updates for category interfaces:
Provide new interfaces for categories to work without RooCatType.
Add bracket operator and constructor arguments to define multiple
category states at once or using category["stateName"] = 1.
Mark legacy interfaces exposing RooCatType as deprecated in doxygen.
Stop including RooCatType when R__LESS_INCLUDES is set.
Flag legacy interfaces with R__SUGGEST_ALTERNATIVE.
Don't create RooCatType instances, unless users continue to use
the legacy interface.
Provide functions hasIndex() and hasLabel() to check validity of
index/label.
Provide function states() with direct access to map of category names
to category numbers.
Add functions getCurrent{Index|Label} to make clear that they access
class state.
Prevent clearing of shapeDirty in RooAbsCategory::evaluate(), since
derived category classes might need to recompute their shapes if one
of their input categories change.
Changes in special category classes:
all the code, so RooSuperCategory will now use a RooMultiCategory for
state definition and retrieval. The only additional functionality for
RooSuperCategory is that it can assign states to its subcategories.