Skip to content

Conversation

@hageboeck
Copy link
Member

For reviewers:
The most important commit is the first, and specifically the files
RooAbsCategory.h, RooCategory.h
The former is the base class of all categories, the latter the user-facing implementation.

Here's what happened:
RooFit categories were using the class RooCatType (: public TObject)
instead of an integer to represent category states.
This class contained an integer and a char[256] to store a category index
and a (possibly truncated) state name. This prevented fast batch access,
and increased the overhead for data storage and data loading.

From now on, categories are represented as integers.
Every state can be denoted by index or name, but instead of storing
both packed into an object in each row of the dataset, only the index is stored, and the
mapping from index to state name is stored in the category instance.

Since all the category classes were happily using the RooCatType, several
parts had to be touched to convert everything to using integers. This feature
has been cooking since March, and the interface for category classes was
discussed in the PPP, but nevertheless, have a double look at the above-mentioned
headers.

In more detail:

  • Replace RooCatType by an integer.
  • Replace names saved in each row of dataset with RooCatType by a map in
    RooAbsCategory that maps state names to numbers. This saves 64 bytes
    • the size of TObject for each event, and makes loading strings in each
      event unnecessary.
  • Provide legacy iterator for iterating through states instances of
    RooCatType, which will be created on the fly.
  • Make RooVectorDataStore store integers instead of RooCatType instances.
    • With this, also reduce the amount of unnecessary members in RooVectorDataStore.
    • Outline functions that polluted the header of RooVectorDataStore.

Updates for category interfaces:

  • Provide new interfaces for categories to work without RooCatType.

  • Add bracket operator and constructor arguments to define multiple
    category states at once or using category["stateName"] = 1.

  • Mark legacy interfaces exposing RooCatType as deprecated in doxygen.

  • Stop including RooCatType when R__LESS_INCLUDES is set.

  • Flag legacy interfaces with R__SUGGEST_ALTERNATIVE.

  • Don't create RooCatType instances, unless users continue to use
    the legacy interface.

  • Provide functions hasIndex() and hasLabel() to check validity of
    index/label.

  • Provide function states() with direct access to map of category names
    to category numbers.

  • Add functions getCurrent{Index|Label} to make clear that they access
    class state.

  • Prevent clearing of shapeDirty in RooAbsCategory::evaluate(), since
    derived category classes might need to recompute their shapes if one
    of their input categories change.

Changes in special category classes:

  • Consolidate RooMultiCategory and RooSuperCategory. These share almost
    all the code, so RooSuperCategory will now use a RooMultiCategory for
    state definition and retrieval. The only additional functionality for
    RooSuperCategory is that it can assign states to its subcategories.
  • Clean up header of RooMappedCategory. Outline as much as possible.

@hageboeck hageboeck requested review from eguiraud and lmoneta April 29, 2020 18:20
@hageboeck hageboeck self-assigned this Apr 29, 2020
@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos7-multicore/default, ROOT-fedora29/python3, ROOT-fedora30/cxx14, ROOT-fedora31/noimt, ROOT-ubuntu16/nortcxxmod, mac1015/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link

Build failed on ROOT-fedora31/noimt.
Running on root-fedora-31-1.cern.ch:/build/workspace/root-pullrequests-build
See cdash.
See console output.

Warnings:

  • [2020-04-29T18:25:25.802Z] /build/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooCategory.cxx:491:123: warning: declaration of ‘const std::vector<int>& b’ shadows a parameter [-Wshadow]
  • [2020-04-29T18:25:25.802Z] /build/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooCategory.cxx:491:123: warning: declaration of ‘const std::vector<int>& a’ shadows a parameter [-Wshadow]

@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos7-multicore/default, ROOT-fedora29/python3, ROOT-fedora30/cxx14, ROOT-fedora31/noimt, ROOT-ubuntu16/nortcxxmod, mac1015/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link

Build failed on ROOT-fedora30/cxx14.
Running on root-fedora30-2.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See cdash.
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on ROOT-debian10-i386/cxx14.
Running on pcepsft10.dyndns.cern.ch:/build/workspace/root-pullrequests-build
See cdash.
See console output.

Failing tests:

@eguiraud
Copy link
Contributor

eguiraud commented May 4, 2020

Hi, I'll take a look today, probably in the afternoon. @hageboeck looks like there are conflicts

@hageboeck
Copy link
Member Author

Due to Oksana's update of gtest. Rebasing gave me the opportunity to add a fix and squash commits. Pushing soon.

@hageboeck hageboeck force-pushed the reworkCategories branch from ec65c0c to e18fc1f Compare May 4, 2020 08:54
@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos7-multicore/default, ROOT-fedora29/python3, ROOT-fedora30/cxx14, ROOT-fedora31/noimt, ROOT-ubuntu16/nortcxxmod, mac1015/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link

Build failed on mac1015/cxx17.
Running on macphsft18.dyndns.cern.ch:/Users/sftnight/build/jenkins/workspace/root-pullrequests-build
See cdash.
See console output.

Warnings:

  • [2020-05-04T09:46:08.673Z] /Users/sftnight/build/jenkins/workspace/root-pullrequests-build/root/roofit/roofitcore/test/testProxiesAndCategories.cxx:116:1: warning: 'InstantiateTestCase_P_IsDeprecated' is deprecated: INSTANTIATE_TEST_CASE_P is deprecated, please use INSTANTIATE_TEST_SUITE_P [-Wdeprecated-declarations]

@hageboeck hageboeck force-pushed the reworkCategories branch from e18fc1f to d49f044 Compare May 4, 2020 11:09
@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos7-multicore/default, ROOT-fedora29/python3, ROOT-fedora30/cxx14, ROOT-fedora31/noimt, ROOT-ubuntu16/nortcxxmod, mac1015/cxx17, windows10/cxx14
How to customize builds

1 similar comment
@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos7-multicore/default, ROOT-fedora29/python3, ROOT-fedora30/cxx14, ROOT-fedora31/noimt, ROOT-ubuntu16/nortcxxmod, mac1015/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link

Build failed on ROOT-fedora29/python3.
Running on root-fedora29-3.cern.ch:/build/workspace/root-pullrequests-build
See cdash.
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on ROOT-performance-centos7-multicore/default.
Running on olhswep22.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See cdash.
See console output.

Failing tests:

Copy link
Contributor

@eguiraud eguiraud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, all good. Unfortunately the important commit contains way too many changes for a RooFit-newbie to digest in a reasonable time. I appreciate the addition of tests and the verbose commit messages though 👍

_insertionOrder.push_back(catType->getVal());\
} }";
#pragma read sourceClass="RooAbsCategory" targetClass="RooAbsCategory" version="[1-2]" include="RooFitLegacy/RooCatTypeLegacy.h" \
source="RooCatType _value" target="_currentIndex" code="{ _currentIndex = onfile._value.getVal(); }"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious..what on earth does this do?

Copy link
Member Author

@hageboeck hageboeck May 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's schema evolution. These are read rules telling ROOT how to read an old class version on file that stores category states in the 284-bytes-too-large RooCatType. You basically say:
Get me the RooCatType from file, and I do with it what's necessary.
@eguiraud

Copy link
Member

@lmoneta lmoneta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me ! Very nice improvement !

@hageboeck hageboeck added this to the 6.22 milestone May 4, 2020
hageboeck added 5 commits May 4, 2020 18:53
RooFit categories were using the class RooCatType (: public TObject)
instead of an integer to represent category states.
This class contained an integer and a char[256] to store a category index
and a (possibly truncated) state name. This prevented fast batch access,
and increased the overhead for data storage and data loading.

From now on, categories are represented as integers.
Every state can be denoted by index or name, but instead of storing
both in each row of the dataset, only the index is stored, and the
mapping from index to state name is stored in the category instance.

In more detail:
- Replace RooCatType by an integer.
- Replace names saved in each row of dataset with RooCatType by a map in
  RooAbsCategory that maps state names to numbers. This saves 64 bytes
  + the size of TObject for each event, and makes loading strings in each
  event unnecessary.
- Provide legacy iterator for iterating through states instances of
  RooCatType, which will be created on the fly.
- Make RooVectorDataStore store integers instead of RooCatType instances.
  - With this, also reduce the amount of unnecessary members in RooVectorDataStore.
  - Outline functions that polluted the header of RooVectorDataStore.

Updates for category interfaces:
- Provide new interfaces for categories to work without RooCatType.
- Add bracket operator and constructor arguments to define multiple
  category states at once or using category["stateName"] = 1.
- Mark legacy interfaces exposing RooCatType as deprecated in doxygen.
- Stop including RooCatType when R__LESS_INCLUDES is set.
- Flag legacy interfaces with R__SUGGEST_ALTERNATIVE.
- Don't create RooCatType instances, unless users continue to use
  the legacy interface.
- Provide functions hasIndex() and hasLabel() to check validity of
  index/label.
- Provide function states() with direct access to map of category names
  to category numbers.
- Add functions getCurrent{Index|Label} to make clear that they access
  class state.

- Prevent clearing of shapeDirty in RooAbsCategory::evaluate(), since
  derived category classes might need to recompute their shapes if one
  of their input categories change.

Changes in special category classes:
- Consolidate RooMultiCategory and RooSuperCategory. These share almost
  all the code, so RooSuperCategory will now use a RooMultiCategory for
  state definition and retrieval. The only additional functionality for
  RooSuperCategory is that it can assign states to its subcategories.
- Clean up header of RooMappedCategory. Outline as much as possible.
Move several legacy category classes to
roofitcore/{inc,src}/RooFitLegacy. These are only needed for reading old
files, and shouldn't be used actively.
To make clear that these functions retrieve the current state, the
functions have been renamed to getCurrentIndex() and getCurrentState().
To not break old code, aliases with the old names have been defined, but
their doxygen documentation will be listed in the section "Legacy
interface". In this way, both functions can be used, but only the ones
with the more expressive names show up in the main documentation.
hageboeck added 7 commits May 4, 2020 19:16
stateNames() is supposed to be an access function for category classes.
It was public, though.
Users should access this map only when using RooCategory, the fundamental
category type.

Further, remove two unimplemented functions from RooAbsRealLValue, and
fix whitespace errors.
RooCategory instances can share range definitions with clones of
themselves. This is now achieved with shared_ptr, but these also had to
be restored when reading current and old categories from files.
- Add test for reading v6.20 RooFit proxies.
When inserting category states using the newly added operator[], the
insertion order would not be saved. That lead to a crash when generating
random states from the category.
(To remain backward-compatible, the insertion order is needed when
randomising states. Otherwise, the order of generated states is
different.)
@hageboeck hageboeck force-pushed the reworkCategories branch from 57ca8c2 to baef1f1 Compare May 4, 2020 17:40
@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos7-multicore/default, ROOT-fedora29/python3, ROOT-fedora30/cxx14, ROOT-fedora31/noimt, ROOT-ubuntu16/nortcxxmod, mac1015/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link

Build failed on ROOT-fedora29/python3.
Running on root-fedora29-2.cern.ch:/build/workspace/root-pullrequests-build
See cdash.
See console output.

Failing tests:

@hageboeck hageboeck merged commit cad7217 into root-project:master May 5, 2020
@hageboeck hageboeck deleted the reworkCategories branch May 5, 2020 06:02
guitargeek added a commit to guitargeek/root that referenced this pull request Aug 4, 2022
In the previous commit, the wrong old-style implementation of
`RooMultiCategory::evaluate()` was removed.

Now, the only place where the `RooMultiCatIter` is still used is the
`RooSuperCategory::MakeIterator()` function, which is already deprecated
since the RooFit category rewrite for ROOT 6.22 (root-project#5502).

It's not worth to carry around the `RooMultiCatIter` just for that, so
this commit suggests to remove the deprecated
`RooSuperCategory::MakeIterator()` function and get rid of the
RooMultiCatIter.
guitargeek added a commit that referenced this pull request Aug 5, 2022
In the previous commit, the wrong old-style implementation of
`RooMultiCategory::evaluate()` was removed.

Now, the only place where the `RooMultiCatIter` is still used is the
`RooSuperCategory::MakeIterator()` function, which is already deprecated
since the RooFit category rewrite for ROOT 6.22 (#5502).

It's not worth to carry around the `RooMultiCatIter` just for that, so
this commit suggests to remove the deprecated
`RooSuperCategory::MakeIterator()` function and get rid of the
RooMultiCatIter.
guitargeek added a commit to guitargeek/root that referenced this pull request Jan 12, 2023
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a
place where the servers of a derived category are redirected. The
`mustReplaceAll` flag was set to true here. This made sense when the
internal categories were still the only servers of the
`RooSuperCategory`, but since the refactor in root-project#5502 this is not the case
anymore. Therefore, a harmless error is printed now, which should be
avoided by setting the `mustReplaceAll` flag to `false`.
guitargeek added a commit to guitargeek/root that referenced this pull request Jan 12, 2023
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a
place where the servers of a derived category are redirected. The
`mustReplaceAll` flag was set to true here. This made sense when the
internal categories were still the only servers of the
`RooSuperCategory`, but since the refactor in root-project#5502 this is not the case
anymore. Therefore, a harmless error is printed now, which should be
avoided by setting the `mustReplaceAll` flag to `false`.
guitargeek added a commit to guitargeek/root that referenced this pull request Jan 13, 2023
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a
place where the servers of a derived category are redirected. The
`mustReplaceAll` flag was set to true here. This made sense when the
internal categories were still the only servers of the
`RooSuperCategory`, but since the refactor in root-project#5502 this is not the case
anymore. Therefore, a harmless error is printed now, which should be
avoided by setting the `mustReplaceAll` flag to `false`.
guitargeek added a commit that referenced this pull request Jan 13, 2023
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a
place where the servers of a derived category are redirected. The
`mustReplaceAll` flag was set to true here. This made sense when the
internal categories were still the only servers of the
`RooSuperCategory`, but since the refactor in #5502 this is not the case
anymore. Therefore, a harmless error is printed now, which should be
avoided by setting the `mustReplaceAll` flag to `false`.
guitargeek added a commit to guitargeek/root that referenced this pull request Jan 18, 2023
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a
place where the servers of a derived category are redirected. The
`mustReplaceAll` flag was set to true here. This made sense when the
internal categories were still the only servers of the
`RooSuperCategory`, but since the refactor in root-project#5502 this is not the case
anymore. Therefore, a harmless error is printed now, which should be
avoided by setting the `mustReplaceAll` flag to `false`.
guitargeek added a commit to guitargeek/root that referenced this pull request Jan 18, 2023
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a
place where the servers of a derived category are redirected. The
`mustReplaceAll` flag was set to true here. This made sense when the
internal categories were still the only servers of the
`RooSuperCategory`, but since the refactor in root-project#5502 this is not the case
anymore. Therefore, a harmless error is printed now, which should be
avoided by setting the `mustReplaceAll` flag to `false`.
guitargeek added a commit that referenced this pull request Jan 19, 2023
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a
place where the servers of a derived category are redirected. The
`mustReplaceAll` flag was set to true here. This made sense when the
internal categories were still the only servers of the
`RooSuperCategory`, but since the refactor in #5502 this is not the case
anymore. Therefore, a harmless error is printed now, which should be
avoided by setting the `mustReplaceAll` flag to `false`.
omazapa pushed a commit to omazapa/root that referenced this pull request Apr 13, 2023
In the `RooSimGenContex` and `RooSimSplitGenContex` classes, there is a
place where the servers of a derived category are redirected. The
`mustReplaceAll` flag was set to true here. This made sense when the
internal categories were still the only servers of the
`RooSuperCategory`, but since the refactor in root-project#5502 this is not the case
anymore. Therefore, a harmless error is printed now, which should be
avoided by setting the `mustReplaceAll` flag to `false`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants