Skip to content

Conversation

@guitargeek
Copy link
Contributor

@guitargeek guitargeek commented Jan 23, 2023

This is a backport of all the relevant RooFit PRs that were recently merged to master to v6-28-00-patches (in the right order, to not have the commit history diverge too much).

  1. [math] KahanSum release notes for 6.28 #12089
  2. [RF] Fix problems with dataset generation from a nested RooSimultaneous with proto data #12022
    Commits 3 and 4 that have not been backported yet, as the next PR in this list depends on that change (everything is backported from that PR now except the last commit)
  3. [RF] New way to fix compute graph with RooAbsArg::compileForNormSet() #12079
  4. [RF] Improve implementation of RooAbsArg::redirectServers() #12040

Related to #11856.

@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/soversion, ROOT-performance-centos8-multicore/cxx17, ROOT-ubuntu18.04/nortcxxmod, ROOT-ubuntu2004/python3, mac12/noimt, mac11/cxx14, windows10/cxx14
How to customize builds

@phsft-bot
Copy link

Build failed on ROOT-ubuntu18.04/nortcxxmod.
Running on sft-ubuntu-1804-2.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2023-01-23T14:51:01.592Z] FAILED: roofit/roofitcore/CMakeFiles/RooFitCore.dir/src/RooSimSplitGenContext.cxx.o
  • [2023-01-23T14:51:02.520Z] /mnt/build/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooSimSplitGenContext.cxx:116:59: error: no matching function for call to ‘RooArgSet::snapshot(RooArgSet*&, bool)’
  • [2023-01-23T14:51:02.520Z] /mnt/build/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooSimSplitGenContext.cxx:120:59: error: request for member ‘find’ in ‘((RooSimSplitGenContext*)this)->RooSimSplitGenContext::_idxCatSet’, which is of pointer type ‘RooArgSet*’ (maybe you meant to use ‘->’ ?)

@phsft-bot
Copy link

Build failed on mac12/noimt.
Running on macphsft17.dyndns.cern.ch:/Users/sftnight/build/jenkins/workspace/root-pullrequests-build
See console output.

Errors:

  • [2023-01-23T14:57:43.014Z] FAILED: roofit/roofitcore/CMakeFiles/RooFitCore.dir/src/RooSimSplitGenContext.cxx.o
  • [2023-01-23T14:57:44.380Z] /Users/sftnight/build/jenkins/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooSimSplitGenContext.cxx:116:34: error: no matching member function for call to 'snapshot'
  • [2023-01-23T14:57:44.380Z] /Users/sftnight/build/jenkins/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooSimSplitGenContext.cxx:120:58: error: member reference type 'RooArgSet *' is a pointer; did you mean to use '->'?

@phsft-bot
Copy link

Build failed on ROOT-ubuntu2004/python3.
Running on root-ubuntu-2004-1.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2023-01-23T15:04:02.957Z] FAILED: roofit/roofitcore/CMakeFiles/RooFitCore.dir/src/RooSimSplitGenContext.cxx.o
  • [2023-01-23T15:04:03.580Z] /home/sftnight/build/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooSimSplitGenContext.cxx:116:59: error: no matching function for call to ‘RooArgSet::snapshot(RooArgSet*&, bool)’
  • [2023-01-23T15:04:03.581Z] /home/sftnight/build/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooSimSplitGenContext.cxx:120:59: error: request for member ‘find’ in ‘((RooSimSplitGenContext*)this)->RooSimSplitGenContext::_idxCatSet’, which is of pointer type ‘RooArgSet*’ (maybe you meant to use ‘->’ ?)

@phsft-bot
Copy link

Build failed on windows10/cxx14.
Running on null:C:\build\workspace\root-pullrequests-build
See console output.

Errors:

  • [2023-01-23T16:21:45.717Z] C:\build\workspace\root-pullrequests-build\root\roofit\roofitcore\src\RooSimSplitGenContext.cxx(116,5): error C2664: 'bool RooArgSet::snapshot(RooAbsCollection &,bool) const': cannot convert argument 1 from 'RooArgSet *' to 'RooAbsCollection &' [C:\build\workspace\root-pullrequests-build\build\roofit\roofitcore\RooFitCore.vcxproj]

guitargeek and others added 5 commits January 23, 2023 18:01
Plus some code modernization of the gen context implementations
In the `RooSimultaneous::genContext` function, the logic that figures
out which `RooAbsGenContext` implementation to instantiate needs to know
the list of all category components. The way to figure this out depends
on the type of the index category, and this logic is also reused in the
`RooAbsGenContext` implementations for the RooSimultaneous. That's why
it is now factored out into a protected function called
`RooSimultaneous::flattenCatList()`.
There was an obvious place in a unit test where `BatchMode("off")` was
required in createNLL, because the created NLL was compared to another
one with `BatchMode("cpu")`.
For the new RooFit BatchMode, the model pdf had to be "compiled" for a
fixed normalization set before being evaluated by the RooFit driver.

This step was implemented in a rather hacky way, because it only became
clear gradually how important this step is, as many changes need to be
done to the computation graph before it can be used by the BatchMode.

It started by "unrolling" the normalization integrals such that they are
separate nodes that can be evaluated by the RooFit driver.

Then, it also covered the transformation of every RooProdPdf into a new
object that exposes the full intenal computation graph that is cached by
the RooProdPdf for a given norm set.

These two changes where done in separate passes through the compute
graph, and there was also a third pass to figure out dependencies and
normalization sets in the graph, and some more recursive passes the
attach new servers.

This all became very compilcated and hard to debug, as the separate
passes through the graph interfered with each other as well.

With all the lessons learned, this commit suggests a new more general
way to compile a model for a given normalization set.

It is now done in a **single recursive pass through the graph** via the
newly-proposed function
`RooAbsArg::compileForNormSet(RooAbsArg const& normSet, context)`.

It's job is to make a clone of the RooAbsArg where the normalization set
is fixed to `normSet`, given some context.

Now, the details of unrolling the integrals of a RooAbsPdf and replacing
the RooProdPdfs can be implemented in the overload for this function of
the specific class. There is even an overload for the RooSimultaneous
now, which contains all the logic of making the RooSimultaenous ready
for the NLL creation.

There are still some rough edges in the new interface, but it is already
a big step forward that will make debugging much easier because it's now
much less compilcated to understand how the computation graph is set up
by the BatchMode.

Also, this makes the speed of NLL creation with the BatchMode on par
with the speed of `createNLL()` without BatchMode.

Finally, this is also the groundwork for some future RooFit
developments, like the analytical convolution support for the GPU. To
implement this, it will also be beneficial to compile the analytical
convolution classes for a given normalization set, which can now be done
in an easy modular way.
@phsft-bot
Copy link

Build failed on ROOT-performance-centos8-multicore/cxx17.
Running on olbdw-01.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See console output.

Errors:

  • [2023-01-23T17:03:32.700Z] /data/sftnight/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooSimSplitGenContext.cxx:116:59: error: no matching function for call to ‘RooArgSet::snapshot(RooArgSet*&, bool)’
  • [2023-01-23T17:03:32.700Z] /data/sftnight/workspace/root-pullrequests-build/root/roofit/roofitcore/src/RooSimSplitGenContext.cxx:120:59: error: request for member ‘find’ in ‘((RooSimSplitGenContext*)this)->RooSimSplitGenContext::_idxCatSet’, which is of pointer type ‘RooArgSet*’ (maybe you meant to use ‘->’ ?)

@guitargeek guitargeek force-pushed the v6-28-00-patches_roofit_backports_7 branch from 61b5714 to 41fbe20 Compare January 23, 2023 17:08
@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/soversion, ROOT-performance-centos8-multicore/cxx17, ROOT-ubuntu18.04/nortcxxmod, ROOT-ubuntu2004/python3, mac12/noimt, mac11/cxx14, windows10/cxx14
How to customize builds

@phsft-bot
Copy link

Build failed on ROOT-ubuntu18.04/nortcxxmod.
Running on sft-ubuntu-1804-2.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on mac12/noimt.
Running on macphsft18.dyndns.cern.ch:/Users/sftnight/build/jenkins/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on ROOT-ubuntu2004/python3.
Running on root-ubuntu-2004-3.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

The `RooAbsArg::replaceServer()` function is quite dangerous to use,
because it leaves your RooFit objects in an invalid state. See for
example this code:

```
   RooRealVar x("x", "x", -10, 10);
   RooRealVar mean("mean", "mean of gaussian", 1, -10, 10);
   RooRealVar sigma("sigma", "width of gaussian", 1, 0.1, 10);

   RooGaussian gauss("gauss", "gaussian PDF", x, mean, sigma);

   RooRealVar mean2(mean);

   gauss.replaceServer(mean, mean2, true, false);
   gauss.Print("v");

   std::cout << "x    : " << &gauss.getX() << std::endl;
   std::cout << "mean : " << &gauss.getMean() << std::endl;
   std::cout << "sigma: " << &gauss.getSigma() << std::endl;
```

Here, the proxy for `mean` will still point to the original `mean`, but
the server was redirected to the copy `mean2`. This is dangerous, and
desyncing of the proxy and server list are actually the underlying
reason for a set of RooFit problems.

The safter `RooAbsArg::redirectServers()` should always be used,
becauese that one is also updating the proxies. Therefore, the
`replaceServer()` interface is now marked as dangerous everywhere
possible: in a printout when you use it, in the docs, and with the
`R__SUGGEST_ALTERNATIVE` macro.

Internally, the `replaceServer()` was also used in `redirectServers()`.
But this was also causing problems: `replaceServer()` always adds the
new server at the end of the server list, which means the list gets
reordered. This can confuse usercode that rely on the server list being
ordered (yes, that's not a good idea anyway, but there are many codes
that do this). This reordering can also be seein in the example code
above.

Therefore, the `redirectServers()` function is now rewritten to replace
the server without changing its position in the server list. This also
means that the original server list doesn't need to be copied, as not
iterators are invalidated.

Furthermore, the `redirectServers()` is more optimized now. Before, it
redundantly figured out whether a server was a value and/or shape
server. Now, this is figured out only once when removing the original
server from the client.

In summary: this PR makes RooFit code safer and faster by changing
`RooAbsArg::redirectServers()`.
It doesn't make sense to clear the servers in the beginning of the
constructor when no servers were adeed yet.
@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/soversion, ROOT-performance-centos8-multicore/cxx17, ROOT-ubuntu18.04/nortcxxmod, ROOT-ubuntu2004/python3, mac12/noimt, mac11/cxx14, windows10/cxx14
How to customize builds

@guitargeek guitargeek merged commit 6ab1508 into root-project:v6-28-00-patches Jan 24, 2023
@guitargeek guitargeek deleted the v6-28-00-patches_roofit_backports_7 branch January 24, 2023 07:00
@phsft-bot
Copy link

Build failed on mac11/cxx14.
Running on macphsft23.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2023-01-24T11:31:50.768Z] CMake Error at /Users/sftnight/build/workspace/root-pullrequests-build/rootspi/jenkins/root-build.cmake:1138 (message):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants