Split ergm into a meta package and modules?

`ergm` is a big package. Based on the rough count in `NAMESPACE`, it currently exports 86 functions and declares 118 S3 methods, as well as implementing at least 205 terms, proposals, constraints, and references. `R CMD check` complains about its code size.

This means that the package takes a long time to build and test, no matter how small the change. Releasing a fix or enhancement for any part of `ergm` to CRAN requires testing the whole of `ergm` and its reverse-dependencies. Continuous integration does not provide immediate feedback, and every release to CRAN is a big todo.

In light of this, we may want to consider splitting `ergm` up into hierarchically dependent components. Based on some discussions, here is one possible split. In the following list, the later packages Depend, Import, and/or are Linking-To the earlier packages.

1. `ergm.core`: Core functions of `ergm`, including `ergm()` itself, `simulate()`, and the functions they need to run. `ergm.core` may be further split into two packages:
    1. `ergm.core.api`: The C API and the low-level R functions needed to initialise models and proposals and call the C code, such as the terms API, `ergm_model()`, `ergm.pl()`, `ergm_MCMC_sample()`, as well as the nodal attributes API.
    1. `ergm.core.ui`: Core front-end functions such as `ergm()` and `simulate()`, as well as functions involved in estimation.
1. `ergm.terms.core`: The terms, proposals, constraints, and references currently in `ergm`. (Basically, a big `ergm.userterms` package.)
1. `ergm.post`: Utilities used for postprocessing and diagnostic results, such as `mcmc.diagnostics()`, `gof()`, `predict.ergm()`, and perhaps `godfather()`.
1. `ergm`: A metapackage that Depends on the latest version of all of the above and contains few or no functions of its own but houses all of the vignettes. A typical end-user would still type `library(ergm)`.

The datasets can be housed in any of these, though `ergm` seems like a natural place.

Notably, while circular Depends and Imports are a problem, a package (e.g., `ergm.core`) can Suggest a package that Depends on it (e.g., `ergm.terms.core`), which it can load for the purposes of testing. For example, `ergm` currently Suggests `ergm.count`, which it uses to test the valued userterms API.

The actual process of splitting up a package is not particularly difficult, particularly with Roxygen managing the namespace and the documentation files, though it can be tedious. It consists of copying the `ergm` repository (with full history) and deleting the functions that do not belong in the particular subpackage. This is how `tergm` was split out of `ergm` and `rle` out of `statnet.common`.

One interesting question is whether `ergm` should reexport functions from the packages it Depends on. From the point of view of the end-user, it doesn't make a difference; but from the point of view of a developer depending on `ergm`, it does. The advantage of reexporting is that a developer can import from `ergm` without worrying where the function actually lives. This is not elegant, but it would certainly smooth transition and make their lives easier. A disadvantage is that it commits us to maintaining up to date reexports, though it may be possible to automatically generate the code to do this by scanning the `NAMESPACE` files of the Imported packages. Also, it is probably not practical to similarly "reexport" the C API.

This issue is not a high priority, but I believe it to be something worth doing in the long run, and so I am opening this ticket to flag the issue and record my current thoughts on the matter.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split ergm into a meta package and modules? #186

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Split ergm into a meta package and modules? #186

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions