Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Contributing to `simChef`

<!-- This CONTRIBUTING.md is adapted from https://gist.github.com/peterdesmet/e90a1b0dc17af6c12daf6e8b2f044e7c -->

Thank you for considering contributing to `simChef`!

[repo]: https://github.com/Yu-Group/simChef
[issues]: https://github.com/Yu-Group/simChef/issues
[new_issue]: https://github.com/Yu-Group/simChef/issues/new
[website]: https://yu-group.github.io/simChef/
<!-- [citation]: https://our_org.github.io/`simChef`/authors.html -->
[email]: mailto:ttang4@nd.edu

## How you can contribute

There are several ways you can contribute to this project. If you want to know more about why and how to contribute to open source projects like this one, see this [Open Source Guide](https://opensource.guide/how-to-contribute/).

<!-- ### Share the love ❤️

Think `simChef` is useful? Let others discover it, by telling them in person, via Twitter or a blog post.

Using `simChef` for a paper you are writing? Consider [citing it][citation]. -->

### Ask a question ⁉️

Using `simChef` and got stuck? Browse the [documentation][website] to see if you can find a solution. Still stuck? Post your question as an [issue on GitHub][new_issue]. While we cannot offer user support, we'll try to do our best to address it, as questions often lead to better documentation or the discovery of bugs.

Want to ask a question in private? Contact the package maintainer by [email][email].

### Propose an idea 💡

Have an idea for a new `simChef` feature? Take a look at the [documentation][website] and [issue list][issues] to see if it isn't included or suggested yet. If not, suggest your idea as an [issue on GitHub][new_issue]. While we can't promise to implement your idea, it helps to:

* Explain in detail how it would work.
* Keep the scope as narrow as possible.

See below if you want to contribute code for your idea as well.

### Report a bug 🐛

Using `simChef` and discovered a bug? That's annoying! Don't let others have the same experience and report it as an [issue on GitHub][new_issue] so we can fix it. A good bug report makes it easier for us to do so, so please include:

* Your operating system name and version (e.g. Mac OS 10.13.6).
* Any details about your local setup that might be helpful in troubleshooting.
* Detailed steps to reproduce the bug.

### Improve the documentation 📖

Noticed a typo on the website? Think a function could use a better example? Good documentation makes all the difference, so your help to improve it is very welcome!

#### The website

[This website][website] is generated with [`pkgdown`](http://pkgdown.r-lib.org/). That means we don't have to write any html: content is pulled together from documentation in the code, vignettes, [Markdown](https://guides.github.com/features/mastering-markdown/) files, the package `DESCRIPTION` and `_pkgdown.yml` settings. If you know your way around `pkgdown`, you can [propose a file change](https://help.github.com/articles/editing-files-in-another-user-s-repository/) to improve documentation. If not, [report an issue][new_issue] and we can point you in the right direction.

#### Function documentation

Functions are described as comments near their code and translated to documentation using [`roxygen2`](https://klutometis.github.io/roxygen/). If you want to improve a function description:

1. Go to `R/` directory in the [code repository][repo].
2. Look for the file with the name of the function.
3. [Propose a file change](https://help.github.com/articles/editing-files-in-another-user-s-repository/) to update the function documentation in the roxygen comments (starting with `#'`).

### Contribute code 📝

Care to fix bugs or implement new functionality for `simChef`? Awesome! 👏 Have a look at the [issue list][issues] and leave a comment on the things you want to work on. See also the development guidelines below.

## Development guidelines

We try to follow the [GitHub flow](https://guides.github.com/introduction/flow/) for development.

1. Fork [this repo][repo] and clone it to your computer. To learn more about this process, see [this guide](https://guides.github.com/activities/forking/).
2. If you have forked and cloned the project before and it has been a while since you worked on it, [pull changes from the original repo](https://help.github.com/articles/merging-an-upstream-repository-into-your-fork/) to your clone by using `git pull upstream main`.
3. Open the RStudio project file (`.Rproj`).
4. Make your changes:
* Write your code.
* Test your code (bonus points for adding unit tests).
* Document your code (see function documentation above).
* Check your code with `devtools::check()` and aim for 0 errors and warnings.
5. Commit and push your changes.
6. Submit a [pull request](https://guides.github.com/activities/forking/#making-a-pull-request).
8 changes: 4 additions & 4 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ URL: https://yu-group.github.io/simChef
BugReports: https://github.com/Yu-Group/simChef/issues
Imports:
data.table,
dplyr,
dplyr (>= 1.1.0),
future,
future.apply,
knitr,
Expand Down Expand Up @@ -68,8 +68,8 @@ Remotes:
Yu-Group/vthemes
Config/testthat/edition: 3
Encoding: UTF-8
Roxygen: list(markdown = TRUE, r6 = FALSE)
RoxygenNote: 7.2.3
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
Collate:
'dgp.R'
'docs.R'
Expand All @@ -85,10 +85,10 @@ Collate:
'globals.R'
'init-dir.R'
'method.R'
'reexport-magrittr.R'
'run-tests.R'
'signals.R'
'use_templates.R'
'utils-pipe.R'
'utils-rmd.R'
'utils-templates.R'
'visualizer-lib-feature-selection.R'
Expand Down
175 changes: 123 additions & 52 deletions R/dgp.R
Original file line number Diff line number Diff line change
@@ -1,22 +1,126 @@
#' \code{R6} class representing a data-generating process.
# NOTE: R6 methods can't use the `@inheritParams` tag. If you update the
# `@param` tags below then be sure to manually replace the corresponding tags
# above `DGP$initialize()`.

#' Create a new `DGP` (data-generating process)
#'
#' @name create_dgp
#'
#' @description Create a [DGP] which can `generate()` data in an [Experiment].
#'
#' @param .dgp_fun The user-defined data-generating process function.
#' @param .name (Optional) An optional name for the `DGP`, helpful for later
#' identification.
#' @param ... User-defined default arguments to pass to `.dgp_fun()` when
#' `DGP$generate()` is called.
#'
#' @return A new [DGP] object.
#'
#' @examples
#' # create an example DGP function
#' dgp_fun <- function(n, beta, rho, sigma) {
#' cov_mat <- matrix(c(1, rho, rho, 1), byrow = TRUE, nrow = 2, ncol = 2)
#' X <- MASS::mvrnorm(n = n, mu = rep(0, 2), Sigma = cov_mat)
#' y <- X %*% beta + rnorm(n, sd = sigma)
#' return(list(X = X, y = y))
#' }
#'
#' # create DGP (with uncorrelated features)
#' dgp <- create_dgp(.dgp_fun = dgp_fun,
#' .name = "Linear Gaussian DGP",
#' # additional named parameters to pass to dgp_fun() by default
#' n = 50, beta = c(1, 0), rho = 0, sigma = 1)
#'
#' print(dgp)
#'
#' data_uncorr <- dgp$generate()
#' cor(data_uncorr$X)
#'
#' data_corr <- dgp$generate(rho = 0.7)
#' cor(data_corr$X)
#'
#' @export
create_dgp <- function(.dgp_fun, .name = NULL, ...) {
DGP$new(.dgp_fun, .name, ...)
}

#' `R6` class representing a data-generating process
#'
#' @name DGP
#'
#' @docType class
#'
#' @description A data-generating process which will be used in the
#' \code{Experiment} to **generate** data.
#' @description `DGP`, a data-generating process which can `generate()` data in
#' an [Experiment].
#'
#' Generally speaking, users won't directly interact with the `DGP` R6
#' class, but instead indirectly through [create_dgp()] and the
#' following `Experiment` helpers:
#'
#' - [add_dgp()]
#' - [update_dgp()]
#' - [remove_dgp()]
#' - [get_dgps()]
#' - [generate_data()]
#'
#' @seealso [create_dgp]
#'
#' @examples
#' # create an example DGP function
#' dgp_fun <- function(n, beta, rho, sigma) {
#' cov_mat <- matrix(c(1, rho, rho, 1), byrow = TRUE, nrow = 2, ncol = 2)
#' X <- MASS::mvrnorm(n = n, mu = rep(0, 2), Sigma = cov_mat)
#' y <- X %*% beta + rnorm(n, sd = sigma)
#' return(list(X = X, y = y))
#' }
#'
#' # create DGP (with uncorrelated features)
#' dgp <- DGP$new(.dgp_fun = dgp_fun,
#' .name = "Linear Gaussian DGP",
#' # additional named parameters to pass to dgp_fun() by default
#' n = 50, beta = c(1, 0), rho = 0, sigma = 1)
#'
#' print(dgp)
#'
#' @template dgp-template
#' data_uncorr <- dgp$generate()
#' cor(data_uncorr$X)
#'
#' data_corr <- dgp$generate(rho = 0.7)
#' cor(data_corr$X)
#'
#' @export
DGP <- R6::R6Class(
classname = 'DGP',

private = list(
.dgp_fun_formals = NULL
),

public = list(

#' @field name The name of the `DGP`.
name = NULL,

#' @field dgp_fun The user-defined data-generating process function.
dgp_fun = NULL,

#' @field dgp_params A (named) list of user-defined default arguments
#' to input into the data-generating process function.
dgp_params = NULL,

# NOTE: R6 methods can't use the `@inheritParams` tag. If you want to update
# the `@param` tags below, do so in the `create_dgp()` docs above and
# then copy-paste the corresponding `@param` tags below.

#' @description Initialize a new `DGP` object.
#'
#' @param .dgp_fun The user-defined data-generating process function.
#' @param .name (Optional) An optional name for the `DGP`, helpful for later
#' identification.
#' @param ... User-defined default arguments to pass to `.dgp_fun()` when
#' `DGP$generate()` is called.
#'
#' @return A new instance of `DGP`.
initialize = function(.dgp_fun, .name = NULL, ...) {
self$dgp_fun <- .dgp_fun
self$name <- .name
Expand All @@ -27,15 +131,16 @@ DGP <- R6::R6Class(
}
private$.dgp_fun_formals <- formalArgs(self$dgp_fun)
},
# @description Generate data from a \code{DGP} with the provided \code{DGP}
# parameters.
#
# @param ... Arguments to pass into \code{dgp_fun()} that will overwrite
# the initialized \code{DGP} parameters. If no additional arguments are
# provided, data will be generated using \code{dgp_fun()} with the
# parameters that were set when \code{DGP$new()} was called.
#
# @return Result of \code{dgp_fun()}.

#' @description Generate data from a `DGP`.
#'
#' @param ... User-defined arguments to pass into `DGP$dgp_fun()` that will
#' overwrite the initialized `DGP` parameters. If no additional arguments
#' are provided, data will be generated using `DGP$dgp_fun()` with the
#' parameters that were set when `DGP$new()` was called.
#'
#' @return Result of `DGP$dgp_fun()`. If the result is not a list,
#' it will be coerced to a list.
generate = function(...) {
dgp_params <- self$dgp_params
new_dgp_params <- rlang::list2(...)
Expand Down Expand Up @@ -68,10 +173,11 @@ DGP <- R6::R6Class(

return(data_list)
},
# @description Print a \code{DGP} in a nice format, showing the
# \code{DGP}'s name, function, and parameters.
#
# @return The original \code{DGP} object.

#' @description Print a `DGP` in a nice format, showing the
#' `DGP`'s name, function, and parameters.
#'
#' @return The original `DGP` object, invisibly.
print = function() {
if (is.null(self$name)) {
cat("DGP Name: NULL \n")
Expand All @@ -87,38 +193,3 @@ DGP <- R6::R6Class(
}
)
)

#' Create a new \code{DGP} (data-generating process).
#'
#' @name create_dgp
#'
#' @param .dgp_fun The data-generating process function.
#' @param .name (Optional) The name of the \code{DGP}.
#' @param ... Arguments to pass into \code{.dgp_fun()}.
#'
#' @return A new instance of \code{DGP}.
#'
#' @examples
#' # create an example DGP function
#' dgp_fun <- function(n, beta, rho, sigma) {
#' cov_mat <- matrix(c(1, rho, rho, 1), byrow = T, nrow = 2, ncol = 2)
#' X <- MASS::mvrnorm(n = n, mu = rep(0, 2), Sigma = cov_mat)
#' y <- X %*% beta + rnorm(n, sd = sigma)
#' return(list(X = X, y = y))
#' }
#'
#' # create DGP (with uncorrelated features)
#' dgp_uncorr <- create_dgp(.dgp_fun = dgp_fun,
#' .name = "Uncorrelated Linear Gaussian DGP",
#' # additional named parameters to pass to dgp_fun()
#' n = 200, beta = c(1, 0), rho = 0, sigma = 1)
#' # create DGP (with correlated features)
#' dgp_corr <- create_dgp(.dgp_fun = dgp_fun,
#' .name = "Correlated Linear Gaussian DGP",
#' # additional named parameters to pass to dgp_fun()
#' n = 200, beta = c(1, 0), rho = 0.7, sigma = 1)
#'
#' @export
create_dgp <- function(.dgp_fun, .name = NULL, ...) {
DGP$new(.dgp_fun, .name, ...)
}
Loading