Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
169 changes: 169 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
# Copilot Instructions for RobinCar2

## Project Overview

`RobinCar2` is an R package for **robust covariate adjustment** in randomized clinical trials. It provides methods for estimating and inferring treatment effects under stratified randomization, aligned with the [FDA's final guidance on covariate adjustment](https://www.regulations.gov/docket/FDA-2019-D-0934).

### Core Functionality

| Function | Purpose |
|----------|---------|
| `robin_lm()` | Linear model covariate adjustment (ANCOVA) |
| `robin_glm()` | GLM-based covariate adjustment (logistic, Poisson, negative binomial) |
| `robin_surv()` | Survival analysis with covariate adjustment |

### Key Concepts

- **Treatment formulas**: Use `treatment ~ strata` syntax (e.g., `treatment ~ s1` or `treatment ~ pb(s1)` for permuted-block)
- **Randomization schemas**: `sr` (simple), `pb` (permuted-block), `ps` (Pocock-Simon)
- **Variance estimators**: `vcovG` (default, robust) and `vcovHC` (Huber-White, limited use cases)
- **Contrasts**: `"difference"`, `"risk_ratio"`, `"odds_ratio"`, `"log_risk_ratio"`, `"log_odds_ratio"`

## Project Structure

```
R/ # Source code
├── robin_lm.R # Linear model adjustment
├── robin_glm.R # GLM adjustment
├── survival.R # Survival analysis methods
├── utils.R # Helper functions (h_* prefix)
├── variance_*.R # Variance estimators
└── treatment_effect.R # Treatment effect calculations

tests/testthat/ # Unit tests (testthat v3)
data-raw/ # Scripts to generate package datasets
design/ # Design documents and specifications
vignettes/ # User-facing tutorials
```

## Code Style & Conventions

### Formatting (air.toml)

- **Line width**: 120 characters
- **Indentation**: 2 spaces
- **Use `air` formatter** for R code formatting

### Function Naming

- **Public API**: `robin_*()` for main user-facing functions
- **Internal helpers**: `h_*()` prefix (e.g., `h_get_vars()`, `h_interaction()`)
- **S3 methods**: Standard naming (e.g., `confint.robin_output`)

### Documentation (roxygen2)

```r
#' Title (one line)
#'
#' @param name (`type`) Description.
#' @param formula (`formula`) A formula of analysis.
#' @return Description of return value.
#' @export
#' @examples
#' robin_lm(y ~ treatment * s1, data = glm_data, treatment = treatment ~ s1)
```

- Use `@keywords internal` for non-exported helper functions
- Include type annotations in param descriptions: `(`type`)`
- Add `@export` tag for public functions

### Assertions

Use `checkmate` for input validation:

```r
assert_formula(formula)
assert_subset(all.vars(formula), names(data))
assert_function(contrast, args = c("x", "y"))
```

### Testing (testthat v3)

- Test file naming: `test-<function_name>.R`
- Use `expect_silent()`, `expect_error()`, `expect_warning()` for behavior tests
- Use snapshot tests in `tests/testthat/_snaps/` for complex outputs

```r
test_that("robin_glm works correctly", {

expect_silent(
robin_glm(y ~ treatment * s1, data = glm_data, treatment = treatment ~ s1, contrast = "difference")
)
})
```

## Developer Workflows

### Common Commands

```bash
# Run tests
Rscript -e "devtools::test()"

# Update documentation (regenerates man/ and NAMESPACE)
Rscript -e "devtools::document()"

# Build and check package
R CMD build .
R CMD check RobinCar2_*.tar.gz

# Build pkgdown site
Rscript -e "pkgdown::build_site()"

# Format code with air
air format R/
```

### Adding a New Feature

1. **Design first**: Review relevant documents in `design/` for architectural context
2. **Implement**: Create/modify files in `R/`, following existing patterns
3. **Document**: Add roxygen2 comments; run `devtools::document()`
4. **Test**: Add tests in `tests/testthat/test-<name>.R`
5. **Validate**: Run `devtools::test()` and `R CMD check`

### Key Dependencies

From `DESCRIPTION`:
- `checkmate` - Input validation
- `numDeriv` - Numerical derivatives (Jacobian)
- `MASS` - Negative binomial GLM
- `sandwich` - Robust variance estimation
- `survival` - Survival analysis

## Important Patterns

### Treatment Formula Parsing

The `h_get_vars()` function extracts treatment, strata, and randomization schema:

```r
# Input: treatment ~ pb(s1, s2)
# Output: list(treatment = "treatment", schema = "pb", strata = c("s1", "s2"))
```

### Return Objects

Main functions return `robin_output` objects with:
- `marginal_mean`: Counterfactual predictions
- `contrast`: Treatment effect estimates

### Variance Estimation

- `vcovG`: General robust variance (recommended)
- `vcovHC`: Huber-White (only for linear models without interactions, difference contrast)

## Common Pitfalls

- **Do NOT** use `vcovHC` with GLM or treatment-covariate interactions
- **Edit `README.Rmd`**, not `README.md` (regenerate with knitr)
- **Run `devtools::document()`** after modifying roxygen2 comments
- **Add new dependencies** to `DESCRIPTION` Imports/Suggests sections

## References

Key papers (cited in FDA guidance):
- Tsiatis et al. (2008) - ANCOVA theory
- Wang et al. (2021) - Model-robust inference
- Ye et al. (2022, 2023) - Stratified randomization
- Bannick et al. (2024) - Survival covariate adjustment
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Type: Package
Package: RobinCar2
Title: ROBust INference for Covariate Adjustment in Randomized Clinical Trials
Version: 0.2.2
Date: 2026-01-09
Version: 0.2.2.9000
Date: 2026-02-12
Authors@R:
c(
person("Liming", "Li", , "liming.li1@astrazeneca.com", role = c("aut", "cre"), comment = c(ORCID = "0009-0008-6870-0878")),
Expand Down
6 changes: 6 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# RobinCar2 0.2.2.9000

### Misc

* Added Biometric Bulletin vignette article.

# RobinCar2 0.2.2

### New features
Expand Down
147 changes: 147 additions & 0 deletions vignettes/articles/biometric_bulletin.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
---
title: "RobinCar2: ROBust INference for Covariate Adjustment in Randomized Clinical Trials"
author: |
| Dong Xi^1^, Marlena Bannick^2^, Gregory Chen^3^, Liming Li^4^,
| Daniel Sabanés Bové^5^, Ting Ye^2^, Yanyao Yi^6^
|
| ^1^Gilead Sciences, ^2^University of Washington, ^3^MSD, ^4^AstraZeneca,
| ^5^RCONIS, ^6^Eli Lilly and Company
date: "`r format(Sys.Date(), '%B %d, %Y')`"
output:
html_document:
toc: no
toc_float: true
number_sections: true
pdf_document:
toc: no
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(
echo = TRUE,
message = FALSE,
warning = FALSE,
fig.align = "center",
fig.width = 7,
fig.height = 5
)
```

# Introduction

Covariate adjustment is a powerful statistical technique that can increase efficiency in randomized clinical trials (RCTs) by reducing variability in treatment effect estimates. The U.S. Food and Drug Administration (FDA) finalized the guideline on covariate adjustment, providing recommendations and best practices for using these methods in drug development (FDA, 2023). However, a gap has existed between the extensive statistical literature on covariate adjustment and software that is easy to use and follows these best practices.

The **RobinCar2** R package, which stands for **ROB**ust **IN**ference for **C**ovariate **A**djustment in **R**andomized clinical trials, addresses this gap. It is a streamlined version of the original **RobinCar** package (Bannick et al., 2026a), designed with minimal dependencies and extensive validation for use in drug development, particularly for Good Practice (GxP) purposes. The package is supported by the ASA Biopharmaceutical Section Covariate Adjustment Scientific Working Group Software Subteam.

This paper provides an introduction to **RobinCar2**, covering its core functionality for three common outcome types in clinical trials: continuous, binary, and time-to-event outcomes. It also provides best practices of using **RobinCar** and **RobinCar2** packages for covariate adjustment.

# Covariate Adjustment in RCTs

Covariate adjustment leverages baseline variables to improve the precision of treatment effect estimates. Unlike traditional regression interpretations, covariate-adjusted estimators in RCTs target the same unconditional (marginal) treatment effect as unadjusted analyses, but with potentially smaller variance. This is because randomization ensures treatment assignment is independent of baseline covariates, allowing model-assisted approaches that are robust to model misspecification.

**RobinCar2** supports three covariate-adaptive randomization schemes:

- Simple randomization (`sr`): Subjects are randomly assigned to treatment groups without stratification.
- Permuted-block randomization (`pb`): Treatment assignments are balanced within blocks defined by stratification factors.
- Pocock-Simon minimization (`ps`): An adaptive method that minimizes imbalance across multiple stratification factors.

The package provides two variance estimation approaches:

- `vcovG`: The default heteroskedasticity-consistent variance estimator that accounts for covariate-adaptive randomization.
- `vcovHC`: The Huber-White sandwich estimator, which is appropriate for linear covariate adjustment and only when treatment-by-covariate interactions are not included in the model.

# Analysis of Continuous Outcomes

For continuous outcomes, **RobinCar2** provides the `robin_lm()` function, which fits a linear model and returns covariate-adjusted treatment effect estimates with robust inference. The following code fits an ANHECOVA (Analysis of Heterogeneous Covariance) model (Ye et al., 2023a), which includes the treatment assignment (`treatment`), the stratification factor (`s1`), the treatment-by-stratification interaction (`treatment * s1`), and a continuous covariate (`covar`). The randomization scheme is permuted-block randomization stratified by `s1`, specified as `treatment ~ pb(s1)`. The variance estimation method is `vcovG`.

```{r}
library(RobinCar2)

result_lm <- robin_lm(
y ~ treatment * s1 + covar,
data = glm_data,
treatment = treatment ~ pb(s1),
vcov = "vcovG"
)

print(result_lm)
```

The output has three main parts. The first part reiterates the input parameters and the model specification. The second part provides results on **Marginal Mean**, which includes estimated response means for each treatment group with standard errors and confidence intervals. The last part provides the **Contrast** results, including pairwise treatment comparisons (the difference in means) with test statistics and p-values. The Huber-White variance estimator can be applied when the linear model does not include treatment-by-stratification/covariate interactions, by specifying `vcov = "vcovHC"` (Rosenblum and van der Laan, 2009; Lin, 2013). The confidence interval of treatment effect contrasts can be obtained via the `confint()` function.

# Analysis of Binary and Count Outcomes

For binary and count outcomes, **RobinCar2** provides `robin_glm()`, which extends the framework to generalized linear models (Ye et al., 2023b, Bannick et al. 2025). The following code fits a logistic model (`family = binomial(link = "logit")`), which includes the treatment assignment (`treatment`), the stratification factor (`s1`), the treatment-by-stratification interaction (`treatment * s1`), and a continuous covariate (`covar`). The randomization scheme is permuted-block randomization stratified by `s1`, specified as `treatment ~ pb(s1)`. Currently, `vcovG` is the only supported method for variance estimation in generalized linear models.

```{r}
result_binary <- robin_glm(
y_b ~ treatment * s1 + covar,
data = glm_data,
treatment = treatment ~ pb(s1),
family = binomial(link = "logit"),
contrast = "difference"
)

print(result_binary)
```

The output of `robin_glm()` has a similar structure to `robin_lm()`. The default contrast for binary outcomes is the difference in probabilities (`contrast = "difference"`). **RobinCar2** supports other contrast functions, including risk ratio, odds ratio, and their log transformations (`"log_risk_ratio"`, `"log_odds_ratio"`). The confidence interval of treatment effect contrasts can be obtained via the `confint()` function. Any family argument handled by `glm()` can be used with `robin_glm()`.

# Analysis of Time-to-Event Outcomes

For survival outcomes, **RobinCar2** provides `robin_surv()`, which implements stratified and covariate-adjusted log-rank tests and hazard ratio estimation (Ye et al., 2024). The following code fits a stratified log-rank test and estimates hazard ratios, stratified by the factor `strata`. The treatment variable (`sex`) is specified via the `treatment` formula. The randomization scheme is permuted-block randomization stratified by `strata`, specified as `sex ~ pb(strata)`.

```{r}
result_tte <- robin_surv(
Surv(time, status) ~ 1 + strata(strata),
data = surv_data,
treatment = sex ~ pb(strata)
)

print(result_tte)
```

The output of `robin_surv()` has a similar structure to `robin_lm()` and `robin_glm()`. The **Contrast** section provides unconditional (marginal) hazard ratios for all pairwise treatment comparisons. The **Test** section provides the log-rank test results. The confidence interval of hazard ratios can be obtained via the `confint()` function.

# Best Practices of using **RobinCar** and **RobinCar2**

**RobinCar2** is a streamlined version of the original **RobinCar** package with the following characteristics (Bannick et al., 2026b):

| Feature | **RobinCar** | **RobinCar2** |
|---------|----------|-----------|
| Dependencies | More extensive | Minimal |
| Validation | Standard | GxP-ready |
| Methods | Comprehensive | Curated subset |
| Support | Research community | ASA BIOP Covariate Adjustment Working Group |

Methods included in **RobinCar2** have undergone additional validation and are recommended for interactions with regulatory agencies. Previous users of **RobinCar** should therefore consider transitioning to **RobinCar2** when possible.

# Conclusions

**RobinCar2** provides a validated, user-friendly implementation of covariate adjustment methods for randomized clinical trials. Its simple interface, based on familiar R formula syntax, makes it accessible to clinical trial statisticians while ensuring robust inference aligned with the FDA guideline. The package covers the most common outcome types encountered in clinical trials: continuous, binary, and time-to-event outcomes. Future development of **RobinCar2** may include additional variance estimation methods, the Mantel-Haenszel risk difference estimator (currently available in **RobinCar**), and bootstrap methods.

For more information, including additional vignettes and documentation, visit the package GitHub repository at [github.com/openpharma/RobinCar2](https://github.com/openpharma/RobinCar2).

# Acknowledgements

This package is supported by [ASA Biopharmaceutical Section Covariate Adjustment Scientific Working Group Software Subteam](https://carswg.github.io/subteam_software.html).

# References

Bannick M, Shao J, Liu J, Du Y, Yi Y, Ye T (2025). A General Form of Covariate Adjustment in Randomized Clinical Trials. *Biometrika*, 112(3) asaf029.

Bannick M, Qian Y, Ye T, Yi Y, Bian F (2026a). RobinCar: Robust Inference for Covariate Adjustment in Randomized Clinical Trials. R package version 1.1.0, https://CRAN.R-project.org/package=RobinCar.

Bannick M, Bian Y, Chen G, Li L, Qian Y, Sabanés Bové D, Xi D, Ye T, Yi Y (2026b). The RobinCar Family: R Tools for Robust Covariate Adjustment in Randomized Clinical Trials. *arXiv preprint*, arXiv:2601.14498.

Food and Drug Administration (2023). Adjusting for Covariates in Randomized Clinical Trials for Drugs and Biological Products: Final Guidance for Industry, https://www.fda.gov/media/148910/download.

Lin W (2013). Agnostic Notes on Regression Adjustments to Experimental Data: Reexamining Freedman's Critique. *Annals of Applied Statistics*, 7(1):295-318.

Rosenblum M, van der Laan MJ (2009). Using Regression Models to Analyze Randomized Trials: Asymptotically Valid Hypothesis Tests Despite Incorrectly Specified Models. *Biometrics*, 65(3):937-945.

Ye T, Shao J, Yi Y, Zhao Q (2023a). Toward Better Practice of Covariate Adjustment in Analyzing Randomized Clinical Trials. *Journal of the American Statistical Association*, 118(544):2370-2381.

Ye T, Bannick M, Yi Y, Shao J (2023b). Robust Variance Estimation for Covariate-Adjusted Unconditional Treatment Effect in Randomized Clinical Trials with Binary Outcomes. *Statistical Theory and Related Fields*, 7(2):159-163.

Ye T, Shao J, Yi Y (2024). Covariate-Adjusted Log-Rank Test: Guaranteed Efficiency Gain and Universal Applicability. *Biometrika*, 111(2):691-705.
Loading