Unnecessary overwriting in `mice.impute.2l.bin()`?

Forgive me if I am missing something, but I think I found a very minor issue in the following section of `mice.impute.2l.bin()`:
```
    clmis <- x[wy, clust]
    for (i in clmis) {
        bi.star <- t(MASS::mvrnorm(n = 1L, mu = rep(0, nrow(psi.star)), 
            Sigma = psi.star))
        idx <- wy & (x[, clust] == i)
        logit <- X[idx, , drop = FALSE] %*% beta.star + Z[idx, 
            , drop = FALSE] %*% matrix(bi.star, ncol = 1)
        vec <- rbinom(nrow(logit), 1, as.vector(1/(1 + exp(-logit))))
        if (is.factor(y)) {
            vec <- factor(vec, c(0, 1), levels(y))
        }
        y[idx] <- vec
    }
```
Each `i` in the `for` loop is a level-1 observations value of the level-2 cluster variable `clust`. However, the result of each iteration is to assign a vector of imputed values `vec` to `y[idx]`, which contains every level-1 observation in cluster `i`. If there are, say, 1000 observations in `i`, then `y[idx]` is unnecessarily overwritten 999 times. I think this could be avoided by changing the first line to `clmis <- unique(x[wy, clust])`. Not a big deal, but perhaps a little more efficient.

As an aside, I noticed this issue because I rewrote this section of `mice.impute.2l.bin()` so that instead of drawing each cluster intercept `bi.star` anew from its marginal distribution, it only does so for clusters that were dropped from the imputation model because they have no observed values of `y`. For clusters with observed values, the posterior prediction conditions on the random effect estimate from the imputation model. This is contrary to what is described in [this paper](https://doi.org/10.1214/18-STS646) (p. 163), which does not include random effects among the parameters conditioned on. However, it struck me as more natural than acting as if the missing observations in a partially observed cluster were independent of the observed observations in the same cluster. Of course, if this idea is wrong-headed, I'd be happy to be corrected. Here's the relevant code implementing my approach:
```
    for (i in clmis) {
        ri <- rownames(rancoef) == i    # Which random coefficient is cluster i's?
        if (any(ri)) {                  # If any coefficient is i's...
            bi.star = t(rancoef[ri, ])  # ...use it for imputation.
        } else {                        # If not...
            bi.star <-                  # ...draw a new one.
                t(MASS::mvrnorm(n = 1L, mu = rep(0, nrow(psi.star)), 
                                Sigma = psi.star))
        }
```

Thanks,
Devin


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unnecessary overwriting in `mice.impute.2l.bin()`? #686

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unnecessary overwriting in mice.impute.2l.bin()? #686

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Unnecessary overwriting in `mice.impute.2l.bin()`? #686