Skip to content

Finding the Value of a Mult-arm Policy Tree and Gain (HTE) of Each Node #175

Open
@shafayetShafee

Description

@shafayetShafee

Hello, GRF labs team, thank you for building and maintaining ML tools for causal inference. I have recently started using the packages - {grf}, {policytree} and {maq} for a multi-arm treatment data. However, I need guidance on,

  1. How can I measure the benefit of a multi-arm policy tree, that is, what is the expected uplift if I allocate the treatments as suggested by the policy tree?

  2. How can I calculate the HTE (or uplift) for each suggested leaf of the multi-arm policytree?

Consider the following reprex from the {policytree} docs,

library(grf)
library(DiagrammeR)
library(policytree)

set.seed(1344)

n <- 5000 # number of subjects
p <- 10 # number of pre-treatment covariates
data <- gen_data_mapl(n, p)
head(data.frame(data)[1:6])
  action          Y        X.1        X.2          X.3        X.4
1      1  2.0905685 0.46864412 0.22784132 0.6269076453 0.71743214
2      1 -0.3635110 0.42637207 0.74486955 0.2168520605 0.25096641
3      2  1.1952524 0.70800587 0.03636718 0.0007285422 0.01940693
4      2  1.8925130 0.05426782 0.97814957 0.5647130643 0.45693730
5      2 -0.4199647 0.95205213 0.53340813 0.3522940478 0.44700733
6      0  2.6126626 0.69570475 0.77983017 0.9636860283 0.72605917
X <- data$X
Y <- data$Y
W <- as.factor(data$action)

multi.forest <- multi_arm_causal_forest(X, Y, W, seed = 1344)

Gamma.matrix <- double_robust_scores(multi.forest)
head(Gamma.matrix)
             0          1          2
[1,]  2.640066  2.1543402  1.2385549
[2,]  1.365158 -2.1944712  1.4777855
[3,] -1.276021  0.6792919  0.8427064
[4,]  2.511125  1.8532989  4.4776106
[5,]  2.708066  2.1123636 -6.4939299
[6,]  9.127422  1.8865874  1.4614259
train <- sample(1:n, 3500)
opt.tree <- policy_tree(X[train, ], Gamma.matrix[train, ], depth = 2)
opt.tree
policy_tree object 
Tree depth:  2 
Actions:  1: 0 2: 1 3: 2 
Variable splits: 
(1) split_variable: X5  split_value: 0.600704 
  (2) split_variable: X7  split_value: 0.334204 
    (4) * action: 3 
    (5) * action: 1 
  (3) split_variable: X7  split_value: 0.784358 
    (6) * action: 2 
    (7) * action: 3 

Get the policy for the held out data,

X.test <- X[-train, ]
pi_hat <- predict(opt.tree, X.test)
head(pi_hat)
[1] 3 1 2 2 2 2

Determining the value of the Policy Tree

Following the eqn. (11) from this paper, value of the policy, $\widehat V(\pi) = \frac{1}{n} \sum_{i=1}^{n}\langle \hat \pi(X_i), \widehat \Gamma_i \rangle$ where the policy $\pi(X_i)$ is a $K$-dimensional vector where the $k^{th}$ element is equal to 1 if arm $k$ is assigned, and 0 otherwise and $\widehat \Gamma_i$ is the estimated DR score (eqn. 13).

# creating the pi hat matrix
pi_mat <- matrix(
  data = 0, 
  nrow = length(pi_hat), 
  ncol = length(unique(pi_hat)) - 1
)

pi_mat[cbind(1:length(pi_hat), pi_hat - 1)] <- 1
colnames(pi_mat) <- c("action-1", "action-2")
head(pi_mat)
     action-1 action-2
[1,]        0        1
[2,]        0        0
[3,]        1        0
[4,]        1        0
[5,]        1        0
[6,]        1        0
# getting the DR scores
gamma_diff <- grf::get_scores(multi.forest)[, ,]
head(gamma_diff)
          1 - 0      2 - 0
[1,] -0.4857258 -1.4015111
[2,] -3.5596292  0.1126275
[3,]  1.9553132  2.1187276
[4,] -0.6578262  1.9664855
[5,] -0.5957024 -9.2019960
[6,] -7.2408342 -7.6659957
# calculating value
mean(pi_mat * gamma_diff[-train, ])
[1] 0.5233645
  • So can I say, “Expected benefit per subject of the above treatment allocation would be 0.52 units and total benefit would be 1000 * 0.52 = 520 units” ??

  • Am I on the right track to answering my first question?

  • Also, How can I reach the answer to my second question?

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions