Management of binary outcome and missing data using interaction terms in glm_weightit #74

kgkirgkiris · 2024-10-20T14:53:31Z

Hi Noah,

I want to start by expressing my sincere thanks, not only for this incredible package but also for everything you have done to make propensity score weighting and matching both accessible and easy to interpret. I have come across countless answers from you on StackExchange and GitHub, and I have learned so much from them. Your contribution has been invaluable. I am sure many others feel the same way. Thank you.

My questions concern estimating effects after weighting. I have a continuous treatment variable and several covariates for a binary outcome. May the proposed algorithm for fitting the outcome model:

fit <- lm_weightit(Y_C ~ splines::ns(Ac, df = 4) *
(X1 + X2 + X3 + X4 + X5 +
X6 + X7 + X8 + X9),
data = d, weightit = W)

be modified as follows in order to convert the binary outcome (Y_B) into continuous as "predicted probabilities of outcome"?

fit <- glm_weightit(Y_B ~ splines::ns(Ac, df = 4) *
(X1+ X2 + X3 + X4 + X5 +
X6 + X7 + X8 + X9),
data = d, weightit = W, family = binomial)

Is the abovementioned modifications enough to continue with the rest of the analysis or my approach is wrong?

An error I also face when using the default "ind" way of dealing with missing data is that when I include the interaction term in my fitting mode, I get this error:

Warning: (from glm()) glm.fit: fitted probabilities numerically 0 or 1 occurred
Error in cbind(psi_out(Bout, w, Y, Xout, SW, offset), psi_treat(Btreat, :
number of rows of matrices must match (see arg 2)

This is also the case when i use a binary treatment variable. It seems that this error does not occur when i remove the interaction term along with the covariates.

Thank you in advance for your time and support.
I am genuinely looking forward to your response.
I would also like to apologize if any of my questions come across as overly basic or elementary.

Kind regards,
Kostas

The text was updated successfully, but these errors were encountered:

ngreifer · 2024-10-20T15:12:04Z

Hi Kostas,

Thank you so much for the kind words about my packages and writing! I'm glad they have been helpful.

Your modification for the binary outcome is correct. Note that your confidence intervals might be outside of [0, 1]; there are ways to prevent this but they are a bit involved, so let me know if that's an issue for you.

Unfortunately, I have not thoroughly tested the performance of glm_weightit() with missing data. Because it calls glm(), it just deletes any missing data, which causes the problems you observed. You should not include any covariate with missingness in the outcome model. Even if that covariate is not part of the interaction, it will still cause your observations to be dropped, which may not be apparent in the output.

Noah

kgkirgkiris · 2024-10-20T15:43:39Z

Thank you very much for your kind and prompt response, and for your helpful insights.

Regarding the confidence intervals, your guidance on how to prevent them from falling outside the [0,1] range would be really helpful, especially since my dataset contains small percentages. I would appreciate any advice or methods you could share for addressing this issue.

Kostas

ngreifer · 2024-10-20T23:21:01Z

The code will look a bit esoteric, but here is how you would do it:

p <- avg_predictions(fit,
                     variables = list(Ac = values),
                     byfun = function(...) qnorm(mean(...)),
                     transform = pnorm)

What this does is first put the average predicted probabilities on an unbounded scale, on which standard errors and confidence intervals are estimated, and then transforms the estimates and confidence intervals back to the probability scale. You can replace qnorm() and pnorm() with qlogis() and plogis(), respectively. This would be a bit foreign to some audiences but it does have the nice feature of ensuring the confidence intervals are bounded. They are symmetric around the estimates on the unbounded scale rather than on the probability scale. Otherwise the estimates should be identical and the confidence intervals have the usual interpretation.

kgkirgkiris changed the title ~~Management of binary outcome and missing data using interaction terms in the glm_weightit~~ Management of binary outcome and missing data using interaction terms in glm_weightit Oct 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Management of binary outcome and missing data using interaction terms in glm_weightit #74

Management of binary outcome and missing data using interaction terms in glm_weightit #74

kgkirgkiris commented Oct 20, 2024

ngreifer commented Oct 20, 2024

kgkirgkiris commented Oct 20, 2024

ngreifer commented Oct 20, 2024

Management of binary outcome and missing data using interaction terms in glm_weightit #74

Management of binary outcome and missing data using interaction terms in glm_weightit #74

Comments

kgkirgkiris commented Oct 20, 2024

ngreifer commented Oct 20, 2024

kgkirgkiris commented Oct 20, 2024

ngreifer commented Oct 20, 2024