Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weighting vs. Matching #76

Open
rduesing opened this issue Nov 18, 2024 · 2 comments
Open

Weighting vs. Matching #76

rduesing opened this issue Nov 18, 2024 · 2 comments

Comments

@rduesing
Copy link

Dear Dr. Greifer and all!

I was trying to learn more about PS Weighting (with WeightIt package) and Matching (with MatchIt package), but I am puzzled by a result, where I tried to estimate the ATE with both approaches. I hope you can help me out.

As far as I understood the references of the MatchIt package, the option "full matching" uses all cases and assigns weights to each individual. Since all subjects are used, only weights are assiged and it is explicitly possible calculate the ATE, I thought the results should be very similiar to the ATE weighting with the WeighIt package. The authors of the MatchIt package also say: "...and these weights then function like propensity score weights and can be used to estimate a weighted treatment effect...".

Now I tried both approaches with the sample dataset lalonde from the cobal package, but it turned out to be completely different.

Please see the code attached:

library(MatchIt)
library(WeightIt)
library(cobalt)
library(tidyverse)
library(marginaleffects)


data("lalonde")


# Full PS matching on a logit PS
PSM_ATE <- matchit(
  treat ~ age + educ + race + married +
    nodegree + re74 + re75,
  data = lalonde,
  method = "full",
  estimand = "ATE",
  distance = "glm",
  link = "logit")
PSM_ATE

dat_PSM_ATE <- match.data(PSM_ATE)

# Full Matching but ATE estimand different weights
lm_PSM_ATE <- lm(re78 ~ treat, 
                 data = dat_PSM_ATE, 
                 weights = weights)

avg_comparisons(lm_PSM_ATE,
                variables = "treat",
                vcov = ~subclass)

avg_predictions(lm_PSM_ATE,
                variables = "treat",
                vcov = ~subclass)


###########################################
# PS weighting for the ATE with a logistic regression PS
W_ATE <- weightit(
  treat ~ age + educ + race + married +
    nodegree + re74 + re75,
  data = lalonde,
  method = "glm", 
  estimand = "ATE")

# Linear model with covariates
fit_ATE <- lm_weightit(re78 ~ treat,
                       data = lalonde, 
                       weightit = W_ATE)

avg_comparisons(fit_ATE, variables = "treat")
avg_predictions(fit_ATE, variables = "treat")

The estimate for the ATE with the PSM method is:

Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
-253 1204 -0.21 0.834 0.3 -2612 2106

treat Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
0 6228 402 15.51 <0.001 177.8 5441 7016
1 5976 985 6.07 <0.001 29.5 4046 7906

but the ATE with the PS weighting method is:

Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
386 3138 0.123 0.902 0.1 -5763 6536

treat Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
0 6418 434 14.79 <0.001 162.0 5567 7268
1 6804 3165 2.15 0.0316 5.0 600 13008

I hope you can help me out. What am I missing or doing wrong here? What did I misunderstand?
Many thanks in advance for any help!!
Rainer

@ngreifer
Copy link
Owner

First, I'm unable to replicate your results using the most recent versions of each package. See my reprex below (in the future please use reprex to help format the code):

library(MatchIt)
library(WeightIt)
library(cobalt)
#>  cobalt (Version 4.5.5, Build Date: 2024-04-02)
library(marginaleffects)


data("lalonde")


# Full PS matching on a logit PS
PSM_ATE <- matchit(
  treat ~ age + educ + race + married +
    nodegree + re74 + re75,
  data = lalonde,
  method = "full",
  estimand = "ATE",
  distance = "glm",
  link = "logit")
PSM_ATE
#> A `matchit` object
#>  - method: Optimal full matching
#>  - distance: Propensity score             - estimated with logistic regression
#>  - number of obs.: 614 (original), 614 (matched)
#>  - target estimand: ATE
#>  - covariates: age, educ, race, married, nodegree, re74, re75

dat_PSM_ATE <- match.data(PSM_ATE)

# Full Matching but ATE estimand different weights
lm_PSM_ATE <- lm(re78 ~ treat, 
                 data = dat_PSM_ATE, 
                 weights = weights)

avg_comparisons(lm_PSM_ATE,
                variables = "treat",
                vcov = ~subclass)
#> 
#>  Estimate Std. Error      z Pr(>|z|)   S 2.5 % 97.5 %
#>      -202       1901 -0.106    0.915 0.1 -3928   3524
#> 
#> Term: treat
#> Type:  response 
#> Comparison: mean(1) - mean(0)
#> Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted

avg_predictions(lm_PSM_ATE,
                variables = "treat",
                vcov = ~subclass)
#> 
#>  treat Estimate Std. Error    z Pr(>|z|)    S 2.5 % 97.5 %
#>      0     6288        879 7.15   <0.001 40.1  4565   8011
#>      1     6086       1155 5.27   <0.001 22.8  3821   8350
#> 
#> Type:  response 
#> Columns: treat, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high


###########################################
# PS weighting for the ATE with a logistic regression PS
W_ATE <- weightit(
  treat ~ age + educ + race + married +
    nodegree + re74 + re75,
  data = lalonde,
  method = "glm", 
  estimand = "ATE")

# Linear model with covariates
fit_ATE <- lm_weightit(re78 ~ treat,
                       data = lalonde, 
                       weightit = W_ATE)

avg_comparisons(fit_ATE, variables = "treat")
#> 
#>  Estimate Std. Error     z Pr(>|z|)   S 2.5 % 97.5 %
#>       225        876 0.256    0.798 0.3 -1493   1942
#> 
#> Term: treat
#> Type:  probs 
#> Comparison: mean(1) - mean(0)
#> Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted
avg_predictions(fit_ATE, variables = "treat")
#> 
#>  treat Estimate Std. Error     z Pr(>|z|)     S 2.5 % 97.5 %
#>      0     6423        353 18.18   <0.001 242.8  5730   7115
#>      1     6648        813  8.18   <0.001  51.6  5054   8241
#> 
#> Type:  probs 
#> Columns: treat, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high

Created on 2024-11-18 with reprex v2.1.1

Regardless, I think you're overstating how different the results are. The effect estimates for full matching and IPW are -202 and 225, respectively. It might seem like that is a large difference and the estimates have different signs, but look at their standard errors. The full matching estimate is within half a standard error of the IPW estimate (i.e., the confidence intervals heavily overlap). That suggests these estimates are quite close actually. So there isn't really much of a difference if you look on the appropriate scale. It's also worth considering the actual scale of the outcome variable, which is dollars, and which ranges from 0 to over 60,000. A difference of 400 between estimates is tiny; it corresponds to about .06 standard deviations.

Also, neither method achieves balance well, so both methods can have bias, and the degree to which each method allows imbalance to remain differs somewhat. Still, though, I would say the methods perform very similarly. That said, not any pair of methods that target the same estimand will have similar performance because some methods perform better than others. For example, when targeting the ATT (the canonical estimand for this dataset), nearest neighbor matching without replacement does terribly, but propensity score weighting does very well. Even though they both target the same estimand and may both use the same propensity score, they perform differently because the methods are different.

@rduesing
Copy link
Author

Many thanks for the quick answer Dr Greifer!

I didn't know about the reprex package, this is indeed very helpful, thank you.

I think the hint with the effect size compared to the range and standard error is pointing absolutely in the right direction, I was so focused on the "obvious" result, i.e. the mean difference and the sign that I did not put the value into context. Many thanks for your help! At least I (hopefully) did not misunderstand the approaches completely ;-)

All the best,
Rainer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants