Linear vs generic

This sections presents randomization inference in which the statistic is a a coefficient of a simple OLS with simple randomization.

Tip

Code available in the repository.

Data

The simulated dataset represents a randomized experiment with \(N=10{,}000\) observations, designed to resemble realistic applied data. Treatment is randomly assigned with equal probability (50/50), with no clustering and no stratification. The dataset includes 20 covariates covering demographics, behavior, engagement, and region indicators. These covariates are generated from a small number of latent factors, inducing realistic correlations while remaining fully observed and free of missing values.

The outcome is continuous and driven by covariates plus a very small heterogeneous treatment effect. The individual treatment effect is

\[\tau_i = 0.02 + 0.0075 \cdot h_i\] where \(h_i\) is a standardized index based on baseline spending and satisfaction, so the true average treatment effect is approximately 0.02. Outcome noise is Gaussian with standard deviation \(5\), chosen so that at \(N = 10{,}000\) the problem is nontrivial but stable, with an expected standard error for the treatment coefficient of about \(0.05\). This implies considerable design-based variability, so point estimates and \(p\)-values can differ noticeably across realizations and across implementations. In addition, randomization inference relies on a finite number of permutations, which introduces a small amount of Monte Carlo error. On average, however, inference is expected to provide little evidence against the null hypothesis of no treatment effect.

Stata

Set a local macro with all covariates except the permute variable treat:

local xvars ///
    age female education_years log_income household_size urban      ///
    tenure_months baseline_spend purchases_12m returns_12m          ///
    support_tickets_6m app_sessions_30d days_since_last_purchase    ///
    email_opt_in promo_exposure_30d prior_churn credit_score        ///
    satisfaction_score region_1 region_2 region_3 region_4

Run ritest:

ritest treat _b[treat], reps(2000) seed(23) nodots: reg y treat `xvars'

. ritest treat _b[treat], reps(2000) seed(23) nodots: reg y treat `xvars'

      Source |       SS           df       MS      Number of obs   =    10,000
-------------+----------------------------------   F(23, 9976)     =    701.81
       Model |  389011.799        23  16913.5565   Prob > F        =    0.0000
    Residual |  240422.079     9,976   24.100048   R-squared       =    0.6180
-------------+----------------------------------   Adj R-squared   =    0.6172
       Total |  629433.877     9,999  62.9496827   Root MSE        =    4.9092

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       treat |   .0819918   .0983118     0.83   0.404    -.1107191    .2747028
       ... (clipped for brevity)
------------------------------------------------------------------------------

      Command: regress y treat age female education_years log_income
                   household_size urban tenure_months baseline_spend
                   purchases_12m returns_12m support_tickets_6m
                   app_sessions_30d days_since_last_purchase email_opt_in
                   promo_exposure_30d prior_churn credit_score
                   satisfaction_score region_1 region_2 region_3 region_4
        _pm_1: _b[treat]
  res. var(s):  treat
   Resampling:  Permuting treat
Clust. var(s):  __000001
     Clusters:  10000
Strata var(s):  none
       Strata:  1

------------------------------------------------------------------------------
T            |     T(obs)       c       n   p=c/n   SE(p) [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _pm_1 |   .0819918     786    2000  0.3930  0.0109  .3715147   .4147996
------------------------------------------------------------------------------
Note: Confidence interval is with respect to p=c/n.
Note: c = #{|T| >= |T(obs)|}

44 seconds.

R

Estimate a linear model, save in est:

fml <- y ~ treat +
  age + female + education_years + log_income + household_size + urban + tenure_months +
  baseline_spend + purchases_12m + returns_12m + support_tickets_6m + app_sessions_30d +
  days_since_last_purchase + email_opt_in + promo_exposure_30d + prior_churn + credit_score +
  satisfaction_score + region_1 + region_2 + region_3 + region_4

est <- lm(fml, data = df)

Run ritest, save results in ri:

ri <- ritest(est, "treat", reps = 2000, seed = 23L)

Print results.

print(ri)

OLS (lm) fit:

Call:
lm(formula = fml, data = df)

Residuals:
     Min       1Q   Median       3Q      Max
-19.0734  -3.3147   0.0069   3.3346  19.3317

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)
(Intercept)              -2.008e+01  1.414e+00 -14.205  < 2e-16 ***
treat                     8.199e-02  9.831e-02   0.834 0.404302
... (clipped for brevity)
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.909 on 9976 degrees of freedom
Multiple R-squared:  0.618, Adjusted R-squared:  0.6172
F-statistic: 701.8 on 23 and 9976 DF,  p-value: < 2.2e-16


RI result:

          Call: lm(formula = fml, data = df)
   Res. var(s): treat
            H0: treat=0
     Num. reps: 2000
────────────────────────────────────────────────────────────────────────────────
  T(obs)         c         n     p=c/n     SE(p)   CI 2.5%  CI 97.5%
 0.08199       839      2000    0.4195   0.01815    0.3896    0.4494
────────────────────────────────────────────────────────────────────────────────
Note: Confidence interval is with respect to p=c/n.
Note: c = #{|T| >= |T(obs)|}

7.18 seconds.

Python

Linear path

Set up the model:

PERMUTE_VAR = "treat"
STAT = "treat"

COVARS = [
    "age",
    "female",
    "education_years",
    "log_income",
    "household_size",
    "urban",
    "tenure_months",
    "baseline_spend",
    "purchases_12m",
    "returns_12m",
    "support_tickets_6m",
    "app_sessions_30d",
    "days_since_last_purchase",
    "email_opt_in",
    "promo_exposure_30d",
    "prior_churn",
    "credit_score",
    "satisfaction_score",
    "region_1",
    "region_2",
    "region_3",
    "region_4",
]

FORMULA = "y ~ treat + " + " + ".join(COVARS)

Run ritest:

res = ritest(
       df=df,
       permute_var=PERMUTE_VAR,
       formula=FORMULA,
       stat=STAT,
       reps=reps,
       seed=seed,
    )

Print results:

print(summary)

Randomization Inference Result
===============================

Coefficient
-----------
Observed effect (β̂):   0.0820
Coefficient CI bounds: not computed
Coefficient CI band:   not computed

Permutation test
----------------
Tail (alternative):     two-sided
p-value:                0.3990 (39.9%)
P-value CI @ α=0.050: [0.3775, 0.4208]
As-or-more extreme:     798 / 2000

Test configuration
------------------
Strata:                 —
Clusters:               —
Weights:                no

Settings
--------
alpha:                  0.050
seed:                   23
ci_method:              clopper-pearson
ci_mode:                none
n_jobs:                 4

1.8 seconds.

Generic path

Specify the model:

PERMUTE_VAR = "treat"
STAT = "treat"

COVARS = [
    "age",
    "female",
    "education_years",
    "log_income",
    "household_size",
    "urban",
    "tenure_months",
    "baseline_spend",
    "purchases_12m",
    "returns_12m",
    "support_tickets_6m",
    "app_sessions_30d",
    "days_since_last_purchase",
    "email_opt_in",
    "promo_exposure_30d",
    "prior_churn",
    "credit_score",
    "satisfaction_score",
    "region_1",
    "region_2",
    "region_3",
    "region_4",
]

Define generic function stat_fn(...):

    def stat_fn(dfp: pd.DataFrame) -> float:
        X[:, 1] = dfp["treat"].to_numpy(dtype=float, copy=False)
        fit = sm.OLS(y, X).fit()
        return float(fit.params[1])  # treat coefficient (const is params[0])

Call ritest

    res = ritest(
        df=df,
        permute_var=PERMUTE_VAR,
        stat_fn=stat_fn,
        reps=reps,
        seed=seed,
        ci_mode="none",
    )

print(summary)

Randomization Inference Result
===============================

Coefficient
-----------
Observed effect (β̂):   0.0820
Coefficient CI bounds: not computed
Coefficient CI band:   not computed

Permutation test
----------------
Tail (alternative):     two-sided
p-value:                0.3970 (39.7%)
P-value CI @ α=0.050: [0.3755, 0.4188]
As-or-more extreme:     794 / 2000

Test configuration
------------------
Strata:                 —
Clusters:               —
Weights:                no

Settings
--------
alpha:                  0.050
seed:                   23
ci_method:              clopper-pearson
ci_mode:                none
n_jobs:                 4

14.29 seconds.