Non-linear models
ritest() can do randomization inference (RI) for non-linear models by using the generic statistic path:
- You provide
stat_fn(df_perm) -> float, returning one scalar (the test statistic). ritest()repeatedly permutespermute_var(e.g., treatmentz) and re-computes the statistic.- The RI \(p\)-value comes from the permutation distribution of that statistic.
All snippets of code in this section come from this script.
Data
Each example uses its own toy data, designed to be appropriate for the corresponding model.
Logit
Define the function stat_logit(...):
def stat_logit(df_perm: pd.DataFrame) -> float:
X = design(df_perm)
y = df_perm["y"]
fit = sm.GLM(
y, X,
family=sm.families.Binomial(link=sm.families.links.Logit()),
).fit(disp=False)
return float(fit.params["z"])Call ritest:
res = ritest(
df=df,
permute_var="z",
stat_fn=stat_logit,
alternative="two-sided",
reps=1000,
seed=1,
ci_mode="none",
)Randomization inference result (ritest)
--------------------------------------
Coefficient
-----------
Observed effect (β̂): 0.9404
Coefficient CI bounds: not computed
Coefficient CI band: not computed
Permutation test
----------------
Tail (alternative): two-sided
p-value: 0.0180 (1.8%)
P-value CI @ α=0.050: [0.0083, 0.0339]
As-or-more extreme: 9 / 500
Probit
Define the function stat_probit(...):
def stat_probit(df_perm: pd.DataFrame) -> float:
X = design(df_perm)
y = df_perm["y"]
fit = sm.GLM(
y, X,
family=sm.families.Binomial(link=sm.families.links.Probit()),
).fit(disp=False)
return float(fit.params["z"])ritest call:
res = ritest(
df=df,
permute_var="z",
stat_fn=stat_probit,
alternative="two-sided",
reps=1000,
seed=2,
ci_mode="none",
)Randomization inference result (ritest)
--------------------------------------
Coefficient
-----------
Observed effect (β̂): 0.5865
Coefficient CI bounds: not computed
Coefficient CI band: not computed
Permutation test
----------------
Tail (alternative): two-sided
p-value: 0.0320 (3.2%)
P-value CI @ α=0.050: [0.0184, 0.0514]
As-or-more extreme: 16 / 500
Poisson
Define stat_poisson(...):
def stat_poisson(df_perm: pd.DataFrame) -> float:
X = design(df_perm)
y = df_perm["y"]
fit = sm.GLM(y, X, family=sm.families.Poisson()).fit(disp=False)
return float(fit.params["z"])ritest call:
res = ritest(
df=df,
permute_var="z",
stat_fn=stat_poisson,
alternative="two-sided",
reps=1000,
seed=3,
ci_mode="none",
)Randomization inference result (ritest)
--------------------------------------
Coefficient
-----------
Observed effect (β̂): 0.3636
Coefficient CI bounds: not computed
Coefficient CI band: not computed
Permutation test
----------------
Tail (alternative): two-sided
p-value: 0.0200 (2.0%)
P-value CI @ α=0.050: [0.0096, 0.0365]
As-or-more extreme: 10 / 500
Negative Binomial
Define stat_negbin_fixed_alpha(...):
def stat_negbin_fixed_alpha(df_perm: pd.DataFrame, *, alpha: float = 1.0) -> float:
X = design(df_perm)
y = df_perm["y"]
fit = sm.GLM(
y, X,
family=sm.families.NegativeBinomial(alpha=alpha),
).fit(disp=False)
return float(fit.params["z"])ritest call with a lambda function to handle the extra parameter:
res = ritest(
df=df,
permute_var="z",
stat_fn=lambda d: stat_negbin_fixed_alpha(d, alpha=1.0),
alternative="two-sided",
reps=1000,
seed=4,
ci_mode="none",
)Randomization inference result (ritest)
--------------------------------------
Coefficient
-----------
Observed effect (β̂): 0.1609
Coefficient CI bounds: not computed
Coefficient CI band: not computed
Permutation test
----------------
Tail (alternative): two-sided
p-value: 0.4840 (48.4%)
P-value CI @ α=0.050: [0.4394, 0.5288]
As-or-more extreme: 242 / 500
Fractional logit
Define stat_fractional_logit(...):
def stat_fractional_logit(df_perm: pd.DataFrame) -> float:
X = design(df_perm)
y = df_perm["y"]
fit = sm.GLM(
y, X,
family=sm.families.Binomial(link=sm.families.links.Logit()),
).fit(disp=False)
return float(fit.params["z"])RI call:
res = ritest(
df=df,
permute_var="z",
stat_fn=stat_fractional_logit,
alternative="two-sided",
reps=1000,
seed=5,
ci_mode="none",
)Randomization inference result (ritest)
--------------------------------------
Coefficient
-----------
Observed effect (β̂): 0.5334
Coefficient CI bounds: not computed
Coefficient CI band: not computed
Permutation test
----------------
Tail (alternative): two-sided
p-value: 0.0000 (0.0%)
P-value CI @ α=0.050: [0.0000, 0.0074]
As-or-more extreme: 0 / 500
Tobit
Define stat_tobit(...):
from scipy.optimize import minimize
from scipy.stats import norm
def stat_tobit(df_perm: pd.DataFrame) -> float:
y = df_perm["y"].to_numpy(dtype=float)
X = design(df_perm).to_numpy(dtype=float)
is_cens = y <= 0.0
is_obs = ~is_cens
# theta = [beta0, beta_z, beta_x, log_sigma]
def nll(theta: np.ndarray) -> float:
beta = theta[:-1]
log_sigma = theta[-1]
sigma = np.exp(log_sigma)
mu = X @ beta
z = (y - mu) / sigma
ll = 0.0
if np.any(is_obs):
ll += np.sum(norm.logpdf(z[is_obs]) - log_sigma)
if np.any(is_cens):
ll += np.sum(norm.logcdf((0.0 - mu[is_cens]) / sigma))
return -float(ll)
theta0 = np.zeros(X.shape[1] + 1, dtype=float) # simple start
opt = minimize(nll, theta0, method="BFGS", options={"maxiter": 120})
beta_hat = opt.x[:-1]
return float(beta_hat[1]) # [const, z, x] -> pick zritest call:
res = ritest(
df=df,
permute_var="z",
stat_fn=stat_tobit,
alternative="two-sided",
reps=500, # MLE wrappers are slower; start smaller
seed=6,
ci_mode="none",
)Randomization inference result (ritest)
--------------------------------------
Coefficient
-----------
Observed effect (β̂): 0.9919
Coefficient CI bounds: not computed
Coefficient CI band: not computed
Permutation test
----------------
Tail (alternative): two-sided
p-value: 0.0000 (0.0%)
P-value CI @ α=0.050: [0.0000, 0.0074]
As-or-more extreme: 0 / 500
Practical notes
- Your wrapper should be deterministic given
df_perm. - Keep the wrapper minimal: compute the statistic and return it.
ritestassumes thatgen_funworks, i.e., is always return a scalar. This behavior is validated before any heavy computation begins. You must ensure that your generic function works before passing it toritest.