2024-02-09

SPHIS Home » Departments » Bioinformatics & Biostatistics » Research » Seminar Series » 2024-02-09

James Belmonte, Staff Data Scientist – Causal Inference, Covera Health

"Iterative Orthogonal Regression For The Identification Of Direct Effect Modifiers"

Advances in causal machine learning have produced methods to non-parametrically predict conditional average treatment effects (CATE), thus facilitating effect modifier identification through testing associations between variables and estimated CATE. However, even with perfect CATE estimation, this method cannot differentiate between effect modifiers that are merely associated with effect heterogeneity, and effect modifiers that cause effect heterogeneity, i.e. “direct” effect modifiers (DEMs). In many applications, this distinction is unimportant. For example, when the goal is to predict which patients are most likely to benefit from a risky medical intervention, learning associations between patient characteristics and treatment benefit is sufficient. However, when the research goal is to better understand causal mechanisms, investigate policy bias, or optimize an existing program, the identification of DEMs may be required. To this end, we introduce Iterative Orthogonal Regression (IOR), a novel framework for the identification of DEMs. Inspired by the Frisch-Waugh-Lovell theorem, DEMs are identified by iteratively testing the association between the residuals of each effect modifier and estimated CATE, where each effect modifier is residualized by the others. 250 data generating processes were simulated where 20% of effect modifiers were DEMs. DEM status was predicted using the p-value of the effect modifier’s coefficient when individually regressing CATE. IOR was then applied, and DEM status was re-predicted. The results demonstrate IOR’s ability to identify DEMs with high precision and recall while the traditional approach cannot differentiate DEMs from other effect modifiers: Precision (.87 Vs. .27), Recall (.99 Vs. 1), AUPRC (.90 Vs. .34). IOR is validated as a novel framework to identify DEMs for binary, categorical or continuous exposures and can be applied to tasks such as uncovering causal mechanisms, researching health equity, and program optimization.

Search:

Upcoming Events

2024-02-09