如何使用python 執行Propensity score matching 研究

#PropensityScoreMatch #Python #Statistics

最近開始整理之前的研究檔案,其中一個retrospective cohort study所收集到的實驗與對照組人數落差太大,對照組是試驗組的10倍以上。為了減少研究的落差,可以採用Propensity score matching的方式。

以下有兩個方法透過python來處理:

## method1:

### code:

from sklearn.preprocessing import StandardScalerfrom sklearn.neighbors import NearestNeighbors

def get_matching_pairs(treated_df, non_treated_df, scaler=True):
treated_x = treated_df.values
non_treated_x = non_treated_df.values
if scaler == True:
scaler = StandardScaler()
if scaler:
scaler.fit(treated_x)
treated_x = scaler.transform(treated_x)
non_treated_x = scaler.transform(non_treated_x)
nbrs= NearestNeighbors(n_neighbors=1,algorithm=’ball_tree’).fit(non_treated_x)
distances, indices = nbrs.kneighbors(treated_x)
indices = indices.reshape(indices.shape[0])
matched = non_treated_df.iloc[indices]
return matched
matched_df = get_matching_pairs(treated_df, non_treated_df)
Python code

### 說明與思考

## method 2

# python 
model = ‘treated ~ age + male +edu’ propensity = smf.logit(formula=model, data = df).fit() propensity.summary()

### 思考

--

--

Continuous Quality improvement for Life and Work

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store