卡方检验实例(python)

卡方检验实例(python)白人和黑人在求职路上会有种族的歧视吗 import pandas as pd import numpy as np from scipy import stats data pd io stata read stata us job market discriminati dta data head blacks

大家好,我是讯享网,很高兴认识大家。

白人和黑人在求职路上会有种族的歧视吗?

import pandas as pd import numpy as np from scipy import stats data = pd.io.stata.read_stata('us_job_market_discrimination.dta') data.head()

讯享网
讯享网blacks = data[data.race == 'b'] whites = data[data.race == 'w']

black的数据: 

whites.call.describe()
讯享网blacks.call.describe()
count 2435.000000 mean 0.064476 std 0. min 0.000000 25% 0.000000 50% 0.000000 75% 0.000000 max 1.000000 Name: call, dtype: float64

white的数据描述:

讯享网whites.call.describe()
count 2435.000000 mean 0.096509 std 0. min 0.000000 25% 0.000000 50% 0.000000 75% 0.000000 max 1.000000 Name: call, dtype: float64

卡方检验

  • 白人获得职位
  • 白人被拒绝
  • 黑人获得职位
  • 黑人被拒绝

假设检验

  • H0:种族对求职结果没有显著影响
  • H1:种族对求职结果有影响
讯享网blacks_called = len(blacks[blacks['call'] == True])#黑人获得职位 blacks_not_called = len(blacks[blacks['call'] == False])#黑人被拒绝 whites_called = len(whites[whites['call'] == True])#白人获得职位 whites_not_called = len(whites[whites['call'] == False])#白人被拒绝
observed = pd.DataFrame({'blacks': {'called': blacks_called, 'not_called': blacks_not_called}, 'whites': {'called' : whites_called, 'not_called' : whites_not_called}}) observed

                           
讯享网

讯享网num_called_back = blacks_called + whites_called#获得职位总数 num_not_called = blacks_not_called + whites_not_called#没有获得职位的总数 print(num_called_back) print(num_not_called)
392 4478
讯享网rate_of_callbacks = num_called_back / (num_not_called + num_called_back) rate_of_callbacks
0.068377
讯享网expected_called = len(data) * rate_of_callbacks expected_not_called = len(data) * (1 - rate_of_callbacks) print(expected_called) print(expected_not_called)
391.994 4478.0
讯享网import scipy.stats as stats #观测值 observed_frequencies = [blacks_not_called, whites_not_called, whites_called, blacks_called] #期望值 expected_frequencies = [expected_not_called/2, expected_not_called/2, expected_called/2, expected_called/2] #卡方检验 stats.chisquare(f_obs = observed_frequencies, f_exp = expected_frequencies)
Power_divergenceResult(statistic=16.0221, pvalue=0.000)

p值小于0.05,拒绝假设H0:种族对求职结果没有显著影响。

 

小讯
上一篇 2025-01-05 22:37
下一篇 2025-02-22 15:19

相关推荐

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/46347.html