2025年基尼辛普森指数衡量多样性

基尼辛普森指数衡量多样性Simpson index The measure equals the probability that two entities taken at random from the dataset with replacement represent the same type where is the total number of types in the dataset

大家好,我是讯享网,很高兴认识大家。

Simpson index

\lambda =\sum_{i=1}^{R}p_{i}^{2}
讯享网

The measure equals the probability that two entities taken at random from the dataset (with replacement) represent the same type, where R is the total number of types in the dataset.

 

Gini–Simpson index

The transformation 1-\lambda equals the probability that the two entities represent different types.

分布越均衡,该指数越高;分布越集中,该指数越低。

 

Code

import pandas as pd def gini_calc(df2): sum_ = sum_square = 0 sum_ = df2['cnt'].sum() df2['cnt_prop']=df2['cnt'].apply(lambda x :x/sum_) for i in df2['cnt_prop']: sum_square += i2 return 1-sum_square df = pd.read_excel('gini.xlsx') df=df.groupby([df['population'],df['subpopulation'],df['type']],as_index=False).sum() a=[] b=[] c=[] for name,group in df.groupby([df['population'],df['subpopulation']]): index = gini_calc(group) a.append(name[0]) b.append(name[1]) c.append(index) res={"population":a, "subpopulation":b, "gini_simpson_index":c} data=pd.DataFrame(res) result=data.to_csv('gini_result.csv')

讯享网

 

小讯
上一篇 2025-02-17 22:52
下一篇 2025-01-30 07:09

相关推荐

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/42350.html