<p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F1b03437bj00qzepyb001cd200oy00d1g00oy00d1.jpg&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65RPI"><strong>大数据文摘授权转载自法纳斯特</strong></p><p id="08R65RPK">大家好,我是小F~<br/></p><p id="08R65RPM">数据可视化是数据科学中关键的一步。</p><p id="08R65RPO">在以图形方式表现某些数据时,Python能够提供很大的帮助。</p><p id="08R65RPQ">不过有些小伙伴也会遇到不少问题,比如选择何种图表,以及如何制作,代码如何编写,这些都是问题!</p><p id="08R65RPS">今天给大家介绍一个Python图表大全,40个种类,总计约400个示例图表。<br/></p><p id="08R65RPU">分为7个大系列,分布、关系、排行、局部整体、时间序列、地理空间、流程。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F5d394b4ap00qzepyb0010d200m800b4g00m800b4.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65RQ2">文档地址<br/></p><p id="08R65RQ3">https://www.python-graph-gallery.com</p><p id="08R65RQ5">GitHub地址</p><p id="08R65RQ6">https://github.com/holtzy/The-Python-Graph-Gallery</p><p id="08R65RQ8">给大家提供了示例及代码,几分钟内就能构建一个你所需要的图表。</p><p id="08R65RQA">下面就给大家介绍一下~</p><p id="08R65RQD"><strong>01. 小提琴图</strong></p><p id="08R65RQF">小提琴图可以将一组或多组数据的数值变量分布可视化。</p><p id="08R65RQH">相比有时会隐藏数据特征的箱形图相比,小提琴图值得更多关注。</p><p id="08R65RQJ">import seaborn as sns<br/>import matplotlib.pyplot as plt</p><p id="08R65RQK"># 加载数据<br/>df = sns.load_dataset('iris', data_home='seaborn-data', cache=True)</p><p id="08R65RQL"># 绘图显示<br/>sns.violinplot(x=df["species"], y=df["sepal_length"])<br/>plt.show()<br/></p><p id="08R65RQN">使用Seaborn的violinplot()进行绘制,结果如下。<br/></p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Fcfe6246cp00qzepyc000sd200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65RQS"><strong><strong>02.</strong> 核密度估计图</strong></p><p id="08R65RQU">核密度估计图其实是对直方图的一个自然拓展。</p><p id="08R65RR0">可以可视化一个或多个组的数值变量的分布,非常适合大型数据集。</p><p id="08R65RR2">import seaborn as sns<br/>import matplotlib.pyplot as plt</p><p id="08R65RR3"># 加载数据<br/>df = sns.load_dataset('iris', data_home='seaborn-data', cache=True)</p><p id="08R65RR4"># 绘图显示<br/>sns.kdeplot(df['sepal_width'])<br/>plt.show()<br/></p><p id="08R65RR6">使用Seaborn的kdeplot()进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F852a95d1p00qzepyc000ld200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65RRB"><strong><strong>03.</strong> 直方图</strong></p><p id="08R65RRD">直方图,可视化一组或多组数据的分布情况。<br/></p><p id="08R65RRF">import seaborn as sns<br/>import matplotlib.pyplot as plt</p><p id="08R65RRG"># 加载数据<br/>df = sns.load_dataset('iris', data_home='seaborn-data', cache=True)</p><p id="08R65RRH"># 绘图显示<br/>sns.distplot(a=df["sepal_length"], hist=True, kde=False, rug=False)<br/>plt.show()<br/></p><p id="08R65RRJ">使用Seaborn的distplot()进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Fe660a3ebp00qzepyc000ad200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65RRO"><strong><strong>04.</strong> 箱形图</strong></p><p id="08R65RRQ">箱形图,可视化一组或多组数据的分布情况。</p><p id="08R65RRS">可以快速获得中位数、四分位数和异常值,但也隐藏数据集的各个数据点。</p><p id="08R65RRU">import seaborn as sns<br/>import matplotlib.pyplot as plt</p><p id="08R65RRV"># 加载数据<br/>df = sns.load_dataset('iris', data_home='seaborn-data', cache=True)</p><p id="08R65RS0"># 绘图显示<br/>sns.boxplot(x=df["species"], y=df["sepal_length"])<br/>plt.show()<br/></p><p id="08R65RS2">使用Seaborn的boxplot()进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F28eff308p00qzepyd000dd200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65RS7"><strong><strong>05.</strong> 山脊线图</strong></p><p id="08R65RS9">山脊线图,总结几组数据的分布情况。</p><p id="08R65RSB">每个组都表示为一个密度图,每个密度图相互重叠以更有效地利用空间。</p><p id="08R65RSD">import plotly.graph_objects as go<br/>import numpy as np<br/>import pandas as pd</p><p id="08R65RSE"># 读取数据<br/>temp = pd.read_csv('2016-weather-data-seattle.csv')<br/># 数据处理, 时间格式转换<br/>temp['year'] = pd.to_datetime(temp['Date']).dt.year</p><p id="08R65RSF"># 选择几年的数据展示即可<br/>year_list = [1950, 1960, 1970, 1980, 1990, 2000, 2010]<br/>temp = temp[temp['year'].isin(year_list)]</p><p id="08R65RSG"># 绘制每年的直方图,以年和平均温度分组,并使用'count'函数进行汇总<br/>temp = temp.groupby(['year', 'Mean_TemperatureC']).agg({'Mean_TemperatureC': 'count'}).rename(columns={'Mean_TemperatureC': 'count'}).reset_index()</p><p id="08R65RSH"># 使用Plotly绘制脊线图,每个轨迹对应于特定年份的温度分布<br/># 将每年的数据(温度和它们各自的计数)存储在单独的数组,并将其存储在字典中以方便检索<br/>array_dict = {}<br/>for year in year_list:<br/># 每年平均温度<br/>array_dict[f'x_{year}'] = temp[temp['year'] == year]['Mean_TemperatureC']<br/># 每年温度计数<br/>array_dict[f'y_{year}'] = temp[temp['year'] == year]['count']<br/>array_dict[f'y_{year}'] = (array_dict[f'y_{year}'] - array_dict[f'y_{year}'].min()) <br/>/ (array_dict[f'y_{year}'].max() - array_dict[f'y_{year}'].min())</p><p id="08R65RSI"># 创建一个图像对象<br/>fig = go.Figure()<br/>for index, year in enumerate(year_list):<br/># 使用add_trace()绘制轨迹<br/>fig.add_trace(go.Scatter(<br/>x=[-20, 40], y=np.full(2, len(year_list) - index),<br/>mode='lines',<br/>line_color='white'))</p><p id="08R65RSJ">fig.add_trace(go.Scatter(<br/>x=array_dict[f'x_{year}'],<br/>y=array_dict[f'y_{year}'] + (len(year_list) - index) + 0.4,<br/>fill='tonexty',<br/>name=f'{year}'))</p><p id="08R65RSK"># 添加文本<br/>fig.add_annotation(<br/>x=-20,<br/>y=len(year_list) - index,<br/>text=f'{year}',<br/>showarrow=False,<br/>yshift=10)</p><p id="08R65RSL"># 添加标题、图例、xy轴参数<br/>fig.update_layout(<br/>title='1950年~2010年西雅图平均温度',<br/>showlegend=False,<br/>xaxis=dict(title='单位: 摄氏度'),<br/>yaxis=dict(showticklabels=False)<br/>)</p><p id="08R65RSM"># 跳转网页显示<br/>fig.show()<br/></p><p id="08R65RSO">Seaborn没有专门的函数来绘制山脊线图,可以多次调用kdeplot()来制作。</p><p id="08R65RSQ">结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F119749e9p00qzepyd000sd200u000gag00zk00ja.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65RSV"><strong><strong>06.</strong> 散点图</strong></p><p id="08R65RT1">散点图,显示2个数值变量之间的关系。</p><p id="08R65RT3">import seaborn as sns<br/>import matplotlib.pyplot as plt</p><p id="08R65RT4"># 加载数据<br/>df = sns.load_dataset('iris', data_home='seaborn-data', cache=True)</p><p id="08R65RT5"># 绘图显示<br/>sns.regplot(x=df["sepal_length"], y=df["sepal_width"])<br/>plt.show()<br/></p><p id="08R65RT7">使用Seaborn的regplot()进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Ffe0348d7p00qzepye0015d200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65RTA"><strong><strong><br/></strong></strong></p><p id="08R65RTB"><strong><strong><br/></strong></strong></p><p id="08R65RTC"><strong><strong>07.</strong> 矩形热力图</strong></p><p id="08R65RTE">矩形热力图,矩阵中的每个值都被表示为一个颜色数据。</p><p id="08R65RTG">import seaborn as sns<br/>import pandas as pd<br/>import numpy as np</p><p id="08R65RTH"># Create a dataset<br/>df = pd.DataFrame(np.random.random((5,5)), columns=["a","b","c","d","e"])</p><p id="08R65RTI"># Default heatmap<br/>p1 = sns.heatmap(df)<br/></p><p id="08R65RTK">使用Seaborn的heatmap()进行绘制,结果如下。<br/></p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F83fe6c00p00qzepye000ad200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65RTN"><strong><strong><br/></strong></strong></p><p id="08R65RTO"><strong><strong><br/></strong></strong></p><p id="08R65RTP"><strong><strong>08.</strong> 相关性图</strong></p><p id="08R65RTR">相关性图或相关矩阵图,分析每对数据变量之间的关系。</p><p id="08R65RTT">相关性可视化为散点图,对角线用直方图或密度图表示每个变量的分布。</p><p id="08R65RTV">import seaborn as sns<br/>import matplotlib.pyplot as plt</p><p id="08R65RU0"># 加载数据<br/>df = sns.load_dataset('iris', data_home='seaborn-data', cache=True)</p><p id="08R65RU1"># 绘图显示<br/>sns.pairplot(df)<br/>plt.show()<br/></p><p id="08R65RU3">使用Seaborn的pairplot()进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Fbfaf1f6cp00qzepyf0038d200u000qmg00zk00vj.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65RU6"><strong><strong><br/></strong></strong></p><p id="08R65RU7"><strong><strong><br/></strong></strong></p><p id="08R65RU8"><strong><strong>09.</strong> 气泡图</strong></p><p id="08R65RUA">气泡图其实就是一个散点图,其中圆圈大小被映射到第三数值变量的值。</p><p id="08R65RUC">import matplotlib.pyplot as plt<br/>import seaborn as sns<br/>from gapminder import gapminder</p><p id="08R65RUD"># 导入数据<br/>data = gapminder.loc[gapminder.year == 2007]</p><p id="08R65RUE"># 使用scatterplot创建气泡图<br/>sns.scatterplot(data=data, x="gdpPercap", y="lifeExp", size="pop", legend=False, sizes=(20, 2000))</p><p id="08R65RUF"># 显示<br/>plt.show()<br/></p><p id="08R65RUH">使用Seaborn的scatterplot()进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F3e490a7dp00qzepyf000yd200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65RUK"><strong><strong><br/></strong></strong></p><p id="08R65RUL"><strong><strong><br/></strong></strong></p><p id="08R65RUM"><strong><strong>10.</strong> 连接散点图</strong></p><p id="08R65RUO">连接散点图就是一个线图,其中每个数据点由圆形或任何类型的标记展示。</p><p id="08R65RUQ">import matplotlib.pyplot as plt<br/>import numpy as np<br/>import pandas as pd</p><p id="08R65RUR"># 创建数据<br/>df = pd.DataFrame({'x_axis': range(1, 10), 'y_axis': np.random.randn(9) * 80 + range(1, 10)})</p><p id="08R65RUS"># 绘制显示<br/>plt.plot('x_axis', 'y_axis', data=df, linestyle='-', marker='o')<br/>plt.show()<br/></p><p id="08R65RUU">使用Matplotlib的plot()进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Fc0e41f47p00qzepyf000ld200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65RV1"><strong><strong><br/></strong></strong></p><p id="08R65RV2"><strong><strong><br/></strong></strong></p><p id="08R65RV3"><strong><strong>11.</strong> 二维密度图</strong></p><p id="08R65RV5">二维密度图或二维直方图,可视化两个定量变量的组合分布。</p><p id="08R65RV7">它们总是在X轴上表示一个变量,另一个在Y轴上,就像散点图。</p><p id="08R65RV9">然后计算二维空间特定区域内的次数,并用颜色渐变表示。</p><p id="08R65RVB">形状变化:六边形a hexbin chart,正方形a 2d histogram,核密度2d density plots或contour plots。</p><p id="08R65RVD">import numpy as np<br/>import matplotlib.pyplot as plt<br/>from scipy.stats import kde</p><p id="08R65RVE"># 创建数据, 200个点<br/>data = np.random.multivariate_normal([0, 0], [[1, 0.5], [0.5, 3]], 200)<br/>x, y = data.T</p><p id="08R65RVF"># 创建画布, 6个子图<br/>fig, axes = plt.subplots(ncols=6, nrows=1, figsize=(21, 5))</p><p id="08R65RVG"># 第一个子图, 散点图<br/>axes[0].set_title('Scatterplot')<br/>axes[0].plot(x, y, 'ko')</p><p id="08R65RVH"># 第二个子图, 六边形<br/>nbins = 20<br/>axes[1].set_title('Hexbin')<br/>axes[1].hexbin(x, y, gridsize=nbins, cmap=plt.cm.BuGn_r)</p><p id="08R65RVI"># 2D 直方图<br/>axes[2].set_title('2D Histogram')<br/>axes[2].hist2d(x, y, bins=nbins, cmap=plt.cm.BuGn_r)</p><p id="08R65RVJ"># 高斯kde<br/>k = kde.gaussian_kde(data.T)<br/>xi, yi = np.mgrid[x.min():x.max():nbins * 1j, y.min():y.max():nbins * 1j]<br/>zi = k(np.vstack([xi.flatten(), yi.flatten()]))</p><p id="08R65RVK"># 密度图<br/>axes[3].set_title('Calculate Gaussian KDE')<br/>axes[3].pcolormesh(xi, yi, zi.reshape(xi.shape), shading='auto', cmap=plt.cm.BuGn_r)</p><p id="08R65RVL"># 添加阴影<br/>axes[4].set_title('2D Density with shading')<br/>axes[4].pcolormesh(xi, yi, zi.reshape(xi.shape), shading='gouraud', cmap=plt.cm.BuGn_r)</p><p id="08R65RVM"># 添加轮廓<br/>axes[5].set_title('Contour')<br/>axes[5].pcolormesh(xi, yi, zi.reshape(xi.shape), shading='gouraud', cmap=plt.cm.BuGn_r)<br/>axes[5].contour(xi, yi, zi.reshape(xi.shape))</p><p id="08R65RVN">plt.show()<br/></p><p id="08R65RVP">使用Matplotlib和scipy进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Fce10866dp00qzepyg001md200u000afg00zk00cc.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65RVU"><strong><strong>12.</strong> 条形图</strong></p><p id="08R65S00">条形图表示多个明确的变量的数值关系。每个变量都为一个条形。条形的大小代表其数值。<br/></p><p id="08R65S02">import numpy as np<br/>import matplotlib.pyplot as plt</p><p id="08R65S03"># 生成随机数据<br/>height = [3, 12, 5, 18, 45]<br/>bars = ('A', 'B', 'C', 'D', 'E')<br/>y_pos = np.arange(len(bars))</p><p id="08R65S04"># 创建条形图<br/>plt.bar(y_pos, height)</p><p id="08R65S05"># x轴标签<br/>plt.xticks(y_pos, bars)</p><p id="08R65S06"># 显示<br/>plt.show()<br/></p><p id="08R65S08">使用Matplotlib的bar()进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F432dc9aep00qzepyh0006d200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S0D"><strong><strong>13.</strong> 雷达图</strong></p><p id="08R65S0F">雷达图,可以可视化多个定量变量的一个或多个系列的值。</p><p id="08R65S0H">每个变量都有自己的轴,所有轴都连接在图形的中心。<br/></p><p id="08R65S0J">import matplotlib.pyplot as plt<br/>import pandas as pd<br/>from math import pi</p><p id="08R65S0K"># 设置数据<br/>df = pd.DataFrame({<br/>'group': ['A', 'B', 'C', 'D'],<br/>'var1': [38, 1.5, 30, 4],<br/>'var2': [29, 10, 9, 34],<br/>'var3': [8, 39, 23, 24],<br/>'var4': [7, 31, 33, 14],<br/>'var5': [28, 15, 32, 14]<br/>})</p><p id="08R65S0L"># 目标数量<br/>categories = list(df)[1:]<br/>N = len(categories)</p><p id="08R65S0M"># 角度<br/>angles = [n / float(N) * 2 * pi for n in range(N)]<br/>angles += angles[:1]</p><p id="08R65S0N"># 初始化<br/>ax = plt.subplot(111, polar=True)</p><p id="08R65S0O"># 设置第一处<br/>ax.set_theta_offset(pi / 2)<br/>ax.set_theta_direction(-1)</p><p id="08R65S0P"># 添加背景信息<br/>plt.xticks(angles[:-1], categories)<br/>ax.set_rlabel_position(0)<br/>plt.yticks([10, 20, 30], ["10", "20", "30"], color="grey", size=7)<br/>plt.ylim(0, 40)</p><p id="08R65S0Q"># 添加数据图</p><p id="08R65S0R"># 第一个<br/>values = df.loc[0].drop('group').values.flatten().tolist()<br/>values += values[:1]<br/>ax.plot(angles, values, linewidth=1, linestyle='solid', label="group A")<br/>ax.fill(angles, values, 'b', alpha=0.1)</p><p id="08R65S0S"># 第二个<br/>values = df.loc[1].drop('group').values.flatten().tolist()<br/>values += values[:1]<br/>ax.plot(angles, values, linewidth=1, linestyle='solid', label="group B")<br/>ax.fill(angles, values, 'r', alpha=0.1)</p><p id="08R65S0T"># 添加图例<br/>plt.legend(loc='upper right', bbox_to_anchor=(0.1, 0.1))</p><p id="08R65S0U"># 显示<br/>plt.show()<br/></p><p id="08R65S10">使用Matplotlib进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F608623f2p00qzepyh001ad200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S15"><strong>14. 词云图</strong></p><p id="08R65S17">词云图是文本数据的视觉表示。</p><p id="08R65S19">单词通常是单个的,每个单词的重要性以字体大小或颜色表示。</p><p id="08R65S1B">from wordcloud import WordCloud<br/>import matplotlib.pyplot as plt</p><p id="08R65S1C"># 添加词语<br/>text=("Python Python Python Matplotlib Chart Wordcloud Boxplot")</p><p id="08R65S1D"># 创建词云对象<br/>wordcloud = WordCloud(width=480, height=480, margin=0).generate(text)</p><p id="08R65S1E"># 显示词云图<br/>plt.imshow(wordcloud, interpolation='bilinear')<br/>plt.axis("off")<br/>plt.margins(x=0, y=0)<br/>plt.show()<br/></p><p id="08R65S1G">使用wordcloud进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F29c7fb3bp00qzepyi000vd200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S1L"><strong>15. 平行座标图</strong></p><p id="08R65S1N">一个平行座标图,能够比较不同系列相同属性的数值情况。</p><p id="08R65S1P">Pandas可能是绘制平行坐标图的**方式。</p><p id="08R65S1R">import seaborn as sns<br/>import matplotlib.pyplot as plt<br/>from pandas.plotting import parallel_coordinates</p><p id="08R65S1S"># 读取数据<br/>data = sns.load_dataset('iris', data_home='seaborn-data', cache=True)</p><p id="08R65S1T"># 创建图表<br/>parallel_coordinates(data, 'species', colormap=plt.get_cmap("Set2"))</p><p id="08R65S1U"># 显示<br/>plt.show()<br/></p><p id="08R65S20">使用Pandas的parallel_coordinates()进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F2314b153p00qzepyj002kd200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S25"><strong><strong>16.</strong> 棒棒糖图</strong><br/></p><p id="08R65S27">棒棒糖图其实就是柱状图的变形,显示一个线段和一个圆。<br/></p><p id="08R65S29">import matplotlib.pyplot as plt<br/>import pandas as pd<br/>import numpy as np</p><p id="08R65S2A"># 创建数据<br/>df = pd.DataFrame({'group': list(map(chr, range(65, 85))), 'values': np.random.uniform(size=20) })</p><p id="08R65S2B"># 排序取值<br/>ordered_df = df.sort_values(by='values')<br/>my_range = range(1, len(df.index)+1)</p><p id="08R65S2C"># 创建图表<br/>plt.stem(ordered_df['values'])<br/>plt.xticks(my_range, ordered_df['group'])</p><p id="08R65S2D"># 显示<br/>plt.show()<br/></p><p id="08R65S2F">使用Matplotlib的stem()进行绘制,结果如下。<br/></p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Fac3d1191p00qzepyj000ed200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S2K"><strong><strong>17.</strong> 径向柱图</strong><br/></p><p id="08R65S2M">径向柱图同样也是条形图的变形,但是使用极坐标而不是直角坐标系。</p><p id="08R65S2O">绘制起来有点麻烦,而且比柱状图准确度低,但更引人注目。<br/></p><p id="08R65S2Q">import pandas as pd<br/>import matplotlib.pyplot as plt<br/>import numpy as np</p><p id="08R65S2R"># 生成数据<br/>df = pd.DataFrame(<br/>{<br/>'Name': ['item ' + str(i) for i in list(range(1, 51)) ],<br/>'Value': np.random.randint(low=10, high=100, size=50)<br/>})</p><p id="08R65S2S"># 排序<br/>df = df.sort_values(by=['Value'])</p><p id="08R65S2T"># 初始化画布<br/>plt.figure(figsize=(20, 10))<br/>ax = plt.subplot(111, polar=True)<br/>plt.axis('off')</p><p id="08R65S2U"># 设置图表参数<br/>upperLimit = 100<br/>lowerLimit = 30<br/>labelPadding = 4</p><p id="08R65S2V"># 计算最大值<br/>max = df['Value'].max()</p><p id="08R65S30"># 数据下限10, 上限100<br/>slope = (max - lowerLimit) / max<br/>heights = slope * df.Value + lowerLimit</p><p id="08R65S31"># 计算条形图的宽度<br/>width = 2*np.pi / len(df.index)</p><p id="08R65S32"># 计算角度<br/>indexes = list(range(1, len(df.index)+1))<br/>angles = [element * width for element in indexes]</p><p id="08R65S33"># 绘制条形图<br/>bars = ax.bar(<br/>x=angles,<br/>height=heights,<br/>width=width,<br/>bottom=lowerLimit,<br/>linewidth=2,<br/>edgecolor="white",<br/>color="#61a4b2",<br/>)</p><p id="08R65S34"># 添加标签<br/>for bar, angle, height, label in zip(bars,angles, heights, df["Name"]):</p><p id="08R65S35"># 旋转<br/>rotation = np.rad2deg(angle)</p><p id="08R65S36"># 翻转<br/>alignment = ""<br/>if angle >= np.pi/2 and angle < 3*np.pi/2:<br/>alignment = "right"<br/>rotation = rotation + 180<br/>else:<br/>alignment = "left"</p><p id="08R65S37"># 最后添加标签<br/>ax.text(<br/>x=angle,<br/>y=lowerLimit + bar.get_height() + labelPadding,<br/>s=label,<br/>ha=alignment,<br/>va='center',<br/>rotation=rotation,<br/>rotation_mode="anchor")</p><p id="08R65S38">plt.show()<br/></p><p id="08R65S3A">使用Matplotlib进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Fe54901ccp00qzepyk0020d200u000k7g00zk00nx.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S3F"><strong><strong>18.</strong> 矩形树图</strong><br/></p><p id="08R65S3H">矩形树图是一种常见的表达『层级数据』『树状数据』的可视化形式。</p><p id="08R65S3J">它主要用面积的方式,便于突出展现出『树』的各层级中重要的节点。</p><p id="08R65S3L">import matplotlib.pyplot as plt<br/>import squarify<br/>import pandas as pd</p><p id="08R65S3M"># 创建数据<br/>df = pd.DataFrame({'nb_people': [8, 3, 4, 2], 'group': ["group A", "group B", "group C", "group D"]})</p><p id="08R65S3N"># 绘图显示<br/>squarify.plot(sizes=df['nb_people'], label=df['group'], alpha=.8 )<br/>plt.axis('off')<br/>plt.show()<br/></p><p id="08R65S3P">使用squarify库进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Fbd386157p00qzepyk0007d200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S3U"><strong><strong>19.</strong> 维恩图</strong><br/></p><p id="08R65S40">维恩图,显示不同组之间所有可能的关系。</p><p id="08R65S42">import matplotlib.pyplot as plt<br/>from matplotlib_venn import venn2</p><p id="08R65S43"># 创建图表<br/>venn2(subsets=(10, 5, 2), set_labels=('Group A', 'Group B'))</p><p id="08R65S44"># 显示<br/>plt.show()<br/></p><p id="08R65S46">使用matplotlib_venn库进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F3e1f8bfdp00qzepyl000ed200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S4B"><strong><strong>20.</strong> 圆环图</strong><br/></p><p id="08R65S4D">圆环图,本质上就是一个饼图,中间切掉了一个区域。</p><p id="08R65S4F">import matplotlib.pyplot as plt</p><p id="08R65S4G"># 创建数据<br/>size_of_groups = [12, 11, 3, 30]</p><p id="08R65S4H"># 生成饼图<br/>plt.pie(size_of_groups)</p><p id="08R65S4I"># 在中心添加一个圆, 生成环形图<br/>my_circle = plt.Circle((0, 0), 0.7, color='white')<br/>p = plt.gcf()<br/>p.gca().add_artist(my_circle)</p><p id="08R65S4J">plt.show()<br/></p><p id="08R65S4L">使用Matplotlib进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F3909ae69p00qzepyl000cd200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S4O"><strong><strong><br/></strong></strong></p><p id="08R65S4P"><strong><strong><br/></strong></strong></p><p id="08R65S4Q"><strong><strong>21.</strong> 饼图</strong></p><p id="08R65S4S">饼图,最常见的可视化图表之一。</p><p id="08R65S4U">将圆划分成一个个扇形区域,每个区域代表在整体中所占的比例。</p><p id="08R65S50">import matplotlib.pyplot as plt</p><p id="08R65S51"># 创建数据<br/>size_of_groups = [12, 11, 3, 30]</p><p id="08R65S52"># 生成饼图<br/>plt.pie(size_of_groups)<br/>plt.show()<br/></p><p id="08R65S54">使用Matplotlib进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F6db197d6p00qzepym0009d200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S59"><strong><strong>22.</strong> 树图</strong></p><p id="08R65S5B">树图主要用来可视化树形数据结构,是一种特殊的层次类型,具有唯一的根节点,左子树,和右子树。</p><p id="08R65S5D">import pandas as pd<br/>from matplotlib import pyplot as plt<br/>from scipy.cluster.hierarchy import dendrogram, linkage</p><p id="08R65S5E"># 读取数据<br/>df = pd.read_csv('mtcars.csv')<br/>df = df.set_index('model')</p><p id="08R65S5F"># 计算每个样本之间的距离<br/>Z = linkage(df, 'ward')</p><p id="08R65S5G"># 绘图<br/>dendrogram(Z, leaf_rotation=90, leaf_font_size=8, labels=df.index)</p><p id="08R65S5H"># 显示<br/>plt.show()<br/></p><p id="08R65S5J">使用Scipy进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F96092d69p00qzepym0017d200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S5O"><strong><strong>23.</strong> 气泡图</strong></p><p id="08R65S5Q">气泡图,表示层次结构及数值大小。<br/></p><p id="08R65S5S">import circlify<br/>import matplotlib.pyplot as plt</p><p id="08R65S5T"># 创建画布, 包含一个子图<br/>fig, ax = plt.subplots(figsize=(14, 14))</p><p id="08R65S5U"># 标题<br/>ax.set_title('Repartition of the world population')</p><p id="08R65S5V"># 移除坐标轴<br/>ax.axis('off')</p><p id="08R65S60"># 人口数据<br/>data = [{'id': 'World', 'datum': 6964195249, 'children': [<br/>{'id': "North America", 'datum': 450448697,<br/>'children': [<br/>{'id': "United States", 'datum': 308865000},<br/>{'id': "Mexico", 'datum': 107550697},<br/>{'id': "Canada", 'datum': 34033000}<br/>]},<br/>{'id': "South America", 'datum': 278095425,<br/>'children': [<br/>{'id': "Brazil", 'datum': 192612000},<br/>{'id': "Colombia", 'datum': 45349000},<br/>{'id': "Argentina", 'datum': 40134425}<br/>]},<br/>{'id': "Europe", 'datum': 209246682,<br/>'children': [<br/>{'id': "Germany", 'datum': 81757600},<br/>{'id': "France", 'datum': 65447374},<br/>{'id': "United Kingdom", 'datum': 62041708}<br/>]},<br/>{'id': "Africa", 'datum': 311929000,<br/>'children': [<br/>{'id': "Nigeria", 'datum': 154729000},<br/>{'id': "Ethiopia", 'datum': 79221000},<br/>{'id': "Egypt", 'datum': 77979000}<br/>]},<br/>{'id': "Asia", 'datum': 2745929500,<br/>'children': [<br/>{'id': "China", 'datum': 1336335000},<br/>{'id': "India", 'datum': 1178225000},<br/>{'id': "Indonesia", 'datum': 231369500}<br/>]}<br/>]}]</p><p id="08R65S61"># 使用circlify()计算, 获取圆的大小, 位置<br/>circles = circlify.circlify(<br/>data,<br/>show_enclosure=False,<br/>target_enclosure=circlify.Circle(x=0, y=0, r=1)<br/>)</p><p id="08R65S62">lim = max(<br/>max(<br/>abs(circle.x) + circle.r,<br/>abs(circle.y) + circle.r,<br/>)<br/>for circle in circles<br/>)<br/>plt.xlim(-lim, lim)<br/>plt.ylim(-lim, lim)</p><p id="08R65S63">for circle in circles:<br/>if circle.level != 2:<br/>continue<br/>x, y, r = circle<br/>ax.add_patch(plt.Circle((x, y), r, alpha=0.5, linewidth=2, color="lightblue"))</p><p id="08R65S64">for circle in circles:<br/>if circle.level != 3:<br/>continue<br/>x, y, r = circle<br/>label = circle.ex["id"]<br/>ax.add_patch(plt.Circle((x, y), r, alpha=0.5, linewidth=2, color="#69b3a2"))<br/>plt.annotate(label, (x, y), ha='center', color="white")</p><p id="08R65S65">for circle in circles:<br/>if circle.level != 2:<br/>continue<br/>x, y, r = circle<br/>label = circle.ex["id"]<br/>plt.annotate(label, (x, y), va='center', ha='center', bbox=dict(facecolor='white', edgecolor='black', boxstyle='round', pad=.5))</p><p id="08R65S66">plt.show()<br/></p><p id="08R65S68">使用Circlify进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F6144c3afp00qzepyn001xd200u000ntg00zk00s7.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S6B"><strong><strong><br/></strong></strong></p><p id="08R65S6C"><strong><strong><br/></strong></strong></p><p id="08R65S6D"><strong><strong>24.</strong> 折线图</strong></p><p id="08R65S6F">折线图是最常见的图表类型之一。</p><p id="08R65S6H">将各个数据点标志连接起来的图表,用于展现数据的变化趋势。</p><p id="08R65S6J">import matplotlib.pyplot as plt<br/>import numpy as np</p><p id="08R65S6K"># 创建数据<br/>values = np.cumsum(np.random.randn(1000, 1))</p><p id="08R65S6L"># 绘制图表<br/>plt.plot(values)<br/>plt.show()<br/></p><p id="08R65S6N">使用Matplotlib进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F975c76e7p00qzepyo001id200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S6S"><strong><strong>25.</strong> 面积图</strong><br/></p><p id="08R65S6U">面积图和折线图非常相似,区别在于和x坐标轴间是否被颜色填充。</p><p id="08R65S70">import matplotlib.pyplot as plt</p><p id="08R65S71"># 创建数据<br/>x = range(1, 6)<br/>y = [1, 4, 6, 8, 4]</p><p id="08R65S72"># 生成图表<br/>plt.fill_between(x, y)<br/>plt.show()<br/></p><p id="08R65S74">使用Matplotlib的fill_between()进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F2946e92bp00qzepyo000dd200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S79"><strong><strong>26.</strong> 堆叠面积图</strong><br/></p><p id="08R65S7B">堆叠面积图表示若干个数值变量的数值演变。</p><p id="08R65S7D">每个显示在彼此的顶部,易于读取总数,但较难准确读取每个的值。<br/></p><p id="08R65S7F">import matplotlib.pyplot as plt</p><p id="08R65S7G"># 创建数据<br/>x = range(1, 6)<br/>y1 = [1, 4, 6, 8, 9]<br/>y2 = [2, 2, 7, 10, 12]<br/>y3 = [2, 8, 5, 10, 6]</p><p id="08R65S7H"># 生成图表<br/>plt.stackplot(x, y1, y2, y3, labels=['A', 'B', 'C'])<br/>plt.legend(loc='upper left')<br/>plt.show()<br/></p><p id="08R65S7J">使用Matplotlib的stackplot()进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F6602406ap00qzepyp000hd200u000mig00zk00qo.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S7O"><strong><strong>27.</strong> 河流图</strong><br/></p><p id="08R65S7Q">河流图是一种特殊的流图, 它主要用来表示事件或主题等在一段时间内的变化。</p><p id="08R65S7S">围绕着中心轴显示,且边缘是圆形的,从而形成流动的形状。</p><p id="08R65S7U">import matplotlib.pyplot as plt<br/>import numpy as np<br/>from scipy import stats</p><p id="08R65S7V"># 添加数据<br/>x = np.arange(1990, 2020)<br/>y = [np.random.randint(0, 5, size=30) for _ in range(5)]</p><p id="08R65S80">def gaussian_smooth(x, y, grid, sd):<br/>"""平滑曲线"""<br/>weights = np.transpose([stats.norm.pdf(grid, m, sd) for m in x])<br/>weights = weights / weights.sum(0)<br/>return (weights * y).sum(1)</p><p id="08R65S81"># 自定义颜色<br/>COLORS = ["#D0D1E6", "#A6BDDB", "#74A9CF", "#2B8CBE", "#045A8D"]</p><p id="08R65S82"># 创建画布<br/>fig, ax = plt.subplots(figsize=(10, 7))</p><p id="08R65S83"># 生成图表<br/>grid = np.linspace(1985, 2025, num=500)<br/>y_smoothed = [gaussian_smooth(x, y_, grid, 1) for y_ in y]<br/>ax.stackplot(grid, y_smoothed, colors=COLORS, baseline="sym")</p><p id="08R65S84"># 显示<br/>plt.show()<br/></p><p id="08R65S86">先使用Matplotlib绘制堆积图,设置stackplot()的baseline参数,可将数据围绕x轴展示。</p><p id="08R65S88">再通过scipy.interpolate平滑曲线,最终结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F06042ba5p00qzepyp000td200u000l0g00zk00ow.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S8D"><strong><strong>28.</strong> 时间序列图</strong><br/></p><p id="08R65S8F">时间序列图是指能够展示数值演变的所有图表。</p><p id="08R65S8H">比如折线图、柱状图、面积图等等。</p><p id="08R65S8J">import numpy as np<br/>import seaborn as sns<br/>import pandas as pd<br/>import matplotlib.pyplot as plt</p><p id="08R65S8K"># 创建数据<br/>my_count = ["France", "Australia", "Japan", "USA", "Germany", "Congo", "China", "England", "Spain", "Greece", "Marocco",<br/>"South Africa", "Indonesia", "Peru", "Chili", "Brazil"]<br/>df = pd.DataFrame({<br/>"country": np.repeat(my_count, 10),<br/>"years": list(range(2000, 2010)) * 16,<br/>"value": np.random.rand(160)<br/>})</p><p id="08R65S8L"># 创建网格<br/>g = sns.FacetGrid(df, col='country', hue='country', col_wrap=4, )</p><p id="08R65S8M"># 添加曲线图<br/>g = g.map(plt.plot, 'years', 'value')</p><p id="08R65S8N"># 面积图<br/>g = g.map(plt.fill_between, 'years', 'value', alpha=0.2).set_titles("{col_name} country")</p><p id="08R65S8O"># 标题<br/>g = g.set_titles("{col_name}")</p><p id="08R65S8P"># 总标题<br/>plt.subplots_adjust(top=0.92)<br/>g = g.fig.suptitle('Evolution of the value of stuff in 16 countries')</p><p id="08R65S8Q"># 显示<br/>plt.show()<br/></p><p id="08R65S8S">下面以一个时间序列面积图为例,显示多组数据,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Fb6c5271ep00qzepyq0027d200u000prg00zk00ui.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S91"><strong><strong>29.</strong> 地图</strong><br/></p><p id="08R65S93">所有的地理空间数据分析应该都离不开地图吧!</p><p id="08R65S95">import pandas as pd<br/>import folium</p><p id="08R65S96"># 创建地图对象<br/>m = folium.Map(location=[20, 0], tiles="OpenStreetMap", zoom_start=2)</p><p id="08R65S97"># 创建图标数据<br/>data = pd.DataFrame({<br/>'lon': [-58, 2, 145, 30.32, -4.03, -73.57, 36.82, -38.5],<br/>'lat': [-34, 49, -38, 59.93, 5.33, 45.52, -1.29, -12.97],<br/>'name': ['Buenos Aires', 'Paris', 'melbourne', 'St Petersbourg', 'Abidjan', 'Montreal', 'Nairobi', 'Salvador'],<br/>'value': [10, 12, 40, 70, 23, 43, 100, 43]<br/>}, dtype=str)</p><p id="08R65S98"># 添加信息<br/>for i in range(0,len(data)):<br/>folium.Marker(<br/>location=[data.iloc[i]['lat'], data.iloc[i]['lon']],<br/>popup=data.iloc[i]['name'],<br/>).add_to(m)</p><p id="08R65S99"># 保存<br/>m.save('map.html')<br/></p><p id="08R65S9B">使用Folium绘制谷歌地图风格的地图,结果如下。<br/></p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F3c570706p00qzepyr002jd200u000fjg00zk00ie.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S9G"><strong><strong>30.</strong> 等值域地图</strong><br/></p><p id="08R65S9I">等值域地图,相同数值范围,着色相同。<br/></p><p id="08R65S9K">import pandas as pd<br/>import folium</p><p id="08R65S9L"># 创建地图对象<br/>m = folium.Map(location=[40, -95], zoom_start=4)</p><p id="08R65S9M"># 读取数据<br/>state_geo = f"us-states.json"<br/>state_unemployment = f"US_Unemployment_Oct2012.csv"<br/>state_data = pd.read_csv(state_unemployment)</p><p id="08R65S9N">folium.Choropleth(<br/>geo_data=state_geo,<br/>name="choropleth",<br/>data=state_data,<br/>columns=["State", "Unemployment"],<br/>key_on="feature.id",<br/>fill_color="YlGn",<br/>fill_opacity=0.7,<br/>line_opacity=.1,<br/>legend_name="Unemployment Rate (%)",<br/>).add_to(m)</p><p id="08R65S9O">folium.LayerControl().add_to(m)<br/># 保存<br/>m.save('choropleth-map.html')<br/></p><p id="08R65S9Q">使用Folium的choropleth()进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Fa901bca4p00qzepyr0013d200ia00ayg00ia00ay.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65S9V"><strong><strong>31.</strong> 网格地图</strong><br/></p><p id="08R65SA1">Hexbin地图,美国大选投票经常看见。</p><p id="08R65SA3">import pandas as pd<br/>import geopandas as gpd<br/>import matplotlib.pyplot as plt</p><p id="08R65SA4"># 读取数据<br/>file = "us_states_hexgrid.geojson.json"<br/>geoData = gpd.read_file(file)<br/>geoData['centroid'] = geoData['geometry'].apply(lambda x: x.centroid)</p><p id="08R65SA5">mariageData = pd.read_csv("State_mariage_rate.csv")<br/>geoData['state'] = geoData['google_name'].str.replace(' (United States)','')</p><p id="08R65SA6">geoData = geoData.set_index('state').join(mariageData.set_index('state'))</p><p id="08R65SA7"># 初始化<br/>fig, ax = plt.subplots(1, figsize=(6, 4))</p><p id="08R65SA8"># 绘图<br/>geoData.plot(<br/>ax=ax,<br/>column="y_2015",<br/>cmap="BuPu",<br/>norm=plt.Normalize(vmin=2, vmax=13),<br/>edgecolor='black',<br/>linewidth=.5<br/>);</p><p id="08R65SA9"># 不显示坐标轴<br/>ax.axis('off')</p><p id="08R65SAA"># 标题, 副标题,作者<br/>ax.annotate('Mariage rate in the US', xy=(10, 340), xycoords='axes pixels', horizontalalignment='left', verticalalignment='top', fontsize=14, color='black')<br/>ax.annotate('Yes, people love to get married in Vegas', xy=(10, 320), xycoords='axes pixels', horizontalalignment='left', verticalalignment='top', fontsize=11, color='#808080')<br/>ax.annotate('xiao F', xy=(400, 0), xycoords='axes pixels', horizontalalignment='left', verticalalignment='top', fontsize=8, color='#808080')</p><p id="08R65SAB"># 每个网格<br/>for idx, row in geoData.iterrows():<br/>ax.annotate(<br/>s=row['iso3166_2'],<br/>xy=row['centroid'].coords[0],<br/>horizontalalignment='center',<br/>va='center',<br/>color="white"<br/>)</p><p id="08R65SAC"># 添加颜色<br/>sm = plt.cm.ScalarMappable(cmap='BuPu', norm=plt.Normalize(vmin=2, vmax=13))<br/>fig.colorbar(sm, orientation="horizontal", aspect=50, fraction=0.005, pad=0 );</p><p id="08R65SAD"># 显示<br/>plt.show()<br/></p><p id="08R65SAF">使用geopandas和matplotlib进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Fd6fbbb9ep00qzepys000gd200go00b4g00go00b4.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65SAI"><strong><strong><br/></strong></strong></p><p id="08R65SAJ"><strong><strong><br/></strong></strong></p><p id="08R65SAK"><strong><strong>32. 变形地图</strong></strong><br/></p><p id="08R65SAM">故名思义,就是形状发生改变的地图。</p><p id="08R65SAO">其中每个区域的形状,会根据数值发生扭曲变化。</p><p id="08R65SAQ">这里没有相关的代码示例,直接上个图好了。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F13f19223p00qzepyt001qd200u000f7g00yt00hm.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65SAV"><strong>33. 连接映射地图</strong></p><p id="08R65SB1">连接地图可以显示地图上几个位置之间的连接关系。</p><p id="08R65SB3">航空上经常用到的飞线图,应该是这个的升级版。</p><p id="08R65SB5">from mpl_toolkits.basemap import Basemap<br/>import matplotlib.pyplot as plt<br/>import pandas as pd</p><p id="08R65SB6"># 数据<br/>cities = {<br/>'city': ["Paris", "Melbourne", "Saint.Petersburg", "Abidjan", "Montreal", "Nairobi", "Salvador"],<br/>'lon': [2, 145, 30.32, -4.03, -73.57, 36.82, -38.5],<br/>'lat': [49, -38, 59.93, 5.33, 45.52, -1.29, -12.97]<br/>}<br/>df = pd.DataFrame(cities, columns=['city', 'lon', 'lat'])</p><p id="08R65SB7"># 创建地图<br/>m = Basemap(llcrnrlon=-179, llcrnrlat=-60, urcrnrlon=179, urcrnrlat=70, projection='merc')<br/>m.drawmapboundary(fill_color='white', linewidth=0)<br/>m.fillcontinents(color='#f2f2f2', alpha=0.7)<br/>m.drawcoastlines(linewidth=0.1, color="white")</p><p id="08R65SB8"># 循环建立连接<br/>for startIndex, startRow in df.iterrows():<br/>for endIndex in range(startIndex, len(df.index)):<br/>endRow = df.iloc[endIndex]<br/>m.drawgreatcircle(startRow.lon, startRow.lat, endRow.lon, endRow.lat, linewidth=1, color='#69b3a2');</p><p id="08R65SB9"># 添加城市名称<br/>for i, row in df.iterrows():<br/>plt.annotate(row.city, xy=m(row.lon + 3, row.lat), verticalalignment='center')</p><p id="08R65SBA">plt.show()<br/></p><p id="08R65SBC">使用basemap绘制,结果如下。<br/></p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F2508bcc6p00qzepyt000xd200nq00bug00nq00bu.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65SBH"><strong>34. 气泡地图</strong></p><p id="08R65SBJ">气泡地图,使用不同尺寸的圆来表示该地理坐标的数值。<br/></p><p id="08R65SBL">import folium<br/>import pandas as pd</p><p id="08R65SBM"># 创建地图对象<br/>m = folium.Map(location=[20,0], tiles="OpenStreetMap", zoom_start=2)</p><p id="08R65SBN"># 坐标点数据<br/>data = pd.DataFrame({<br/>'lon': [-58, 2, 145, 30.32, -4.03, -73.57, 36.82, -38.5],<br/>'lat': [-34, 49, -38, 59.93, 5.33, 45.52, -1.29, -12.97],<br/>'name': ['Buenos Aires', 'Paris', 'melbourne', 'St Petersbourg', 'Abidjan', 'Montreal', 'Nairobi', 'Salvador'],<br/>'value': [10, 12, 40, 70, 23, 43, 100, 43]<br/>}, dtype=str)</p><p id="08R65SBO"># 添加气泡<br/>for i in range(0, len(data)):<br/>folium.Circle(<br/>location=[data.iloc[i]['lat'], data.iloc[i]['lon']],<br/>popup=data.iloc[i]['name'],<br/>radius=float(data.iloc[i]['value'])*20000,<br/>color='crimson',<br/>fill=True,<br/>fill_color='crimson'<br/>).add_to(m)</p><p id="08R65SBP"># 保存<br/>m.save('bubble-map.html')<br/></p><p id="08R65SBR">使用Folium的Circle()进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F82a2b969p00qzepyu002kd200u000g1g00zk00iz.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65SC0"><strong><strong>35.</strong> 和弦图</strong><br/></p><p id="08R65SC2">和弦图表示若干个实体(节点)之间的流或连接。</p><p id="08R65SC4">每个实体(节点)有圆形布局外部的一个片段表示。</p><p id="08R65SC6">然后在每个实体之间绘制弧线,弧线的大小与流的关系成正比。</p><p id="08R65SC8">from chord import Chord</p><p id="08R65SC9">matrix = [<br/>[0, 5, 6, 4, 7, 4],<br/>[5, 0, 5, 4, 6, 5],<br/>[6, 5, 0, 4, 5, 5],<br/>[4, 4, 4, 0, 5, 5],<br/>[7, 6, 5, 5, 0, 4],<br/>[4, 5, 5, 5, 4, 0],<br/>]</p><p id="08R65SCA">names = ["Action", "Adventure", "Comedy", "Drama", "Fantasy", "Thriller"]</p><p id="08R65SCB"># 保存<br/>Chord(matrix, names).to_html("chord-diagram.html")<br/></p><p id="08R65SCD">使用Chord库进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Fbc39a2e2p00qzepyu0011d200ik00f3g00ik00f3.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65SCI"><strong><strong>36. 网状图</strong></strong><br/></p><p id="08R65SCK">网状图显示的是一组实体之间的连接关系。</p><p id="08R65SCM">每个实体由一个节点表示,节点之间通过线段连接。</p><p id="08R65SCO">import pandas as pd<br/>import numpy as np<br/>import networkx as nx<br/>import matplotlib.pyplot as plt</p><p id="08R65SCP"># 创建数据<br/>ind1 = [5, 10, 3, 4, 8, 10, 12, 1, 9, 4]<br/>ind5 = [1, 1, 13, 4, 18, 5, 2, 11, 3, 8]<br/>df = pd.DataFrame(<br/>{'A': ind1, 'B': ind1 + np.random.randint(10, size=(10)), 'C': ind1 + np.random.randint(10, size=(10)),<br/>'D': ind1 + np.random.randint(5, size=(10)), 'E': ind1 + np.random.randint(5, size=(10)), 'F': ind5,<br/>'G': ind5 + np.random.randint(5, size=(10)), 'H': ind5 + np.random.randint(5, size=(10)),<br/>'I': ind5 + np.random.randint(5, size=(10)), 'J': ind5 + np.random.randint(5, size=(10))})</p><p id="08R65SCQ"># 计算相关性<br/>corr = df.corr()</p><p id="08R65SCR"># 转换<br/>links = corr.stack().reset_index()<br/>links.columns = ['var1', 'var2', 'value']</p><p id="08R65SCS"># 保持相关性超过一个阈值, 删除自相关性<br/>links_filtered = links.loc[(links['value'] > 0.8) & (links['var1'] != links['var2'])]</p><p id="08R65SCT"># 生成图<br/>G = nx.from_pandas_edgelist(links_filtered, 'var1', 'var2')</p><p id="08R65SCU"># 绘制网络<br/>nx.draw(G, with_labels=True, node_color='orange', node_size=400, edge_color='black', linewidths=1, font_size=15)</p><p id="08R65SCV"># 显示<br/>plt.show()<br/></p><p id="08R65SD1">使用NetworkX库进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F4134b11ep00qzepyv0008d200hs00dcg00hs00dc.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65SD6"><strong>37. 桑基图</strong><br/></p><p id="08R65SD8">桑基图是一种特殊的流图。</p><p id="08R65SDA">它主要用来表示原材料、能量等如何从初始形式经过中间过程的加工、转化到达最终形式。<br/></p><p id="08R65SDC">Plotly可能是创建桑基图的**工具,通过Sankey()在几行代码中获得一个图表。</p><p id="08R65SDE">import plotly.graph_objects as go<br/>import json</p><p id="08R65SDF"># 读取数据<br/>with open('sankey_energy.json') as f:<br/>data = json.load(f)</p><p id="08R65SDG"># 透明度<br/>opacity = 0.4<br/># 颜色<br/>data['data'][0]['node']['color'] = ['rgba(255,0,255, 0.8)' if color == "magenta" else color for color in data['data'][0]['node']['color']]<br/>data['data'][0]['link']['color'] = [data['data'][0]['node']['color'][src].replace("0.8", str(opacity))<br/>for src in data['data'][0]['link']['source']]</p><p id="08R65SDH">fig = go.Figure(data=[go.Sankey(<br/>valueformat=".0f",<br/>valuesuffix="TWh",<br/># 点<br/>node=dict(<br/>pad=15,<br/>thickness=15,<br/>line=dict(color = "black", width = 0.5),<br/>label=data['data'][0]['node']['label'],<br/>color=data['data'][0]['node']['color']<br/>),<br/># 线<br/>link=dict(<br/>source=data['data'][0]['link']['source'],<br/>target=data['data'][0]['link']['target'],<br/>value=data['data'][0]['link']['value'],<br/>label=data['data'][0]['link']['label'],<br/>color=data['data'][0]['link']['color']<br/>))])</p><p id="08R65SDI">fig.update_layout(title_text="Energy forecast for 2050<br/>Source: Department of Energy & Climate Change, Tom Counsell via Mike Bostock",<br/>font_size=10)</p><p id="08R65SDJ"># 保持<br/>fig.write_html("sankey-diagram.html")<br/></p><p id="08R65SDL">使用Plotly库进行绘制,结果如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F6bec53e3p00qzepyv0022d200u000***00zk00ig.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65SDQ"><strong><strong>38.</strong> 弧线图</strong><br/></p><p id="08R65SDS">弧线图是一种特殊的网络图。</p><p id="08R65SDU">由代表实体的节点和显示实体之间关系的弧线组成的。</p><p id="08R65SE0">在弧线图中,节点沿单个轴显示,节点间通过圆弧线进行连接。</p><p id="08R65SE2">目前还不知道如何通过Python来构建弧线图,不过可以使用R或者D3.js。</p><p id="08R65SE4">下面就来看一个通过js生成的弧线图。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2Fce262504p00qzepyx0052d200u000ggg00z600j9.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65SE9"><strong><strong>39.</strong> 环形布局关系图</strong><br/></p><p id="08R65SEB">可视化目标之间的关系,可以减少复杂网络下观察混乱。</p><p id="08R65SED">和弧线图一样,也只能通R或者D3.js绘制。</p><p id="08R65SEF">D3.js绘制的示例如下。</p><p class="f_center"><img src="https://nimg.ws.126.net/?url=http%3A%2F%2Fdingyue.ws.126.net%2F2021%2F0914%2F294dab7cp00qzepyx0079d200rw00qkg00rw00qk.png&thumbnail=660x2147483647&quality=80&type=jpg"/><br/></p><p id="08R65SEJ"><strong><strong><br/></strong></strong></p><p id="08R65SEK"><strong><strong>40. 动态图表</strong></strong><br/></p><p id="08R65SEM">动态图表本质上就是显示一系列静态图表。</p><p id="08R65SEO">可以描述目标从一种状态到另一种状态的变化。</p><p id="08R65SEQ">import imageio<br/>import pandas as pd<br/>import matplotlib.pyplot as plt</p><p id="08R65SER"># 读取数据<br/>data = pd.read_csv('gapminderData.csv')<br/># 更改格式<br/>data['continent'] = pd.Categorical(data['continent'])</p><p id="08R65SES"># 分辨率<br/>dpi = 96</p><p id="08R65SET">filenames = []<br/># 每年的数据<br/>for i in data.year.unique():<br/># 关闭交互式绘图<br/>plt.ioff()</p><p id="08R65SEU"># 初始化<br/>fig = plt.figure(figsize=(680 / dpi, 480 / dpi), dpi=dpi)</p><p id="08R65SEV"># 筛选数据<br/>subsetData = data[data.year == i]</p><p id="08R65SF0"># 生成散点气泡图<br/>plt.scatter(<br/>x=subsetData['lifeExp'],<br/>y=subsetData['gdpPercap'],<br/>s=subsetData['pop'] / 200000,<br/>c=subsetData['continent'].cat.codes,<br/>cmap="Accent", alpha=0.6, edgecolors="white", linewidth=2)</p><p id="08R65SF1"># 添加相关信息<br/>plt.yscale('log')<br/>plt.xlabel("Life Expectancy")<br/>plt.ylabel("GDP per Capita")<br/>plt.title("Year: " + str(i))<br/>plt.ylim(0, 100000)<br/>plt.xlim(30, 90)</p><p id="08R65SF2"># 保存<br/>filename = 'https://www.163.com/dy/article/images/' + str(i) + '.png'<br/>filenames.append(filename)<br/>plt.savefig(fname=filename, dpi=96)<br/>plt.gca()<br/>plt.close(fig)</p><p id="08R65SF3"># 生成GIF动态图表<br/>with imageio.get_writer('result.gif', mode='I', fps=5) as writer:<br/>for filename in filenames:<br/>image = imageio.imread(filename)<br/>writer.append_data(image)<br/></p><p id="08R65SF5">以一个动态散点气泡图为例,<br/></p><p id="08R65SF7">先用matplotlib绘制图表图片,再通过imageio生成GIF,结果如下。<br/></p><p class="f_center"><img src="http://dingyue.ws.126.net/2021/0914/ba523815g00qzepyy0070d200iw00dcg00iw00dc.gif"/><br/></p><p id="08R65SFB">好了,本期的分享就到此结束了。</p><p id="08R65SFD">其中使用到的可视化库,大部分通过pip install即可完成安装。</p><p id="08R65SFF">相关代码及文件已上传,回复「<strong>可视化图表</strong>」即可获取。</p><p id="08R65SFL"> 点「在看」的人都变好看了哦!</p>
讯享网

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/208293.html