plotly基础

plotly基础图画

学习自官方文档:https://plotly.com/python/

保存图片:下载 plotly-orca package ,fig.write_image(“images/fig1.png”) 保存

运用plotly画图的步骤,主要可以分成三类,我尝试归纳一下:

  1. 使用 Plotly Express , 从这个名字就可以看出这个类是让我们可以快速画出基本的图并做一些自定义的。
    • 步骤:
    • import plotly.express as px
    • 按照自己想要画的图选择api , px.scatter(),px.bar()
  2. 使用 plotly.graph_objects 这是更普遍的画图类,可以选择的图像更多,也支持更多自定义
    • 使用 graph_objects(简称go)画图也可以分为两种范式,第一种更加方便快捷但是代码一多会造成混乱,第二种适合复杂的图形绘制,支持添加注释、添加、更新图画等,但也使得代码更庞大
    • 第一种:fig = go.Figure(data= …)
      • 所有的操作都在go.Figure()中完成 ,在里面可以添加图元 ,比如go.Figure(data=go.scatter())
    • 第二种:fig = go.Figure()
      • 即我们先声明一张空白的画布,然后我们慢慢添加、修改
      • fig.add_trace() … 添加图元 比如 fig.add_trace(go.Scatter(…))
      • fig.update_trace()… 修改图元,设置图元颜色样式
      • fig.update_layout()… 渲染标题,注释等等
    • 第三种 :fig = go.Figure(fig)
      • 首先我们用fig.add_traces 这些方法构成fig中的所有要素、
      • 然后利用fig = go.Figure(fig) 把图片交给 go去渲染
      • 最后fig.show()呈现
  3. 第三类就是前两种的混合
    1. 先让px画出一个简单图形,然后用go来进行更新和美化布局.往往是添加一些text
    2. 或者先让fig = go.Figure(data= …)画出初步图形,再利用 update_traces或者update_layout 更新

我认为,若要系统的画图时,还是应该采用 go 的第二种画图范式进行。这样能让代码更好管理,容错率也会越高

下面提供一些 整理

layout()的参数:https://plotly.com/python-api-reference/generated/plotly.graph_objects.Layout.html#plotly.graph_objects.Layout

update_layout常用参数有:

title

yaxis_zeroline xaxis_zeroline

xaxis_title yaxis_title

legend

xaxis yaxis

autosize

margin

showlegend

plot_bgcolor

annotations

barmode bargap bargroupgap

uniformtext_minsize

uniformtext_mode

xaxis_tickangle yaxis_tickangle …

各种图的Traces属性参考document:https://plotly.com/python-api-reference/plotly.graph_objects.html#simple-traces

update_traces 常用参数有:update_traces的参数根据不同的图标而不同,有些是公用的

mode

marker_line_width marker_line_color

marker_size : list

hoverinfo : Any combination of [‘label’, ‘text’, ‘value’, ‘percent’, ‘name’] joined with ‘+’ characters

marker_color : list

opacity :float [0,1]

texttemplate

textposition : [‘inside’, ‘outside’, ‘auto’, ‘none’]

textinfo : Any combination of [‘label’, ‘text’, ‘value’, ‘percent’] joined with ‘+’ characters

Colorscale 的选择:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
One of the following named colorscales:
['aggrnyl', 'agsunset', 'algae', 'amp', 'armyrose', 'balance',
'blackbody', 'bluered', 'blues', 'blugrn', 'bluyl', 'brbg',
'brwnyl', 'bugn', 'bupu', 'burg', 'burgyl', 'cividis',
'curl', 'darkmint', 'deep', 'delta', 'dense', 'earth',
'edge', 'electric',
'emrld', 'fall', 'geyser','gnbu', 'gray', 'greens',
'greys', 'haline', 'hot', 'hsv', 'ice', 'icefire',
'inferno', 'jet', 'magenta','magma', 'matter', 'mint',
'mrybm', 'mygbm', 'oranges','orrd', 'oryel', 'peach',
'phase', 'picnic', 'pinkyl', 'piyg', 'plasma', 'plotly3',
'portland', 'prgn', 'pubu', 'pubugn','puor',
'purd', 'purp', 'purples','purpor','rainbow', 'rdbu',
'rdgy', 'rdpu', 'rdylbu', 'rdylgn','redor', 'reds',
'solar', 'spectral', 'speed', 'sunset','sunsetdark','teal',
'tealgrn', 'tealrose', 'tempo', 'temps', 'thermal', 'tropic',
'turbid', 'twilight', 'viridis', 'ylgn', 'ylgnbu', 'ylorbr',
'ylorrd'].

Scatter Plots

关于一些表的参数:https://plotly.com/python-api-reference/generated/plotly.graph_objects.Scatter.html#plotly.graph_objects.Scatter

Plotly Express

1
2
3
import plotly.express as px
fig = px.scatter(x=[0,1,2,3,4],y=[0,1,4,9,16])
fig.show()

1
2
3
df = px.data.iris() # iris is a pandas DataFrame
fig = px.scatter(df, x="sepal_width", y="sepal_length")
fig.show()

Set size and color with column names

hover_data就是当我们鼠标移动到散点上会显示这个点的数据.所以我们这里令hover_data=[‘petal_width’],就是把petal_width 这个维度的信息添加到显示信息中去

1
2
3
4
5
6
7
8
9
df = px.data.iris()
fig = px.scatter(
df,
x="sepal_width",
y="sepal_length",
color="species",
size='petal_length',
hover_data=['petal_width'])
fig.show()

go.Scatter

此外,我们可以使用更加通用的 go.Scatter类。 和plotly.express 将line()和 scatter() 分成两个函数来使用不同, go.Scatter 可以通过设置 mode属性,用一个api就能画出 线型图和散点图。我们可以看看 go.Scatter的reference page.来具体了解其中属性。这里给出几个例子(go.Scatter默认画线形图)

Simple Scatter Plot

1
2
3
4
5
6
7
8
9
10
import plotly.graph_objects as go
import numpy as np

N = 1000
t = np.linspace(0, 10, 100)
y = np.sin(t)

fig = go.Figure(data=go.Scatter(x=t, y=y, mode='markers'))

fig.show()

Line and Scatter Plots

Use mode argument to choose between markers, lines, or a combination of both. For more options about line plots, see also the line charts notebook and the filled area plots notebook.

如果是markers,那么就是散点图;如果是lines,那么就是线型图;如果 mode=’lines+markers’那么就是在线型图的基础上描点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
np.random.seed(1)

N = 100
random_x = np.linspace(0, 1, N)
random_y0 = np.random.randn(N) + 5
random_y1 = np.random.randn(N)
random_y2 = np.random.randn(N) - 5

fig = go.Figure()

# Add traces
fig.add_trace(go.Scatter(x=random_x, y=random_y0,
mode='markers',
name='markers'))
fig.add_trace(go.Scatter(x=random_x, y=random_y1,
mode='lines+markers',
name='lines+markers'))
fig.add_trace(go.Scatter(x=random_x, y=random_y2,
mode='lines',
name='lines'))
fig.show()

Bubble Scatter Plots

在气泡图中,第三维度的数据可以由气泡的大小反应,我们可以在bubble chart notebook一章中详细介绍

在这里,我们手动设置了marker的属性:传入了一个字典,让不同的scatters有着不同的size属性和color属性

1
2
3
4
5
6
7
8
9
10
11
import plotly.graph_objects as go

fig = go.Figure(data=go.Scatter(
x=[1, 2, 3, 4],
y=[10, 11, 12, 13],
mode='markers',
marker=dict(size=[40, 60, 80, 100],
color=[0, 1, 2, 3])
))

fig.show()

Style Scatter Plots

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
t = np.linspace(0, 10, 100)# 0-10 取100个点

fig = go.Figure() # 申请一张空白画布

fig.add_trace(go.Scatter( # 在画布中添加 trace 第一条是sin
x=t, y=np.sin(t),
name='sin',
mode='markers',
marker_color='rgba(152, 0, 0, .8)'
))

fig.add_trace(go.Scatter( # 第二条是cos
x=t, y=np.cos(t),
name='cos',
marker_color='rgba(255, 182, 193, .9)'
))

# Set options common to all traces with fig.update_traces
fig.update_traces(mode='markers', marker_line_width=2, marker_size=10)
fig.update_layout(title='Styled Scatter',
yaxis_zeroline=False, xaxis_zeroline=False)


fig.show()

Data Labels on Hover

首先我们git clone https://github.com/plotly/datasets.git 然后在本地操作,不然jupyter每次加载报错。

x轴就是 各个州的缩写,y轴就是各个州的人口数量,利用散点图形式绘制,然后设置 marker_color 属性,也就是根据人口的不同来呈现不同颜色。

text就是鼠标移动到点上呈现的文本

1
2
3
4
5
6
7
8
9
10
data= pd.read_csv("./datasets/2014_usa_states.csv")

fig = go.Figure(data=go.Scatter(x=data['Postal'],
y=data['Population'],
mode='markers',
marker_color=data['Population'],
text=data['State'])) # hover text goes here

fig.update_layout(title='Population of USA States')
fig.show()

Scatter with a Color Dimension

通过设置 colorscale,我们可以添加一个色阶。同时我们传入的marker信息有:点的大小,点的颜色(随机),是否显示colorscale:true

1
2
3
4
5
6
7
8
9
10
11
12
fig = go.Figure(data=go.Scatter(
y = np.random.randn(500),
mode='markers',
marker=dict(
size=16,
color=np.random.randn(500), #set color equal to a variable
colorscale='Viridis', # one of plotly colorscales
showscale=True
)
))

fig.show()

Large Data Sets

Now in Ploty you can implement WebGL with Scattergl() in place of Scatter()
for increased speed, improved interactivity, and the ability to plot even more data!

使用Scattergl,可以比Scatter更快,交互性也更好,而且能画出更多的数据。比如下面我们用100_000级数据进行操作

1
2
3
4
5
6
7
8
9
10
11
12
13
N = 100_000
fig = go.Figure(data=go.Scattergl(
x = np.random.randn(N),
y = np.random.randn(N),
mode='markers',
marker=dict(
color=np.random.randn(N),
colorscale='Viridis',
line_width=1
)
))

fig.show()

上面是随机生成10万个数据,xxi安眠,我们要在一块区域当中生成10w个数据。

首先在一个圆里生成10w数据,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
N = 100000
r = np.random.uniform(0, 1, N)
theta = np.random.uniform(0, 2*np.pi, N)

fig = go.Figure(data=go.Scattergl(
x = r * np.cos(theta), # non-uniform distribution
y = r * np.sin(theta), # zoom to see more points at the center
mode='markers',
marker=dict(
color=np.random.randn(N),
colorscale='Viridis',
line_width=1
)
))

fig.show()

Line Charts

Plotly Express

1
2
fig = px.line(x=t, y=np.cos(t), labels={'x':'t', 'y':'cos(t)'})
fig.show()

这里,我们使用plotly的一个内置库 gapminder 主要记录了各个国家的情况

1
2
df = px.data.gapminder()
df.head()
country continent year lifeExp pop gdpPercap iso_alpha iso_num
0 Afghanistan Asia 1952 28.801 8425333 779.445314 AFG 4
1 Afghanistan Asia 1957 30.332 9240934 820.853030 AFG 4
2 Afghanistan Asia 1962 31.997 10267083 853.100710 AFG 4
3 Afghanistan Asia 1967 34.020 11537966 836.197138 AFG 4
4 Afghanistan Asia 1972 36.088 13079460 739.981106 AFG 4
1
2
3
df = px.data.gapminder().query("continent == 'Oceania'")
fig = px.line(df, x='year', y='lifeExp', color='country')
fig.show()

go.Scatter

Simple Line Plot

1
2
3
4
5
6
7
import plotly.graph_objects as go
import numpy as np

x = np.arange(10)

fig = go.Figure(data=go.Scatter(x=x, y=x**2))
fig.show()

Line Plot Mode 和 scatter plot一样,也是 line,markers,line+markers 三种

Style Line Plots

This example styles the color and dash of the traces, adds trace names, modifies line width, and adds plot and axes titles.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# Add data
month = ['January', 'February', 'March', 'April', 'May', 'June', 'July',
'August', 'September', 'October', 'November', 'December']
high_2000 = [32.5, 37.6, 49.9, 53.0, 69.1, 75.4, 76.5, 76.6, 70.7, 60.6, 45.1, 29.3]
low_2000 = [13.8, 22.3, 32.5, 37.2, 49.9, 56.1, 57.7, 58.3, 51.2, 42.8, 31.6, 15.9]
high_2007 = [36.5, 26.6, 43.6, 52.3, 71.5, 81.4, 80.5, 82.2, 76.0, 67.3, 46.1, 35.0]
low_2007 = [23.6, 14.0, 27.0, 36.8, 47.6, 57.7, 58.9, 61.2, 53.3, 48.5, 31.0, 23.6]
high_2014 = [28.8, 28.5, 37.0, 56.8, 69.7, 79.7, 78.5, 77.8, 74.1, 62.6, 45.3, 39.9]
low_2014 = [12.7, 14.3, 18.6, 35.5, 49.9, 58.0, 60.0, 58.6, 51.7, 45.2, 32.2, 29.1]

fig = go.Figure()
# Create and style traces
fig.add_trace(go.Scatter(x=month, y=high_2014, name='High 2014',
line=dict(color='firebrick', width=4)))
#第一条是High 2014 砖红色的实线
fig.add_trace(go.Scatter(x=month, y=low_2014, name = 'Low 2014',
line=dict(color='royalblue', width=4)))
#第二条是Low 2014 蓝色的实线
fig.add_trace(go.Scatter(x=month, y=high_2007, name='High 2007',
line=dict(color='firebrick', width=4,
dash='dash')
# dash options include 'dash', 'dot',and 'dashdot'
))
#第三条是High 2007 是一条红色的长虚线,也就是 dash = 'dash'

fig.add_trace(go.Scatter(x=month, y=low_2007, name='Low 2007',
line = dict(color='royalblue', width=4, dash='dash')))
#第四条是Low 2007 是一条蓝色的长虚线
fig.add_trace(go.Scatter(x=month, y=high_2000, name='High 2000',
line = dict(color='firebrick', width=4, dash='dot')))
#第五条是High 2000 是一条红色的点虚线
fig.add_trace(go.Scatter(x=month, y=low_2000, name='Low 2000',
line=dict(color='royalblue', width=4, dash='dot')))
#第六条是Low 2000 是一条拉暗色的点虚线
# 其中 dashdot 是长短虚线相交的线
# Edit the layout
fig.update_layout(title='Average High and Low Temperatures in New York',
xaxis_title='Month',
yaxis_title='Temperature (degrees F)')


fig.show()

Connect Data Gaps

connectgaps=True 这个属性为True的时候,plotly会自动补全线和线之间的空隙。In this tutorial, we showed how to take benefit of this feature and illustrate multiple areas in mapbox.

我们可以通过下面的

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

fig = go.Figure()

fig.add_trace(go.Scatter(
x=x,
y=[10, 20, None, 15, 10, 5, 15, None, 20, 10, 10, 15, 25, 20, 10],
name = '<b>No</b> Gaps', # Style name/legend entry with html tags
connectgaps=True # override default to connect the gaps
))
fig.add_trace(go.Scatter(
x=x,
y=[5, 15, None, 10, 5, 0, 10, None, 15, 5, 5, 10, 20, 15, 5],
name='Gaps',
))

fig.show()

Interpolation with Line Plots

我们可以设置 line_shape 属性,设置线条的形状:

一共可以分为: hv, vh, hvh, vhv ,spline 和 linear 这几种线条。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
x = np.array([1, 2, 3, 4, 5])
y = np.array([1, 3, 2, 3, 1])

fig = go.Figure()
fig.add_trace(go.Scatter(x=x, y=y, name="linear",
line_shape='linear'))
fig.add_trace(go.Scatter(x=x, y=y + 5, name="spline",
text=["tweak line smoothness<br>with 'smoothing' in line object"],
hoverinfo='text+name',
line_shape='spline'))
fig.add_trace(go.Scatter(x=x, y=y + 10, name="vhv",
line_shape='vhv'))
fig.add_trace(go.Scatter(x=x, y=y + 15, name="hvh",
line_shape='hvh'))
fig.add_trace(go.Scatter(x=x, y=y + 20, name="vh",
line_shape='vh'))
fig.add_trace(go.Scatter(x=x, y=y + 25, name="hv",
line_shape='hv'))

fig.update_traces(hoverinfo='text+name', mode='lines+markers')
fig.update_layout(legend=dict(y=0.5, traceorder='reversed', font_size=16))

fig.show()

Label Lines with Annotations

首先做点准备工作,先设置一下标题、标签和颜色,再设置一下 线的粗细和点的大小

然后准备一下x和y的数据,利用np.vstack复制出四个一摸一样的列表。

利用一个循环,把我们的线条都渲染上去。

对于每一个线条,第一次渲染是渲染主体;第二次是渲染断点,也就是第一个数据的位置,标记为marker

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
title = 'Main Source for News'
labels = ['Television', 'Newspaper', 'Internet', 'Radio']
colors = ['rgb(67,67,67)', 'rgb(115,115,115)', 'rgb(49,130,189)', 'rgb(189,189,189)']

mode_size = [8, 8, 12, 8]
line_size = [2, 2, 4, 2]

x_data = np.vstack((np.arange(2001, 2014),)*4)

y_data = np.array([
[74, 82, 80, 74, 73, 72, 74, 70, 70, 66, 66, 69],
[45, 42, 50, 46, 36, 36, 34, 35, 32, 31, 31, 28],
[13, 14, 20, 24, 20, 24, 24, 40, 35, 41, 43, 50],
[18, 21, 18, 21, 16, 14, 13, 18, 17, 16, 19, 23],
])

fig = go.Figure()

for i in range(0, 4):
fig.add_trace(go.Scatter(x=x_data[i], y=y_data[i], mode='lines',
name=labels[i],
line=dict(color=colors[i], width=line_size[i]),
connectgaps=True,
))

# endpoints
fig.add_trace(go.Scatter(
x=[x_data[i][0], x_data[i][-1]],
y=[y_data[i][0], y_data[i][-1]],
mode='markers',
marker=dict(color=colors[i], size=mode_size[i])
))

随后设置一下layout属性,也就是美化图形

我们设置一下x轴: 我们保持x轴,但是去掉了x轴上的方格线,我们保留了x轴上突出的小短线,然后设置其颜色为灰白。然后我们再设置了x轴上的字体。

因为这个图不需要y轴,所以我们把y轴的所有基本信息都置为False

我们设置一下画面的页边距。: l代表左边距,r代表右边距,t代表顶边距

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
fig.update_layout(
xaxis=dict(
showline=True,
showgrid=False,
showticklabels=True,
linecolor='rgb(204, 204, 204)',
linewidth=2,
ticks='outside',
tickfont=dict(
family='Arial',
size=12,
color='rgb(82, 82, 82)',
),
),
yaxis=dict(
showgrid=False,
zeroline=False,
showline=False,
showticklabels=False,
),
autosize=False,
margin=dict(
autoexpand=False,
l=100,
r=20,
t=110,
),
showlegend=False,
plot_bgcolor='white'
)

最后我们来设置一下注释

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

annotations = []

# Adding labels
for y_trace, label, color in zip(y_data, labels, colors):
# labeling the left_side of the plot
annotations.append(dict(xref='paper', x=0.05, y=y_trace[0],
xanchor='right', yanchor='middle',
text=label + ' {}%'.format(y_trace[0]),
font=dict(family='Arial',
size=16),
showarrow=False))
# labeling the right_side of the plot
annotations.append(dict(xref='paper', x=0.95, y=y_trace[11],
xanchor='left', yanchor='middle',
text='{}%'.format(y_trace[11]),
font=dict(family='Arial',
size=16),
showarrow=False))
# Title
annotations.append(dict(xref='paper', yref='paper', x=0.0, y=1.05,
xanchor='left', yanchor='bottom',
text='Main Source for News',
font=dict(family='Arial',
size=30,
color='rgb(37,37,37)'),
showarrow=False))
# Source
annotations.append(dict(xref='paper', yref='paper', x=0.5, y=-0.1,
xanchor='center', yanchor='top',
text='Source: PewResearch Center & ' +
'Storytelling with data',
font=dict(family='Arial',
size=12,
color='rgb(150,150,150)'),
showarrow=False))

fig.update_layout(annotations=annotations)

fig.show()

Filled Lines

上面三条是画这个的

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
x_rev = x[::-1] # x的逆排列 即为 [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

# Line 1
y1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y1_upper = [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
y1_lower = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
y1_lower = y1_lower[::-1]

# Line 2
y2 = [5, 2.5, 5, 7.5, 5, 2.5, 7.5, 4.5, 5.5, 5]
y2_upper = [5.5, 3, 5.5, 8, 6, 3, 8, 5, 6, 5.5]
y2_lower = [4.5, 2, 4.4, 7, 4, 2, 7, 4, 5, 4.75]
y2_lower = y2_lower[::-1]

# Line 3
y3 = [10, 8, 6, 4, 2, 0, 2, 4, 2, 0]
y3_upper = [11, 9, 7, 5, 3, 1, 3, 5, 3, 1]
y3_lower = [9, 7, 5, 3, 1, -.5, 1, 3, 1, -1]
y3_lower = y3_lower[::-1]


fig = go.Figure()

fig.add_trace(go.Scatter(
x=x+x_rev,
y=y1_upper+y1_lower,
fill='toself',
fillcolor='rgba(0,100,80,0.2)',
line_color='rgba(255,255,255,0)',
showlegend=False,
name='Fair',
))
fig.add_trace(go.Scatter(
x=x+x_rev,
y=y2_upper+y2_lower,
fill='toself',
fillcolor='rgba(0,176,246,0.2)',
line_color='rgba(255,255,255,0)',
name='Premium',
showlegend=False,
))
fig.add_trace(go.Scatter(
x=x+x_rev,
y=y3_upper+y3_lower,
fill='toself',
fillcolor='rgba(231,107,243,0.2)',
line_color='rgba(255,255,255,0)',
showlegend=False,
name='Ideal',
))
fig.add_trace(go.Scatter(
x=x, y=y1,
line_color='rgb(0,100,80)',
name='Fair',
))
fig.add_trace(go.Scatter(
x=x, y=y2,
line_color='rgb(0,176,246)',
name='Premium',
))
fig.add_trace(go.Scatter(
x=x, y=y3,
line_color='rgb(231,107,243)',
name='Ideal',
))

fig.update_traces(mode='lines')
fig.show()

Bar Charts

参数文档:https://plotly.com/python-api-reference/generated/plotly.graph_objects.Bar.html#plotly.graph_objects.Bar

Plotly Express

Plotly Express is the easy-to-use, high-level interface to Plotly, which operates on a variety of types of data and produces easy-to-style figures.

With px.bar, each row of the DataFrame is represented as a rectangular mark.

plotly Express 可以快速地帮我们画出一些基本图形

1
2
3
4
import plotly.express as px
data_canada = px.data.gapminder().query("country == 'Canada'")
fig = px.bar(data_canada, x='year', y='pop')
fig.show()

1
data_canada.head()
country continent year lifeExp pop gdpPercap iso_alpha iso_num
240 Canada Americas 1952 68.75 14785584 11367.16112 CAN 124
241 Canada Americas 1957 69.96 17010154 12489.95006 CAN 124
242 Canada Americas 1962 71.30 18985849 13462.48555 CAN 124
243 Canada Americas 1967 72.13 20819767 16076.58803 CAN 124
244 Canada Americas 1972 72.88 22284500 18970.57086 CAN 124

Customize bar chart with Plotly Express

plotly express 也可以做一些简单的定制:我们在 hover_data中添加 lifeExp 和 gdpPercap 两个维度的信息。然后利用lifeExp让plotly对平均寿命的长短添加颜色轴,还定制了一下图的高度 height

1
2
3
4
5
6
7
8
9
10
11
data = px.data.gapminder()

data_canada = data[data.country == 'Canada']
fig = px.bar(data_canada,
x='year',
y='pop',
hover_data=['lifeExp', 'gdpPercap'],
color='lifeExp',
labels={'pop':'population of Canada'},
height=400)
fig.show()

When several rows share the same value of x (here Female or Male), the rectangles are stacked on top of one another by default.

下面是一个叠加柱状图,Dinner和Lunch通过不同颜色的表现形式叠成了一个柱体。因为 barmode 默认是stack也就是堆积图。

1
2
3
df = px.data.tips()
fig = px.bar(df, x="sex", y="total_bill", color='time')
fig.show()

我们可以修改 barmode属性来修改图的样式: barmode有四个选项:[‘stack’, ‘group’, ‘overlay’, ‘relative’]

我们把 barmode改成 group,画面会直观一点:

1
2
3
4
5
6
7
fig = px.bar(df, 
x="sex",
y="total_bill",
color='smoker',
barmode='group',
height=400)
fig.show()

Facetted subplots

使用 facet_row 和 facet_col 属性,我们可以创建子图。这和 seaborn的 row 和 col原理一样。

Use the keyword arguments facet_row (resp. facet_col) to create facetted subplots, where different rows (resp. columns) correspond to different values of the dataframe column specified in facet_row.

1
2
3
4
5
6
7
8
9
10
fig = px.bar(
df,
x="sex",
y="total_bill",
color="smoker",
barmode="group",
facet_row="time",
facet_col="day",
category_orders={"day": ["Thur", "Fri", "Sat", "Sun"],"time": ["Lunch","Dinner"]})
fig.show()

go.Bar

express仅仅是快速画图,现在使用 plotly.graph_objects 来进行画图

If Plotly Express does not provide a good starting point, it is also possible to use the more generic go.Bar class from plotly.graph_objects.

1
2
3
4
5
import plotly.graph_objects as go
animals=['giraffes', 'orangutans', 'monkeys']

fig = go.Figure(data=[go.Bar(x=animals, y=[20, 14, 23])])
fig.show()

Grouped Bar Chart

使用 update_layout 来自定义图像

1
2
3
4
5
6
7
8
9
animals=['giraffes', 'orangutans', 'monkeys']

fig = go.Figure(data=[
go.Bar(name='SF Zoo', x=animals, y=[20, 14, 23]),
go.Bar(name='LA Zoo', x=animals, y=[12, 18, 29])
])
# Change the bar mode
fig.update_layout(barmode='group')
fig.show()

Stacked Bar Chart

1
2
3
4
5
6
7
8
9
animals=['giraffes', 'orangutans', 'monkeys']

fig = go.Figure(data=[
go.Bar(name='SF Zoo', x=animals, y=[20, 14, 23]),
go.Bar(name='LA Zoo', x=animals, y=[12, 18, 29])
])
# Change the bar mode
fig.update_layout(barmode='stack')
fig.show()

Bar Chart with Hover Text

我们利用 update_trace来更新柱状体的样式。通过marker_color 设置柱体颜色,marker_line_color设置边框颜色,marker_line_width设置边框线粗细,opacity设置透明度

最后把title渲染上去

1
2
3
4
5
6
7
8
9
10
11
x = ['Product A', 'Product B', 'Product C']
y = [20, 14, 23]

# Use the hovertext kw argument for hover text
fig = go.Figure(data=[go.Bar(x=x, y=y,
hovertext=['27% market share', '24% market share', '19% market share'])])
# Customize aspect
fig.update_traces(marker_color='rgb(158,202,225)', marker_line_color='rgb(8,48,107)',
marker_line_width=1.5, opacity=0.6)
fig.update_layout(title='January 2013 Sales Report')
fig.show()

Bar Chart with Direct Labels

我们可以设置 go.bar() 中的text属性,把y轴对应的数据直接渲染到柱体上

1
2
3
4
5
6
7
8
9
10
11
x = ['Product A', 'Product B', 'Product C']
y = [20, 14, 23]

# Use textposition='auto' for direct text
fig = go.Figure(data=[go.Bar(
x=x, y=y,
text=y,
textposition='auto',
)])

fig.show()

Controlling text fontsize with uniformtext

If you want all the text labels to have the same size, you can use the uniformtext layout parameter. The minsize attribute sets the font size, and the mode attribute sets what happens for labels which cannot fit with the desired fontsize: either hide them or show them with overflow. In the example below we also force the text to be outside of bars with textposition.

如上图,柱子所代表的数字是被囊括在柱子中间的。但显然这并不美观。我们设置柱子上面的文字是漂浮在柱子之上的,所以我们通过 update_traces 设置textposition属性为outside(还可以选择inside,auto,none)

下图是选择auto的情况,plotly会根据text的长短和柱体的粗细来选择到底是outside还是inside

然后设置 text的模板,也就是texttemplate属性,让他保留两位数,可以是整数也可以是小数

然后我们美化一下,设置uniformtext_minsize也就是text的字体为8,然后把 uniformtext_mode置为True(还可以设置为hide)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
df = px
.data
.gapminder()
.query("continent == 'Europe' and year == 2007 and pop > 2.e6")
fig = px.bar(
df,
y='pop',
x='country',
text='pop')
fig.update_traces(
texttemplate='%{text:.2s}',
textposition='outside')
fig.update_layout(
uniformtext_minsize=8,
uniformtext_mode='show')
fig.show()

Rotated Bar Chart Labels

这个组图是“手工”一步步加上去的,但是我们的重点是 update_layout 设置的xaxis_tickangle属性

这其实是让 x轴的标签有一定的倾斜角度。 正值为

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']

fig = go.Figure()
fig.add_trace(go.Bar(
x=months,
y=[20, 14, 25, 16, 18, 22, 19, 15, 12, 16, 14, 17],
name='Primary Product',
marker_color='indianred'
))
fig.add_trace(go.Bar(
x=months,
y=[19, 14, 22, 14, 16, 19, 15, 14, 10, 12, 12, 16],
name='Secondary Product',
marker_color='lightsalmon'
))

# Here we modify the tickangle of the xaxis, resulting in rotated labels.
fig.update_layout(barmode='group', xaxis_tickangle=-45)
fig.show()

Customizing Individual Bar Colors

我们通过设置marker_color 可以给每一个柱体分配颜色。

marker color 可以是一个单独的值(统一) , 也可以是一个可迭代对象

1
2
3
4
5
6
7
8
9
10
colors = ['lightslategray',] * 5
colors[1] = 'crimson'

fig = go.Figure(data=[go.Bar(
x=['Feature A', 'Feature B', 'Feature C',
'Feature D', 'Feature E'],
y=[20, 14, 23, 25, 22],
marker_color=colors # marker color can be a single color value or an iterable
)])
fig.update_layout(title='Least Used Feature')

Customizing Individual Bar Widths

我们甚至可以自定义柱状体的宽度

1
2
3
4
5
6
7
fig = go.Figure(data=[go.Bar(
x=[1, 2, 3, 5.5, 10],
y=[10, 8, 6, 4, 2],
width=[0.8, 0.8, 0.8, 3.5, 4] # customize width here
)])

fig.show()

Customizing Individual Bar Base

我们可以手动设置 base ,也就是柱体开始的地方。下面设置了柱体的base分别是-500,-600,-700 ,柱体的高度分别是500,600,700 所以柱体从base开始生长,到y=0的时候终结。呈现了”翻转“柱体的效果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
years = ['2016','2017','2018']

fig = go.Figure()
fig.add_trace(go.Bar(x=years, y=[500, 600, 700],
base=[-500,-600,-700],
marker_color='crimson',
name='expenses'))
fig.add_trace(go.Bar(x=years, y=[300, 400, 700],
base=0,
marker_color='lightslategrey',
name='revenue'
))

fig.show()

Colored and Styled Bar Chart

In this example several parameters of the layout as customized, hence it is convenient to use directly the go.Layout(...) constructor instead of calling fig.update.

首先我们添加了 rest of world 和 China 两列个信息。然后我们设置layout

设置xaxis_tickfont_size也就是x轴下标注的字体

然后设置yaxis 的相关信息

接着设置图例,bgcolor和bordercolor 设置了图例的边框和底纹都是透明的。

最后设置了图按group编排,柱组与柱组之间的间隙,和柱组之间柱与柱的间隙

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
years = [1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003,
2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012]

fig = go.Figure()
fig.add_trace(go.Bar(x=years,
y=[219, 146, 112, 127, 124, 180, 236, 207, 236, 263,
350, 430, 474, 526, 488, 537, 500, 439],
name='Rest of world',
marker_color='rgb(55, 83, 109)'
))
fig.add_trace(go.Bar(x=years,
y=[16, 13, 10, 11, 28, 37, 43, 55, 56, 88, 105, 156, 270,
299, 340, 403, 549, 499],
name='China',
marker_color='rgb(26, 118, 255)'
))

fig.update_layout(
title='US Export of Plastic Scrap',
xaxis_tickfont_size=14,
yaxis=dict(
title='USD (millions)',
titlefont_size=16,
tickfont_size=14,
),
legend=dict(
x=0,
y=1.0,
bgcolor='rgba(255, 255, 255, 0)',
bordercolor='rgba(255, 255, 255, 0)'
),
barmode='group',
bargap=0.15, # gap between bars of adjacent location coordinates.
bargroupgap=0.1 # gap between bars of the same location coordinate.
)
fig.show()

Bar Chart with Relative Barmode

With “relative” barmode, the bars are stacked on top of one another, with negative values below the axis, positive values above.

1
2
3
4
5
6
7
8
9
10
x = [1, 2, 3, 4]

fig = go.Figure()
fig.add_trace(go.Bar(x=x, y=[1, 4, 9, 16]))
fig.add_trace(go.Bar(x=x, y=[6, -8, -4.5, 8]))
fig.add_trace(go.Bar(x=x, y=[-15, -3, 4.5, -8]))
fig.add_trace(go.Bar(x=x, y=[-1, 3, -3, -4]))

fig.update_layout(barmode='relative', title_text='Relative Barmode')
fig.show()

Bar Chart with Sorted or Ordered Categories

Set categoryorder to "category ascending" or "category descending" for the alphanumerical order of the category names or "total ascending" or "total descending" for numerical order of values. categoryorder for more information. Note that sorting the bars by a particular trace isn’t possible right now - it’s only possible to sort by the total values. Of course, you can always sort your data before plotting it if you need more customization.

This example orders the bar chart alphabetically with categoryorder: 'category ascending'

1
2
3
4
5
6
7
8
9
import plotly.graph_objects as go

x=['b', 'a', 'c', 'd']
fig = go.Figure(go.Bar(x=x, y=[2,5,1,9], name='Montreal'))
fig.add_trace(go.Bar(x=x, y=[1, 4, 9, 16], name='Ottawa'))
fig.add_trace(go.Bar(x=x, y=[6, 8, 4.5, 8], name='Toronto'))

fig.update_layout(barmode='stack', xaxis={'categoryorder':'category ascending'})
fig.show()

This example shows how to customise sort ordering by defining categoryorder to “array” to derive the ordering from the attribute categoryarray.

1
2
3
4
5
6
7
8
9
import plotly.graph_objects as go

x=['b', 'a', 'c', 'd']
fig = go.Figure(go.Bar(x=x, y=[2,5,1,9], name='Montreal'))
fig.add_trace(go.Bar(x=x, y=[1, 4, 9, 16], name='Ottawa'))
fig.add_trace(go.Bar(x=x, y=[6, 8, 4.5, 8], name='Toronto'))

fig.update_layout(barmode='stack', xaxis={'categoryorder':'array', 'categoryarray':['d','a','c','b']})
fig.show()

This example orders the bar chart by descending value with categoryorder: 'total descending'

1
2
3
4
5
6
7
8
9
import plotly.graph_objects as go

x=['b', 'a', 'c', 'd']
fig = go.Figure(go.Bar(x=x, y=[2,5,1,9], name='Montreal'))
fig.add_trace(go.Bar(x=x, y=[1, 4, 9, 16], name='Ottawa'))
fig.add_trace(go.Bar(x=x, y=[6, 8, 4.5, 8], name='Toronto'))

fig.update_layout(barmode='stack', xaxis={'categoryorder':'total descending'})
fig.show()

Pie Charts

接下来我们来介绍饼图。饼图的参数:https://plotly.com/python-api-reference/generated/plotly.graph_objects.Pie.html#plotly.graph_objects.Pie

plotly express

我们首先用 plotly express来画饼图。只要传入dataset,values,names,px会自动计算各个值占的比例,然后names代表了旁边的一排图例上的名字。不设置图例就没有名字。

In px.pie, data visualized by the sectors of the pie is set in values. The sector labels are set in names.

1
2
3
4
5
6
7
8
9
10
import plotly.express as px
df = px.data.gapminder().query("year == 2007").query("continent == 'Europe'")
df.loc[df['pop'] < 2.e6, 'country'] = 'Other countries'
# Represent only large countries
fig = px.pie(
df,
values='pop',
names='country',
title='Population of European continent')
fig.show()

Pie chart with repeated labels

Lines of the dataframe with the same value for names are grouped together in the same sector.

1
2
3
4
# This dataframe has 244 lines, but 4 distinct values for `day`
df = px.data.tips()
fig = px.pie(df, values='tip', names='day')
fig.show()

Setting the color of pie sectors

我们可以设置color_discrete_sequence 属性给饼状图设置不同的色系

1
2
3
df = px.data.tips()
fig = px.pie(df, values='tip', names='day', color_discrete_sequence=px.colors.sequential.RdBu)
fig.show()

Using an explicit mapping for discrete colors

For more information about discrete colors, see the dedicated page.

我们还可以设置 color_discrete_map 传入一个字典,为不同的day值设置不同的颜色

1
2
3
4
5
6
7
df = px.data.tips()
fig = px.pie(df, values='tip', names='day', color='day',
color_discrete_map={'Thur':'lightcyan',
'Fri':'cyan',
'Sat':'royalblue',
'Sun':'darkblue'})
fig.show()

Customizing a pie chart created with

In the example below, we first create a pie chart with px,pie, using some of its options such as hover_data (which columns should appear in the hover) or labels (renaming column names).

For further tuning, we call fig.update_traces to set other parameters of the chart (you can also use fig.update_layout for changing the layout).

我们首先用 px.pie创建一张初始图片,设置一下标题,hover_data,和label,

这就是我们设置的 hover_data,因为在信息库中列名为lifeExp,但是这有点晦涩,所以我们设置
labels={‘lifeExp’:’life expectancy’}) 这样我们得到的 lifeExp会被替换成 life expectancy

最后利用update_traces方法在饼图上添加文字和百分比。textinfo可以是下面四个选择的任意组合

textinfo: - Any combination of [‘label’, ‘text’, ‘value’, ‘percent’] joined with ‘+’ characters

1
2
3
4
5
6
7
8
9
10
df = px.data.gapminder().query("year == 2007").query("continent == 'Americas'")
fig = px.pie(
df,
values='pop',
names='country',
title='Population of American continent',
hover_data=['lifeExp'],
labels={'lifeExp':'life expectancy'})
fig.update_traces(textposition='inside', textinfo='percent+label')
fig.show()

go.Pie

Basic Pie Chart with go.Pie

In go.Pie, data visualized by the sectors of the pie is set in values. The sector labels are set in labels. The sector colors are set in marker.colors.

If you’re looking instead for a multilevel hierarchical pie-like chart, go to the Sunburst tutorial.

下面我们使用go来画饼状图。最基本的就是设置labels和 values

1
2
3
4
5
6
7
import plotly.graph_objects as go

labels = ['Oxygen','Hydrogen','Carbon_Dioxide','Nitrogen']
values = [4500, 2500, 1053, 500]

fig = go.Figure(data=[go.Pie(labels=labels, values=values)])
fig.show()

Styled Pie Chart

Colors can be given as RGB triplets or hexadecimal strings, or with CSS color names as below.

此外,我们可以通过update_traces 来手添加 hoverinfo和textinfo 并且添加marker来设置颜色、边界线属性

1
2
3
4
5
6
7
8
9
10
11
colors = ['gold', 'mediumturquoise', 'darkorange', 'lightgreen']

fig = go.Figure(data=[go.Pie(
labels=['Oxygen','Hydrogen','Carbon_Dioxide','Nitrogen'],
values=[4500,2500,1053,500])])
fig.update_traces(
hoverinfo='label+percent',
textinfo='value',
textfont_size=20,
marker=dict(colors=colors, line=dict(color='#000000', width=2)))
fig.show()

Controlling text fontsize with uniformtext

If you want all the text labels to have the same size, you can use

the uniformtext layout parameter.

The minsize attribute sets the font size

the mode attribute sets what happens for labels which cannot fit with the desired fontsize: either hide them or show them with overflow.

In the example below we also force the text to be inside with textposition, otherwise text labels which do not fit are displayed outside of pie sectors.

我们通过设置textposition 可以设置文本在饼状图的外面还是里面

然后通过设置layout 来让画面更加美观。当 uniformtext_mode 为hide时,plotly会选择性地给饼图添加文字,如果该部分面积太小,就会被隐藏。而如果uniformtext_mode = ’show‘ 那么一律标出。

1
2
3
4
5
df = px.data.gapminder().query("continent == 'Asia'")
fig = px.pie(df, values='pop', names='country')
fig.update_traces(textposition='inside')
fig.update_layout(uniformtext_minsize=12, uniformtext_mode='hide')
fig.show()

我们可以通过 设置 textposition = ‘auto’ ,uniformtext_mode = ’show‘ 达到这样的效果

Controlling text orientation inside pie sectors

The insidetextorientation attribute controls the orientation of text inside sectors. With “auto” the texts may automatically be rotated to fit with the maximum size inside the slice. Using “horizontal” (resp. “radial”, “tangential”) forces text to be horizontal (resp. radial or tangential)

For a figure fig created with plotly express, use fig.update_traces(insidetextorientation='...') to change the text orientation.(如果使用plotly express 画的话,需要用update_traces修改这个属性,如果使用go可以直接设置)

radial: 辐射状,也就是饼状图中的文字朝向圆心辐射状分布,那么如果选择了horizontal 那么就始终是水平分布的。还可以选择 tangential 和 auto

1
2
3
4
5
6
7
8
9
labels = ['Oxygen','Hydrogen','Carbon_Dioxide','Nitrogen']
values = [4500, 2500, 1053, 500]

fig = go.Figure(data=[go.Pie(
labels=labels,
values=values,
textinfo='label+percent',
insidetextorientation='radial')])
fig.show()

horizontal 的效果:

tangential 的效果

Donut Chart

通过设置hole 属性,可以设置中空的圆,从而实现 甜甜圈图的效果

1
2
3
4
5
6
labels = ['Oxygen','Hydrogen','Carbon_Dioxide','Nitrogen']
values = [4500, 2500, 1053, 500]

# Use `hole` to create a donut-like pie chart
fig = go.Figure(data=[go.Pie(labels=labels, values=values, hole=.3)])
fig.show()

Pulling sectors out from the center

For a “pulled-out” or “exploded” layout of the pie chart, use the pull argument. It can be a scalar for pulling all sectors or an array to pull only some of the sectors.

设置pull属性,可以实现把一块区域 “拉出来” 的效果

1
2
3
4
5
6
7
8
9
labels = ['Oxygen','Hydrogen','Carbon_Dioxide','Nitrogen']
values = [4500, 2500, 1053, 500]

# pull is given as a fraction of the pie radius
fig = go.Figure(data=[go.Pie(
labels=labels,
values=values,
pull=[0, 0, 0.2, 0])])
fig.show()

Pie Charts in subplots

下面来介绍一下子图的用法

首先我们要了解一下 plotly.subplots :https://plotly.com/python/subplots/

然后了解一下这些sub plots 的种类:

specs是可选的参数。specs可以规定每一个子图的种类,表现为一个二维数组,其中各个子图利用键值对表示

By default, the make_subplots function assumes that the traces that will be added to all subplots are 2-dimensional cartesian traces (e.g. scatter, bar, histogram, violin, etc.). Traces with other subplot types (e.g. scatterpolar, scattergeo, parcoords, etc.) are supporteed by specifying the type subplot option in the specs argument to make_subplots.

Here are the possible values for the type option:

  • "xy": 2D Cartesian subplot type for scatter, bar, etc. This is the default if no type is specified.
  • "scene": 3D Cartesian subplot for scatter3d, cone, etc.
  • "polar": Polar subplot for scatterpolar, barpolar, etc.
  • "ternary": Ternary subplot for scatterternary.
  • "mapbox": Mapbox subplot for scattermapbox.
  • "domain": Subplot type for traces that are individually positioned. pie, parcoords, parcats, etc.
  • trace type: A trace type name (e.g. "bar", "scattergeo", "carpet", "mesh", etc.) which will be used to determine the appropriate subplot type for that trace.

然后我们向划分好的子图区域中添加图元。

最后通过 annotations 向甜甜圈的中心添加注释文字

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import plotly.graph_objects as go
from plotly.subplots import make_subplots

labels = ["US",
"China",
"European Union",
"Russian Federation",
"Brazil",
"India",
"Rest of World"]

# Create subplots: use 'domain' type for Pie subplot
fig = make_subplots(
rows=1,
cols=2,
specs=[[{'type':'domain'}, {'type':'domain'}]])
fig.add_trace(go.Pie(
labels=labels,
values=[16, 15, 12, 6, 5, 4, 42],
name="GHG Emissions"),1, 1)
fig.add_trace(go.Pie(
labels=labels,
values=[27, 11, 25, 8, 1, 3, 25],
name="CO2 Emissions"),1, 2)

# Use `hole` to create a donut-like pie chart
fig.update_traces(hole=.4, hoverinfo="label+percent+name")

fig.update_layout(
title_text="Global Emissions 1990-2011",
# Add annotations in the center of the donut pies.
annotations=[dict(text='GHG', x=0.18, y=0.5, font_size=20, showarrow=False),
dict(text='CO2', x=0.82, y=0.5, font_size=20, showarrow=False)])
fig.show()

Plot chart with area proportional to total count

首先我们来规划子图区域,这里是2行2列,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
import plotly.graph_objects as go
from plotly.subplots import make_subplots

labels = ['1st', '2nd', '3rd', '4th', '5th']

# Define color sets of paintings
night_colors = ['rgb(56, 75, 126)',
'rgb(18, 36, 37)',
'rgb(34, 53, 101)',
'rgb(36, 55, 57)',
'rgb(6, 4, 4)']
sunflowers_colors = ['rgb(177, 127, 38)',
'rgb(205, 152, 36)',
'rgb(99, 79, 37)',
'rgb(129, 180, 179)',
'rgb(124, 103, 37)']
irises_colors = ['rgb(33, 75, 99)',
'rgb(79, 129, 102)',
'rgb(151, 179, 100)',
'rgb(175, 49, 35)',
'rgb(36, 73, 147)']
cafe_colors = ['rgb(146, 123, 21)',
'rgb(177, 180, 34)',
'rgb(206, 206, 40)',
'rgb(175, 51, 21)',
'rgb(35, 36, 21)']

# Create subplots, using 'domain' type for pie charts
specs = [
[{'type':'domain'}, {'type':'domain'}],
[{'type':'domain'}, {'type':'domain'}]
]
fig = make_subplots(rows=2, cols=2, specs=specs)

# Define pie charts
fig.add_trace(go.Pie(
labels=labels,
values=[38, 27, 18, 10, 7],
name='Starry Night',
marker_colors=night_colors), 1, 1)
fig.add_trace(go.Pie(
labels=labels,
values=[28, 26, 21, 15, 10],
name='Sunflowers',
marker_colors=sunflowers_colors), 1, 2)
fig.add_trace(go.Pie(
labels=labels,
values=[38, 19, 16, 14, 13],
name='Irises',
marker_colors=irises_colors), 2, 1)
fig.add_trace(go.Pie(
labels=labels,
values=[31, 24, 19, 18, 8],
name='The Night Café',
marker_colors=cafe_colors), 2, 2)

# Tune layout and hover info
fig.update_traces(hoverinfo='label+percent+name', textinfo='none')
fig.update(
layout_title_text='Van Gogh: 5 Most Prominent Colors Shown Proportionally',
layout_showlegend=False)
'''
注: fig.update 可以写成下面的形式
fig.update_layout(
title='Van Gogh: 5 Most Prominent Colors Shown Proportionally',
showlegend=False
)
'''
fig = go.Figure(fig)
fig.show()

Bubble Charts

plotly.express

A bubble chart is a scatter plot in which a third dimension of the data is shown through the size of markers.

We first show a bubble chart example using Plotly Express. Plotly Express is the easy-to-use, high-level interface to Plotly, which operates on a variety of types of data and produces easy-to-style figures. The size of markers is set from the dataframe column given as the size parameter.

首先我们用plotly express 来画一个气泡图,操作非常方便。我们只需要向size 传入一个维度的信息即可。px会自动帮我们渲染

1
2
3
4
5
6
7
8
9
10
11
12
13
import plotly.express as px
df = px.data.gapminder()

fig = px.scatter(
df.query("year==2007"),
x="gdpPercap",
y="lifeExp",
size="pop",
color="continent",
hover_name="country",
log_x=True,
size_max=60)
fig.show()

go.Scatter

当然我们也可以用 go.Scatter ,并手动设置大小

Simple Bubble Chart

1
2
3
4
5
6
7
8
9
10
import plotly.graph_objects as go

fig = go.Figure(data=[go.Scatter(
x=[1, 2, 3, 4],
y=[10, 11, 12, 13],
mode='markers',
marker_size=[40, 60, 80, 100])
])

fig.show()

Setting Marker Size and Color

我们也可以手动设置颜色。

1
2
3
4
5
6
7
8
9
10
11
12
fig = go.Figure(data=[go.Scatter(
x=[1, 2, 3, 4], y=[10, 11, 12, 13],
mode='markers',
marker=dict(
color=['rgb(93, 164, 214)', 'rgb(255, 144, 14)',
'rgb(44, 160, 101)', 'rgb(255, 65, 54)'],
opacity=[1, 0.8, 0.6, 0.4],
size=[40, 60, 80, 100],
)
)])

fig.show()

Scaling the Size of Bubble Charts

To scale the bubble size, use the attribute sizeref. We recommend using the following formula to calculate a sizeref value:
sizeref = 2. * max(array of size values) / (desired maximum marker size ** 2)
Note that setting ‘sizeref’ to a value greater than 1, decreases the rendered marker sizes, while setting ‘sizeref’ to less than 1, increases the rendered marker sizes. See https://plotly.com/python/reference/#scatter-marker-sizeref for more information. Additionally, we recommend setting the sizemode attribute: https://plotly.com/python/reference/#scatter-marker-sizemode to area.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
size = [20, 40, 60, 80, 100, 80, 60, 40, 20, 40]
fig = go.Figure(data=[go.Scatter(
x=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
y=[11, 12, 10, 11, 12, 11, 12, 13, 12, 11],
mode='markers',
marker=dict(
size=size,
sizemode='area',
sizeref=2.*max(size)/(40.**2),
sizemin=4
)
)])

fig.show()

Hover Text with Bubble Charts

我们可以添加向每个图元添加Hover Text

1
2
3
4
5
6
7
8
9
10
11
12
13
14
fig = go.Figure(data=[go.Scatter(
x=[1, 2, 3, 4], y=[10, 11, 12, 13],
text=['A<br>size: 40', 'B<br>size: 60', 'C<br>size: 80', 'D<br>size: 100'],
mode='markers',
marker=dict(
color=['rgb(93, 164, 214)',
'rgb(255, 144, 14)',
'rgb(44, 160, 101)',
'rgb(255, 65, 54)'],
size=[40, 60, 80, 100],
)
)])

fig.show()

Bubble Charts with Colorscale

在marker中设置 showscale=True 可以添加一个 colorscale

1
2
3
4
5
6
7
8
9
10
11
12
fig = go.Figure(data=[go.Scatter(
x=[1, 3.2, 5.4, 7.6, 9.8, 12.5],
y=[1, 3.2, 5.4, 7.6, 9.8, 12.5],
mode='markers',
marker=dict(
color=[120, 125, 130, 135, 140, 145],
size=[15, 30, 55, 70, 90, 110],
showscale=True
)
)])

fig.show()

Categorical Bubble Charts

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
import pandas as pd
import math

# Load data, define hover text and bubble size
data = px.data.gapminder()
df_2007 = data[data['year']==2007]
df_2007 = df_2007.sort_values(['continent', 'country'])

hover_text = []
bubble_size = []

for index, row in df_2007.iterrows():
hover_text.append(('Country: {country}<br>'+
'Life Expectancy: {lifeExp}<br>'+
'GDP per capita: {gdp}<br>'+
'Population: {pop}<br>'+
'Year: {year}').format(country=row['country'],
lifeExp=row['lifeExp'],
gdp=row['gdpPercap'],
pop=row['pop'],
year=row['year']))
bubble_size.append(math.sqrt(row['pop']))

df_2007['text'] = hover_text
df_2007['size'] = bubble_size
sizeref = 2.*max(df_2007['size'])/(100**2)

# Dictionary with dataframes for each continent
continent_names = ['Africa', 'Americas', 'Asia', 'Europe', 'Oceania']
continent_data = {continent:df_2007.query("continent == '%s'" %continent)
for continent in continent_names}

# Create figure
fig = go.Figure()

for continent_name, continent in continent_data.items():
fig.add_trace(go.Scatter(
x=continent['gdpPercap'], y=continent['lifeExp'],
name=continent_name,
text=continent['text'],
marker_size=continent['size'],
))

# Tune marker appearance and layout
fig.update_traces(mode='markers', marker=dict(sizemode='area',
sizeref=sizeref,
line_width=2))

fig.update_layout(
title='Life Expectancy v. Per Capita GDP, 2007',
xaxis=dict(
title='GDP per capita (2000 dollars)',
gridcolor='white',
type='log',
gridwidth=2,
),
yaxis=dict(
title='Life Expectancy (years)',
gridcolor='white',
gridwidth=2,
),
paper_bgcolor='rgb(243, 243, 243)',
plot_bgcolor='rgb(243, 243, 243)',
)
fig.show()

Sunburst Plots

Sunburst Plots 是比较酷炫的一类图了,他可以说是 Pie Plots 的加强版。

Sunburst plots visualize hierarchical data spanning outwards radially from root to leaves. The sunburst sector hierarchy is determined by the entries in labels (names in px.sunburst) and in parents. The root starts from the center and children are added to the outer rings.

Main arguments: 最重要的三个参数: labels,也就是Sunburst 上面的文字;parents,如果B是A的外环,A是B的内环,那么A就是B的parents。values用来计算比例

  1. labels (names in px.sunburst since labels is reserved for overriding columns names): sets the labels of sunburst sectors.
  2. parents: sets the parent sectors of sunburst sectors. An empty string '' is used for the root node in the hierarchy. In this example, the root is “Eve”.
  3. values: sets the values associated with sunburst sectors, determining their width (See the branchvalues section below for different modes for setting the width).

plotly.express

With px.sunburst,each row of the DataFrame is represented as a sector of the sunburst.

为了更好的理解 Parents,我们下面再写几个例子:

Eve 的 parents : “” 因为Eve是最内环,没有parents

Cain 的 parents: “Eve” 因为Chain 是 Eve的外环

Noam的parents 是Seth,因为Noam是 parents的外环

1
2
3
4
5
6
7
8
9
10
11
12
13
import plotly.express as px
data = dict(
character=["Eve", "Cain", "Seth", "Enos", "Noam", "Abel", "Awan", "Enoch", "Azura"],
parent=["", "Eve", "Eve", "Seth", "Seth", "Eve", "Eve", "Awan", "Eve" ],
value=[10, 14, 12, 10, 2, 6, 6, 4, 4])

fig =px.sunburst(
data,
names='character',
parents='parent',
values='value',
)
fig.show()

Sunburst of a rectangular DataFrame

Hierarchical data are often stored as a rectangular dataframe, with different columns corresponding to different levels of the hierarchy. px.sunburst can take a path parameter corresponding to a list of columns. Note that id and parent should not be provided if path is given.

这是一个更简单的方法,设置 path属性也就是说,path属性是一个可迭代的对象,依次从内环到外环。

比如下图的数据源是tips, 我们选取了三个维度:day,time和sex

所以我们看到最内环是 day:分成 Thur,Fri,Sat,Sun四个部分,中环是Time,分为Dinner和Lunch两个部分

外环是sex,分为 Female和Male两个部分

1
2
3
4
5
6
df = px.data.tips()
fig = px.sunburst(
df,
path=['day', 'time', 'sex'],
values='total_bill')
fig.show()

Sunburst of a rectangular DataFrame with continuous color argument

If a color argument is passed, the color of a node is computed as the average of the color values of its children, weighted by their values.

color_continuous_midpoint 即为颜色条分界点,这里取了世界平均寿命,也就是图中白色区域,大概是70岁左右

1
2
3
4
5
6
7
8
9
10
11
import numpy as np
df = px.data.gapminder().query("year == 2007")
fig = px.sunburst(
df,
path=['continent', 'country'],
values='pop',
color='lifeExp',
hover_data=['iso_alpha'],
color_continuous_scale='RdBu',
color_continuous_midpoint=np.average(df['lifeExp'], weights=df['pop']))
fig.show()

Sunburst of a rectangular DataFrame with discrete color argument in

官方文档是如何定义color属性的:

When the argument of color corresponds to non-numerical data, discrete colors are used. If a sector has the same value of the color column for all its children, then the corresponding color is used, otherwise the first color of the discrete color sequence is used.

1
2
3
4
5
6
7
df = px.data.tips()
fig = px.sunburst(
df,
path=['sex', 'day', 'time'],
values='total_bill',
color='day')
fig.show()

In the example below the color of Saturday and Sunday sectors is the same as Dinner because there are only Dinner entries for Saturday and Sunday. However, for Female -> Friday there are both lunches and dinners, hence the “mixed” color (blue here) is used.

1
2
3
4
5
6
7
df = px.data.tips()
fig = px.sunburst(
df,
path=['sex', 'day', 'time'],
values='total_bill',
color='time')
fig.show()

Using an explicit mapping for discrete colors

For more information about discrete colors, see the dedicated page.

我们还可以用color_discrete_map 传入一个字典,规定什么维度应该用什么颜色

(?) 就代表如果没有规定好的字段。

1
2
3
4
5
6
7
8
df = px.data.tips()
fig = px.sunburst(
df,
path=['sex', 'day', 'time'],
values='total_bill',
color='time',
color_discrete_map={'(?)':'black', 'Lunch':'gold', 'Dinner':'darkblue'})
fig.show()

Rectangular data with missing values

If the dataset is not fully rectangular, missing values should be supplied as None. Note that the parents of None entries must be a leaf, i.e. it cannot have other children than None (otherwise a ValueError is raised).

如果有一部分空缺了,那么在这部分空缺的值为None

1
2
3
4
5
6
7
8
9
10
11
12
13
import pandas as pd
vendors = ["A", "B", "C", "D", None, "E", "F", "G", "H", None]
sectors = ["Tech", "Tech", "Finance", "Finance", "Other",
"Tech", "Tech", "Finance", "Finance", "Other"]
regions = ["North", "North", "North", "North", "North",
"South", "South", "South", "South", "South"]
sales = [1, 3, 2, 4, 1, 2, 2, 1, 4, 1]
df = pd.DataFrame(
dict(vendors=vendors, sectors=sectors, regions=regions, sales=sales)
)
print(df)
fig = px.sunburst(df, path=['regions', 'sectors', 'vendors'], values='sales')
fig.show()

go.Sunburst

If Plotly Express does not provide a good starting point, it is also possible to use the more generic go.Sunburst class from plotly.graph_objects.

我们接下来使用 go.Suburst作图

同样的我们需要规定 labels parents 和values这三个基本的参数

1
2
3
4
5
6
7
8
9
10
11
12
import plotly.graph_objects as go

fig =go.Figure(go.Sunburst(
labels=["Eve", "Cain", "Seth", "Enos", "Noam", "Abel", "Awan", "Enoch", "Azura"],
parents=["", "Eve", "Eve", "Seth", "Seth", "Eve", "Eve", "Awan", "Eve" ],
values=[10, 14, 12, 10, 2, 6, 6, 4, 4],
))
# Update layout for tight margin
# See https://plotly.com/python/creating-and-updating-figures/
fig.update_layout(margin = dict(t=0, l=0, r=0, b=0))

fig.show()

Sunburst with Repeated Labels

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
fig =go.Figure(go.Sunburst(
ids=[
"North America", "Europe", "Australia", "North America - Football", "Soccer",
"North America - Rugby", "Europe - Football", "Rugby",
"Europe - American Football","Australia - Football", "Association",
"Australian Rules", "Autstralia - American Football", "Australia - Rugby",
"Rugby League", "Rugby Union"
],
labels= [
"North<br>America", "Europe", "Australia", "Football", "Soccer", "Rugby",
"Football", "Rugby", "American<br>Football", "Football", "Association",
"Australian<br>Rules", "American<br>Football", "Rugby", "Rugby<br>League",
"Rugby<br>Union"
],
parents=[
"", "", "", "North America", "North America", "North America", "Europe",
"Europe", "Europe","Australia", "Australia - Football", "Australia - Football",
"Australia - Football", "Australia - Football", "Australia - Rugby",
"Australia - Rugby"
],
))
fig.update_layout(margin = dict(t=0, l=0, r=0, b=0))

fig.show()

Branchvalues

With branchvalues “total”, the value of the parent represents the width of its wedge. In the example below, “Enoch” is 4 and “Awan” is 6 and so Enoch’s width is 4/6ths of Awans. With branchvalues “remainder”, the parent’s width is determined by its own value plus those of its children. So, Enoch’s width is 4/10ths of Awan’s (4 / (6 + 4)).

Note that this means that the sum of the values of the children cannot exceed the value of their parent when branchvalues is set to “total”. When branchvalues is set to “remainder” (the default), children will not take up all of the space below their parent (unless the parent is the root and it has a value of 0).

1
2
3
4
5
6
7
8
9
fig =go.Figure(go.Sunburst(
labels=[ "Eve", "Cain", "Seth", "Enos", "Noam", "Abel", "Awan", "Enoch", "Azura"],
parents=["", "Eve", "Eve", "Seth", "Seth", "Eve", "Eve", "Awan", "Eve" ],
values=[ 65, 14, 12, 10, 2, 6, 6, 4, 4],
branchvalues="total",
))
fig.update_layout(margin = dict(t=0, l=0, r=0, b=0))

fig.show()

Large Number of Slices

This example uses a plotly grid attribute for the suplots. Reference the row and column destination using the domain attribute.

下面我们导入两个csv文件进行画图

规定ids ,labels 和 parents,并利用domain属性规划子图所在区域。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import pandas as pd

df1 = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/718417069ead87650b90472464c7565dc8c2cb1c/sunburst-coffee-flavors-complete.csv')
df2 = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/718417069ead87650b90472464c7565dc8c2cb1c/coffee-flavors.csv')

fig = go.Figure()

fig.add_trace(go.Sunburst(
ids=df1.ids,
labels=df1.labels,
parents=df1.parents,
domain=dict(column=0)
))

fig.add_trace(go.Sunburst(
ids=df2.ids,
labels=df2.labels,
parents=df2.parents,
domain=dict(column=1),
maxdepth=2
))

fig.update_layout(
grid= dict(columns=2, rows=1),
margin = dict(t=0, l=0, r=0, b=0)
)

fig.show()

Controlling text orientation inside sunburst sectors

The insidetextorientation attribute controls the orientation of text inside sectors. With “auto” the texts may automatically be rotated to fit with the maximum size inside the slice. Using “horizontal” (resp. “radial”, “tangential”) forces text to be horizontal (resp. radial or tangential). Note that plotly may reduce the font size in order to fit the text with the requested orientation.

For a figure fig created with plotly express, use fig.update_traces(insidetextorientation='...') to change the text orientation.

现在我们对右上图进行美化,我们设置insidetextorientation=’horizontal’ 让文字水平排列。增强可读性

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/718417069ead87650b90472464c7565dc8c2cb1c/coffee-flavors.csv')

fig = go.Figure()

fig.add_trace(go.Sunburst(
ids=df.ids,
labels=df.labels,
parents=df.parents,
domain=dict(column=1),
maxdepth=2,
insidetextorientation='horizontal'
))

fig.update_layout(
margin = dict(t=10, l=10, r=10, b=10),
uniformtext=dict(minsize=16, mode='hide'),height=1000,width=1000
)

fig.show()

Controlling text fontsize with uniformtext

If you want all the text labels to have the same size, you can use the uniformtext layout parameter. The minsize attribute sets the font size, and the mode attribute sets what happens for labels which cannot fit with the desired fontsize: either hide them or show them with overflow.

上面那张咖啡风味图密密麻麻,我们可以设置其uniformtext 属性,令其mode = ‘hide’ 这样只有可以显示的文字才能显示出来,不能显示的就暂时隐藏。等有足够空间以后再显示

1
2
3
4
5
6
7
8
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/718417069ead87650b90472464c7565dc8c2cb1c/sunburst-coffee-flavors-complete.csv')

fig = go.Figure(go.Sunburst(
ids = df.ids,
labels = df.labels,
parents = df.parents))
fig.update_layout(uniformtext=dict(minsize=10, mode='hide'))
fig.show()
ids labels parents
0 Aromas Aromas NaN
1 Tastes Tastes NaN
2 Aromas-Enzymatic Enzymatic Aromas
3 Aromas-Sugar Browning Sugar Browning Aromas
4 Aromas-Dry Distillation Dry Distillation Aromas
91 Pungent-Thyme Thyme Spicy-Pungent
92 Smokey-Tarry Tarry Carbony-Smokey
93 Smokey-Pipe Tobacco Pipe Tobacco Carbony-Smokey
94 Ashy-Burnt Burnt Carbony-Ashy
95 Ashy-Charred Charred Carbony-Ashy

Sunburst chart with a continuous colorscale

The example below visualizes a breakdown of sales (corresponding to sector width) and call success rate (corresponding to sector color) by region, county and salesperson level. For example, when exploring the data you can see that although the East region is behaving poorly, the Tyler county is still above average — however, its performance is reduced by the poor success rate of salesperson GT.

In the right subplot which has a maxdepth of two levels, click on a sector to see its breakdown to lower levels.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
from plotly.subplots import make_subplots

df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/sales_success.csv')
print(df.head())

levels = ['salesperson', 'county', 'region']
# levels used for the hierarchical chart
color_columns = ['sales', 'calls']
value_column = 'calls'

def build_hierarchical_dataframe(
df,
levels,
value_column,
color_columns=None):
"""
Build a hierarchy of levels for Sunburst or Treemap charts.

Levels are given starting from the bottom to the top of the hierarchy,
ie the last level corresponds to the root.
"""
df_all_trees = pd.DataFrame(columns=['id', 'parent', 'value', 'color'])
for i, level in enumerate(levels):
df_tree = pd.DataFrame(columns=['id', 'parent', 'value', 'color'])
dfg = df.groupby(levels[i:]).sum()
dfg = dfg.reset_index()
df_tree['id'] = dfg[level].copy()
if i < len(levels) - 1:
df_tree['parent'] = dfg[levels[i+1]].copy()
else:
df_tree['parent'] = 'total'
df_tree['value'] = dfg[value_column]
df_tree['color'] = dfg[color_columns[0]] / dfg[color_columns[1]]
df_all_trees = df_all_trees.append(df_tree, ignore_index=True)
total = pd.Series(dict(
id='total',
parent='',
value=df[value_column].sum(),
color=df[color_columns[0]].sum() / df[color_columns[1]].sum()))
df_all_trees = df_all_trees.append(total, ignore_index=True)
return df_all_trees


df_all_trees = build_hierarchical_dataframe(df, levels, value_column, color_columns)
average_score = df['sales'].sum() / df['calls'].sum()

fig = make_subplots(1, 2, specs=[[{"type": "domain"}, {"type": "domain"}]],)

fig.add_trace(go.Sunburst(
labels=df_all_trees['id'],
parents=df_all_trees['parent'],
values=df_all_trees['value'],
branchvalues='total',
marker=dict(
colors=df_all_trees['color'],
colorscale='RdBu',
cmid=average_score),
hovertemplate='<b>%{label} </b> <br> Sales: %{value}<br> Success rate: %{color:.2f}',
name=''
), 1, 1)

fig.add_trace(go.Sunburst(
labels=df_all_trees['id'],
parents=df_all_trees['parent'],
values=df_all_trees['value'],
branchvalues='total',
marker=dict(
colors=df_all_trees['color'],
colorscale='RdBu',
cmid=average_score),
hovertemplate='<b>%{label} </b> <br> Sales: %{value}<br> Success rate: %{color:.2f}',
maxdepth=2
), 1, 2)

fig.update_layout(margin=dict(t=10, b=10, r=10, l=10))
fig.show()

-------------本文结束,感谢您的阅读-------------